About Me

My photo
Web person at the Imperial War Museum, just completed PhD about digital sustainability in museums (the original motivation for this blog was as my research diary). Posting occasionally, and usually museum tech stuff but prone to stray. I welcome comments if you want to take anything further. These are my opinions and should not be attributed to my employer or anyone else (unless they thought of them too). Twitter: @jottevanger

Thursday, February 25, 2010

Linked Data meeting at the Collections Trust

[December 2010: I don't even know anymore if this was ever published, or if I simply edited it and it went back into draft. If the latter, duh. If the former, well, in the spirit of an end-of-year clearout here's something I wrote many months ago]
[UPDATE, March 2010: Richard Light's presentation is now available here]

On February 22nd Collections Trust hosted a meeting about Linked Data (LD) at their London Bridge offices. Aside from yours truly and a few other admitted newbies amongst the very diverse set of people in the room, there was a fair amount of experience in LD-related issues, although I think only a few could claim to have actually delivered the genuine article to the real world. We did have two excellent case studies to start discussion, though, with Richard Light and Joe Padfield both taking us through their work. CT's Chief Executive Nick Poole had invited Ross Parry to chair and tasked him with squeezing out of us a set of principles from which CT could start to develop a forward plan for the sector, although it should be noted that they didn’t want to limit things too tightly to the UK museum sector.

In the run-up to the meeting I’d been party to a few LD-related exchanges, but they’d mainly been concentrated into the 140 characters of tweets, which is pragmatic but can be frustrating for all concerned, I think. The result was that the merits, problems, ROI, technical aspects etc of LD sometimes seemed to disappear into a singularity where all the dimensions were mashed into one. For my own sanity, in order to understand the why (as well as the how) of Linked Data, I hoped to see the meeting tease these apart again as the foundation for exploring how LD can serve museums and how museums can serve the world through LD. I was thinking about these as axes for discussion:

  • Creating vs consuming Linked Data

  • End-user (typically, web) vs business, middle-layer or behind-the-scenes user

  • Costs vs benefits. ROI may be thrown about as a single idea, but it’s composed of two things: the investment and the return.

  • On-the-fly use of Linked Data vs ingested or static use of Linked Data

  • Public use vs internal drivers
So I took to this meeting a matchbox full of actual knowledge, a pocket full of confusion and this list of axes of inquiry. In the end the discussion did tread some of these axes whilst others went somewhat neglected, but it was productive in ways I didn’t expect and managed to avoid getting mired in too much technology.

To start us off, Richard Light spoke about his experiments with the Wordsworth Trust’s ModesXML database (his perennial sandbox), taking us through his approach to rendering RDF using established ontologies, to linking with other data nodes on the web (at present I think limited to GeoNames for location data, grabbed on the fly), and to cool URIs and content negotiation. Concerning ontologies, we all know the limitations of Dublin Core but CIDOC-CRM is problematic in its own way (it’s a framework, after all, not a solution), and Richard posed the question of whether we need any specific “museum” properties, or should even broaden the scope to a “history” property set. He touched on LIDO, a harvesting format but one well placed to present documents about museum objects and which tries to act as a bridge between North American formats (CDWALite) and European initiatives including CIDOC-CRM and SPECTRUM (LIDO intro here, in depth here (both PDF)). LIDO could be expressed as RDF for LD purposes.

For Richard, the big LD challenges for museums are agreeing an ontology for cross-collection queries via SPARQL; establishing shared URLs for common concepts (people, places, events etc); developing mechanisms for getting URLs into museum data; and getting existing authorities available as LD. Richard has kindly allowed me to upload his presentation Adventures in Linked Data: bringing RDF to the Wordsworth Trust to Slideshare.

Joe Padfield took us through a number of semantic web-based projects he’s worked on at the National Gallery. I’m afraid I was too busy listening to take many notes, but go and ferret out some of his papers from conferences or look here. I did register that he was suggesting 4store as an alternative to Sesame for a triple store; that they use a CRM-based data model; that they have a web prototype built on a SPARQL interface which is damn quick; and that data mining is the key to getting semantic info out of their extensive texts because data entry is a mare. A notable selling point of SW to the “business” is that the system doesn’t break every time you add a new bit of data to the model.

Beyond this, my notes aren’t up to the task of transcribing the discussion but I will put down here the things that stuck with me, which may be other peoples’ ideas or assertions or my own, I’m often no longer sure!

My thoughts in bullet-y form
I’m now more confident in my personal simplification that LD is basically about an implementation of the Semantic Web “up near the surface”, where regular developers can deploy and consume it. It seems like SW with the “hard stuff” taken out, although it’s far from trivial. It reminds me a lot of microformats (and in fact the two can overlap, I believe) in this surfacing of SW to, or near to, the browsable level that feels more familiar.

Each audience to which LD needs explaining or “selling” will require a different slant. For policy makers and funders, the open data agenda from central government should be enough to encourage them that (a) we have to make our data more readily available and (b) that LD-like outputs should be attached as a condition to more funding; they can also be sold on the efficiency argument or doing more with less, avoiding the duplication of effort and using networked information to make things possible that would otherwise not be. For museum directors and managers, strings attached to funding, the “ethical” argument of open data, the inevitability argument, the potential for within-institution and within-partnership use of semantic web technology; all might be motives for publishing LD, whilst for consuming it we can point to (hopefully) increased efficiency and cost savings, the avoidance of duplication etc. For web developers, for curators and registrars, for collections management system vendors, there are different motives again. But all would benefit from some co-ordination so that there genuinely is a set of services, products and, yes, data upon which museums can start to build their LD-producing and –consuming applications.

There was a lot of focus on producing LD but less on consuming it; more than this, there was a lot of focus producing linkable data i.e. RDF documents, rather than linking it in some useful fashion. It's a bit like that packaging that says "made of 100% recyclable materials": OK, that's good, but I'd much rather see "made of 100% recycled materials". All angles of attack should be used in order to encourage museums to get involved. I think that the consumption aspect needs a bit of shouting about, but it also could do with some investment from organisations like Collections Trust that are in a position potentially to develop, certify, recommend, validate or otherwise facilitate LD sources that museums, suppliers etc will feel they can depend upon. This might be a matter of partnering with Getty, OCLC or Wikipedia/dbPedia to open up or fill in gaps in existing data, or giving a stamp of recommendation to GeoNames or similar sources of referenceable data. Working with CMS vendors to make it easy to use LD in Modes, Mimsy, TMS, KE EMu etc, and in fact make it more efficient than not using LD; now that would make a difference. The benefits depend upon an ecosystem developing, so bootstrapping that is key.

SPARQL: it ain’t exactly inviting. But then again I can’t help but feel that if the data was there, we knew where to find it and had the confidence to use it, more museum web bods like me would give it a whirl. The fact that more people are not taking up the challenge of consuming LD may be partly down to this sort of technical barrier, but may also be to do with feeling that the data are insecure or unreliable. Whilst we can “control” our own data sources and feel confident to build on top of them, we can’t control dbPedia etc., so lack confidence of building apps that depend on them (Richard observed that dbPedia contains an awful lot of muddled and wrong data, and Brian Kelly's recent experiment highlighted the same problem). In the few days since the meeting there have been more tweets in this subject, including references to this interesting looking Google Code project for a Linked Data API to make it simpler to negotiate SPARQL. With Jeni Tennison as an owner (who has furnished me with many an XSLT insight and countless code snippets) it might actually come to something.

Tools for integrating LD into development UIs for normal devs like me – where are they?

If LD in cultural heritage needs mass in order for people to take it up, then as with semantic web tech in general we should not appeal to the public benefit angle but to internal drivers: using LD to address needs in business systems, just as Joe has shown, or between existing partners.

What do we need? Shared ontologies, LD embedded in software, help with finding data sources, someone to build relationships with intermediaries like publishers and broadcasters that might use the LD we could publish.

Outcomes of the meeting
So what did we come up with as a group? Well Ross chaired a discussion at the end that did result in a set of principles. Hopefully we'll see them written up soon coz I didn't write them down, but they might be legible on these images:

Monday, February 08, 2010

Museums and online logins pt.3: making a better “why” (with a bit of how)

Breaking down collections database silos has long been a dream for me as a user (and consequently as a developer), hence my involvement in and (perhaps unrealistically) high hopes for Europeana. Well, collections-related functions aren’t all that museum sites have in common, and lots of things would work better if a few more walls were knocked over. In Part 3 I’ll suggest some of the things that a service built around universal museum login could offer that aren’t going to happen with the current situation, but could be of value to both museums and their online users.

In the scenario in Part 2 you used your MuPPort ID to log into a museum site that you’d never visited before, and then fiddled around with your profile before using the museum’s thimble freaks’ forum. What else could you do? How about bookmark a few items in the thimble collection’s pages? You could then tag them and put them in a set along with the thimbles you faved on the Framley Museum site, then share this set with a group of thimble lovers on the MoPPort site. Whilst you’re there you put some pictures from the V&A next to others from the British Postal Museum and perhaps use them to build a nice little timeline in Magic Studio (I’d better mention my Declaration of Interest here). How about saving some events listings, filtered to your preferences, from a few museum sites, supplemented with the events listings held in the Culture24 database? This being an OpenID site, if gave it permission to link to your Flickr account you could also see your museum-related groups and contacts here, perhaps filtering new uploads of thimble-related images from known museum Flickr accounts for you to view and comment upon. In short, this would be a service where all your stuff from UK museums would be in one place for you to mix up, share, discuss, tag.

What would be required of museums? Well actually, unless they wanted some function based on registration that the service didn't offer or if they needed access control functionality (authorisation), nothing at all. Imagine then that most of the time you didn’t need to log into museum sites at all, only MuPPort, using the MuPPort bookmarklet to favourite object records and save searches. A museum would be added automatically to your master profile whenever this happened. When login was necessary on a given site – such as for using a forum – it would be semi-automated, much like logging into YouTube when already logged into Google (or like Athens, if you know that). But many services would be much more valuable when run across institutions and would be fit for MuPPort, or a developer building on its platform. And much of the time login is important for tying data to a user rather than for authorising that user, so lots of tools could wrap around sites in just the way that Delicious does: requiring nothing of the site itself, only that the user is logged in to Delicious. If nothing at all is required of the museum, where’s the catch?

Well actually more interesting is, where’s the extra value for museums? Here we’re finally back to recommendation systems, which got me mulling over universal login again. Pretty much any museum isn’t going to learn much from the patterns of their own users alone, whether through their explicit, conscious actions such as favouriting and tagging, or the trails they implicitly leave browsing and searching their site. Put together, though, many users over many sites, hopefully doing more, adds up to a lot of knowledge and a good source of recommendations. This is good for museums and for users.

Well it's late and I've spent too much time on this so I’m going to leave it at that. There’s clearly overlap here with what could be offered by Europeana, but in the UK there are other organisations well suited to the task (of course I’m looking at you, Nick). I wouldn’t want to say really who or what should provide a service like this, but I do think that universal login is only a part: the whole point is to build real value on top of a nexus of users and specialised content which current generalist alternatives can’t really offer. Is there a case for a service like this? I’d love to hear your thoughts. And thanks for getting this far...

I’ve put a quick survey up to find out what sort of registration-dependent activities museums run at the moment on and off their own websites. If you work on a museum website it would be really interesting to have your input, which I’ll put on this blog in due course.

Other posts in this series:

Sunday, February 07, 2010

Museums and online logins pt.2: making a better “how”

(or "It's a universal tribulation")
In Part 1, I posited that one reason for museums not to integrate registration and login with their sites must be the hassle for both parties (museum and user). So, what if it wasn't a hassle to give your users a registration facility and a bunch of tools that depend on it? What if it was easy technically and yielded lots of value because it was popular? In short, what if the business case was good (which is incidentally the essence of my perspective on sustainability: sustaining appropriate levels of resources by identifying and sustaining value). If the users liked it because the value to them was higher: the benefits of logged-in actions extended beyond a single museum and let you bring together all your "stuff" from countless museums and galleries? In Part 2 I’ll look at lowering the barrier, before moving on to boosting the rewards in Part 3. First, though, let’s quickly off-road to look at the context within which museums operate online.

A digression: the Modalities of Regulation (from The Digital Sustainability Foundation Course)
I find it difficult these days to think about investment without thinking about sustainability. This isn't just because sustainability should be on your mind when you build something new (doh!), but because the things that make something worth building are the same as those that make it worth your while to keep it going. Or they can be, but that's another blog post.

An organisation’s internal drivers are vital in the question of whether and how they go about "building" and "operating", but these values and imperatives are in constant negotiation with the environment within which they operate. Larry Lessig talks about this in terms of the regulation of behaviour i.e. external factors that affect decisions, amongst them the law, the market, social norms and "architecture", which amounts to the stuff you just have to live with (whether that be the law of gravity, the presence of a mountain in your path, or your inability to see through walls).

An interesting constraint in this discussion is the market, which relates both to the users we have in mind and the ecosystem of competitors, peers and solution providers we operate within (often enough a "competitor" is also a provider of a solution, according to how you come at them c.f. Google, Wikipedia, Flickr, and many more). It's immediately apparent that the market has enabled museums to achieve a lot of social stuff without having to do the tricky stuff themselves, and as importantly tap into a much larger audience as a result. Several brilliant examples of using Flickr make it the most notable case, for me. On the architectural side, OpenID (http://openid.net) is a liberating “constraint” with market-y aspects, being a technical framework that relies on the market for OpenID providers. OpenID hasn't yet seen a lot of museum usage, perhaps because it's a bit technical or again simply because the use case is still too weak for offering logged in sections at all. The Brooklyn Museum lets people use their Google IDs to login (they use this for their famous Posse, which is clearly a compelling use case) but it's not exactly a widespread practice.

Whenever thinking about a solving problem it makes sense to look at the environment, especially the market, and examine what solutions already exist, how they fail, and perhaps how they could be adopted or adapted. For now I'll just say that with Delicious, Google's SideWiki, Zotero and many other relatively generalist, site-agnostic, "wrap-around" tools we have some great foundations to build upon. They form an important aspect of “the market”, and in some cases also the architecture. Before trying to reproduce what they do we should ask the question of whether they fall short in significant ways. I would suggest that they do because, whilst their success is based upon their not requiring anything particular of the web resources they are used upon, there is not much scope for pushing crowd-sourced knowledge back to those resources.

Lower the barrier or De-hassle the users, de-hassle the techs
As a developer, if I were asked to create some functionality that needed people to register I know various ways I could go about it, but it’s not trivial and if you had to deploy it over multiple web apps it could be a bit of a nightmare for a hack like me. Most other museums are even more thinly resourced than my own in terms of developer capacity.

As a user, I’ve got enough logins, thank you, and I’d pause for thought before registering with the Louvre or the BM, let alone Colchester Castle Museum. Another password? Another profile to build and manage? And although I’m going to do much of the same stuff as on all the other museum websites I visit, no connection between them? How useful is a Facebook widget for a my favourite objects from single museum? How many such widgets will I add to my profile? (probably none because I don’t know of any right now, but perhaps that’s due to exactly this barrier: it’s not worthwhile for users nor for single museums to make single museum apps of this sort)

So I’m proposing a UK museum passport – let’s call it a MuPPort for now, so that we remember to change it – a universal(ish) digital login service, although perhaps with an off-line dimension. This would help to lower the barrier for both sides. It would probably be built around OpenID and, to make museum coders' lives easier, would come with a bunch of widgets and code snippets. There would be access level control for those parts of a museum’s site that needed to be restricted to a particular group. Users would benefit from feature profile management tools and from various value-added stuff we’ll look at in Part 3, much of which might not even need the museum to implement MuPPort on their site because it’s not about access control, it’s just about identifying the user and keeping their stuff tied to their identity.

Imagine a scenario where, as a user, you can go to a museum site that you've never visited before and log in using your Google ID, or whatever OpenID identity you have attached to your MuPPort. Before going any further, you decide to visit the MuPPort site and check out your passport master account, where you see individual accounts for three museums you had logged into on previously, together with the stub of an account for the one you just went to. You customise your profile for the new museum, then when you go and contribute to their thimble-lovers’ forum MuPPort will ensure that the right avatar and username show up and are linked through to your Facebook page and blog.

So logging in could be made easier for everyone, but how about the more compelling case for users? Well that's where the fun starts, because whilst it could simply be a central sign-in system that then relies on individual museum websites to take care of all of the real action, for many functions it would make a lot more sense to run them centrally and gain from the pooled content as a result.

Other posts in this series:

Museums and online logins pt.1: who’s doing what, where and why

It's not happening
Right now, not that many museums let users register and do extra stuff online. That assertion is very unscientific, but looking around the London nationals' websites (but looking no further than the home pages/main menus) I find:

  • British Museum: No login

  • Imperial War Museum: No login

  • National Gallery: No login

  • National Maritime Museum: No login

  • National Portrait Gallery: No login

  • Natural History Museum: Login for NaturePlus (forums, blogs, bookmarking etc.)

  • Science Museum: No login

  • Tate: No login

  • Victoria & Albert Museum: No login
Of these, all but the National Gallery are also partners in Creative Spaces, the "grown up" part of the National Museums Online Learning Project launched last year. This requires you to log in to do much of the interesting stuff, and the need for this and benefits of doing so are pretty self-evident (at least if you buy into the application, and there are others in the museum geek community who don't). The login, though, is not good outside Creative Spaces AFAIK.

I realise that my sample was purely London majors and there are perhaps other UK museums that do have registered user areas, but if this sample is at all representative, why is registration, or more pertinently the sort of activity that would require registration, apparently so scarce on museum sites? I mean, there must be lots of things we'd like to do with museum/gallery content or in museum/gallery contexts (meat-space or digital) that would need you to sign up, and there's always loads of talk about funky collaborative stuff, social media, personalisation... So if the big 'uns don't generally do it there's got to be a reason worth digging in to.

There are plenty of reasons why this may be so and I'll have a look at some later on, but in simple terms I'd put it down to "it's tricky to do" and "is it worth it anyway?" (that last referring to the value that both museums and their users would get for all the hassle). Certainly at the Museum of London we've had the registration discussion more than once. Mia was always opposed, and rightly so I think, on the grounds that it's exclusive and off-putting with little to gain from it; yet the temptation is there, especially if you're considering value-added features that depend upon registration to be effective - say, favouriting objects or saving searches. In fact, the redesign that we’re currently engaged in on the MOL sites is another reason why registration has been on my mind again, since it always brings up the suggestion of a special area for the Friends. Quite what they’d get in that area I couldn’t tell ya. It does bring up one important distinction between reasons for “doing” login: it can be for access control, or it can be for identifying the user and associating them with their stuff (or both of the above).

Of course, doing some of this stuff needn’t involve logging in to our sites at all. So the nest question is, what sort of things are museums doing with third party services that involve people using (optionally or otherwise) registered identities? And would it ever make sense to try to go it alone instead?

Social media basics: Twitter, Facebook
The whole point of participating in the likes of Facebook or Twitter is to become part of a wide existing social network. Whilst museums might sometimes have specific narrower audiences in mind for some of what they do here, there’s surely nothing to be gained by doing it anywhere but where the people are. For richer but more targeted social networks, Ning is a popular alternative hosted solution.

Blogs, forums, and comments
Lots of museums run blogs, some of them on hosted services like Blogger, others on their own installations (MOL, for example). These may have their own login system (Wordpress, for example) but they often hook into OpenID too, so if users want to make comments they can use an existing identity. Forums, too, can be installed and run without having to run a registration/login system, or you can use hosted alternatives like Google or Yahoo! groups (for all their failings). And of course, people feed back to museums through “contact us” forms. It would be a fool who made these accessible only to logged-in users, but one can see how it could be useful to link together contact forms with known users.

Wikis and Wikipedia
Installing MediaWiki or using the likes of PBWiki can involve some sort of user management, although you don’t have to build your own system for this. Perhaps in some cases you can use OpenID? If users currently have to register to edit your wiki, one can see how it might be easier if this was tied to other things they might have to register for.

Wikipedia, of course, is not under your control and nor do you need to be registered to edit it, so although plenty of museums use it for their own purposes it doesn’t really fit into this discussion. Wikipedia is socially produced but is not a social space in the same way as other sites, although discussions can grow up around topics.

Sharing media
YouTube, iTunes and Flickr are the classic places for sharing media, and this is for a combination of pragmatic/economic reasons (free hosting, unlimited bandwidth, no real technical challenges, easy embedding, a UI dedicated to that medium) and because of the ready access to a user base, many of them registered, who can favourite, tag and comment on your assets. Museums’ use of Flickr is much more sophisticated overall than with, say, YouTube, owing partly to the simple fact that photographs constitute a large part of many collections (and can represent most of the rest). A recent flurry of tweets and bloggage on the relative merits of Flickr Commons and Wikimedia Commons brought out the value of the social aspect of Flickr.

Take your pick here, but Delicious is the big social bookmarking app, and as a user it’s more important to tag your bookmarks than pretty much anything else. This makes it a great source of user generated data on bookmarked pages, although whether any museums mine the info on how their pages are bookmarked and tagged I know not but if you’ve got time you can do it (here for example is a search for how people have bookmarked the MOL home page). If people bookmarked using a museum’s own app instead of Delicious (or Digg, Stumbleupon, whatever), what would that do for them or for the museum? Well, for users I think I’ll leave that until later, but for museums patterns of use and tagging would be a potential goldmine.

Museums exist in social contexts all around the web. Sometimes they put themselves there – on Facebook, through blogging, via Flickr – and other times they simply find themselves or their content there – mentioned on Twitter, faved in Google Reader or tagged in Delicious, written up in TripAdvisor or Wikipedia. Doubtless I have some of the details of my little environmental scan wrong but that’s not really too important: the point is the multitude of interactions and the fragmentation of the information – no, the relationships – that result. All that wealth of knowledge and opinion, all that social capital, spread all over the shop. It makes you wonder if there’s more we could be doing to marshal it for the good of all museums, because rich as this ecosystem is, it can be pretty hard to learn from it.

Other posts in this series:

Museums and online logins: the preamble

I was recently asked for my thoughts on recommendation systems and cultural heritage organisations and it set off a train of thought that took me back to an old idea, one I'm sure I must have discussed with many people and which perhaps someone's working on right now.

The idea is universal login. OK "universal" might be a bit strong, so let's just start with UK-wide, museum website login. For whatever reason (possibly coz it's a rubbish idea) it's not happened yet and I want to think about why, and whether there's a way it might happen successfully. I thought I’d do a quick post on it but it needed some background and rapidly ballooned, with the result that I’ve split it into three posts (four if you count this one).

The first post looks briefly at what museums are doing with registration and login to their own site, as well as touching on some of the other places they encourage “identity-tied” user engagement without having to build their own mechanisms. Then I talk about some of the ways these fall short and speculate on why login is still pretty rare amongst museum sites.

The second post proposes “universal login” as a way to tackle one of the barriers to doing this: the effort involved.

The third post is really the first one I wanted to write, but that would have meant skipping straight to a “solution” without framing the “problem” or describing universal login, which is a necessary (but insufficient) intermediate step. It’s about the more important barrier: the lack of value in creating another login silo.

Finally, there's also a survey to find out what other museums are doing and what you think. If you're involved with a museum website I'd very much appreciate your input, which I'll wrap up into a further post in the future.

Other posts in this series: