Wednesday, November 02, 2011

New IWM websites pt.II: Collections and licensing

This is the second post about our "big bang" and the things we launched on October 4th. In the first post I talked about the new brand, e-commerce, and hosting. Here I'll talk a bit about collections and the closely related issue of licensing.

The collections
So, now we’re getting more into the area I can talk about more knowledgeably. The opportunity to help reimagine how IWM’s collections are brought to the public through digital media was one of the key attractions that brought me here 18 months ago (though there were several other extremely compelling reasons. I think some of them are still in post). I felt I could bring something useful from my experiences at the Museum of London, having been part of the team that brought an ambitious new system there just before I left.
I hope I’ll be able write it all up properly soon, but I’ll keep it brief here. The collections online project at IWM had two objectives: firstly to build the foundational infrastructure for all future data-driven collections applications; and secondly to build the first public interface onto that infrastructure with the collections search pages in the new website. Only the bare essentials of the infrastructure were to be built in this phase: purely what was necessary to deliver to the web application and to be provide enough of an architecture for us to plug in the planned extra features later on. We have plenty in line for phase 2, but we’ll tie all that stuff in with specific front-end requirements.
Simon Chambers came on board with us to project manage this one and he did an incredible job of marshalling the requirements, prioritising them, working with a number of departments and strong-willed people and getting things as quickly as possible to the point where we could deliver the baseline of what the website needed. We ultimately decided to work with Knowledge Integration, who built the CIIM for us at MoL and who could bring us an existing application that fitted our needs very well.
Essentially the CIIM, at least the part we’ve implemented so far, pulls data from the collections management system (in our case Adlib) and remodels it to serve the needs of discovery and delivery as opposed to data management. These are very different things, and that a CollMS may do the latter very well doesn’t make it ideal for the former. This is as much about how the database is used across the organisation’s varied collections as it is about the technical qualities of, say, Adlib specifically, because this architecture allows us to intervene in the data between its source and front-end applications that use it – to remodel and align it, to integrate it with other data sources or enrich it, to prepare associated media, and to optimise it for full text searching, for instance. The big job since February, when things kicked off in earnest, has been modelling the data correctly. I have to admit I seriously underestimated the complexity of getting this right, and we had a series of problems to do with the API it was extracting from and the readiness of some of the data, but in a way this illustrates why it’s a good thing to be able to do all this work away from the front-end.
The result is a first-pass at a Solr index that, along with 3 others, lies at the heart of discovery on our new website. Try out the search engine here, or here are some good searches to get you going. Watch a video. Listen to interviews or momentous radio broadcasts (Czech alert). Oh and if you find an object like this you'll see that it's part of a collection, and can see a complete listing of that collection. Our priorities mean that we’ve deferred implementation of some of the features we want in the longer term but we know we can leap into action soon. In fact, Tom Grinsted, our multimedia manager, has put together a project with UCL and K-Int that gives us a focus for some of this functionality and is just getting underway now, which is rather exciting. Luke Smith and Giv Parvaneh are also busy planning various projects for the next few years as part of the centenary of the First World War that will also draw on and feed into the system. So watch this space.

All that e-commerce work around collections has also meant reviewing the way we licence our material. Recent developments at national and European level – notably the creation of the Open Government Licence by the National Archive (TNA) – and the steps that some of our peers have made in offering their assets for creative reuse to the benefit of all, have also had an impact. IWM has now launched its User Licence (essentially the OGL), which frees up almost 200,000 images, audio recordings and films, for non-commerical use. Regular "fair dealing" restictions apply to others, like this nice Ronald Searle picture. I'm afraid we've not yet got a filter in collections search for items with this licence, but try right-clicking on an image on an item page to see if you can download or embed it.
The licence applies to IWM-generated content and data too, so although we don’t yet have a public API to our collections the data around them is up for grabs. Hopefully I'll have more to report on this before too long.

In the next post I'll talk through the website itself. Stay awake at the back!

