About Me

My photo
Web person at the Imperial War Museum, just completed PhD about digital sustainability in museums (the original motivation for this blog was as my research diary). Posting occasionally, and usually museum tech stuff but prone to stray. I welcome comments if you want to take anything further. These are my opinions and should not be attributed to my employer or anyone else (unless they thought of them too). Twitter: @jottevanger
Showing posts with label mcg. Show all posts
Showing posts with label mcg. Show all posts

Wednesday, March 04, 2009

Uncontroversial title

Well it's all going off on Mike's blog, and previously on the MCG list, plus other (much more) considered opinions on Tom's blog and hopefully elsewhere. The subject matter: Creative Spaces, at last. "At last" because it's not been discussed properly by our community since its launch. Actually that's not quite fair: Tom's first post went up yesterday, and it's a thorough review supplemented today, so good on him. I'd planned to write something but haven't (still) checked it out properly.
Frankie kicked things off today, and after a bumpy start the discussion on those various venues has been fruitful I think. My thoughts are scattered around the place but, as I say, precede any real knowledge. Basically I think it's an idea with plenty going for it and which is a necessary experiment that we'll all learn from which, at the very least, will hopefully leave a legacy of some infrastructure (technical and organisational/political) for the 9 partners. It may be that there's a lot more than that, and it will make a very interesting case study. In the meantime, genuine congratulations to Carolyn Royston and her team for battling through all the challenges and getting this far. It's humbling, when you produce the sort of modest stuff I do, to see what's possible (if you have some resources).

Sunday, May 04, 2008

The MCG thread: 21st Century digital curation

Bridget McKenzie has blogged and kicked off a great discussion on the MCG list following a seminar last week with Carole Souter of HLF and Roy Clare of MLA. I've written a reply but as usual I do go on a bit so I'm sending a brief version there and putting more depth here.




There are so many strands to this exchange that I want to jabber about so I’m going back to the start – Bridget’s post. Bridget, if I get your drift, yes, I agree: there is a balance to be struck between the effort put into basic “digitisation” (though I think there are various ideas circulating about what that term implies) and interpreting our digitised collections, and I’m not sure we’re being helped to strike the right balance at present. Raves and gripes about past projects aside, it’s how we spend the scant funds now available that bothers me. Going from a time of relative plenty to a time where most budgets are Spartan rather than Olympian, how do we plan clearly how we spend them?

I’m torn. On the one hand, I get the sentiment which says, we need to be making things that people will enjoy, with some sort of mediation and aimed at well-defined audiences, just like any exhibition or public event we host. I accept that HLF amongst others want their funds to be used on things that we don’t automatically do, things that aren’t our “core activities”. And I take Dylan’s point that, when resources are scarce (which is always), demonstrating (actually, having) impact is really important. But….

On the other, I would suggest that digitisation being referred to as a core activity betrays the fact that it is indeed now something we have to do. The trouble is, it may be a part of our core work, but it has never been core funded (at least not in terms of funds additional to the pre-digitisation days). So it’s a cop-out to say “it’s really important, and therefore we won’t pay for it”. Who will, then? Until our core funders, whoever they might be, come up with the cash to do this newly core activity on an ongoing rather than project basis, we’ll have to act as though it’s not “core” and go begging precisely because it’s so important. But apparently HLF think it’s important enough to not fund it too, so we’re stuffed. I’m glad to hear that in Birmingham at least there digitisation is seen as something to drive with internal funds, I guess HLF are hoping more places will go that way.

The important thing that NOF Digi did (along with other HLF and DCMS funded projects) was get a load of collections records in some sort of order and snap some nice shots of the objects (which seems to be commonly accepted as equating to the “digitisation” of a typical museum/gallery item – making a good record and a decent photo). All the other stuff isn’t flim-flam – for those many users that had a great experience of Port Cities and other such projects, that experience was in no small part due to the contextualisation and linking built on top of the vital digitisation effort. I’m sure that BMAG’s Pre-Raphaelite website will be great, but as Rachel says, the payoff is much more than whatever users it attracts. Like an exhibition, the mediated experience will be more transient than the collections (physical or surrogate) on which it is built. We can’t equate web-ready records and surrogates with physical collections, but nevertheless they are the bricks and mortar from which our public-facing offering is built, and they will last longer than the wallpaper, lighting, video games and Persian rugs with which we make it “engaging”. Besides, sometimes all that stuff ends up seeming really forced just so we could secure the funding. Kind of like the way I’m now listening to Paul’s Boutique and hoping to find a clever excuse to quote some lyrics in support of my argument!

Unsurprisingly, then, I come out on the side of those like Bridget, Mike (E, not D) who argue that investing in the fundamentals in such a way that they can be built upon in future is the way to go. To me this means, basically, getting those records and surrogates done irrespective of anticipated clicks: if the object is worth accessioning, it’s worth recording properly. Getting that content into some public-facing form is, frankly, less vital, and we need to be considering intelligent ways of doing it. Building stuff in such a way that others can do good things with it is a step in that direction, which is why I’ve been pinning my hopes on EDL doing the Right Thing with an API. If this works the right way, any size of museum could contribute content and use Europeana’s centralised brain-power to do all the hard work. Then the basics are done, the mediating content can be tied in to it. But “digitisation” is the essential part. Like Mike, I’ve told both EDL and NCO (via Bridget) that they could reasonably drop a user interface altogether if they provide for other people to programme against an aggregated collection, but conversely a public UI without and API is pointless. I say “Rapunzel, Rapunzel, let down your hair” [see side 2 for details]. And Nick, the way you describe the vision for IA is very encouraging for this precise reason. Wish we’d had the chance to talk about it t’other night!

Stephen Lowy also raised the term sustainability. What he described is better termed preservation to my mind (although successful preservation needs sustaining…), but sustainability is at the core of this, and for this we need to make a clear distinction both conceptually and architecturally between the layers of the “digitisation” – the records and surrogates ([collections] data layer), access to them (the layer of functionality), contextualisation (a mediating layer of more content), showing them (a UI layer), perhaps including engagement with the resource in Web 2-ish ways (an additional set of social layers?). We shouldn’t be required to provide all of these parts; it’s ridiculous, even with multi-institution projects. Of course we often want to anyway, but for major funders to say, “we are only interested in helping you with the basics if you’ll build all the other layers too” is dumb.

I should add that I do actually believe that interpreting and being creative with our collections (and in fact going beyond them) is also a core activity of museums, and this obviously carries into the digital realm. To take a current famous example, Launchball, which I had a riot playing with my 5 and 7 year old yesterday: this is exactly what museums are for, but it could have been (relatively) rubbish if it had been compromised in the way Mike describes, perhaps tacked onto a digitisation initiative for the sake of funds and shorn of its purity. It has all it needs: food, sickles and girls

Mike again: “cash for sustainability is either not considered or frowned upon by funders who simply don't recognise that this is an absolute requirement in any successful (web) project.”, and Tehmina also wonders what measures are in place to avoid a repeat of situations where there is no planning for maintenance and continued content development. They’re right. At the start of each project we must be stating (a) what’s important about it (b) how long we want the important aspects to last (c) what strategies will be built in to assist this (d) what other potential sources of value the resource offers. This way we can build it appropriately (funds permitting) to ensure that we can indeed continue to realise value from the thing for the proposed lifespan. And once that period is over, we should also be in a better position to re-examine the resource, decide what’s still got potential to advance the organisation’s purpose, and maybe squeeze more value from it. And if we take the right approach to architecture, with conceptual and technical divisions between the layers, then if we’ve decided that one part is for a couple of years and another is “forever” we’ll be able to put out efforts where it matters.

Bridget asked: “what such lead bodies [HLF and MLA] should be doing to invest in 21st century digital curation?” Basically, I’d say, they should put their funds in three areas and realise that they need to be seen as separate endeavours
  1. invest properly in strategic, sector-wide initiatives like the Information Architecture, that one would hope will do the plumbing job we need, and feed into EDL (and beyond?). Fingers crossed for this one.
  2. support simple digitisation to create the straight-ahead content to go into EDL and/or IA. It’s still got to be done. If it’s not support with funds then MLA must ensure that digitisation is recognised by those providing the core funding as a core activity, and is adequately provided for on an ongoing basis. Not too optimistic.
  3. yes, still fund us to build some imaginative and innovative, born-to-die experimental exciting digital stuff aimed directly at the public. Who knows? Maybe.
  4. and funds need to have the right strings attached. Maybe this is sometimes related to “impact”; it should also be about identifying the “sources of value” in a resource, budgeting realistically for supporting them for a specific period, and planning for the end of its life.

Can you tell it’s a holiday weekend and I’m the only one at home?

PS. About those sickles: sorry, the Mummies also made it onto the turntable.

Wednesday, February 27, 2008

The EDL API debate - Museum Computer Group thread

Recently I kicked off a debate on the MCG mailing list (archive here, check out the February 2008 threads "APIs and EDL" and "API use-cases"). It was really productive. Inevitably the debate strayed beyond the strict bounds of considering the relevance of an API to EDL, and the functionality that it might include. Quite a bit of scepticism was heard concerning the whole project, and there was much debate around the barriers to participation, especially the generation of content and its publication through OAI gateways. In preparation for the WP3 meeting next week, I spent some time yesterday collating and summarising the discussion, which I'm posting below. I did receive some off-list responses and queries, which are not included here.
For those in a hurry, the quick summary of recommendations for an API is this:
  • be “'open', feature-rich and based on established and agreed metadata models/standards/schemas that allow multiple sources and minimise data loss.”
  • feature most of the functionality that can be accessed from the back-end
  • include terms and conditions that specifically requires that UGC be flexible enough to allow any reuse with attribution
  • include a key to enable differentiated access to services for different types of users
  • enable the addition of “crowd-sourced” user-generated metadata
  • be lightweight, using REST, XML and possibly RSS and JSON

I'm still extremely interested in any more opinions on the whys and hows of an API for EDL (or even, more generally, for any digital resource built for a museum) so please do comment or e-mail me if you have anything to add.

************************************
Summary of MCG EDL/API thread
Contributors
Jeremy Ottevanger, web developer, Museum of London
Tehmina Goskar
David Dawson, Senior Policy Adviser (Digital Futures), MLA
Mike Ellis, Solutions Architect, Eduserv
Martyn Farrows, Director, Lexara Ltd
Dr John Faithfull, Hunterian Museum, University of Glasgow
Sebastian Chan, Manager, Web Services, Powerhouse Museum
Nick Poole, Chief Executive, MDA
Terry Makewell, Technical Manager, National Museums Online Learning Project
Robert Bud, Science Museum
Matthew Cock, Head of Web, The British Museum
Douglas Tudhope, Professor, Faculty of Advanced Technology University of Glamorgan
Kate Fernie, MLA
Trevor Reynolds, Collections Registrar, English Heritage
Dylan Edgar, London Hub ICT Development Officer
Joe Cutting, consultant (ex-NMSI)
Richard Light, SGML/XML & Museum Information Consultancy (DCMI & SPECTRUM contributor, developer of MODES)
Ian Rowson, General Manager, ADLIB Information Systems
Graham Turnbull, Head of Education & Editorial, Scran
Frankie Roberto, Science Museum, London

Overview
The discussion kicked off with an introduction to EDL from JO, and a request for responses to the idea of an API for it, specifically:

  • whether and why an API would be useful to them, or influence their decision on whether to contribute content to EDL
  • what features might prove useful
  • any examples of APIs or of their application that they think provide a model for what EDL's API could offer or enable

A second e-mail followed, offering some possible use cases for museums, libraries and archives; for strategic bodies; and for third parties.
Responses fell into three main (interconnected) strands:

  • attempting to understand the role and purpose of EDL itself, and debating the value of participation
  • problems relating to the practicalities of cataloguing and digitisation of collections, and the publication/aggregation of the data
  • the API question

As well as providing useful ideas in respect of an API, the discussion made it clear that in the UK at least there is a need for some public relations work to be done to make the case for EDL, to explain its use for museums and to demonstrate that it will be doing something genuinely new and valuable. Barriers need to be as low as possible, and payoffs immediate and demonstrable. An alternative route to ensuring that there are contributors is coercion, so that funding is dependent upon participation, or a backdoor route wherein content aggregated for other purposes is submitted by aggregators, but ensuring institutional buy-in will be the best route to success and garner the most support. As Nick Poole (NP) himself stated:



The real question, to my mind, is whether museums perceive enough value in participating in something like the EDL to be worth the time it takes to get
involved. People have been burned in the past by services such as Cornucopia
which have tended to be relatively resource-intensive, but with little direct
payoff for individual museums - I'm not surprised people are sceptical.




EDL


Questions included, how would EDL fit in with existing EU and UK projects
such as MICHAEL, Cornucopia, and the People’s Network Discover Service. David
Dawson (DD) offered a detailed overview of its position in this network.

Cataloguing and other barriers

As John Faithfull (JF) expressed it:

I think that the current lack of killer "one stop" apps in the museum sector
is not so much due to lack of projects, technologies, or even standards, but
lack of available basic collection content for them to work with.

While supportive of APIs, he felt that it was the lack of online collection data that was the main problem. Infrastructural problems, such as access to a web server to enable automatic content harvesting in a sustainable fashion, were a big challenge. Nevertheless, he suggested that “the amount publicly available online is bizarre, bewildering and indefensible, given how technically simple the basic task has been for a long time.” Getting even flawed records out there is great for users (a point supported by Matthew Cock). Robert Bud raised some objections to this, if it just added “noise” and confusion to the internet.

NP also argued that shiny front ends tended to get financial priority over sorting out the data, but that we should get on and make the best of what we have (EDL being one means). He also felt that curators often put up resistance to getting their data online.

DD explained the planned architecture for content aggregation, which led to a discussion of software capable of acting as an OAI gateway, Trevor Reynolds pointing out that implementing an OAI gateway is not necessarily that simple. Richard Light (RL), Graham Turnbull, Ian Rowson and DD pointed to various products that do or might offer OAI servers (Modes, Scran-in-a-box, Adlib, MimsyXG and possibly others). NP indicated, too, that the solution should not be oriented at one service (EDL) or one protocol, but should be multilingual, and “the burden of responsibility has to be shifted onto the services themselves to ensure that they capture and preserve as much of the value in the underlying datasets as possible.”

Dylan Edgar pointed to the need to measure or demonstrate impact, if only in order to get funding, whilst DD reminded us that Renaissance and Designation funding, at least, came with a requirement to make metadata available to the PNDS.

An API for EDL

Mike Ellis (ME) argued that:

The notion of an API in *any* content-rich application should be moving not
only in our sphere of knowledge ("I know what an API is") but *fast* into our
sphere of requirement ("give me an API or I won't play")…
…EDL should have a
feature-rich API. A good rule of thumb for this functionality is to ask: "how
much of what can be done by back-end and developer built web systems can be done and accessed via the API?" In an ideal world it'd be 100%. If it's 0 then run
away, fast!

Applications must give us “easy, programmatic access into our data”.

Lexara’s Martyn Farrows made the case, from experience in the commercial software sector, that any API should be “'open', feature-rich and based on established and agreed metadata models/standards/schemas that allow multiple sources and minimise data loss.”

Sebastian Chan suggested that APIs may be “a *practical* alternative to the never ending (dis)agreement on 'standards'.” He suggested an API key to manage security levels and access to different services for various types of users. With regard to user generated content:

it would be prudent to have a T&C that specifically requires that UGC be
flexible enough to allow any reuse with attribution. (A CC with attribution
license may be a good option).

NP pointed out that in the cultural heritage sector the APIs of recent years have generally been one way i.e. enabling content aggregation. There is a need for evidence of the value that this returns to the content provider, in exchange for the cost of participation. He suggested that opening up the content to third parties is no different: the value is not gained directly by the content provider, and the cost of providing something adequate to all uses is probably too high. He wondered if therefore an API might be inbound as well as outbound, to allow “crowd-sourcing” of value-adding metadata creation.

JF was sceptical of the idea of working with an application housing his institution’s data, at least if this meant another obligation (providing the data):

We need stuff that makes everything easier/cheaper/faster/better rather than
having extra things to do, at extra cost.

He pointed out that the Hunterian can already do all that they wish with their own data, and doubted that any central initiative could offer much to help them add to their capacity.

Joe Cutting (JC) suggested as his main use-case the creation of exhibition displays and interactives. He indicated the problems such applications can have, such as copyright, data integrity, completeness and validity, and service level. His recommendations could be well inform an API for EDL.

In terms of technology, ME argued “lightweight every step of the way”, meaning widespread and simple technology. REST and XML (perhaps RSS too) were his preferences, rather than SOAP or JSON, which JC backed up. RL added the proviso of XML being in a community-agreed an application (for example SPECTRUM interchange format). Frankie Roberto argued for both XML and JSON, since the latter has advantages for data exchange and overcoming cross-site security issues with JavaScript.