About Me

My photo
Web person at the Imperial War Museum, just completed PhD about digital sustainability in museums (the original motivation for this blog was as my research diary). Posting occasionally, and usually museum tech stuff but prone to stray. I welcome comments if you want to take anything further. These are my opinions and should not be attributed to my employer or anyone else (unless they thought of them too). Twitter: @jottevanger

Saturday, March 08, 2008

Public API inputs

Public API inputs and outputs

[edited 17/3/2008]

We discussed at the Paris meeting the range of parameters
that we thought that an API might need to handle to perform the sort of (public-facing)
tasks we envisaged. We didn't actually talk about output, except in regard
to the ability to specifiy return fields, but I think that this is actually
much the simpler part to work out. I've reworked our discussion, added a few
bits of my own (including the UGC bit), and split it into sections relating
to general parameters, filters for collections queries, and UGC. No doubt
lots more clarification and revision are needed and I'm pretty unclear on
some bits myself, but it's something!

Input parameters

The “profile” includes various elements defining the operation
in terms of function, languages, values and format of returned data etc. Collections
data requests will be required for some functions, and consist of various
filters. The third table relates to operations on user generated content,
including adding, editing and getting (by user or group). We may decide that
some operations are only open to specific users or categories of user; for
example, accessing UGC of some categories might only be possible for the owner
of that UGC (via their associate API key) or the owner of the collections
related to that UGC. TBC!

Query profile (data access and data addition/editing functions)


Access or edit

Example values, notes

Function [required]


search, compare, translate, add, update

Return format


A, E

DC-XML, RSS, geoRSS, CDWALite, JSON, CSV. This might instead be implicit in the target URL.

Return fields


Array of field names, but a default set would perhaps include GUID, title, thumbnail, short description, owner, owner type, media. Might also provide shortcuts to preset field groups. Will vary according to target entities

Search data


Formal metadata; all data; expert and user tags; user tags only; “expert” tags only; specific user/expert/group tags.

Expanded terms


True, false [use/don’t use thesauri etc.]

Requesting language

A, E

EN, FR etc.

Return language

A, E

As above. If only one is present presume the same.

Key [required]

A, E

API user key

User ID [required for some operations]

A, E

For the end user. Required for accessing/modifying data attached to specific users or groups. Presumably we need to authenticate and authorise in some way, too, for some operations.


A, E

Perhaps multi-value, specifying rights/licensing parameters. Likely to be more complex than one field!

Collection data filters (access only)


Example values, notes

Target entities

Objects, people, places, subjects [if we are enabling anything more than objects]


A unique identifier given to every record in Europeana

Set ID

ID for a set of entities, which may require the appropriate key, depending upon privacy settings for that set.


Tricycle, ww2, treaty, Anne Briggs, documentary [multi-field search].

Structured data: name

Name of object, person or place. If these use different fields, then the right one should be inferred from the target entity.

Examples: photograph; sunflowers; Forlì (or Forli); Max Brod (or M Brod); Rockall

Structured data: date [point, range, older, younger]

14 July 1792; 19th century. “Older than 1850” might be expressed as: “- 20000000 – 1850”; “Younger than” as: “1850-2050”; uncertainty like “1850 +- 5” as a range: “1845-1855”, though this isn’t perfect

Structured data: related person

For returning objects, people and places

Structured data: related place

For returning objects, people and places. See also “geographical” below.


Original language

Of object, principally, for documents (if this data is well expressed)

Originating institution

Good structured data, ideally (we may require an ID), but we could permit a string search across the relevant field.

Originating institution country

For searching by current location

Originating institution type

Museum, library, archive, A/V archive


Including sub-parameters for grid reference, coordinates, place name, and the location of concern (e.g. place of creation, place of publication, location of subject matter)


Keyword occurrence, date precision, location, location relative to user, institution type – perhaps sorting partly inferred from the fields used in search, but if these are mixed e.g. date and place plus keyword, need to sort on one before the other.


text, audio, image, video or more specifically PDF, WAV, MPEG etc.

Format [item type]

Map, book, video perhaps. Is this data held in a structured way, and is it distinct from the media metadata?

UGC operations (add, edit, view)

These operations will need user ID (or group ID) plus authentication and authorisation for certain operations (but not for viewing public data).



Example values, notes


A, E

For modifying tag i.e. deleting, or viewing associated items


A, E

For modifying or viewing

UGC contributor

A, E

Perhaps multiple values, including groups, so we can look for stuff with a given tag but only when tagged by a certain set of UGC contributors

UGC contributor type

A, E

Content contributor vs. other user

No comments: