Public API inputs and outputs

[edited 17/3/2008]

We discussed at the Paris meeting the range of parameters
that we thought that an API might need to handle to perform the sort of (public-facing)
tasks we envisaged. We didn't actually talk about output, except in regard
to the ability to specifiy return fields, but I think that this is actually
much the simpler part to work out. I've reworked our discussion, added a few
bits of my own (including the UGC bit), and split it into sections relating
to general parameters, filters for collections queries, and UGC. No doubt
lots more clarification and revision are needed and I'm pretty unclear on
some bits myself, but it's something!

Input parameters

The “profile” includes various elements defining the operation
in terms of function, languages, values and format of returned data etc. Collections
data requests will be required for some functions, and consist of various
filters. The third table relates to operations on user generated content,
including adding, editing and getting (by user or group). We may decide that
some operations are only open to specific users or categories of user; for
example, accessing UGC of some categories might only be possible for the owner
of that UGC (via their associate API key) or the owner of the collections
related to that UGC. TBC!

Query profile (data access and data addition/editing functions)

*Parameter*	*Access or edit*	*Example values, notes*
Function [required]	A	search, compare, translate, add, update
Return format [required]	A, E	DC-XML, RSS, geoRSS, CDWALite, JSON, CSV. This might instead be implicit in the target URL.
Return fields	A	Array of field names, but a default set would perhaps include GUID, title, thumbnail, short description, owner, owner type, media. Might also provide shortcuts to preset field groups. Will vary according to target entities
Search data	A	Formal metadata; all data; expert and user tags; user tags only; “expert” tags only; specific user/expert/group tags.
Expanded terms	A	True, false [use/don’t use thesauri etc.]
Requesting language	A, E	EN, FR etc.
Return language	A, E	As above. If only one is present presume the same.
Key [required]	A, E	API user key
User ID [required for some operations]	A, E	For the end user. Required for accessing/modifying data attached to specific users or groups. Presumably we need to authenticate and authorise in some way, too, for some operations.
Rights/licence	A, E	Perhaps multi-value, specifying rights/licensing parameters. Likely to be more complex than one field!

Collection data filters (access only)

*Parameter*	*Example values, notes*
Target entities	Objects, people, places, subjects [if we are enabling anything more than objects]
GUID	A unique identifier given to every record in Europeana
Set ID	ID for a set of entities, which may require the appropriate key, depending upon privacy settings for that set.
Keyword	Tricycle, ww2, treaty, Anne Briggs, documentary [multi-field search].
Structured data: name	Name of object, person or place. If these use different fields, then the right one should be inferred from the target entity. Examples: photograph; sunflowers; Forlì (or Forli); Max Brod (or M Brod); Rockall
Structured data: date [point, range, older, younger]	14 July 1792; 19^th century. “Older than 1850” might be expressed as: “- 20000000 – 1850”; “Younger than” as: “1850-2050”; uncertainty like “1850 +- 5” as a range: “1845-1855”, though this isn’t perfect
Structured data: related person	For returning objects, people and places
Structured data: related place	For returning objects, people and places. See also “geographical” below.
Subject
Original language	Of object, principally, for documents (if this data is well expressed)
Originating institution	Good structured data, ideally (we may require an ID), but we could permit a string search across the relevant field.
Originating institution country	For searching by current location
Originating institution type	Museum, library, archive, A/V archive
Location	Including sub-parameters for grid reference, coordinates, place name, and the location of concern (e.g. place of creation, place of publication, location of subject matter)
Sorting	Keyword occurrence, date precision, location, location relative to user, institution type – perhaps sorting partly inferred from the fields used in search, but if these are mixed e.g. date and place plus keyword, need to sort on one before the other.
Media	text, audio, image, video or more specifically PDF, WAV, MPEG etc.
Format [item type]	Map, book, video perhaps. Is this data held in a structured way, and is it distinct from the media metadata?

UGC operations (add, edit, view)

These operations will need user ID (or group ID) plus authentication and authorisation for certain operations (but not for viewing public data).

*Parameter*	*Access/edit*	*Example values, notes*
tag	A, E	For modifying tag i.e. deleting, or viewing associated items
note	A, E	For modifying or viewing
UGC contributor	A, E	Perhaps multiple values, including groups, so we can look for stuff with a given tag but only when tagged by a certain set of UGC contributors
UGC contributor type	A, E	Content contributor vs. other user

The Doofer Call

About Me

Saturday, March 08, 2008