Thursday, February 28, 2008

Tim Berners-Lee talks to Talis, namechecks museums (generally)

TB-L just did an interview with Paul Miller of the estimable Talis (transcript), talking inevitably about the Semantic Web. I've yet to read it all, but it was nice to see that in the first paragraph of his answer to the first question, he raised museums and libraries:
I think the Semantic Web is such a broad set of technologies and is going to do
so many different things for different people. It is really difficult to put it
on one thing. What are the steps necessary right now for the life sciences
community to be able to use it for their data about proteins is probably
different from which steps do we need to be able to get interoperability between
repositories of library data and museum data.
Which is true enough, and not exactly controversial. Good to have it from the Don, though. It's all about repositories in this vision, which is fine as far as it goes but I'm not clear that it gets us all the way, though. TB-L points out that the common worry that SW means marking up HTML pages is looking at a small part of the picture and that really the bulk of the SW effort will be about databases (remember, I'm skimming!).
My feeling at the moment is that the semantic web is building, but in good part through the efforts of those who are bypassing the "classical" technology. Inferred semantics are crucial in this phase, at least for the businesses that seem to be making some of the most interesting stuff. RDF and OWL are bit players here, although they have much more of a role in dealing with data that's already nicely structured. As TB-L points out, the steps towards full SWW do have payoffs on the way (as they must, not least in our own sector), and data integration is just such a self-rewarding step:
In fact, the gain from the Semantic Web comes much before that. So maybe we
should have written about enterprise and intra-enterprise data integration and
scientific data integration. So, I think, data integration is the name of the
game. That's happening, it's showing benefits. Public data as well; public data
is happening and it is providing the fodder for all kinds of mashups.

On the public side, light and loosely-coupled stuff will/is giving us payoffs for lower cost than going hardcore SW, and yet provides useful stepping stones on the way. Microformats, public APIs etc.: lowish cost, relatively immediate reward (potentially).
One more quote, and then I'll have to stop even skimming. This is about what to say to a CIO who want to understand what SW could do for their company, but it applies to museum people too:

"Well you should take an inventory of what you have got in the way of data and
you should think about how valuable each piece of data in the company would be
if it were available to other people across the company, or if it were available
publicly, and if it were available to your partners."

And then, you should make a list of these things and tackle them in order. You should make sure you don't change the way any of your data is existing, is managed, so you don't mess up the existing systems and so on.

He then goes on to talk about the developing technological picture including SPARQL, GRDDL and the rest, and from that point on I need to read a lot more attentively....

