Tuesday, November 22, 2011

Linked Data, Semantic Web, and Web 3.0

There are three "movements" in the community that seem to be synonymous in many discussions, but are becoming quite distinct in my own mind. Linked Data, the Semantic Web, and "Web 3.0". This is my current mindset regarding how these movements might be more precisely defined, and how the three movements differ.

Linked-Data is a movement to get all data on the Web exposed as Triples. There isn't much attention paid to the entity-types or relationship-types used to achieve this - the goal is just to get the data (a) published, and (b) published in a standard computationally-consumable "model". This movement is being pushed by the W3C (in particular) and I can understand why! Like HTML, publishing data in Triples is a pretty low barrier-to-entry and helps fulfill the objective of getting the maximum amount of buy-in from the global community. The football of data integration is kicked down the road to solve at a later date, just as technologies were invented after the fact to (try to) integrate HTML-formatted data.

The Semantic Web movement spends much more time thinking about what the entities ARE, and what the relationships between them ARE (and can be). So far, this movement is being spearheaded by a relatively small number of ontology consortia; the global thought-leaders who are defining these entities and relationships are quite visible (and influential!) on various mailing lists and blogs. This "concentration of power" in the hands of a few leaders is, I believe, the result of a much higher level of difficulty in deep and accurate semantic modeling. Frankly, getting semantics right is hard! I am certainly one of the masses in this regard - when I have a semantic representation problem, I generally defer to one of these thought-leaders to tell me how to do it properly (though amusingly, not all thought-leaders agree on "properly"...)

"Web 3.0", however, in my opinion, is (should be) a completely different animal! What distinguished Web 2.0 was that the content was user-generated. Individuals could publish their own opinions and thoughts and information through straightforward interfaces, and this data ended-up (often) being produced in a form that could easily be consumed by other interfaces, and "mashed-up" into enormously useful applications. So, given the analogous 3.0 moniker, I suggest that Web 3.0 should represent the combination of Semantic Web with Web 2.0 - a Web in which individuals are producing fragments of ontologies... INDIVIDUALIZED ontologies... which can be shared, compared, mashed-up, and utilized to interpret Linked Data. Rather than the ontology being the product of consortium group-think, it is the product of an individual... perhaps representing the opinion of only the individual who published it!

This vision of Web 3.0 is what my group is pushing for, and we're trying to build the interfaces that make it (a) easy for an individual to create these personalized knowledge-fragments, (b) easy for others to use these artifacts to interpret their data "through the eyes of another" in order to promote crucial scientific discourse and disagreement, and (c) easy for anyone to compare and contrast the ideas and understand the foundation for the differences between them (and then hopefully conceive and conduct experiments to evaluate the "truth" behind these differences!)

I realize that I am imposing definitions on existing words, but... I'm not satisfied with the current (loose) definitions of these words! So, this is how I define them, for myself, in order to keep them clear in my own head :-) Certainly, I think "the holy grail" is Web 3.0, and it's the goal that I am devoting my research career to achieving!