Sunday, September 23, 2007

Semantic Web or just Relational Web?


Both Tim O'Reilly and Alex Iskold from Read/WriteWeb have been talking about the Semantic Web this week. Any prediction about the web of the future involves in a way or another the Semantic Web. The reason is simple, we need to be able to have a better quality of search on the web. We want to be able to get all the packages out there for fall Hawaii vacations or all the 8MP digital cameras we can get below $200 that can ship to California with no taxes and overnight shipping.

Many years ago, Tim Berners Lee came up with a vision of a Semantic Web that would eventually be able to have computer agents looking for all these answers for us. The vision was basically about annotating the existing web in a way computers could understand. Adding tags with metadata representing ontologies (RDF mainly) that would allow computers to infer stuff and using a SQL-like language (SPARQL) retrieve meaningful information. The problem is how to produce/have such an annotated web?

There are basically two ways to do it as Alex Iskold summarizes very well:
1. Top-Down approach, computer tools allow the existing web to be converted into semantic representations (ontologies on RDF format most likely). In order to produce such representation the computer needs to understand semantics. To me this is the real Semantic Web and it is not as far as we imagine but is obviously not here right now, there are some hurdles and time to go ...

2. Bottom-Up approach. All the bottom-down approaches involve manual changes on existing HTMLs and applications producing HTML in order to add to the visible layer another layer with metadata. People build ontologies manually or with help of tools. This is a very consuming process that for now has only be adopted by highly scientific environments with extreme need of formalization. It basically has only worked in very vertical markets and areas.

The Semantic Web is not yet a reality despite being out there as a concept for many years because the top-down approach is ahead of us in terms of our current state of art and the bottom-down approach is high maintenance, to the full extent really impracticable and industry has not adopted it.

So, by definition the Semantic Web is one where computers have the ability to understand semantics. How can computers understand semantics? The only way for that to happen is for computers to understand natural language and to be able to build models of reality with their complex relationships as we humans do. Can we have a Semantic Web without computers understanding natural language? I think the answer is no. The reason is that we can give computers metadata for them to consume, and that will give us better quality information and searches, but there will always be stuff that is not annotated and is falling off the Semantic Web.

What are the alternatives and possible evolution paths from where we're at right now?
1. The partial annotation with industry standard and de-facto standards could prevail. Microformats, standard metadata for contacts, social networks, calendar events, geographical info among other. Seems like the right tool at the right moment. (bottom-down)
2. Browser toolbars and extensions that mine HTML pages in search of phone numbers (like skype recently tried not too successfully) or addresses (like Firefox extension Map+). (top-down)
3. Global relational databases such as Metadata could make a difference if they could become a standard and mine the web accordingly as well as being used in combination with pipes for retrieval. (mix)

I love all these approaches, I just wouldn't try to call them "Semantic Web", it's more like adding some semantics to the web, that's why some people call it "semantic web" (no uppercase). I see it more like part of the "Relational Web" as far as the computer goes and the web itself it does not have the semantic capability by adding these improvements we're just adding to the relational capability of the web. There are other great things happening in the "Relational Web" right now ...

Anyway, regardless of which way the web will evolve and how we'll name it, it's happening and it's very exciting to witness :)

No comments: