Saturday, September 17, 2005

Semantic Wikipedia - At last one of dreams is at the verge of coming true.

Every time I see Wikipedia, I just say WOW and marvel at the quality and quantity of information. Every time I encounter Cyc, I mutter to myself... What consistancy!!! What beauty of logic, but how little data. I have always wondered... if only there were some way to distill the Wikipedia information and allow Cyc to reason with the union of its own assertions and the rdf from Wikipedia...

I even started off a mini project called Intelliwiki in Feb this year. After seeing some (rightful) opposition from the Wikipedians, I moved the project over to Jnanabase and created a section called Intelliwiki. (thanks to NSK from the Wikinerds Community.) I was trying to evolve some kind of Wikisyntax that could make semantic annotation easy. at the same time should be easy for users to edit and manipulate like the Wikipedia. After some initial enthusiasm my mind wandered off ;-) as usual. You might still find some of my ramblings at Intelliwiki interesting.

Just a week ago, I was randomly bouncing around the MediaWiki site when I encountred the post titled Semantic Mediawiki/Implementation. It was simply love at first sight. They propose the use of tags in a sentense like the one shown here.Suppose it is an article about Germany, then a statement like:
The capital of Germany is the city of Berlin.It has a population of 12,345.
would be annotated as
The capital of Germany is the city of [[has capital::Berlin]].It has a population of [[has population:=12345|12,345]].

At the time of saving the Wiki page, it would extract and save the annotation into an RDF store (or something similar) as...
* Germany - has capital - Berlin
* Germany - has population - 12345

A :: and := are used to distinguish between Relations and attributes. I wish not to redundantly elaborate what has been explained at the Semtantic Wikimetdia - Implementation talk page.. All I want to say is that, the distinction is more than trivial and rises interesting possibilities. Read the discussion to know more.

Also, I am very happy to see that the annotation is not far away from the data itself. For Eg.let us say someone updates the value of the population. It is now immediately available both to the visual part of Wikipedia as well the semantic annotation. If the annotation was stored elsewhere the article and the annotation would diverge over time and become eventually useless.

Combining annotation into Wikitext has the disadvantage of making things complex for newbies. The [[link]] is probably the most sexy tag in Wikipedia and it might not be very nice to see it cluttered. But weighing the pros and cons, I am fully in favour of the annotation method. I even wrote a demo Javascript to automatically hide the annotation for users who hate to see it and reintroduce it at the time of submission. The problem is not as easy as it sounds ;-). So if you are a code junkie you can roll your sleeves and crank up some lines. Maybe your solution is far better than mine. At least I tried ;-). Here is link to the DEMO

Wishing the team from doccheck.com who are sponsoring and implementing this all the very Best!!! May the blessings of God protect them from the Murphy's Laws. But some times when I think of what these guys are upto, I begin to pray that Murphy's laws keep them and me busy till the end of our lifetimes... So that we dont have deal with the Dino Eggs hatching ;-). But hey I wanna see the Chick come out, come what may... Ah the ironies of life!!!