Dreams 2 Text

Sunday, January 15, 2006

Spread Firefox through educational Google Videos.

Google Video allows users to upload videos without any constraints in size. So firefox evengelists can shoot hundreds of thousands of short videos of educational material. For E.g. Experiments. These videos can be submitted to spreadfirefox.com where it will be prefixed with a Firefox logo along with a copyleft notice such as creative commons or GFDL. The FFox logo can be for eg on the top-right corner of the screen throughout the video. This can be done using a simple script or program.

At the end of each video we can describe "one specific advantage" of using Firefox instead of IE. If there are 10,000 videos and 10 messages, then the 10 messages can be randomly distributed among thousand videos each.

Wikipedia can be used for getting a list of initial experiments. Programs like Dicovery kids, Nat Geo Junior, Bill Nye etc can be used for inspiration. A forum or wiki could be set up to minimize duplication of a few favorite experiments while leaving the rest untouched. Wikipedia can link to these videos as external links if the wikipedians feel it is fine.

At the end of it we could be having a valuable open educational resource that is likely to be frequently visited. And each time they watch it, they will be reminded of the value of firefox, the value of open source, the value of open collaboration(that inspired and enabled me to post this here and discuss it).

The typical firefox user is usually an above average individual, though this demographic is fast changing(for good). Most geeks,nerds,hackers,crackers and everyone one in between has already switched to FF for a long long time. They are likely to have access to video cameras / cameraphones / webcams etc and necessary software. Also high levels of knowledge of typical evengelists will manifest itself in high quality videos. Just imagine the results if Larry Page or Sergey Brin asked each of his employees to contribute one video towards the movement. Imagine the quality and quantity of results. Google definitely benefits from more visibility for Google Video.

Schools can organize Firefox video shootouts with hundreds of experiments. The entire activity in itself will be a Firefox switchover campaign making higher authorities notice. It could also be the source of large number of videos. School Managements will like it for they publicity they will get from the videos. The kids will talk to their parents, relatives etc about their participation in the movement. That will further multiply the buzz around firefox.

Most important is the fact that at the end of the campaign we will be left with something of permanant value. Something that will be viewed again and again. Something that will evengelise for firefox for all time to come without any recurring expenditure. This strategy makes maximum use of the distributed nature of the firefox fan base and its capabilities to spread firefox.

The suggestion is not a replacement for the current firefox ad campaign but is a complement to it. Both have their own unique advantages and target audience and can be carried out in parallel.

Saturday, December 31, 2005

Negroponte's 100$ laptop per Pixel. What??

As usual, I have some crazy ideas for which I have no clue what to do. The current one is an about using a technique like Alex Tew's www.milliondollarhomepage.com for funding a million laptops suggested by Nicholas Negroponte. I can't promise you it is fault free. But I am confident you will have an interesting read. Please checkout my suggestion at

http://pedia.media.mit.edu/index.php/User:SudarshanP

Please let me know you comments on the discussion page. Also please free to edit whatever you like in the wiki.

Saturday, September 17, 2005

Semantic Wikipedia - At last one of dreams is at the verge of coming true.

Every time I see Wikipedia, I just say WOW and marvel at the quality and quantity of information. Every time I encounter Cyc, I mutter to myself... What consistancy!!! What beauty of logic, but how little data. I have always wondered... if only there were some way to distill the Wikipedia information and allow Cyc to reason with the union of its own assertions and the rdf from Wikipedia...

I even started off a mini project called Intelliwiki in Feb this year. After seeing some (rightful) opposition from the Wikipedians, I moved the project over to Jnanabase and created a section called Intelliwiki. (thanks to NSK from the Wikinerds Community.) I was trying to evolve some kind of Wikisyntax that could make semantic annotation easy. at the same time should be easy for users to edit and manipulate like the Wikipedia. After some initial enthusiasm my mind wandered off ;-) as usual. You might still find some of my ramblings at Intelliwiki interesting.

Just a week ago, I was randomly bouncing around the MediaWiki site when I encountred the post titled Semantic Mediawiki/Implementation. It was simply love at first sight. They propose the use of tags in a sentense like the one shown here.Suppose it is an article about Germany, then a statement like:
The capital of Germany is the city of Berlin.It has a population of 12,345.
would be annotated as
The capital of Germany is the city of [[has capital::Berlin]].It has a population of [[has population:=12345|12,345]].

At the time of saving the Wiki page, it would extract and save the annotation into an RDF store (or something similar) as...
* Germany - has capital - Berlin
* Germany - has population - 12345

A :: and := are used to distinguish between Relations and attributes. I wish not to redundantly elaborate what has been explained at the Semtantic Wikimetdia - Implementation talk page.. All I want to say is that, the distinction is more than trivial and rises interesting possibilities. Read the discussion to know more.

Also, I am very happy to see that the annotation is not far away from the data itself. For Eg.let us say someone updates the value of the population. It is now immediately available both to the visual part of Wikipedia as well the semantic annotation. If the annotation was stored elsewhere the article and the annotation would diverge over time and become eventually useless.

Combining annotation into Wikitext has the disadvantage of making things complex for newbies. The [[link]] is probably the most sexy tag in Wikipedia and it might not be very nice to see it cluttered. But weighing the pros and cons, I am fully in favour of the annotation method. I even wrote a demo Javascript to automatically hide the annotation for users who hate to see it and reintroduce it at the time of submission. The problem is not as easy as it sounds ;-). So if you are a code junkie you can roll your sleeves and crank up some lines. Maybe your solution is far better than mine. At least I tried ;-). Here is link to the DEMO

Wishing the team from doccheck.com who are sponsoring and implementing this all the very Best!!! May the blessings of God protect them from the Murphy's Laws. But some times when I think of what these guys are upto, I begin to pray that Murphy's laws keep them and me busy till the end of our lifetimes... So that we dont have deal with the Dino Eggs hatching ;-). But hey I wanna see the Chick come out, come what may... Ah the ironies of life!!!

Wednesday, August 24, 2005

Secure Surveilance Camera

Have you watched impossible heists on Discovery?

In it a surveillance camera is replaced by one that transmits a fake image by the intruders and the security officers have no clue that they are staring at a fake image. Another trick is to physically hang a picture in front of the camera to fool the security officers.

If the camera has the ability to digitally sign the photo using cryptographic keys embedded within the camera, then a replaced camera can be instantly detected and an alarm generated.

The focusing apparatus of the camera can determine the distance of the object and generate a warning if most of the camera is covered by an image at close distance.

You may also be interested in checking out my earlier suggestion about GPS camera phones.

Friday, August 12, 2005

ViewRank

On the one hand we see continuous exponential growth in computing capabilities of hardware as doubling of clock frequency, doubling of memory, bandwidth, number of cores etc. On the other hand we seem to be working harder and harder to please the computer than the other way round. Yet another amusing fact is that few people realize that the processor is idle most of the time in-spite of all the bloat-ware we run on it. A search indexing service like “Google Desktop Search” is an example of a program that utilizes this spare processing power to make the life of user a little more pleasant. Here is a suggestion to take that one level further.

Yet another interesting application is dashboard or implicit query as Microsoft calls it. It actually enhances the user experience by proactively suggesting relevant and useful information based upon what the user is doing at that moment.

The computer deals with a huge set of files. It has no clue about what each file actually represents and its importance with respect to the user. Each action performed by a user implies a lot of things about the relevance of a certain piece of data to that person. However such information is typically never captured.

The importance of a document depends upon the number of times it was seen by the user. Here ‘seen’ is an important word. If the user does not see it, it does not matter in the imaginary universe of the user.. For Eg a configuration file that the user indirectly uses but has never seen, is irrelevant from the user’s point of view… However a configuration file that was edited by the user is very important to him.

The duration for which the document was open indicates its importance to the user.

The importance of a section of a document depends on the amount of time that section was visible on the screen.

The importance of a section of a document depends on the amount of visible area allocated to it on the screen..

The portion of the document that the user edits is more important to him than some portion of a document he just glanced at.

Every single keystroke, every single mouse click, every single meter of mouse movement that was physically done by the user is much more precious than Gigabytes of junk he copies in and out of the system.

The stuff that he dealt with more recently is usually more important to him than the stuff that he dealt with in the distant past.

Error messages that one frequently encounters are more important to the user than a huge list of errors that could possibly be generated.

A folder that a person keeps visiting frequently is more important to him than some subfolder in some installation directory he has no clue about. Here visiting could mean navigating through a file open dialog or using the windows explorer or File Manager or other utility or could be a cd using the command prompt.

In this context I introduce a (possibly) new concept. I call it view rank. The above mentioned heuristics at judging the relevance of data could dramatically improve the relevance of the hints and clues provided.

I call the above heuristics as viewrank. Each section of each file has an associated viewrank. When you use Google Desktop Search you could possibly say “Sort in the order of importance to me.” This would actually sort results in the order of the viewrank of the documents.

How many times have you muttered to yourself… I have seen this somewhere on my computer. I don’t know when and where. Now you can have a feature that says “Search in all documents and files that I have seen at least once and arrange it in the order of my viewrank.” It must be obvious to you that the number of files and documents you have actually “spent time with” is a very small subset of all files that exist. Thus you are much more likely to get the perfect hit. For Eg. You would have spent a lot more time on your resume than a leave letter.

Capturing ViewRank

One solution is to ensure that all applications broadcast event information such as document opening, scrolling, editing etc. But this approach has a problem. We will have to wait until all these applications are ready to broadcast this information to the dashboard.

Another approach could be to use screen scraping. Take a print screen of the screen, every second and save it as a compressed image automatically into some free memory or a particular folder in the hard disk. Along with this use a tiny invisible program that stores any keyboard and mouse events for that second. This info can be obtained easily windows hooks…

Analyze the image using primitive OCR software to extract all text from the images. I say primitive because extracting text from a print screen is far easier than extracting the same from a scanned image because of the absence of noise. The co-ordinates of the windows, captured should provide clear set of heuristics for compression searching for text etc. Also we have APIs that return us the DOM of many documents and GetWindowText Function that get us info from many GUI elements. This text would be added to a text index.

Some parts of these snaps shall be extremely compressible.

For Eg any Windows Explorer window can be just remembered as the set of icons + filenames + attributes + foldernames.

Only differences in the screen b/w adjacent shots are stored.

Movies, Games etc. can be ignored.

Snaps of plain documents, dialog boxes, Web Pages etc are typically highly compressible.

Windows containing command prompts etc can be primarity remembered as just strings.

Simple back of the envelope calculations show that even if we wish to retain most of the pictorial information it would generate roughly about a 10MB per hour -100 MB per day or 3GB per month - 36 GB per year. Considering the fact that hard drives these days can store terabytes of data, this is really insignificant. By the time the View rank solution is ready, I guess we would have gone thru another pair of Moore’s law cycles.

If we store only text the sizes would be much smaller of course. A lifetime worth of text would definitely fit onto a single CD or DVD.

Using ViewRank

It is like a virtual movie of your entire interaction with the computer. You can travel back in time to any day you choose and see what you had done. All the text that is visible will be actually copy-able. Any
hyperlinks identified in the images will be clickable etc. Remember this would include Email messages, Dialog boxes, Wizards, Folder navigation, interaction with IDEs, the command prompt, telnet sessions etc.

The information thus captured will enable your dashboard to give context sensitive help and advice. For Eg you might have worked on a defect called 23127. You might have info about this on your IM, Mail, and Code etc. When you encounter 23127 you could actually be presented with relevant info in your dashboard.

Your Desktop Search will now also be able to search within everything you have actually “seen” in the specified period. This is very different from whether the string is in the files whose last access date is within a certain range. Also the ordering would be in the order of the time and effort you spent with it.

The Screen Captures usually make it a lot easier to reproduce bugs. In fact one does not even need to redo the action. All one needs to do is to just copy the relevant screen shots and mail it forward.

ViewRank FAQ

What about privacy?

Your computer has so many programs installed on it. Your sole solution for privacy is installing applications like Antivirus programs, firewalls etc. You trust that these programs themselves are not hostile to you. If the antivirus provider himself steals your info there could be very little you could do. But the logic is that stealing your info would hurt the business of an OS company or Antivirus Company or whatever. So they would not steal your info. After it is just a matter of writing clean code. As far as your privacy concerns you should be no more scared than installing an OS without seeing the source code or installing an Antivirus or Desktop search without looking at the source code.

In fact you can be more confident about your privacy than letting your email reside on a remote server where you are forced to trust what they tell you about what they do with your mail. A Desktop application is very different. If it is sending out your private info, security companies get a chance to shred the applications to bits and examine malicious behavior. So the probability of mischief is much lesser near zero. If it is Open Source it is even better.

What about all the love letters, porn, the hacking sites, all the info I never want captured anywhere?
You can instruct the ViewRank Service to pretend as if some programs were never launched. For Eg. You could have two or more browsers on your system. Use one for all useful work you do for Firefox. Use another browser say IE for all activities you are uncomfortable being recorded. The recording will never capture info/snapshots about the particular application.

Of course in case it does capture any info that u think can cause problems you can always delete it using an interface.

What about efficiency?
The computer can use the wasted CPU cycles to do the processing just like all the Peer to Peer Programs, SETI@Home etc. The user will notice little or no performance degradation as no processing will be done when
the processor is busy.

An Open source application could be written based on these ideas that integrate with dash board. Some company could possibly come up with a product that does all this.

Do you feel there are any gaps in the solution that I suggest here? Would u use it? Can you tell any demerits that you see here? Do you have suggestions for improvements?

Tuesday, June 28, 2005

Making it easy for the Layman to use the Semacode based "Virtual GPS"

If you are new to this blog please read the earlier post titled "The poor man's GPS"

Here is an example which illustrates how one can reach a house or building located at 15.12345N / 45.67891E.

When one is driving on main road or a highway he is not very particular about the insignificant digits of the co-ordinates. Even if we retain just 2 digits after the decimal point we get a resolution of 1.1 Km which is more than enough for getting a sense of one's location.

After reaching close to the destination, one can start hunting for a particular house or building in a particular area. The integer parts of the latitude and longitude are extremely unlikely to change and even the first digit after the decimal point changes only once in 11 Kms. Therefore these numbers are printed small adjacent to semacodes on houses. Infact for small cities these numbers can be constant throughout the city. So we now have 3 digits of latitude and 3 digits of longitude left, using which a person can reach a 11m x 11m spot of land. After reaching the spot he can find the house or building using the house/building number mentioned along with the semacode.

When you try to imagine uses for such a system, stop asking yourself where are GPSes being currently used... The current uses of GPS are primarily limited by its cost and nothing else. Once the co-ordinates are available universally, it won't be ten of thousands of users who would use this, but hundreds of millions of people.

Imagine having to cram your brain with hundreds of Chinese names like Quin Qan Xing, Ching Chang Poi etc on a visit to china. The harder part is you have no extra info with these names and you also dont know the local language. If you just misspell due to lack of familiarity of language, you could land up in a totally different place. But the moment you see a Co-ordinate you immediately know how far your destination is from you. How far you are from your hotel etc. Same for a Chinese in the US or Europe. See the previous post about how easy it is to estimate approximate distance from your destination with an average accuracy of about 80 percent without even having to pull out a paper for calculations.

Lasallastrasse does not bring anything to my mind if I go to Europe and Nanjundeshwara Nagar would sound meaningful to an Indian but for an outsider a Nagashettahalli would not bring up any different picture. In such situations co-ordinates are the easiest way to identify current location as well as destination. If co-ordinates are available over the place, one feels more comfortable as he has some idea where he is at any point of time. Especially if you are in place where the native language is different from yours. Even if you just have a low detail map and the map has co-ordinates as well which usually is the case, it becomes extremely easy to deal with navigation and travel.

The moment people know the co-ordinates of their own houses, they can specify it on mails, couriers etc. This would significantly improve reliablity of delivery, which any person would definitely like to have. The postman or courier boy could have a GPS with him, or he too can rely on the co-ordinates available around, in conjunction with your mail address.

Specifying your delivery address on Websites(that really need it)would become much easier too.

Those who want to geotag images can simply snap a semacode and then the actual scenery or object and just MMS it to a site like geobloggers or flickr or some other moblogging site which could geotag the image with the co-ordinate info.

The possibilities are endless, we have just begun to see the tip of the iceberg. This current state of the GeoSpatial Web, is like it was when 640k of RAM was like too much of storage for a PC. As more and more people use co-ordinates for day to day activities, we shall begin to see, unforseen innovations in this area.

Monday, June 27, 2005

How hard is it for the “Common Man” to use the Poor Man’s GPS mentioned earlier?

Many of the readers quickly grasped how the users with a camera phone or with just a cellphone might use the system. However the common question was how will a person without these use these reach their destination? Some even asked if he needs to know pythogoras theorem or distance formula from analytical geometry. So here is my answer.

Latitudes: A degree of latitude always corresponds to about 111 km. So if there is difference of one in the first decimal point of the latitude, it corresponds to a difference of ~10 km. If the difference is 3 then the difference is something like ~30km. If the numbers differ on the second decimal point of the latitude by 5, the difference is like 5 km. On the third, it is like 100m per digit change. On the fourth it is like a 10m change and on the fifth it is like a one meter change.

Longitudes: A degree of longitude corresponds to a distance from 0 to 111 km depending on how far away it is from the equator. For a certain city or town these differences shall be negligible between different parts except for places like the North and south pole which are very sparsely inhabited. We just need to multiply the difference in longitudes with a constant which applies to that city. For cities and places close to the equator, we can treat it just like we trated the latitude, assuming approximately that each digit change in the first decimal place of the longitude meant a 10km difference. Each digit change in the second place meant one km and so on… As you go up you could compute the changes as 3/4, 1/2,1/4th and so on.

That way you don’t have to actually do any serious multiplication division etc. And any way you are never going to the destination through the shortest path or geodesic. So you can coolly add the two numbers you found out earlier i.e the diff b/w longitudes in km and the same for latitudes and just tell yourself roughly how far you are from your destination. You also always know whether you destination lies towards your north or south, east or west. So moving around is more like Golf. First you hit far away and as you keep getting closer to your destination, you start hitting more carefully. It does not need any extra ordinary brilliance to move around the city with some simple heuristics about big barriers like rivers, major highways etc. and some common sense along with the co-ordinates of your target scribbled in your addressbook and co-ordinates imprinted on various objects like mailboxes of houses, power poles etc..

Using Landline Telephone Numbers for GeoLocation.

LandLine telephone numbers are associated with a particular physical address. It can be trivially mapped to a geographical co-ordinate. Once the mapping is available, you just need to know the destination phone number. Just SMS it to a specific number, along with where you are currently located and it can send you back an SMS or MMS telling you how to reach there. In short you can use a fixed landline’s phone number just the way you would use a Long/Lat co-ordinate.

Also once the mapping is available, we can massively automate the process of creation and distribution of the barcodes mentioned in my earlier article titled The Poor Man’s GPS. In some centralised office with a few employees, the barcodes for all subscribers of the telephone network could be printed out by pinpointing the addresses in the telephone directory using Google Maps. These can be delivered to the users along with their telephone bills. That way the human labour is minimized. The users can then stick it next to their mailbox for Example.