Friday, May 11, 2007

Banff, interesting stuff

Some interesting ideas from the Banff conference:

1. CSurf, a context-driven, non-visual web-browser. This web browser for people with visual disabilities helps readers optimize their web navigation based in a very innovative concept of context semantic matching. Once the user clicks on a link they speed up the reading of the linked page content starting directly with the block that has more relevant semantic similarity to the original link block. Their technology seems likely to be the basis for a mobile browser as well. I asked the question and the answer is they're already thinking on CMobile which would address this market. They also have a very cool geometric analyzer to recognize blocks within a page that seems highly reusable for other page analysis purposes. CSurf will be launching soon and we'll keep an eye on it as well as the CMobile version.

2. Ryen White, with Microsoft Research, presented a prototype of a search log analyzer that tries to identify behavioral patterns in users performing web search tasks in order to be able to offer personalized search interfaces/tools depending on the adopted patterns. It was basically a log based study based on browser trails.

3. Marius Pasca from Google presented an optimization to the traditional analysis of data for textual info from document collection based. The innovative thought is to consider that people is providing knowledge when looking for knowledge, therefore he conducted a study mining information from search logs to deduct information about particular attributes (company name, city name, country population) of classes (company, city, country) instances (apple inc, Rochester, USA). There seem to be questions as far as the temporal variations of query logs, but it seems like one viable research path for mining text info from logs.

4. Geospatial and temporal RSS tracker. This middleware tool has a very interesting approach in the sense of thinking out of the box and attempting to make news readable in a different way and richer content than the traditional newspaper linked into html. They considered adding geographic and temporal navigation of news. They used the typical Google powered map (similar to twittervision) including a nice picture in picture for associated relevant locations. Their heuristics and rules to mine the location out of the rss are pretty arguable as they can easily confuse location aboutness and source location.

5. Data mining of information intent on emails from Carnegie Mellon. Their VIO (virtual information officer) attempted to data mine intent from emails suggesting the suitable form to perform the action required in the email and populate the related fields. One thing I liked from their approach is that it is in their words "precision biased", meaning that this system will only suggest when the probability of making a relevant and appropriate suggestion is way above some threshold. Better doing nothing before doing something that might be wrong in another words. The system is first trained ("system domestication") to do form suggestion, field suggestion and field value suggestion. They obtained about 17% to 31% improvement.

No comments: