Monday, 7 February 2011

Progress February 2011

Over the last few months I've tried a number of novel approaches to try and make sense of the data that the crawler and semantifier return.

Many of these approaches (based on the Neo4J work) still have some potential ( I think ), but I wasn't able to prove it.

Recently, if only in an effort to see what's there, I went back to basics and do the simplest thing possible (almost). The Home Page now looks like this (see below)... a simple tag cloud showing the most frequent concepts and images of the people who's pages were found to contain squarish, largeish images. There is also a type-ahead scrolling search box. Showing people what is in the database, before they search is a really useful means of helping people not to search for things that aren't there.

I decided to attempt to return only related people and then overlapping concepts (for any given concept). This would mean that a search for plankton might return ...

Plankton -> (Peter Daines, Andy Bird, Rob Raiswell, Michael Krom etc) which would then subsequently return the connected concepts of (Fisheries, Aquatic ecology, Climate change, Environment etc).

The result of this really simple approach looks like this...

...and bizarrely, because the concepts are interlinked, the visualisation sorts itself out. That is to say, previously I was attempting to fingerprint the connections between concepts and then sort them, but found that the visualisation ( trying to find the best layout for interconnected concepts) actually does the same job, but in 2D space rather than in a matrix.

A rough beta of this work is online here . Be gentle with it, it's not finished. I now want to see if I can create some embed code that might be usefully added to another site - showing a mini collection of related concepts and URLs.