Saturday, 27 March 2010

Sprint 1, it works

So having crawled the sites, mangled the data... some very simple code...


from djangocalais.models import *


def related_people(name):
    es = Entity.objects.filter(name__icontains=name)[0]
    cds = CalaisDocument.objects.filter(entities=es)
    for cd in cds:
        for e in cd.entities.all():
            if e.type.name == "Person":
                print e.name , ">",  cd.__unicode__()





m.related_people("Alastair Fitter")

... produces this....



This is what I like.... Artificial Intelligence for Dummies. I can't wait to slap an interface on this... Now to brush up on some jQuery.

1 comment:

  1. Tom:

    Very cool. In a future sprint you may want to try a trick. While the extractions from OpenCalais will really help building a people graph one simple technique that often works well is co-occurrence at "x" level - where x is document, page, paragraph, etc. If, for example, you have access to a corpus of publications or Department overview web sites (John Doe's Research Team..., etc) - co-occurrence at the document / web page level would be a great addition to building person:person maps.

    As an aside - I strongly support the building of results into normal workflows. There are many pretty visualizations, etc - but in our experience there's one fundamental problem - people don't use them.

    Thanks for sharing.

    Regards,

    ReplyDelete