<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-8560185820472870114</id><updated>2011-08-02T21:51:54.511-07:00</updated><category term='visualisation'/><category term='crawlers'/><category term='jisc'/><category term='semantic'/><category term='university of york'/><category term='python'/><category term='gephi'/><category term='neo4j'/><category term='twitter'/><category term='collaboration'/><category term='delicious'/><category term='pppeople'/><category term='crawler'/><category term='api'/><category term='open calais'/><category term='google'/><title type='text'>PPPeople PPPowered</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>27</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-7503946453095649856</id><published>2011-02-07T08:15:00.000-08:00</published><updated>2011-03-07T08:43:20.073-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='open calais'/><title type='text'>Progress February 2011</title><content type='html'>Over the last few months I've tried a number of novel approaches to try and &lt;i&gt;make sense&lt;/i&gt; of the data that the crawler and semantifier return.&lt;br /&gt;&lt;br /&gt;Many of these approaches (based on the Neo4J work) still have some potential ( I think ), but I wasn't able to prove it.&lt;br /&gt;&lt;br /&gt;Recently, if only in an effort to &lt;i&gt;see what's there&lt;/i&gt;, I went back to basics and do the &lt;b&gt;simplest thing possible&lt;/b&gt; (almost). The Home Page now looks like this (see below)... a simple tag cloud showing the most frequent concepts and images of the people who's pages were found to contain squarish, largeish images. There is also a type-ahead scrolling search box. Showing people what is in the database, before they search is a really useful means of helping people not to search for things that aren't there.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh4.googleusercontent.com/-bYzUkzhXDwg/TXUGpBGitAI/AAAAAAAAAOc/2wLdNbrkDT8/s1600/HomePage.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="185" src="https://lh4.googleusercontent.com/-bYzUkzhXDwg/TXUGpBGitAI/AAAAAAAAAOc/2wLdNbrkDT8/s320/HomePage.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;I decided to attempt to return only related people and then overlapping concepts (for any given concept). This would mean that a search for plankton might return ...&lt;br /&gt;&lt;br /&gt;Plankton -&amp;gt; (Peter Daines, Andy Bird, Rob Raiswell, Michael Krom etc) which would then subsequently return the connected concepts of (Fisheries, Aquatic ecology, Climate change, Environment etc).&lt;br /&gt;&lt;br /&gt;The result of this really simple approach looks like this...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://lh5.googleusercontent.com/-V4ZIqlJ2hrg/TXUHH2HcCTI/AAAAAAAAAOg/4Y_02jSEFnI/s1600/DominicWatt.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="182" src="https://lh5.googleusercontent.com/-V4ZIqlJ2hrg/TXUHH2HcCTI/AAAAAAAAAOg/4Y_02jSEFnI/s320/DominicWatt.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;...and bizarrely, because the concepts are interlinked, the visualisation sorts itself out. That is to say, previously I was attempting to fingerprint the connections between concepts and then sort them, but found that the visualisation ( trying to find the best layout for interconnected concepts) actually does the same job, but in 2D space rather than in a matrix.&lt;br /&gt;&lt;br /&gt;A rough beta of this work is online here &lt;a href="http://pppeople.collabtools.org.uk/"&gt;http://pppeople.collabtools.org.uk/&lt;/a&gt;&amp;nbsp;. Be gentle with it, it's not finished. I now want to see if I can create some embed code that might be usefully added to another site - showing a mini collection of related concepts and URLs.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-7503946453095649856?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/7503946453095649856/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2011/02/progress-february-2011.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7503946453095649856'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7503946453095649856'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2011/02/progress-february-2011.html' title='Progress February 2011'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='https://lh4.googleusercontent.com/-bYzUkzhXDwg/TXUGpBGitAI/AAAAAAAAAOc/2wLdNbrkDT8/s72-c/HomePage.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-914972506414185457</id><published>2010-08-06T09:24:00.000-07:00</published><updated>2010-08-06T09:24:18.156-07:00</updated><title type='text'>Neo4J + Python Crawler + Open Calais + Gephi</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_2IeQft2KL-g/TFw0DBneTiI/AAAAAAAAAM4/LDhUxFsLECQ/s1600/gephishot.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="484" src="http://1.bp.blogspot.com/_2IeQft2KL-g/TFw0DBneTiI/AAAAAAAAAM4/LDhUxFsLECQ/s640/gephishot.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;So, today after having spent the morning getting our LDAP server to talk to our Cyn.in instance (it worked!) I thought I'd have a bash at this. The plan was/is to create a crawler that gets a web page, finds the pages that it links to (including Word or PDF files), &amp;nbsp;pumps any data found at the Open Calais API and saves the semantic Entities returned into a Neo4J database then go and look at it with the Gephi visualising tool ( see previous posts ).&lt;br /&gt;&lt;br /&gt;This is all kinda new to me, I have no idea if what I'm doing makes sense, but I hope that once I can "look" at data, I'll be able to figure out a way of pruning it into something usable.&lt;br /&gt;&lt;br /&gt;The (ropey) code is here and the visualisation is shown above. I have no idea if this will be "traversable" yet, but it kind of proves to me that it's doable. Ideally I want to crawl a given pile of pages into one big soup and then get from page B to page X and "discover" the shortest route between them...&lt;br /&gt;&lt;br /&gt;How hard can it be :-)&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;a href="" name="l5"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;5    &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;urllib2, urllib,  traceback, re, urlparse, socket, sys, random, time &lt;br /&gt;&lt;a href="" name="l6"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;6    &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;from &lt;/span&gt;&lt;span class="s1"&gt;pprint &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;pprint &lt;br /&gt;&lt;a href="" name="l7"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;7    &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;codecs, sys &lt;br /&gt;&lt;a href="" name="l8"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;8    &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;from &lt;/span&gt;&lt;span class="s1"&gt;calais &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;Calais      &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#http://code.google.com/p/python-calais/&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l9"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;9    &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;from &lt;/span&gt;&lt;span class="s1"&gt;scraper &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;fetch     &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#http://zesty.ca/python/scrape.py&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l10"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;10   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l11"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;11   &lt;/span&gt;&lt;/a&gt;streamWriter = codecs.lookup(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'utf-8'&lt;/span&gt;&lt;span class="s1"&gt;)[-&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;1&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l12"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;12   &lt;/span&gt;&lt;/a&gt;sys.stdout = streamWriter(sys.stdout) &lt;br /&gt;&lt;a href="" name="l13"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;13   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l14"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;14   &lt;/span&gt;&lt;/a&gt;API_KEY = &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'YOUR_KEY_HERE'&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l15"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;15   &lt;/span&gt;&lt;/a&gt;calaisapi = Calais(API_KEY, submitter=&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"python-calais demo"&lt;/span&gt;&lt;span class="s1"&gt;) &lt;br /&gt;&lt;a href="" name="l16"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;16   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l17"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;17   &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;analyze(url=None): &lt;br /&gt;&lt;a href="" name="l18"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;18   &lt;/span&gt;&lt;/a&gt;    result = calaisapi.analyze_url( url ) &lt;br /&gt;&lt;a href="" name="l19"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;19   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;result &lt;br /&gt;&lt;a href="" name="l20"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;20   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l21"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;21   &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;neo_result(result, url): &lt;br /&gt;&lt;a href="" name="l22"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;22   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;import &lt;/span&gt;&lt;span class="s1"&gt;neo4j &lt;br /&gt;&lt;a href="" name="l23"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;23   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l24"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;24   &lt;/span&gt;&lt;/a&gt;    db = neo4j.GraphDatabase( &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"simple_neo_calais_test" &lt;/span&gt;&lt;span class="s1"&gt;) &lt;br /&gt;&lt;a href="" name="l25"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;25   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l26"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;26   &lt;/span&gt;&lt;/a&gt;    result.print_summary( ) &lt;br /&gt;&lt;a href="" name="l27"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;27   &lt;/span&gt;&lt;/a&gt;     &lt;br /&gt;&lt;a href="" name="l28"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;28   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;with &lt;/span&gt;&lt;span class="s1"&gt;db.transaction: &lt;br /&gt;&lt;a href="" name="l29"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;29   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# Create the page index&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l30"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;30   &lt;/span&gt;&lt;/a&gt;        pages = db.index(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Pages"&lt;/span&gt;&lt;span class="s1"&gt;, create=True) &lt;br /&gt;&lt;a href="" name="l31"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;31   &lt;/span&gt;&lt;/a&gt;        page_node = pages[url] &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# does this page exist yet?&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l32"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;32   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if not &lt;/span&gt;&lt;span class="s1"&gt;page_node: &lt;br /&gt;&lt;a href="" name="l33"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;33   &lt;/span&gt;&lt;/a&gt;            page_node = db.node(url=url) &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# create a page&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l34"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;34   &lt;/span&gt;&lt;/a&gt;            pages[ url ] = page_node &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# Add to index&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l35"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;35   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Created:" &lt;/span&gt;&lt;span class="s1"&gt;, url &lt;br /&gt;&lt;a href="" name="l36"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;36   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l37"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;37   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Exists already:" &lt;/span&gt;&lt;span class="s1"&gt;, url &lt;br /&gt;&lt;a href="" name="l38"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;38   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l39"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;39   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len(result.entities), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Calais Entities"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l40"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;40   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;e &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;result.entities: &lt;br /&gt;&lt;a href="" name="l41"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;41   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;result.doc[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'info'&lt;/span&gt;&lt;span class="s1"&gt;][&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'externalID'&lt;/span&gt;&lt;span class="s1"&gt;]    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#URL&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l42"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;42   &lt;/span&gt;&lt;/a&gt;            entity_type =  e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'_type'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l43"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;43   &lt;/span&gt;&lt;/a&gt;            entity_value = e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'name'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l44"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;44   &lt;/span&gt;&lt;/a&gt;            relevance = e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'relevance'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l45"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;45   &lt;/span&gt;&lt;/a&gt;            instances = e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'instances'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l46"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;46   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l47"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;47   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;entity_type, entity_value, relevance, instances &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#instances is a list of contexts&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l48"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;48   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l49"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;49   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#Create an entity&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l50"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;50   &lt;/span&gt;&lt;/a&gt;            entity = db.node(value=entity_value, relevance=relevance ) &lt;br /&gt;&lt;a href="" name="l51"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;51   &lt;/span&gt;&lt;/a&gt;            entity_type = db.node( name= entity_type ) &lt;br /&gt;&lt;a href="" name="l52"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;52   &lt;/span&gt;&lt;/a&gt;            entity_type.is_a( entity )    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# e.g Amazon is_a Company&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l53"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;53   &lt;/span&gt;&lt;/a&gt;            page_node.has( entity_type ) &lt;br /&gt;&lt;a href="" name="l54"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;54   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l55"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;55   &lt;/span&gt;&lt;/a&gt;    db.shutdown() &lt;br /&gt;&lt;a href="" name="l56"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;56   &lt;/span&gt;&lt;/a&gt;  &lt;br /&gt;&lt;a href="" name="l59"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;59   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l60"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;60   &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;print_result(result): &lt;br /&gt;&lt;a href="" name="l61"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;61   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'Custom code to just show certain bits of the result obj'&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l62"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;62   &lt;/span&gt;&lt;/a&gt;    result.print_summary( ) &lt;br /&gt;&lt;a href="" name="l63"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;63   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l64"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;64   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Entities"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l65"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;65   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;e &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;result.entities: &lt;br /&gt;&lt;a href="" name="l66"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;66   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'_type'&lt;/span&gt;&lt;span class="s1"&gt;], &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;":"&lt;/span&gt;&lt;span class="s1"&gt;, (e[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'name'&lt;/span&gt;&lt;span class="s1"&gt;]) &lt;br /&gt;&lt;a href="" name="l67"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;67   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;e.keys() &lt;br /&gt;&lt;a href="" name="l68"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;68   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l69"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;69   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l70"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;70   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Topics"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l71"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;71   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;t &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;result.topics: &lt;br /&gt;&lt;a href="" name="l72"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;72   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;t[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'category'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l73"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;73   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#print t&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l74"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;74   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l75"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;75   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l76"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;76   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Relations"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l77"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;77   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;result.print_relations( ) &lt;br /&gt;&lt;a href="" name="l78"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;78   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l79"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;79   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l80"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;80   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l81"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;81   &lt;/span&gt;&lt;/a&gt;suffixes_to_avoid=[ &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'css'&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'js'&lt;/span&gt;&lt;span class="s1"&gt;,&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'zip'&lt;/span&gt;&lt;span class="s1"&gt;, ] &lt;br /&gt;&lt;a href="" name="l82"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;82   &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;get_links( data, url=&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;''&lt;/span&gt;&lt;span class="s1"&gt;,suffixes_to_avoid=suffixes_to_avoid ): &lt;br /&gt;&lt;a href="" name="l83"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;83   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'I know I should use BeautifulSoup or lxml'&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l84"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;84   &lt;/span&gt;&lt;/a&gt;    links = [ ] &lt;br /&gt;&lt;a href="" name="l85"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;85   &lt;/span&gt;&lt;/a&gt;    icos = [ ] &lt;br /&gt;&lt;a href="" name="l86"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;86   &lt;/span&gt;&lt;/a&gt;    feeds = [ ] &lt;br /&gt;&lt;a href="" name="l87"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;87   &lt;/span&gt;&lt;/a&gt;    images = [ ] &lt;br /&gt;&lt;a href="" name="l88"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;88   &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;try&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l89"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;89   &lt;/span&gt;&lt;/a&gt;        found_links = re.findall(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;' href="?([^\s^"&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;\'&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;#]+)'&lt;/span&gt;&lt;span class="s1"&gt;, data)&lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# think this strips off anchors&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l90"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;90   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l91"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;91   &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;link &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;found_links: &lt;br /&gt;&lt;a href="" name="l92"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;92   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# fix up relative links&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l93"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;93   &lt;/span&gt;&lt;/a&gt;            link = urlparse.urljoin(url, link) &lt;br /&gt;&lt;a href="" name="l94"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;94   &lt;/span&gt;&lt;/a&gt;            link = link.replace(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"/.."&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;""&lt;/span&gt;&lt;span class="s1"&gt;) &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# fix relative links&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l95"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;95   &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l96"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;96   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#check to see if path is just "/"... for example, http://theotherblog.com/&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l97"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;97   &lt;/span&gt;&lt;/a&gt;            path = urlparse.urlsplit(link)[&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;2&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l98"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;98   &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;path  == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"/"&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l99"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;99   &lt;/span&gt;&lt;/a&gt;                link = link[:-&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;1&lt;/span&gt;&lt;span class="s1"&gt;]&lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#take off the trailing slash (or should we put it on?)&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l100"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;100  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l101"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;101  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#just in case fixups&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l102"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;102  &lt;/span&gt;&lt;/a&gt;            link = link.replace(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"'"&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;""&lt;/span&gt;&lt;span class="s1"&gt;) &lt;br /&gt;&lt;a href="" name="l103"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;103  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l104"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;104  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;link &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;not in &lt;/span&gt;&lt;span class="s1"&gt;links &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;and &lt;/span&gt;&lt;span class="s1"&gt;link[:&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;7&lt;/span&gt;&lt;span class="s1"&gt;] == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'http://' &lt;/span&gt;&lt;span class="s1"&gt;: &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#avoid mailto:, https://&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l105"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;105  &lt;/span&gt;&lt;/a&gt;                &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"." &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;path: &lt;br /&gt;&lt;a href="" name="l106"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;106  &lt;/span&gt;&lt;/a&gt;                    suffix_found = path.split(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"."&lt;/span&gt;&lt;span class="s1"&gt;)[-&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;1&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l107"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;107  &lt;/span&gt;&lt;/a&gt;                    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;suffix_found, link &lt;br /&gt;&lt;a href="" name="l108"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;108  &lt;/span&gt;&lt;/a&gt;                    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;suffix_found  &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;suffixes_to_avoid: &lt;br /&gt;&lt;a href="" name="l109"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;109  &lt;/span&gt;&lt;/a&gt;                        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;pass&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l110"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;110  &lt;/span&gt;&lt;/a&gt;                    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l111"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;111  &lt;/span&gt;&lt;/a&gt;                        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;suffix_found == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'ico'&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l112"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;112  &lt;/span&gt;&lt;/a&gt;                            icos.append( link ) &lt;br /&gt;&lt;a href="" name="l113"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;113  &lt;/span&gt;&lt;/a&gt;                        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;elif &lt;/span&gt;&lt;span class="s1"&gt;suffix_found == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'rss' &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;or &lt;/span&gt;&lt;span class="s1"&gt;suffix_found==&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'xml'&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l114"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;114  &lt;/span&gt;&lt;/a&gt;                            feeds.append ( link ) &lt;br /&gt;&lt;a href="" name="l115"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;115  &lt;/span&gt;&lt;/a&gt;                        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;elif &lt;/span&gt;&lt;span class="s1"&gt;suffix_found &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'gif'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'png'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'jpg'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'jpeg'&lt;/span&gt;&lt;span class="s1"&gt;]: &lt;br /&gt;&lt;a href="" name="l116"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;116  &lt;/span&gt;&lt;/a&gt;                            images.append( link ) &lt;br /&gt;&lt;a href="" name="l117"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;117  &lt;/span&gt;&lt;/a&gt;                        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l118"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;118  &lt;/span&gt;&lt;/a&gt;                            links.append( link ) &lt;br /&gt;&lt;a href="" name="l119"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;119  &lt;/span&gt;&lt;/a&gt;                &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l120"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;120  &lt;/span&gt;&lt;/a&gt;                    links.append( link ) &lt;br /&gt;&lt;a href="" name="l121"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;121  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l122"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;122  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l123"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;123  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;except &lt;/span&gt;&lt;span class="s1"&gt;Exception, err: &lt;br /&gt;&lt;a href="" name="l124"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;124  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;err &lt;br /&gt;&lt;a href="" name="l125"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;125  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l126"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;126  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;links, icos, images, feeds &lt;br /&gt;&lt;a href="" name="l127"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;127  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l128"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;128  &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;get_images(content, url): &lt;br /&gt;&lt;a href="" name="l129"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;129  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#ico&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l130"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;130  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#feeds&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l131"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;131  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#images&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l132"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;132  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#text&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l133"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;133  &lt;/span&gt;&lt;/a&gt;    found_images = re.findall(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;' src="?([^\s^"&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;\'&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;#]+)'&lt;/span&gt;&lt;span class="s1"&gt;, content) &lt;br /&gt;&lt;a href="" name="l134"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;134  &lt;/span&gt;&lt;/a&gt;    images = [ ] &lt;br /&gt;&lt;a href="" name="l135"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;135  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l136"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;136  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l137"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;137  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;image &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;found_images: &lt;br /&gt;&lt;a href="" name="l138"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;138  &lt;/span&gt;&lt;/a&gt;        image = urlparse.urljoin(url, image) &lt;br /&gt;&lt;a href="" name="l139"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;139  &lt;/span&gt;&lt;/a&gt;        image = image.replace(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"/.."&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;""&lt;/span&gt;&lt;span class="s1"&gt;) &lt;br /&gt;&lt;a href="" name="l140"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;140  &lt;/span&gt;&lt;/a&gt;        path = urlparse.urlsplit(image)[&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;2&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l141"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;141  &lt;/span&gt;&lt;/a&gt;         &lt;br /&gt;&lt;a href="" name="l142"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;142  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;path[-&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;3&lt;/span&gt;&lt;span class="s1"&gt;:] == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;".js"&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l143"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;143  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;pass&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l144"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;144  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l145"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;145  &lt;/span&gt;&lt;/a&gt;             images.append( image ) &lt;br /&gt;&lt;a href="" name="l146"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;146  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;images &lt;br /&gt;&lt;a href="" name="l147"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;147  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l148"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;148  &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;get_icos(content, url): &lt;br /&gt;&lt;a href="" name="l149"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;149  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#href="/static/img/favicon.ico"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l150"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;150  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#image/x-icon &lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l151"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;151  &lt;/span&gt;&lt;/a&gt;    found_images = re.findall(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;' href="?([^\s^"&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;\'&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;#]+\.ico)'&lt;/span&gt;&lt;span class="s1"&gt;, content) &lt;br /&gt;&lt;a href="" name="l152"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;152  &lt;/span&gt;&lt;/a&gt;    images = [ ] &lt;br /&gt;&lt;a href="" name="l153"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;153  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l154"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;154  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l155"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;155  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;image &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;found_images: &lt;br /&gt;&lt;a href="" name="l156"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;156  &lt;/span&gt;&lt;/a&gt;        image = urlparse.urljoin(url, image) &lt;br /&gt;&lt;a href="" name="l157"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;157  &lt;/span&gt;&lt;/a&gt;        image = image.replace(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"/.."&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;""&lt;/span&gt;&lt;span class="s1"&gt;) &lt;br /&gt;&lt;a href="" name="l158"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;158  &lt;/span&gt;&lt;/a&gt;        path = urlparse.urlsplit(image)[&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;2&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l159"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;159  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l160"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;160  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;path[-&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;3&lt;/span&gt;&lt;span class="s1"&gt;:] == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;".js"&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l161"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;161  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;pass&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l162"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;162  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l163"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;163  &lt;/span&gt;&lt;/a&gt;             images.append( image ) &lt;br /&gt;&lt;a href="" name="l164"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;164  &lt;/span&gt;&lt;/a&gt;             &lt;br /&gt;&lt;a href="" name="l165"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;165  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;images &lt;br /&gt;&lt;a href="" name="l166"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;166  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;len(images)&amp;gt; &lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;0&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l167"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;167  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len(images), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"icos"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l168"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;168  &lt;/span&gt;&lt;/a&gt;        image = images[ &lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;0 &lt;/span&gt;&lt;span class="s1"&gt;] &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#just get the first&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l169"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;169  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l170"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;170  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;#try a default... you never know... might be aliased anyway...&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l171"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;171  &lt;/span&gt;&lt;/a&gt;        scheme, domain, path, query, x = urlparse.urlsplit( url ) &lt;br /&gt;&lt;a href="" name="l172"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;172  &lt;/span&gt;&lt;/a&gt;        image = &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'%s://%s/favicon.ico' &lt;/span&gt;&lt;span class="s1"&gt;% (scheme, domain) &lt;br /&gt;&lt;a href="" name="l173"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;173  &lt;/span&gt;&lt;/a&gt;         &lt;br /&gt;&lt;a href="" name="l174"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;174  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;image &lt;br /&gt;&lt;a href="" name="l175"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;175  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l176"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;176  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l177"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;177  &lt;/span&gt;&lt;/a&gt;mimes_u_like = [ &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'text/html'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'application/pdf'&lt;/span&gt;&lt;span class="s1"&gt;, ] &lt;br /&gt;&lt;a href="" name="l178"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;178  &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;def &lt;/span&gt;&lt;span class="s1"&gt;get( url, mimes_u_like=mimes_u_like, fetch_links=False ): &lt;br /&gt;&lt;a href="" name="l179"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;179  &lt;/span&gt;&lt;/a&gt;    url, status, message, headers, content  = fetch(url) &lt;br /&gt;&lt;a href="" name="l180"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;180  &lt;/span&gt;&lt;/a&gt;    mime = headers[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'content-type'&lt;/span&gt;&lt;span class="s1"&gt;].split(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;";"&lt;/span&gt;&lt;span class="s1"&gt;)[&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;0&lt;/span&gt;&lt;span class="s1"&gt;].strip() &lt;br /&gt;&lt;a href="" name="l181"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;181  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l182"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;182  &lt;/span&gt;&lt;/a&gt;    page = {&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;:url, &lt;br /&gt;&lt;a href="" name="l183"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;183  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'mime'&lt;/span&gt;&lt;span class="s1"&gt;:mime, &lt;br /&gt;&lt;a href="" name="l184"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;184  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'status'&lt;/span&gt;&lt;span class="s1"&gt;:status, &lt;br /&gt;&lt;a href="" name="l185"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;185  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'message'&lt;/span&gt;&lt;span class="s1"&gt;:message, &lt;br /&gt;&lt;a href="" name="l186"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;186  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'headers'&lt;/span&gt;&lt;span class="s1"&gt;:headers, &lt;br /&gt;&lt;a href="" name="l187"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;187  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'content'&lt;/span&gt;&lt;span class="s1"&gt;:content, &lt;br /&gt;&lt;a href="" name="l188"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;188  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'links'&lt;/span&gt;&lt;span class="s1"&gt;:[], &lt;br /&gt;&lt;a href="" name="l189"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;189  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'images'&lt;/span&gt;&lt;span class="s1"&gt;:[], &lt;br /&gt;&lt;a href="" name="l190"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;190  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'feeds'&lt;/span&gt;&lt;span class="s1"&gt;: [] } &lt;br /&gt;&lt;a href="" name="l191"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;191  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l192"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;192  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l193"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;193  &lt;/span&gt;&lt;/a&gt;    ico =  get_icos( content, url ) &lt;br /&gt;&lt;a href="" name="l194"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;194  &lt;/span&gt;&lt;/a&gt;    page[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'ico'&lt;/span&gt;&lt;span class="s1"&gt;] = ico &lt;br /&gt;&lt;a href="" name="l195"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;195  &lt;/span&gt;&lt;/a&gt;     &lt;br /&gt;&lt;a href="" name="l196"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;196  &lt;/span&gt;&lt;/a&gt;    images_found =  get_images( content, url ) &lt;br /&gt;&lt;a href="" name="l197"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;197  &lt;/span&gt;&lt;/a&gt;    page[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'images'&lt;/span&gt;&lt;span class="s1"&gt;] = images_found &lt;br /&gt;&lt;a href="" name="l198"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;198  &lt;/span&gt;&lt;/a&gt;     &lt;br /&gt;&lt;a href="" name="l199"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;199  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;fetch_links == False: &lt;br /&gt;&lt;a href="" name="l200"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;200  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;page     &lt;/span&gt;&lt;span class="s4" style="color: grey; font-style: italic;"&gt;# this is wrong... I want to discover the the links without necessarily getting their data&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l201"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;201  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;else&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l202"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;202  &lt;/span&gt;&lt;/a&gt;        links_found, icos, images, feeds = get_links( content, url ) &lt;br /&gt;&lt;a href="" name="l203"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;203  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;url &lt;br /&gt;&lt;a href="" name="l204"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;204  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len( links_found ), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"links found"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l205"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;205  &lt;/span&gt;&lt;/a&gt;        links = [ ] &lt;br /&gt;&lt;a href="" name="l206"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;206  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len( icos ), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"icos"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l207"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;207  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len( images ), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"images"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l208"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;208  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;images &lt;br /&gt;&lt;a href="" name="l209"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;209  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len( feeds ), &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"feeds"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l210"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;210  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l211"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;211  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l212"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;212  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;link &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;links_found: &lt;br /&gt;&lt;a href="" name="l213"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;213  &lt;/span&gt;&lt;/a&gt;            url, status, message, headers, content  = fetch(link) &lt;br /&gt;&lt;a href="" name="l214"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;214  &lt;/span&gt;&lt;/a&gt;            mime = headers[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'content-type'&lt;/span&gt;&lt;span class="s1"&gt;].split(&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;";"&lt;/span&gt;&lt;span class="s1"&gt;)[&lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;0&lt;/span&gt;&lt;span class="s1"&gt;].strip() &lt;br /&gt;&lt;a href="" name="l215"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;215  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;mime &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;mimes_u_like: &lt;br /&gt;&lt;a href="" name="l216"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;216  &lt;/span&gt;&lt;/a&gt;                &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;\t&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"&lt;/span&gt;&lt;span class="s1"&gt;, status, mime, link &lt;br /&gt;&lt;a href="" name="l217"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;217  &lt;/span&gt;&lt;/a&gt;                &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;status == &lt;/span&gt;&lt;span class="s2" style="color: blue;"&gt;200&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l218"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;218  &lt;/span&gt;&lt;/a&gt;                    links.append( {&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;:url, &lt;br /&gt;&lt;a href="" name="l219"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;219  &lt;/span&gt;&lt;/a&gt;                                   &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'status'&lt;/span&gt;&lt;span class="s1"&gt;:status, &lt;br /&gt;&lt;a href="" name="l220"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;220  &lt;/span&gt;&lt;/a&gt;                                   &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'message'&lt;/span&gt;&lt;span class="s1"&gt;:message, &lt;br /&gt;&lt;a href="" name="l221"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;221  &lt;/span&gt;&lt;/a&gt;                                   &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'headers'&lt;/span&gt;&lt;span class="s1"&gt;:headers, &lt;br /&gt;&lt;a href="" name="l222"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;222  &lt;/span&gt;&lt;/a&gt;                                   &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'mime'&lt;/span&gt;&lt;span class="s1"&gt;:mime, &lt;br /&gt;&lt;a href="" name="l223"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;223  &lt;/span&gt;&lt;/a&gt;                                   &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'content'&lt;/span&gt;&lt;span class="s1"&gt;:content} &lt;br /&gt;&lt;a href="" name="l224"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;224  &lt;/span&gt;&lt;/a&gt;                                  ) &lt;br /&gt;&lt;a href="" name="l225"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;225  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l226"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;226  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s1"&gt;len(links) , &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Used"&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l227"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;227  &lt;/span&gt;&lt;/a&gt;        page[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'links'&lt;/span&gt;&lt;span class="s1"&gt;] = links &lt;br /&gt;&lt;a href="" name="l228"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;228  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;return &lt;/span&gt;&lt;span class="s1"&gt;page &lt;br /&gt;&lt;a href="" name="l229"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;229  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l230"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;230  &lt;/span&gt;&lt;/a&gt;&lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;if &lt;/span&gt;&lt;span class="s1"&gt;__name__ == &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'__main__'&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l231"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;231  &lt;/span&gt;&lt;/a&gt;    url = &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'http://theotherblog.com'&lt;/span&gt;&lt;span class="s1"&gt; &lt;br /&gt;&lt;a href="" name="l232"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;232  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Getting:"&lt;/span&gt;&lt;span class="s1"&gt;, url &lt;br /&gt;&lt;a href="" name="l233"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;233  &lt;/span&gt;&lt;/a&gt;    page = get( url, fetch_links=True ) &lt;br /&gt;&lt;a href="" name="l234"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;234  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;a href="" name="l235"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;235  &lt;/span&gt;&lt;/a&gt;    result = analyze( url ) &lt;br /&gt;&lt;a href="" name="l236"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;236  &lt;/span&gt;&lt;/a&gt;    neo_result ( result, url ) &lt;br /&gt;&lt;a href="" name="l237"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;237  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l238"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;238  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Getting %s links.." &lt;/span&gt;&lt;span class="s1"&gt;% str(len(page[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'links'&lt;/span&gt;&lt;span class="s1"&gt;])) &lt;br /&gt;&lt;a href="" name="l239"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;239  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;for &lt;/span&gt;&lt;span class="s1"&gt;link_dict &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;in &lt;/span&gt;&lt;span class="s1"&gt;page[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'links'&lt;/span&gt;&lt;span class="s1"&gt;]: &lt;br /&gt;&lt;a href="" name="l240"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;240  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Getting link:"&lt;/span&gt;&lt;span class="s1"&gt;, link_dict[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;] &lt;br /&gt;&lt;a href="" name="l241"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;241  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;try&lt;/span&gt;&lt;span class="s1"&gt;: &lt;br /&gt;&lt;a href="" name="l242"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;242  &lt;/span&gt;&lt;/a&gt;            result = analyze( link_dict[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;] ) &lt;br /&gt;&lt;a href="" name="l243"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;243  &lt;/span&gt;&lt;/a&gt;            neo_result ( result, link_dict[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;] ) &lt;br /&gt;&lt;a href="" name="l244"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;244  &lt;/span&gt;&lt;/a&gt;        &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;except &lt;/span&gt;&lt;span class="s1"&gt;Exception, err: &lt;br /&gt;&lt;a href="" name="l245"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;245  &lt;/span&gt;&lt;/a&gt;            &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print    &lt;/span&gt;&lt;span class="s1"&gt;link_dict[&lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;'url'&lt;/span&gt;&lt;span class="s1"&gt;], err &lt;br /&gt;&lt;a href="" name="l246"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;246  &lt;/span&gt;&lt;/a&gt; &lt;br /&gt;&lt;a href="" name="l247"&gt;&lt;span class="ln" style="color: black; font-style: normal; font-weight: normal;"&gt;247  &lt;/span&gt;&lt;/a&gt;    &lt;/span&gt;&lt;span class="s0" style="color: navy; font-weight: bold;"&gt;print &lt;/span&gt;&lt;span class="s3" style="color: green; font-weight: bold;"&gt;"Done!"&lt;/span&gt;&lt;span class="s1"&gt; &lt;/span&gt;&lt;/pre&gt;&lt;pre&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;pre&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;pre&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;pre&gt;&lt;span class="s1"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-914972506414185457?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/914972506414185457/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/neo4j-python-crawler-open-calais-gephi.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/914972506414185457'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/914972506414185457'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/neo4j-python-crawler-open-calais-gephi.html' title='Neo4J + Python Crawler + Open Calais + Gephi'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2IeQft2KL-g/TFw0DBneTiI/AAAAAAAAAM4/LDhUxFsLECQ/s72-c/gephishot.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-3547393465527158331</id><published>2010-08-04T07:51:00.000-07:00</published><updated>2010-08-04T07:51:20.126-07:00</updated><title type='text'>Django and Neo4J</title><content type='html'>So, today I was tempted by the approach taken in&lt;a href="http://www.slideshare.net/thobe/django-and-neo4j-domain-modeling-that-kicks-ass"&gt; these slides about Neo4J and Django&lt;/a&gt;. My hope was/is that whilst the Neo4J implementation isn't complete, because I imagine many of the concepts simply won't easily map from a graph database to a SQL one, at least I might be able to work in a more familiar way in order to be able to learn about neo4j.&lt;br /&gt;&lt;br /&gt;I started by grabbing the files from the &lt;a href="https://trac.neo4j.org/browser/components/neo4j.py/trunk/src/main/python/neo4j/model?rev=3957"&gt;model folder from here&lt;/a&gt;.... adding...&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px; line-height: 14px;"&gt;&lt;span style="background-color: inherit; border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; color: black; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;NEO4J_RESOURCE_URI&amp;nbsp;=&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px; line-height: 14px;"&gt;&lt;span class="string" style="background-color: inherit; border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; color: blue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;'/Users/tomsmith/neo4django_db'&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px; line-height: 14px;"&gt;&lt;span style="background-color: inherit; border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; color: black; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Consolas, 'Courier New', Courier, mono, serif; font-size: 12px; line-height: 14px;"&gt;&lt;span style="background-color: inherit; border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; color: black; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;/span&gt;&lt;/span&gt;...to my settings.py file and then adding this code to a models.py file...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;from neo4j.model import django_model as models&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;class Actor(models.NodeModel):&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;name= models.Property(indexed=True)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;href = property(lambda self:('/actor/%s/' % self.node.id))&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;def __unicode__(self):&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;return self.name&lt;br /&gt;&lt;br /&gt;class Movie(models.NodeModel):&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;title = models.Property(indexed=True)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;year = models.Property()&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;href = property(lambda self:('/movies/%s/' % self.node.id))&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;actors = models.Relationship(Actor,type=models.Outgoing.ACTS_IN,related_name="acts_in",)&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;def title_length(self):&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;return len(self.title)&lt;br /&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;def __unicode__(self):&lt;br /&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;return self.title&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;... and then from within &lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;python manage.py shell&lt;/span&gt;&lt;/span&gt; I could...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt; import stuff.models as m&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt; movie = m.Movie(title="Fear and Loathing in Las Vegas",year=1998)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt; movie.save()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.save()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.id&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;4L&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt;&amp;gt; movie = m.Movie(title="The Brothers Grimm",year=2005)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.id&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;5L&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie = m.Movie(title="Twelve Monkeys",year=1995)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie = m.Movie(title="Memento",year=2000)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... now I can...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt; movies = m.Movie.objects.all()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; for movie in movies:print movie.id, movie, movie.href&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;...&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;7 Memento /movies/7/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;6 Twelve Monkeys /movies/6/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;5 The Brothers Grimm /movies/5/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;4 Fear and Loathing in Las Vegas /movies/4/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;3 The Book of Eli /movies/3/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;1 10,000 years B.C. /movies/1/&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... which is fantastic! Don't you think?! The django_model is handling the db.transaction stuff. I can even do a ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt; reload( m )&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... and it doesn't break (because the database is already connected) as my earlier code did. (I think I can even run my django instance AND be connected to the neo4j database at the same time.. phew!).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&amp;nbsp;What I think is fantastic about it is that using this approach I get to work with objects in a familiar way. And by that, I mean, having a schema-less database is all well and good, but almost the first thing I seem to find myself doing is creating &lt;b&gt;&lt;i&gt;similar types &lt;/i&gt;&lt;/b&gt;of objects. So being able to create classes (maybe even with subclasses) seems perfect for my needs ... in theory. In theory, I'd like to be able to define classes with properties but then on-the-fly maybe add extra properties to objects (without changing my class definition).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;AND I can add methods to my classes... or can I ?&amp;nbsp;&amp;nbsp;I added a simple function to my Movie class...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;def title_length(self):&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-tab-span" style="white-space: pre;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt; &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;return len(self.title)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...and ran...&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movies = m.Movie.objects.all()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; for movie in movies:print movie.id, movie, movie.href, movie.title_len()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...and got...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AttributeError: 'Movie' object has no attribute 'title_len'&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... Boo! At this point after doing looking at the Movie objects, I found ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; dir( m.Movie.objects )&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__slotnames__', '__str__', '__subclasshook__', '__weakref__', '_copy_to_model', '_inherited', '_insert', '_set_creation_counter', '_update', 'aggregate', 'all', 'annotate', 'complex_filter', 'contribute_to_class', 'count', 'create', 'creation_counter', 'dates', 'defer', 'distinct', 'exclude', 'extra', 'filter', 'get', 'get_empty_query_set', 'get_or_create', 'get_query_set', 'in_bulk', 'iterator', 'latest', 'model', 'none', 'only', 'order_by', 'reverse', 'select_related', 'update', 'values', 'values_list']&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...so I did a ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;gt;&amp;gt;&amp;gt;&amp;nbsp;movies = m.Movie.objects.all().order_by('year')&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...and got ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Traceback (most recent call last):&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;File "&lt;console&gt;", line 1, in &lt;module&gt;&lt;/module&gt;&lt;/console&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AttributeError: 'NodeQuerySet' object has no attribute 'order_by'&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...and the same is true for .&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;get_or_create()&amp;nbsp;&lt;/span&gt;&lt;/span&gt;or .&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;values()&lt;/span&gt;&lt;/span&gt; or many other methods.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... so (and this isn't a criticism) it seems that the Django model is a "&lt;a href="http://journal.thobe.org/2009/12/seamless-neo4j-integration-in-django.html?utm_source=feedburner&amp;amp;utm_medium=feed&amp;amp;utm_campaign=Feed:+thobe/wardrobe+(Wardrobe+strength)&amp;amp;utm_content=FeedBurner"&gt;work in progress&lt;/a&gt;" (the cat ate his source code) with some of the methods still to be completed. At this point I should probably start looking at the django_model source code and begin adding to the functionality it offers... except ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;I probably don't have the ability to create a database adapter, especially for one I don't understand&lt;/li&gt;&lt;li&gt;I don't know if shoe-horning neo4j into Django is a good idea. I'm not saying it isn't ( so far, it's tantalising ) I'm just saying I don't know.&lt;/li&gt;&lt;li&gt;When I find errors I'm not sure if it's me or not...&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;...for example, I tried...&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor = m.Actor(name="Joseph Melito")&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor.id&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;10L&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor = m.Actor(name="Bruce Willis")&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor = m.Actor(name="Jon Seda")&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; actor.save()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;...and all was fine and dandy and then I tried ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie = m.Movie.objects.get(title="Twelve Monkeys")&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;gt;&amp;gt;&amp;gt; movie.actors.add ( actor )&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... and got...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;Traceback (most recent call last):&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;File "&lt;console&gt;", line 1, in &lt;module&gt;&lt;/module&gt;&lt;/console&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;File "/opt/local/lib/python2.5/site-packages/Neo4j.py-0.1_SNAPSHOT-py2.5.egg/neo4j/model/django_model/__init__.py", line 623, in __get__&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;return self._get_relationship(obj, self.__state_for(obj))&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;File "/opt/local/lib/python2.5/site-packages/Neo4j.py-0.1_SNAPSHOT-py2.5.egg/neo4j/model/django_model/__init__.py", line 708, in _get_relationship&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;states[self.name] = state = RelationshipInstance(self, obj)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp;File "/opt/local/lib/python2.5/site-packages/Neo4j.py-0.1_SNAPSHOT-py2.5.egg/neo4j/model/django_model/__init__.py", line 750, in __init__&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;self.__removed = pyneo.python.set() # contains relationships&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;AttributeError: 'module' object has no attribute 'set'&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... I wonder if Tobias has done more work on the Relationships class at all?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-3547393465527158331?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/3547393465527158331/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/django-and-neo4j.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/3547393465527158331'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/3547393465527158331'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/django-and-neo4j.html' title='Django and Neo4J'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-8650448419246068910</id><published>2010-08-03T08:12:00.000-07:00</published><updated>2010-08-03T08:13:03.949-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='gephi'/><category scheme='http://www.blogger.com/atom/ns#' term='visualisation'/><category scheme='http://www.blogger.com/atom/ns#' term='neo4j'/><title type='text'>Visualisation of a Neo4J database with Gephi</title><content type='html'>1. &amp;nbsp;Get the version of Gephi app that can read neo4j databases (not the main one)&lt;br /&gt;bzr branch http://bazaar.launchpad.net/~bujacik/gephi/support-for-neo4j&lt;br /&gt;&lt;br /&gt;2. Get Netbeans, open the project you've just downloaded and build it as a Mac application (you'll find it in a folder called dist). As it happens, Netbean is quite a lovely python IDE, once you've added a few plugins. I added a python plugin, a regular expression one and found I could pretty much use the python debugger out of the box (more than I can say for Eclipse) and watch variables etc. Very nice.&lt;br /&gt;&lt;br /&gt;3. Run the code from yesterday and open the DB that gets created with Gephi... alter the layouts and you get something like this...&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/TFgvR9199AI/AAAAAAAAAMw/kuVSu2SVIDo/s1600/web_site_structure.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/TFgvR9199AI/AAAAAAAAAMw/kuVSu2SVIDo/s320/web_site_structure.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;... which was a small crawl of a small site (showing IDs).&lt;br /&gt;&lt;br /&gt;I have no absolutely no idea what this all means but isn't it lovely to look at? I think I'm onto something.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-8650448419246068910?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/8650448419246068910/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/1.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8650448419246068910'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8650448419246068910'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/1.html' title='Visualisation of a Neo4J database with Gephi'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2IeQft2KL-g/TFgvR9199AI/AAAAAAAAAMw/kuVSu2SVIDo/s72-c/web_site_structure.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-6500425058254095320</id><published>2010-08-03T02:18:00.000-07:00</published><updated>2010-08-03T02:21:49.161-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='neo4j'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><title type='text'>Working Neo4J / Python code</title><content type='html'>&lt;style type="text/css"&gt;td.linenos { background-color: #f0f0f0; padding-right: 10px; }span.lineno { background-color: #f0f0f0; padding: 0 5px 0 5px; }pre { line-height: 125%; }body .hll { background-color: #ffffcc }body  { background: #f8f8f8; }body .c { color: #408080; font-style: italic } /* Comment */body .err { border: 1px solid #FF0000 } /* Error */body .k { color: #008000; font-weight: bold } /* Keyword */body .o { color: #666666 } /* Operator */body .cm { color: #408080; font-style: italic } /* Comment.Multiline */body .cp { color: #BC7A00 } /* Comment.Preproc */body .c1 { color: #408080; font-style: italic } /* Comment.Single */body .cs { color: #408080; font-style: italic } /* Comment.Special */body .gd { color: #A00000 } /* Generic.Deleted */body .ge { font-style: italic } /* Generic.Emph */body .gr { color: #FF0000 } /* Generic.Error */body .gh { color: #000080; font-weight: bold } /* Generic.Heading */body .gi { color: #00A000 } /* Generic.Inserted */body .go { color: #808080 } /* Generic.Output */body .gp { color: #000080; font-weight: bold } /* Generic.Prompt */body .gs { font-weight: bold } /* Generic.Strong */body .gu { color: #800080; font-weight: bold } /* Generic.Subheading */body .gt { color: #0040D0 } /* Generic.Traceback */body .kc { color: #008000; font-weight: bold } /* Keyword.Constant */body .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */body .kn { color: #008000; font-weight: bold } /* Keyword.Namespace */body .kp { color: #008000 } /* Keyword.Pseudo */body .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */body .kt { color: #B00040 } /* Keyword.Type */body .m { color: #666666 } /* Literal.Number */body .s { color: #BA2121 } /* Literal.String */body .na { color: #7D9029 } /* Name.Attribute */body .nb { color: #008000 } /* Name.Builtin */body .nc { color: #0000FF; font-weight: bold } /* Name.Class */body .no { color: #880000 } /* Name.Constant */body .nd { color: #AA22FF } /* Name.Decorator */body .ni { color: #999999; font-weight: bold } /* Name.Entity */body .ne { color: #D2413A; font-weight: bold } /* Name.Exception */body .nf { color: #0000FF } /* Name.Function */body .nl { color: #A0A000 } /* Name.Label */body .nn { color: #0000FF; font-weight: bold } /* Name.Namespace */body .nt { color: #008000; font-weight: bold } /* Name.Tag */body .nv { color: #19177C } /* Name.Variable */body .ow { color: #AA22FF; font-weight: bold } /* Operator.Word */body .w { color: #bbbbbb } /* Text.Whitespace */body .mf { color: #666666 } /* Literal.Number.Float */body .mh { color: #666666 } /* Literal.Number.Hex */body .mi { color: #666666 } /* Literal.Number.Integer */body .mo { color: #666666 } /* Literal.Number.Oct */body .sb { color: #BA2121 } /* Literal.String.Backtick */body .sc { color: #BA2121 } /* Literal.String.Char */body .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */body .s2 { color: #BA2121 } /* Literal.String.Double */body .se { color: #BB6622; font-weight: bold } /* Literal.String.Escape */body .sh { color: #BA2121 } /* Literal.String.Heredoc */body .si { color: #BB6688; font-weight: bold } /* Literal.String.Interpol */body .sx { color: #008000 } /* Literal.String.Other */body .sr { color: #BB6688 } /* Literal.String.Regex */body .s1 { color: #BA2121 } /* Literal.String.Single */body .ss { color: #19177C } /* Literal.String.Symbol */body .bp { color: #008000 } /* Name.Builtin.Pseudo */body .vc { color: #19177C } /* Name.Variable.Class */body .vg { color: #19177C } /* Name.Variable.Global */body .vi { color: #19177C } /* Name.Variable.Instance */body .il { color: #666666 } /* Literal.Number.Integer.Long */  &lt;/style&gt;&lt;br /&gt;So, this sort of works. The team at neo4j have made the iteration work for me... Usage:&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;gt;&amp;gt;python2.6 neo.py &amp;nbsp;http://www.wherever.com&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;... and what it does it go to that web site and grabs a few links, following them, adding the pages and their data to a neo4j graph database.&lt;br /&gt;&lt;br /&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;neo4j&lt;/span&gt; &lt;span class="c"&gt;#See http://components.neo4j.org/neo4j.py/&lt;/span&gt;&lt;br /&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;traceback&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;urlparse&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;random&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nn"&gt;time&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setdefaulttimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;#seconds&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="sd"&gt;'''This is my attempt to begin to make proper classes and functions to talk to the neo4j database, it's not meant to be fancy or clever, I just want to be able to CRUD ok-ish'''&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;neo4j&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GraphDatabase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"crawler_example3_db"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="c"&gt;#How do we delete an index... which might have old nodes in? This doesn't work...any ideas... how do I empty an index, should I? Does deleting a node delete its refererence from the index&lt;/span&gt;&lt;br /&gt;  &lt;span class="sd"&gt;'''try:&lt;/span&gt;&lt;br /&gt;&lt;span class="sd"&gt;   for ref in db.index('pages'):&lt;/span&gt;&lt;br /&gt;&lt;span class="sd"&gt;    ref.delete()&lt;/span&gt;&lt;br /&gt;&lt;span class="sd"&gt;  except Exception, err:&lt;/span&gt;&lt;br /&gt;&lt;span class="sd"&gt;   print err&lt;/span&gt;&lt;br /&gt;&lt;span class="sd"&gt;  '''&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"pages"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# create an index called 'pages'&lt;/span&gt;&lt;br /&gt;  &lt;span class="c"&gt;#print "Index created"&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#return db, pages&lt;/span&gt;&lt;br /&gt;&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#print "db:", db&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="c"&gt;#############     UTILITY FUNCTIONS         #############&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;''&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="s"&gt;'I know I should use BeautifulSoup or lxml, but for simplicity it's a regex..ha'&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;found_links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;' href="?([^\s^"&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s"&gt;#]+)'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;br /&gt;  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;found_links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#print link&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urlparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urljoin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/.."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;#fix relative links&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"'"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'http://'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="c"&gt;#avoid mailtos&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;  &lt;br /&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;br /&gt; &lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;wget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urllib2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;urlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="s"&gt;'text/html'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'content-type'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;unicode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;  &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;'ignore'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Error wgetting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;br /&gt;  &lt;span class="c"&gt;#print traceback.print_exc()&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delete_all_pages&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="nb"&gt;id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"deleted"&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;pass&lt;/span&gt; &lt;span class="c"&gt;# fails on last iteration?&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"All pages deleted!"&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="sd"&gt;'''this creates a page node then adds it to the "pages" index, if it's there already it'll get it'''&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;# Get the actual page data from the web...&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;wget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;#get the actual html&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Boo!"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;data_len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;page_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c"&gt;# does this page exist yet?&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;page_node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;page_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;data_len&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;data_len&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# create a page&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page_node&lt;/span&gt; &lt;span class="c"&gt;# Add to index&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Created:"&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Exists already:"&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;br /&gt;   &lt;br /&gt;  &lt;span class="c"&gt;#Now create a page for every link in the page&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;following links"&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"found"&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="mi"&gt;19&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;creating:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="n"&gt;link&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;create_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;page_node&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="s"&gt;'given a url with return a node obj'&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Node is none! Creating..."&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;create_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;   &lt;br /&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;br /&gt; &lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_all_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt;  &lt;span class="c"&gt;# Just iterate through the pages to make sure the data in in there...&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Listing all pages..."&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;br /&gt;  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;br /&gt;    &lt;span class="c"&gt;#print node['data'][:150], "..."&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;br /&gt;   &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"...done listing!"&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delete_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="s"&gt;''&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Node with id="&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"deleted"&lt;/span&gt;&lt;br /&gt;    &lt;span class="c"&gt;#delete from index&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Node probably not found:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nb"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;# let's have a look&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_links_between_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#'Just iterate through the pages to make sure the data in in there...'&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Linking all pages..."&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;br /&gt;   &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;     &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'data_len'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;     &lt;span class="n"&gt;links&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_links&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'data'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;br /&gt;     &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;links&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;      &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;       &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;br /&gt;       &lt;span class="c"&gt;#look to see if a node with that url exists, if it doesn't it's created...&lt;/span&gt;&lt;br /&gt;       &lt;span class="n"&gt;other_node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;other_node&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;        &lt;span class="k"&gt;pass&lt;/span&gt;&lt;br /&gt;       &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;        &lt;span class="n"&gt;other_node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;links_to&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;      &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;       &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;     &lt;span class="k"&gt;print&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;     &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;    &lt;span class="k"&gt;print&lt;/span&gt; &lt;br /&gt;  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;pass&lt;/span&gt; &lt;span class="c"&gt;# fails on last iteration?&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Backlink&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;neo4j&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Traversal&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;types&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;neo4j&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Outgoing&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;links_to&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;neo4j&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DEPTH_FIRST&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;stop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;neo4j&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;STOP_AT_END_OF_GRAPH&lt;/span&gt;&lt;br /&gt;  &lt;br /&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;isReturnable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;is_start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pos&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_relationship&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'links_to'&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;'__main__'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt; &lt;span class="c"&gt;#### LET'S GET STARTED ######&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;pre&gt;&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;except&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"No url passed, using 'http://diveintopython.org/'"&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;'http://diveintopython.org/'&lt;/span&gt;&lt;br /&gt;  &lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Starting:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;######## DELETE ALL THE PAGES WE HAVE SO FAR ############&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;# Avoiding this because of ...&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#jpype._jexception.RuntimeExceptionPyRaisable: org.neo4j.graphdb.NotFoundException: Node[0] &lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#... errors later...&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#print "Deleting existing pages..."&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#delete_all_pages( ) #we may want to add new data to each page... forget this for now&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;########        ADD SOME ACTUAL PAGES        #################&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#print "Creating some pages "&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="n"&gt;create_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;  &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;follow&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;#Also fetches some pages linked from this page&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;########    NOW GET SOME NODES OUT     ############### &lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Has our data made it to the database? Let's see..."&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;# Do some fishing for nodes...&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;  &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_one&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;get one:'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"... oh yes!"&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\t&lt;/span&gt;&lt;span class="s"&gt;id:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"url:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'url'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="s"&gt;"data-length:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;node&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;'data_len'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;br /&gt;   &lt;br /&gt;   &lt;span class="c"&gt;#This should fail&lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#print "This SHOULD fail.."&lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#node = get_one( 'http://www.pythonware.com/daily/' )&lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#print 'get one:' + str( node ) &lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#print "id:", node.id , "url:", node['url'], "data length:", node['data_len']&lt;/span&gt;&lt;br /&gt;   &lt;span class="c"&gt;#print &lt;/span&gt;&lt;br /&gt;   &lt;br /&gt;  &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="ne"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Probably not a node with that URL"&lt;/span&gt;&lt;br /&gt;   &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;#########  TRY TO ITERATE ALL PAGES FETCHED ################&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;list_all_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;  &lt;span class="p"&gt;)&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;#########    TRY TO DELETE ONE ################&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#delete_one( url ) &lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;# Now let's see if it has gone&lt;/span&gt;&lt;br /&gt; &lt;span class="c"&gt;#list_all_pages( ) #or maybe later&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="c"&gt;######### LET'S LOOK FOR RELATIONSHIPS BETWEEN PAGES ################&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"Doing linking pages..."&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;find_links_between_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt; &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;#goes to every page, looking for other pages in the database&lt;/span&gt;&lt;br /&gt; &lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt;&lt;br /&gt; &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="s"&gt;"shutting down, saving all the changes, etc"&lt;/span&gt;&lt;br /&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shutdown&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;br /&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-6500425058254095320?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/6500425058254095320/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/working-neo4j-python-code.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6500425058254095320'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6500425058254095320'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/working-neo4j-python-code.html' title='Working Neo4J / Python code'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-4005826092727806353</id><published>2010-08-02T08:56:00.001-07:00</published><updated>2010-08-02T08:56:02.436-07:00</updated><title type='text'>Network slides...</title><content type='html'>...to read later...&lt;br /&gt;&lt;br /&gt;&lt;div style="width:425px" id="__ss_4685485"&gt;&lt;strong style="display:block;margin:12px 0 4px"&gt;&lt;a href="http://www.slideshare.net/slidarko/a-perspective-on-graph-theory-and-network-science" title="A Perspective on Graph Theory and Network Science"&gt;A Perspective on Graph Theory and Network Science&lt;/a&gt;&lt;/strong&gt;&lt;object id="__sse4685485" width="425" height="355"&gt;&lt;param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=network-graph-sf2010-100705130759-phpapp02&amp;stripped_title=a-perspective-on-graph-theory-and-network-science" /&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;embed name="__sse4685485" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=network-graph-sf2010-100705130759-phpapp02&amp;stripped_title=a-perspective-on-graph-theory-and-network-science" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div style="padding:5px 0 12px"&gt;View more &lt;a href="http://www.slideshare.net/"&gt;presentations&lt;/a&gt; from &lt;a href="http://www.slideshare.net/slidarko"&gt;Marko Rodriguez&lt;/a&gt;.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-4005826092727806353?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/4005826092727806353/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/network-slides.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4005826092727806353'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4005826092727806353'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/network-slides.html' title='Network slides...'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-6531748766341832835</id><published>2010-08-02T08:17:00.000-07:00</published><updated>2010-08-02T08:17:39.294-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='neo4j'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='crawler'/><title type='text'>Python and Neo4J - Could this be the Funky Data Model?</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;For many years now, I've been chasing the &lt;i&gt;&lt;a href="http://www.google.co.uk/search?sourceid=chrome&amp;amp;ie=UTF-8&amp;amp;q=%22funky+data+model%22"&gt;funky data model&lt;/a&gt;&lt;/i&gt;, a database where you can change your mind about the structure of the data on the fly, without losing all chance of ever getting the data out again. &lt;a href="http://components.neo4j.org/neo4j.py/"&gt;Neo4J&lt;/a&gt; promises that way of working.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;So, before I get all fancy, I'm trying to make a &lt;b&gt;painfully simple&lt;/b&gt; python crawler that will go to a URL, find a few pages linked from that URL, add them to the database and then create relationships between the pages based on their links. Simple. Ahem.&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;This code doesn't quite work at the moment... but it's almost there....&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: x-small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;import&amp;nbsp;neo4j&amp;nbsp;#See&amp;nbsp;http://components.neo4j.org/neo4j.py/&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;import&amp;nbsp;urllib2,&amp;nbsp;traceback,&amp;nbsp;re,&amp;nbsp;urlparse,&amp;nbsp;socket,&amp;nbsp;sys&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;socket.setdefaulttimeout(4)&amp;nbsp;#seconds&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;'''This&amp;nbsp;is&amp;nbsp;my&amp;nbsp;attempt&amp;nbsp;to&amp;nbsp;begin&amp;nbsp;to&amp;nbsp;make&amp;nbsp;proper&amp;nbsp;classes&amp;nbsp;and&amp;nbsp;functions&amp;nbsp;to&amp;nbsp;talk&amp;nbsp;to&amp;nbsp;the&amp;nbsp;neo4j&amp;nbsp;database,&amp;nbsp;it's&amp;nbsp;not&amp;nbsp;meant&amp;nbsp;to&amp;nbsp;be&amp;nbsp;fancy'''&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;setup():&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;########&amp;nbsp;&amp;nbsp;DO&amp;nbsp;SOME&amp;nbsp;SETUP&amp;nbsp;STUFF&amp;nbsp;############&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;db&amp;nbsp;=&amp;nbsp;neo4j.GraphDatabase("crawler_example_db")&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#How&amp;nbsp;do&amp;nbsp;we&amp;nbsp;delete&amp;nbsp;an&amp;nbsp;index...&amp;nbsp;which&amp;nbsp;might&amp;nbsp;have&amp;nbsp;old&amp;nbsp;nodes&amp;nbsp;in?&amp;nbsp;This&amp;nbsp;doesn't&amp;nbsp;work...any&amp;nbsp;ideas...&amp;nbsp;how&amp;nbsp;do&amp;nbsp;I&amp;nbsp;empty&amp;nbsp;an&amp;nbsp;index,&amp;nbsp;should&amp;nbsp;I?&amp;nbsp;Does&amp;nbsp;deleting&amp;nbsp;a&amp;nbsp;node&amp;nbsp;delete&amp;nbsp;its&amp;nbsp;refererence&amp;nbsp;from&amp;nbsp;the&amp;nbsp;index&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;ref&amp;nbsp;in&amp;nbsp;db.index('pages'):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;ref.delete()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pages&amp;nbsp;=&amp;nbsp;db.index("pages",&amp;nbsp;create=True)&amp;nbsp;#&amp;nbsp;create&amp;nbsp;an&amp;nbsp;index&amp;nbsp;called&amp;nbsp;'pages'&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Index&amp;nbsp;created"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;db,&amp;nbsp;pages&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;#############&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;UTILITY&amp;nbsp;FUNCTIONS&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#############&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;get_links(&amp;nbsp;data,&amp;nbsp;url=''):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;'I&amp;nbsp;know&amp;nbsp;I&amp;nbsp;should&amp;nbsp;use&amp;nbsp;BeautifulSoup&amp;nbsp;or&amp;nbsp;lxml'&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;links&amp;nbsp;=&amp;nbsp;[&amp;nbsp;]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;found_links&amp;nbsp;=&amp;nbsp;re.findall('&amp;nbsp;href="?([^\s^"\'#]+)',&amp;nbsp;data)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;link&amp;nbsp;in&amp;nbsp;found_links:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;link&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;link&amp;nbsp;=&amp;nbsp;urlparse.urljoin(url,&amp;nbsp;link)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;link&amp;nbsp;=&amp;nbsp;link.replace("/..",&amp;nbsp;"")&amp;nbsp;#fix&amp;nbsp;relative&amp;nbsp;links&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;link&amp;nbsp;=&amp;nbsp;link.replace("'",&amp;nbsp;"")&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;link&amp;nbsp;not&amp;nbsp;in&amp;nbsp;links&amp;nbsp;and&amp;nbsp;link[:7]&amp;nbsp;==&amp;nbsp;'http://'&amp;nbsp;:#avoid&amp;nbsp;mailtos&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;links.append(&amp;nbsp;link&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;links&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;wget(url):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;handle&amp;nbsp;=&amp;nbsp;urllib2.urlopen(url)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;'text/html'&amp;nbsp;in&amp;nbsp;handle.headers['content-type']:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;data&amp;nbsp;=&amp;nbsp;unicode(handle.read(),&amp;nbsp;&amp;nbsp;errors='ignore')&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;data&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;e:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Error&amp;nbsp;wgetting",&amp;nbsp;e,&amp;nbsp;url&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;traceback.print_exc()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;None&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;delete_all_pages():&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;node&amp;nbsp;in&amp;nbsp;db.node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;id&amp;nbsp;&amp;nbsp;=&amp;nbsp;node.id&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;node.delete(&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;id,&amp;nbsp;"deleted"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pass&amp;nbsp;#&amp;nbsp;fails&amp;nbsp;on&amp;nbsp;last&amp;nbsp;iteration?&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"All&amp;nbsp;pages&amp;nbsp;deleted!"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;create_page(db,&amp;nbsp;pages,&amp;nbsp;url,&amp;nbsp;code=200&amp;nbsp;,&amp;nbsp;follow=True):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;'''this&amp;nbsp;creates&amp;nbsp;a&amp;nbsp;page&amp;nbsp;node&amp;nbsp;then&amp;nbsp;adds&amp;nbsp;it&amp;nbsp;to&amp;nbsp;the&amp;nbsp;"pages"&amp;nbsp;index,&amp;nbsp;if&amp;nbsp;it's&amp;nbsp;there&amp;nbsp;already&amp;nbsp;it'll&amp;nbsp;get&amp;nbsp;it'''&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#&amp;nbsp;Get&amp;nbsp;the&amp;nbsp;actual&amp;nbsp;page&amp;nbsp;data&amp;nbsp;from&amp;nbsp;the&amp;nbsp;web...&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;data&amp;nbsp;=&amp;nbsp;wget(&amp;nbsp;url&amp;nbsp;)&amp;nbsp;#get&amp;nbsp;the&amp;nbsp;actual&amp;nbsp;html&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;not&amp;nbsp;data:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;None&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;data_len&amp;nbsp;=&amp;nbsp;len(data)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;page_node&amp;nbsp;=&amp;nbsp;pages[url]&amp;nbsp;#&amp;nbsp;does&amp;nbsp;this&amp;nbsp;page&amp;nbsp;exist&amp;nbsp;yet?&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;not&amp;nbsp;page_node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;page_node&amp;nbsp;=&amp;nbsp;db.node(url=url,&amp;nbsp;code=code,&amp;nbsp;data=data,&amp;nbsp;data_len=data_len)&amp;nbsp;#&amp;nbsp;create&amp;nbsp;a&amp;nbsp;page&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pages[&amp;nbsp;url&amp;nbsp;]&amp;nbsp;=&amp;nbsp;page_node&amp;nbsp;#&amp;nbsp;Add&amp;nbsp;to&amp;nbsp;index&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Created:"&amp;nbsp;,&amp;nbsp;url&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Exists&amp;nbsp;already:"&amp;nbsp;,&amp;nbsp;url&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#Now&amp;nbsp;create&amp;nbsp;a&amp;nbsp;page&amp;nbsp;for&amp;nbsp;every&amp;nbsp;link&amp;nbsp;in&amp;nbsp;the&amp;nbsp;page&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;follow&amp;nbsp;==&amp;nbsp;True:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"\tfollowing&amp;nbsp;links"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;i&amp;nbsp;=&amp;nbsp;0&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;links&amp;nbsp;=&amp;nbsp;get_links(data,&amp;nbsp;url)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;len(&amp;nbsp;links&amp;nbsp;)&amp;nbsp;,&amp;nbsp;"found"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;len(links)&amp;nbsp;&amp;gt;&amp;nbsp;20:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;links&amp;nbsp;=&amp;nbsp;links[:19]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;link&amp;nbsp;in&amp;nbsp;links:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"\tcreating:",&amp;nbsp;&amp;nbsp;link&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;create_page(db,&amp;nbsp;pages,&amp;nbsp;link,&amp;nbsp;follow=False)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;page_node&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;get_one(url):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;'given&amp;nbsp;a&amp;nbsp;url&amp;nbsp;with&amp;nbsp;return&amp;nbsp;a&amp;nbsp;node&amp;nbsp;obj'&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;node&amp;nbsp;=&amp;nbsp;pages&amp;nbsp;[&amp;nbsp;url&amp;nbsp;]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;node&amp;nbsp;==&amp;nbsp;None:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Node&amp;nbsp;is&amp;nbsp;none!"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;node&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;return&amp;nbsp;None&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;list_all_pages(db):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#&amp;nbsp;Just&amp;nbsp;iterate&amp;nbsp;through&amp;nbsp;the&amp;nbsp;pages&amp;nbsp;to&amp;nbsp;make&amp;nbsp;sure&amp;nbsp;the&amp;nbsp;data&amp;nbsp;in&amp;nbsp;in&amp;nbsp;there...&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Listing&amp;nbsp;all&amp;nbsp;pages..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;node&amp;nbsp;in&amp;nbsp;db.node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;node&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;node['data'][:150],&amp;nbsp;"..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"...done&amp;nbsp;listing!"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;delete_one(&amp;nbsp;url&amp;nbsp;):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;''&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;node&amp;nbsp;=&amp;nbsp;pages[&amp;nbsp;url&amp;nbsp;]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;node.delete()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Node&amp;nbsp;with&amp;nbsp;id=",&amp;nbsp;node.id,&amp;nbsp;"deleted"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Node&amp;nbsp;probably&amp;nbsp;not&amp;nbsp;found:",&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;dir(node)&amp;nbsp;#&amp;nbsp;let's&amp;nbsp;have&amp;nbsp;a&amp;nbsp;look&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def&amp;nbsp;find_links_between_pages(db):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;#'Just&amp;nbsp;iterate&amp;nbsp;through&amp;nbsp;the&amp;nbsp;pages&amp;nbsp;to&amp;nbsp;make&amp;nbsp;sure&amp;nbsp;the&amp;nbsp;data&amp;nbsp;in&amp;nbsp;in&amp;nbsp;there...'&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Linking&amp;nbsp;all&amp;nbsp;pages..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;node&amp;nbsp;in&amp;nbsp;db.node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;str(node),&amp;nbsp;node['url'],&amp;nbsp;node['data_len']&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;links&amp;nbsp;=&amp;nbsp;get_links(&amp;nbsp;node['data'],&amp;nbsp;node['url'])&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;for&amp;nbsp;link&amp;nbsp;in&amp;nbsp;links:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#look&amp;nbsp;to&amp;nbsp;see&amp;nbsp;if&amp;nbsp;a&amp;nbsp;node&amp;nbsp;with&amp;nbsp;that&amp;nbsp;url&amp;nbsp;exists&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;other_node&amp;nbsp;=&amp;nbsp;get_one(&amp;nbsp;link&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;if&amp;nbsp;not&amp;nbsp;other_node:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pass&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;else:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;other_node.links_to(&amp;nbsp;node&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;pass&amp;nbsp;#&amp;nbsp;fails&amp;nbsp;on&amp;nbsp;last&amp;nbsp;iteration?&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;if&amp;nbsp;__name__&amp;nbsp;==&amp;nbsp;'__main__':&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;####&amp;nbsp;LET'S&amp;nbsp;GET&amp;nbsp;STARTED&amp;nbsp;######&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;db,&amp;nbsp;pages&amp;nbsp;=&amp;nbsp;setup(&amp;nbsp;)&amp;nbsp;#&amp;nbsp;I&amp;nbsp;can't&amp;nbsp;connect&amp;nbsp;on&amp;nbsp;load,&amp;nbsp;because&amp;nbsp;then&amp;nbsp;I&amp;nbsp;can't&amp;nbsp;reload(this_module)&amp;nbsp;because&amp;nbsp;it&amp;nbsp;is&amp;nbsp;already&amp;nbsp;connected&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;sys.argv[1]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;url&amp;nbsp;=&amp;nbsp;sys.argv[1]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"No&amp;nbsp;url&amp;nbsp;passed,&amp;nbsp;using&amp;nbsp;'http://diveintopython.org/'"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;url&amp;nbsp;=&amp;nbsp;'http://diveintopython.org/'&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Starting:",&amp;nbsp;url&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;########&amp;nbsp;DELETE&amp;nbsp;ALL&amp;nbsp;THE&amp;nbsp;PAGES&amp;nbsp;WE&amp;nbsp;HAVE&amp;nbsp;SO&amp;nbsp;FAR&amp;nbsp;############&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#&amp;nbsp;Avoiding&amp;nbsp;this&amp;nbsp;because&amp;nbsp;of&amp;nbsp;...&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#jpype._jexception.RuntimeExceptionPyRaisable:&amp;nbsp;org.neo4j.graphdb.NotFoundException:&amp;nbsp;Node[0]&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#...&amp;nbsp;errors&amp;nbsp;later...&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;"Deleting&amp;nbsp;existing&amp;nbsp;pages..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#delete_all_pages(&amp;nbsp;)&amp;nbsp;#we&amp;nbsp;may&amp;nbsp;want&amp;nbsp;to&amp;nbsp;add&amp;nbsp;new&amp;nbsp;data&amp;nbsp;to&amp;nbsp;each&amp;nbsp;page...&amp;nbsp;forget&amp;nbsp;this&amp;nbsp;for&amp;nbsp;now&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;########&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;ADD&amp;nbsp;SOME&amp;nbsp;ACTUAL&amp;nbsp;PAGES&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#################&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;"Creating&amp;nbsp;some&amp;nbsp;pages&amp;nbsp;"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;create_page(&amp;nbsp;db,&amp;nbsp;pages,&amp;nbsp;url,&amp;nbsp;200&amp;nbsp;)&amp;nbsp;#Also&amp;nbsp;fetches&amp;nbsp;some&amp;nbsp;pages&amp;nbsp;linked&amp;nbsp;from&amp;nbsp;this&amp;nbsp;page&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;########&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;NOW&amp;nbsp;GET&amp;nbsp;SOME&amp;nbsp;NODES&amp;nbsp;OUT&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;###############&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Has&amp;nbsp;our&amp;nbsp;data&amp;nbsp;made&amp;nbsp;it&amp;nbsp;to&amp;nbsp;the&amp;nbsp;database?&amp;nbsp;Let's&amp;nbsp;see..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#&amp;nbsp;Do&amp;nbsp;some&amp;nbsp;fishing&amp;nbsp;for&amp;nbsp;nodes...&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;with&amp;nbsp;db.transaction:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;node&amp;nbsp;=&amp;nbsp;get_one(&amp;nbsp;url&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;'\tget&amp;nbsp;one:'&amp;nbsp;+&amp;nbsp;str(&amp;nbsp;node&amp;nbsp;)&amp;nbsp;,&amp;nbsp;"...&amp;nbsp;oh&amp;nbsp;yes!"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"\tid:",&amp;nbsp;node.id&amp;nbsp;,&amp;nbsp;"url:",&amp;nbsp;node['url'],&amp;nbsp;"data-length:",&amp;nbsp;node['data_len']&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#This&amp;nbsp;should&amp;nbsp;fail&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;"This&amp;nbsp;SHOULD&amp;nbsp;fail.."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#node&amp;nbsp;=&amp;nbsp;get_one(&amp;nbsp;'http://www.pythonware.com/daily/'&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;'get&amp;nbsp;one:'&amp;nbsp;+&amp;nbsp;str(&amp;nbsp;node&amp;nbsp;)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;"id:",&amp;nbsp;node.id&amp;nbsp;,&amp;nbsp;"url:",&amp;nbsp;node['url'],&amp;nbsp;"data&amp;nbsp;length:",&amp;nbsp;node['data_len']&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#print&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;except&amp;nbsp;Exception,&amp;nbsp;err:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Probably&amp;nbsp;not&amp;nbsp;a&amp;nbsp;node&amp;nbsp;with&amp;nbsp;that&amp;nbsp;URL"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;err&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;db.shutdown(&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;db,&amp;nbsp;pages&amp;nbsp;=&amp;nbsp;setup(&amp;nbsp;)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#########&amp;nbsp;&amp;nbsp;TRY&amp;nbsp;TO&amp;nbsp;ITERATE&amp;nbsp;ALL&amp;nbsp;PAGES&amp;nbsp;FETCHED&amp;nbsp;################&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;list_all_pages(&amp;nbsp;db&amp;nbsp;)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#########&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;TRY&amp;nbsp;TO&amp;nbsp;DELETE&amp;nbsp;ONE&amp;nbsp;################&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#delete_one(&amp;nbsp;url&amp;nbsp;)&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#&amp;nbsp;Now&amp;nbsp;let's&amp;nbsp;see&amp;nbsp;if&amp;nbsp;it&amp;nbsp;has&amp;nbsp;gone&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#list_all_pages(&amp;nbsp;)&amp;nbsp;#or&amp;nbsp;maybe&amp;nbsp;later&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;#########&amp;nbsp;LET'S&amp;nbsp;LOOK&amp;nbsp;FOR&amp;nbsp;RELATIONSHIPS&amp;nbsp;BETWEEN&amp;nbsp;PAGES&amp;nbsp;################&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"Doing&amp;nbsp;linking&amp;nbsp;pages..."&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;find_links_between_pages(&amp;nbsp;)&amp;nbsp;#goes&amp;nbsp;to&amp;nbsp;every&amp;nbsp;page,&amp;nbsp;looking&amp;nbsp;for&amp;nbsp;other&amp;nbsp;pages&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;print&amp;nbsp;"shutting&amp;nbsp;down,&amp;nbsp;saving&amp;nbsp;all&amp;nbsp;the&amp;nbsp;changes,&amp;nbsp;etc"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;db.shutdown()&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-6531748766341832835?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/6531748766341832835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/python-and-neo4j-could-this-be-funky.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6531748766341832835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6531748766341832835'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/08/python-and-neo4j-could-this-be-funky.html' title='Python and Neo4J - Could this be the Funky Data Model?'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-8742727252927701317</id><published>2010-07-13T08:40:00.000-07:00</published><updated>2010-08-02T08:47:18.129-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='delicious'/><category scheme='http://www.blogger.com/atom/ns#' term='api'/><category scheme='http://www.blogger.com/atom/ns#' term='twitter'/><category scheme='http://www.blogger.com/atom/ns#' term='google'/><title type='text'>Frustrating APIs</title><content type='html'>It's been a frustrating week.&lt;br /&gt;&lt;br /&gt;At first I wanted to work with the Twitter API. I found that Oauth is a bit of pig to work with, and potentially not what what I want. I created a HTML-scraping version of the API that might just get me the information I need simply but it wasn't long before twitter realised what was up and blocked my IP.&lt;br /&gt;&lt;br /&gt;Then I wanted to work with the Delicious API. I found their API is a bit of pig to work with, and potentially not what what I want. I created a HTML-scraping version of the API that might just get me the information I need simply but it wasn't long before Yahoo realised what was up and blocked my IP.&lt;br /&gt;&lt;br /&gt;Next I wanted to work with the Google Search API. I found their API is a bit of pig to work with, and potentially not what what I want. I created a HTML-scraping version of the API that might just get me the information I need simply but it wasn't long before Google realised what was up and blocked my IP.&lt;br /&gt;&lt;br /&gt;I'm spotting a pattern. APIs are crap.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-8742727252927701317?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/8742727252927701317/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/07/frustrating-apis.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8742727252927701317'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8742727252927701317'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/07/frustrating-apis.html' title='Frustrating APIs'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-5657786376914927613</id><published>2010-04-03T06:27:00.000-07:00</published><updated>2010-04-03T06:27:40.615-07:00</updated><title type='text'>Discovery and Construction of Authors’ Profile from Linked Data</title><content type='html'>&lt;object style="width:600px;height:777px" &gt;&lt;param name="movie" value="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf?mode=embed&amp;amp;viewMode=presentation&amp;amp;layout=http%3A%2F%2Fskin.issuu.com%2Fv%2Flight%2Flayout.xml&amp;amp;showFlipBtn=true&amp;amp;documentId=100403132401-fa39416eebd343a99214cd371e2b1efa&amp;amp;docName=discovery_and_construction_of_authors__profile_fro&amp;amp;username=everythingability&amp;amp;loadingInfoText=Discovery%20and%20Construction%20of%20Authors%E2%80%99%20Profile%20from%20Linked%20Data&amp;amp;et=1270301152688&amp;amp;er=61" /&gt;&lt;param name="allowfullscreen" value="true"/&gt;&lt;param name="menu" value="false"/&gt;&lt;embed src="http://static.issuu.com/webembed/viewers/style1/v1/IssuuViewer.swf" type="application/x-shockwave-flash" allowfullscreen="true" menu="false" style="width:600px;height:777px" flashvars="mode=embed&amp;amp;viewMode=presentation&amp;amp;layout=http%3A%2F%2Fskin.issuu.com%2Fv%2Flight%2Flayout.xml&amp;amp;showFlipBtn=true&amp;amp;documentId=100403132401-fa39416eebd343a99214cd371e2b1efa&amp;amp;docName=discovery_and_construction_of_authors__profile_fro&amp;amp;username=everythingability&amp;amp;loadingInfoText=Discovery%20and%20Construction%20of%20Authors%E2%80%99%20Profile%20from%20Linked%20Data&amp;amp;et=1270301152688&amp;amp;er=61" &gt;&lt;/embed&gt;&lt;/object&gt;&lt;br /&gt;&lt;br /&gt;This looks useful and relevant. No idea where I found it, will dig it out later.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-5657786376914927613?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/5657786376914927613/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/04/discovery-and-construction-of-authors.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/5657786376914927613'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/5657786376914927613'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/04/discovery-and-construction-of-authors.html' title='Discovery and Construction of Authors’ Profile from Linked Data'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-7438929304885401261</id><published>2010-04-01T04:38:00.000-07:00</published><updated>2010-04-01T04:38:47.342-07:00</updated><title type='text'>The Next Sprint</title><content type='html'>Before I tidy up the code so far (it's a mess trust me) ... the next Sprint will be all about integration. The reason for this is that it's all very well having a noodley toy with which people can browse and explore and vanity surf but unless it is integrated into a tool that someone uses very regularly then it will never have the chance to prove itself.&lt;br /&gt;&lt;br /&gt;The tool we will be using as a collaborative platform will be &lt;a href="http://cyn.in/"&gt;Cyn.in&lt;/a&gt; because ...&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;it has a plethora of tools, wikis, blogs, discussions, events etc&lt;/li&gt;&lt;li&gt;it has a lovely UI&lt;/li&gt;&lt;li&gt;it is written in Plone and python is my tool of choice&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;My work following Easter will look something like this...&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Find out more about Plone (I've bought a book :-)&lt;/li&gt;&lt;li&gt;Create a script that inserts objects into the ZODB (in this case events crawled from the University site)&lt;/li&gt;&lt;li&gt;Create a new "content type"... I would quite like to create a "Location" object.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Create (or install) an RSS reader so we can add tweets, feed etc.&amp;nbsp;&lt;/li&gt;&lt;li&gt;See if there's a LATEX option for the kinky scientists&lt;/li&gt;&lt;li&gt;Make a portlet to display this stuff&lt;/li&gt;&lt;li&gt;Lighten the overall look n feel.&lt;/li&gt;&lt;li&gt;Make a "content parser" so that key terms in a taxonomy can be automatically replaced (e.g. SIPIG (a commitee), VC011 (my room number) and dps4 (my boss) with links to the relevant items.&lt;/li&gt;&lt;li&gt;Create (or install) a JSON reader so that we can potentially dynamically find "related people" and display them on peoples' profiles&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;That looks like two weeks work to me (or more).... I may be gone sometime...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-7438929304885401261?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/7438929304885401261/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/04/next-sprint.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7438929304885401261'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7438929304885401261'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/04/next-sprint.html' title='The Next Sprint'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-4289845647407908361</id><published>2010-03-31T06:40:00.000-07:00</published><updated>2010-03-31T06:40:20.408-07:00</updated><title type='text'>Put It All Together... Version 0.1</title><content type='html'>I couldn't resist connecting the web site tag cloud terms to people associated with those terms to a person's hypertree.&lt;br /&gt;&lt;br /&gt;Here it is... &lt;a href="http://pppeople.everythingability.com/"&gt;PPPeople PPPowered version Alpha 0.1&lt;/a&gt;&amp;nbsp;(yes there are bugs and interface issues but it's still lots of fun).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-4289845647407908361?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/4289845647407908361/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/put-it-all-together.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4289845647407908361'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4289845647407908361'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/put-it-all-together.html' title='Put It All Together... Version 0.1'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-1276109309571856901</id><published>2010-03-30T08:27:00.000-07:00</published><updated>2010-03-30T08:31:00.530-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pppeople'/><category scheme='http://www.blogger.com/atom/ns#' term='visualisation'/><title type='text'>My very first Hyperforest</title><content type='html'>&lt;span style="font-size: small;"&gt;&lt;/span&gt;&lt;span style="font-size: small;"&gt;So today I decided to look at little deeper at WHAT I want to plot on a Hypertree anyway. It's quite a difficult question. The purpose of all this, ultimately, it to be a browsing experience (not a reporting one) so I don't care if the diagram is an accurate reflection all that matters is that if it is accurate enough to flush out some interesting connections.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;The diagram should look plausible but then have oddities in. So.... today I took a simplistic view that went...&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;What web documents is someone mentioned in&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;What topics are associated with those documents &lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;What terms are those documents associated with (and which topics)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-size: small;"&gt;Who else is on the documents that have those terms&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-size: small;"&gt;I also threw in some relevancy numbers for good luck ( thanks @TomTague for the tip)&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt; &lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;The end results are a quite good...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;Gustav Delius - a mathematician who is involved in lots of web projects looks like this...&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_2IeQft2KL-g/S7IW8UlH8AI/AAAAAAAAALI/0o8hB0wH2Os/s1600/GustavDelius.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_2IeQft2KL-g/S7IW8UlH8AI/AAAAAAAAALI/0o8hB0wH2Os/s320/GustavDelius.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&amp;nbsp;Trevor Sheldon - the Deputy Vice Chancellor looks like this...&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S7IXX4ycs-I/AAAAAAAAALQ/wzLkxHQDFbE/s1600/TrevorSheldon.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S7IXX4ycs-I/AAAAAAAAALQ/wzLkxHQDFbE/s320/TrevorSheldon.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Roger Burrows - Sociologist who might be said to have concerns about technology and privacy (forgive me if I'm wrong Roger)... looks like this...&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_2IeQft2KL-g/S7IX3n_LojI/AAAAAAAAALY/xD3ps1qFxZ8/s1600/RogerBurrows.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_2IeQft2KL-g/S7IX3n_LojI/AAAAAAAAALY/xD3ps1qFxZ8/s320/RogerBurrows.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;Sue Hodges - Educator definitely not into social media self-promotion looks like this... &lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S7IYCeBiC5I/AAAAAAAAALg/LIhZBh7CckM/s1600/SueHodges.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S7IYCeBiC5I/AAAAAAAAALg/LIhZBh7CckM/s320/SueHodges.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;It's looking promising. The pictures look great... now if I can just fathom what they &lt;i&gt;mean&lt;/i&gt; I'll be laughing.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;Tomorrow I need to research talking to the ZODB and how to make custom object types in Plone so that I can take this data and integrate it into a social networking site we're running.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-1276109309571856901?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/1276109309571856901/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/my-very-first-hyperforest.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1276109309571856901'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1276109309571856901'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/my-very-first-hyperforest.html' title='My very first Hyperforest'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2IeQft2KL-g/S7IW8UlH8AI/AAAAAAAAALI/0o8hB0wH2Os/s72-c/GustavDelius.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-8920027084723649277</id><published>2010-03-29T08:36:00.000-07:00</published><updated>2010-03-29T08:46:26.367-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pppeople'/><category scheme='http://www.blogger.com/atom/ns#' term='visualisation'/><category scheme='http://www.blogger.com/atom/ns#' term='jisc'/><title type='text'>My Very First Hypertree</title><content type='html'>Sometimes, in fact most of the times for me, I find it difficult to work with certain concepts until I can see them. I began today by making the templates that could just show the data I have gathered from Open Calais so that I could get a feel for what's there.&lt;br /&gt;&lt;br /&gt;And just now I started adding a little &lt;a href="http://thejit.org/docs/files/Hypertree-js.html"&gt;Hypertree&lt;/a&gt; magic. Here's some of the data displayed quite simply.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S7DHA3CWuoI/AAAAAAAAAK4/yxtmdN6wTsg/s1600/HowardHypertree.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S7DHA3CWuoI/AAAAAAAAAK4/yxtmdN6wTsg/s320/HowardHypertree.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The funny shape makes me realise that I've made a completely arbitrary three-way split of pages, people and terms and that I need to make it build its hierarchy dynamically. Ooh that one hurts.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;It looks even stranger when someone happens to have been in lots of pages about meeting minutes. Here's Jo Casey as a Hypertree.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_2IeQft2KL-g/S7DHsLn7keI/AAAAAAAAALA/3QfXNerV9Fk/s1600/CaseyHypertree.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/_2IeQft2KL-g/S7DHsLn7keI/AAAAAAAAALA/3QfXNerV9Fk/s320/CaseyHypertree.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;I'm pleased that I'm able to display something. I now need to work on the user interface and add a nice search interface and popup areas for "people information" and maybe to be able to load more data as you click on a node.&lt;br /&gt;&lt;br /&gt;I also now need to think about which connections between people are most important and perhaps how to structure the industry terms.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-8920027084723649277?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/8920027084723649277/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/my-very-first-hypertree.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8920027084723649277'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8920027084723649277'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/my-very-first-hypertree.html' title='My Very First Hypertree'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2IeQft2KL-g/S7DHA3CWuoI/AAAAAAAAAK4/yxtmdN6wTsg/s72-c/HowardHypertree.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-59758268168203883</id><published>2010-03-29T03:32:00.000-07:00</published><updated>2010-03-29T03:32:13.422-07:00</updated><title type='text'>This Week's Plan</title><content type='html'>My plans for this sprint are to create a navigable interface for the data Open Calais has returned that focusses on people and connections between people. To begin with this might just be "people pages" or an A to Z page but I also was to quickly start using a few lovely visualisations.&lt;br /&gt;&lt;br /&gt;I then want to host this data on a site that people can play with (and into a subversion repository). I will initially use Django simply because I'm very familiar with it but the intention is that this will add data to a Cyn.in (or Plone site) so I have bought a book on Plone development so I can start thinking about saving data to the ZODB...&lt;br /&gt;&lt;br /&gt;So that I can start simply and learn how the ZODB works, the plan is to...&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Create a crawler for events that saves regular "Events" objects into the ZODB.&amp;nbsp;&lt;/li&gt;&lt;li&gt;Create a &lt;b&gt;new content type&lt;/b&gt;&amp;nbsp;for "Places". This may have a lat/long and in the long term should be displayed on a map.... maybe on an iPhone. This is a bit more involved&lt;/li&gt;&lt;li&gt;Create an RSS &amp;nbsp;or RDF grabber Product so that I can integrate people, places, concepts etc with other linkeddata sources... if only to grab an image that represents York the city.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;Slight concern:&lt;/span&gt;&lt;/b&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt; This sort of means that I'll be splitting my direction... one will be looking at visualisation in Django/MySQL the other is looking at data manipulation and integration within Cyn.in (the environment that will ultimately host this data). I'm hoping that once I have got to the end of the visualisation experimentation and the Cyn.in integration work I will be able to pull the two neatly together somehow.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Visualisation Tools&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;a href="http://thejit.org/home/"&gt;Infovis&lt;/a&gt;&amp;nbsp;looks simple enough and perfect for what I need.&amp;nbsp;I particularly like the&amp;nbsp;&lt;a href="http://thejit.org/Jit/Examples/Hypertree/example1.html"&gt;Hypertree&lt;/a&gt;&amp;nbsp;thing (shown below).&amp;nbsp;Here Jeni has taken some&lt;a href="http://www.jenitennison.com/visualisation/offences.html?offences=http://www.jenitennison.com/data/scheme/offence/burglary"&gt;&amp;nbsp;government open data about burglary and shown it on a hypertree&lt;/a&gt;... brilliant!&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://thejit.org/static/img/demos/hypertree.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://thejit.org/static/img/demos/hypertree.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://live.yworks.com/yfiles-ajax/#demos"&gt;yFiles library&lt;/a&gt; has aGraph viewer component but it seems too &lt;i&gt;complete&lt;/i&gt;, and it's not as black&amp;nbsp;(which is cool).&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://live.yworks.com/yfiles-ajax/resources/images/demoscreens/viewer.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://live.yworks.com/yfiles-ajax/resources/images/demoscreens/viewer.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://vis.stanford.edu/protovis/ex/"&gt;Protovis&lt;/a&gt;&amp;nbsp;has some very nice display options, I may come back to this later.&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://vis.stanford.edu/protovis/ex/jobs-sm.png?3.0" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://vis.stanford.edu/protovis/ex/jobs-sm.png?3.0" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Both&amp;nbsp;&lt;a href="http://raphaeljs.com/"&gt;Raphael&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="http://processingjs.org/exhibition"&gt;Processing&lt;/a&gt;&amp;nbsp;look extremely powerful (go look at the examples) but I will come back to them later once I've done the simple(r) stuff.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;One of the lessons learned from the project already is that once you get lots of people working in an &lt;i&gt;open&lt;/i&gt; environment that the "Activity Stream" soon resembles a fire-hose of "too much" information. This suggests that the Home Page of any collaborative environment needs to think about displaying aggregate data or visualisations rather than actual data as a way to make the Home Page useful. So it's going to be interesting to try and find novels ways of displaying what for the most part will be conversations.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-59758268168203883?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/59758268168203883/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/this-weeks-plan.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/59758268168203883'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/59758268168203883'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/this-weeks-plan.html' title='This Week&apos;s Plan'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-2892032364915857650</id><published>2010-03-27T05:14:00.000-07:00</published><updated>2010-03-27T05:14:49.464-07:00</updated><title type='text'>Sprint 1, it works</title><content type='html'>So having crawled the sites, mangled the data... some very simple code...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;from djangocalais.models import *&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;def related_people(name):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;es = Entity.objects.filter(name__icontains=name)[0]&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;cds = CalaisDocument.objects.filter(entities=es)&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;for cd in cds:&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;for e in cd.entities.all():&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if e.type.name == "Person":&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;print e.name , "&amp;gt;", &amp;nbsp;cd.__unicode__()&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;m.related_people("Alastair Fitter")&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... produces this....&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S632RZsCPEI/AAAAAAAAAKw/v76mMiTMBrM/s1600/RelatedPeopleTerminal.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S632RZsCPEI/AAAAAAAAAKw/v76mMiTMBrM/s320/RelatedPeopleTerminal.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is what I like.... Artificial Intelligence for Dummies. I can't wait to slap an interface on this... Now to brush up on some jQuery.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-2892032364915857650?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/2892032364915857650/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-1-it-works.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/2892032364915857650'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/2892032364915857650'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-1-it-works.html' title='Sprint 1, it works'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2IeQft2KL-g/S632RZsCPEI/AAAAAAAAAKw/v76mMiTMBrM/s72-c/RelatedPeopleTerminal.png' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-8829789265646173922</id><published>2010-03-26T08:53:00.000-07:00</published><updated>2010-03-26T08:56:48.684-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pppeople'/><category scheme='http://www.blogger.com/atom/ns#' term='university of york'/><category scheme='http://www.blogger.com/atom/ns#' term='semantic'/><category scheme='http://www.blogger.com/atom/ns#' term='open calais'/><title type='text'>Sprint 1 Done!</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_2IeQft2KL-g/S6zXkzrHk-I/AAAAAAAAAKg/IJ47SGRaxLU/s1600/YorkTags.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_2IeQft2KL-g/S6zXkzrHk-I/AAAAAAAAAKg/IJ47SGRaxLU/s320/YorkTags.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;York University's web site has been crawled. The data was then manipulated (or added to) using Open Calais - although I still have to leave that cooking a while longer. And then using the Google Visualisation API I created a tag cloud of what are called the top "social tags" for the semantic information.&lt;br /&gt;&lt;br /&gt;The entities returned look really interesting. I need to find out how they relate to each other and how I can find connections between peoples.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6zYS9GV1kI/AAAAAAAAAKo/1pNf1Fnbu28/s1600/Entities.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6zYS9GV1kI/AAAAAAAAAKo/1pNf1Fnbu28/s320/Entities.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The next step is to clean up the code. Make it semi-functional so at very least you can explore the data and then put it into Google Code ... with some instructions! Which will look something like this...&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;1. Install Django and dependencies&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;2. crawl (http://www.york.ac.uk)&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;3. analyze()&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;4. &amp;nbsp;./manage.py runserver&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;What a week...&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-8829789265646173922?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/8829789265646173922/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-1-done.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8829789265646173922'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/8829789265646173922'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-1-done.html' title='Sprint 1 Done!'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2IeQft2KL-g/S6zXkzrHk-I/AAAAAAAAAKg/IJ47SGRaxLU/s72-c/YorkTags.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-6666592277235663421</id><published>2010-03-25T10:47:00.000-07:00</published><updated>2010-03-25T10:47:08.911-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='semantic'/><category scheme='http://www.blogger.com/atom/ns#' term='python'/><category scheme='http://www.blogger.com/atom/ns#' term='crawlers'/><title type='text'>Python Crawling Tools</title><content type='html'>I have been here before, here in this case being in a place where I think, "Surely someone has written a fantastic python crawler that is easy to use and extend that is open source". They may have done, but I can't find it.&lt;br /&gt;&lt;br /&gt;My particular problem is this. I want it to be easy to use, something like this would be nice...&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;from crawler import Crawler&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;class myCrawer(Crawler):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; def handle(self, url, html):&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;#do_something_here&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;c = myCrawler(url="http://wherever.com")&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;c.crawl_type = "nice"&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;c.run( )&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;... and also, I'd like it to well designed enough not to run out of memory, &lt;b&gt;well documented&lt;/b&gt;, know about the strange unicode formats that live out there on the web, to open too many sockets, to do the right thing when a URL times out... and hey to be quick it probably should be threaded to boot.&lt;br /&gt;&lt;br /&gt;After days (and days) of downloading and trying crawlers out there have been some worth noting.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.harvestmanontheweb.com/"&gt;HarvestMan&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://scrapy.org/"&gt;Scrapy&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;...but I have also tried crawl-e, creepycrawler, curlspider, flyonthewall, fseek, hmcrawler, jazz-crawler, ruya, supercrawler, webchuan, webcrawling, yaspider and more! All of these are rubbish...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The problem is this. It IS possible to write a crawler in 20 or so lines in python that will work... except it won't. It won't handle pages that redirect to themselves, it won't handle links that are ../../ relative, it won't be controllable in any way.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;My problem with the two "best contenders" was this. Firstly, &lt;b&gt;HarvestMan&lt;/b&gt;, no matter how hard I configure it, always saves the crawled pages to disk. I had this problem in 2006ish, I still have it today. Arnand (lovely guy) the developer has rolled in many of my change requests since then so that the pseudo code above is almost a reality, but only almost. HarvestMan still is a front-runner if just because of the thought that has gone into the configuration options.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;Scrapy&lt;/b&gt; looked very cool too, but it seems more geared towards getting specific data from known pages, rather than wandering around the web willy nilly. I would have to create my own crawler within Scrapy. I will maybe come back to this but in general, armed with python and a few regular expressions you can't half get a lot done.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, in my attempt to get "round the loop" once, that is, to &lt;b&gt;a.&amp;nbsp;gather some data from a few sites&lt;/b&gt; (namely, the University of York's sites), &lt;b&gt;b.&amp;nbsp;manipulate it&lt;/b&gt; in some way (in this case, pump it at &lt;a href="http://www.opencalais.com/about"&gt;Open Calais&lt;/a&gt; and see what we get back... poor man's artificial intelligence) and then &lt;b&gt;c.present it&lt;/b&gt; ( maybe as a tag cloud, or something more fancy if I have time) I needed a crawler that was very simple to use.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So ...&lt;/div&gt;&lt;div&gt;a. Despite wanting to use an "off the shelf" crawler. I found a crawler that almost worked and hacked it until it worked. It's not threaded, it's not clever but it does the job.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I had to do some hand pruning to look at the mime-types of the pages returned, and remove Betsie pages (of which there were thousands)... but I will try and roll that back into the crawler.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;b. I found a Django application called &lt;a href="http://media.jesselegg.com/djangocalais/"&gt;django_calais&lt;/a&gt;, which after adding the last line to my Page model, like this...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;class Page(models.Model):&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;url = models.URLField(unique=True, null=False, blank=False, db_index=True)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;title = models.CharField(max_length=300, null=True, blank=True)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;crawl_date = models.DateTimeField(default=datetime.now)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;html = models.TextField()&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;type = models.CharField(max_length=100, null=True, blank=True)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;size = models.IntegerField(null=True, blank=True)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;calais_content_fields = [('title', 'text/txt'), ('url', 'text/html'), ('html', 'text/html')]&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;... I could then run....&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;def analyze():&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;pages = Page.objects.filter(type="text/html")&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;for page in pages:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;print page.__unicode__(), page.url&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;try:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;CalaisDocument.objects.analyze(page, fields= [('title', 'text/txt'), ('url', 'text/html'), ('html', 'text/html'),])&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;except Exception, err:&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;print err&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;.... and have my Calais application be populated with People, and Organisations and Companies and Facilities etc. all of which are related back to my Page model.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I haven't got to presentation stage yet, the Calais analyzer is still running, BUT after get quite anxious about spending too long looking for an adequate crawler, the semantic bit has already proven itself. So maybe tomorrow I will be able to present some data...&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And then I'd better check it in to a repository or something.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-6666592277235663421?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/6666592277235663421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/python-crawling-tools.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6666592277235663421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6666592277235663421'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/python-crawling-tools.html' title='Python Crawling Tools'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-1730095130793103937</id><published>2010-03-22T05:53:00.000-07:00</published><updated>2010-03-22T05:54:19.244-07:00</updated><title type='text'>Review of Python Crawling Tools</title><content type='html'>&lt;a href="http://www.ohloh.net/p/WenChuan"&gt;http://www.ohloh.net/p/WenChuan&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Web crawlers are funny things. I have had a little experience of working with web crawlers in the past with a mixed set of results. Many were too difficult to configure or simply not robust enough. At this stage of the project, I would ideally like a crawler that is easy to adapt and yet trust-worthy enough to be able to set it running and leave it.&lt;br /&gt;&lt;br /&gt;Imagining that I don't want to crawl very large portions of the internet, instead looking just to work with small corners I will be happy with something simple. Worth trialling are alternatives to "doing it yourself", such a &lt;a href="http://pipes.yahoo.com/pipes/"&gt;Yahoo Pipes&lt;/a&gt; and the &lt;a href="http://80legs.com/"&gt;80legs.com&lt;/a&gt; crawling service as well as &lt;a href="http://developer.yahoo.com/search/boss/"&gt;Yahoo Boss&lt;/a&gt; search engine builder. Whenever possible I will consider using these tools, simply because of the &lt;b&gt;time needed to crawl for data&lt;/b&gt;.&lt;br /&gt;&lt;br /&gt;My preferred language is python but I wonder if there is a more web-centric language to think about? I vaguely remember &lt;a href="http://www.rebol.net/cookbook/"&gt;Rebol&lt;/a&gt; as having URLs as base types. Maybe Rebol is worth exploring.&lt;br /&gt;&lt;br /&gt;I started by looking at a list of python crawlers at Ohloh &lt;a href="http://www.ohloh.net/p?sort=users&amp;amp;q=python+crawler"&gt;http://www.ohloh.net/p?sort=users&amp;amp;q=python+crawler&lt;/a&gt;&amp;nbsp;and from these tested these....&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Testing Notes...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://sourceforge.net/projects/ruya/"&gt;Ruya&lt;/a&gt; (Not bad, has before_crawl and after_crawl functions that you overshadow)&lt;br /&gt;&lt;br /&gt;&lt;a href="http://wwwsearch.sourceforge.net/mechanize/"&gt;Mechanize&lt;/a&gt; (Kind of browser simulation - including form filling)&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #333333;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;pre style="font-family: courier, monospace; font-size: 12px; margin-left: 10px;"&gt;&lt;span class="pykw" style="color: #9668d7;"&gt;import&lt;/span&gt; re&lt;br /&gt;&lt;span class="pykw" style="color: #9668d7;"&gt;from&lt;/span&gt; mechanize &lt;span class="pykw" style="color: #9668d7;"&gt;import&lt;/span&gt; Browser&lt;br /&gt;&lt;br /&gt;br = Browser()&lt;br /&gt;br.open(&lt;span class="pystr" style="color: #a08070;"&gt;"http://www.example.com/"&lt;/span&gt;)&lt;br /&gt;&lt;span class="pycmt" style="color: #a34727;"&gt;# follow second link with element text matching regular expression&lt;br /&gt;&lt;/span&gt;response1 = br.follow_link(text_regex=&lt;span class="pystr" style="color: #a08070;"&gt;r"cheese\s*shop"&lt;/span&gt;, nr=1)&lt;br /&gt;&lt;span class="pykw" style="color: #9668d7;"&gt;assert&lt;/span&gt; br.viewing_html()&lt;br /&gt;&lt;span class="pykw" style="color: #9668d7;"&gt;print&lt;/span&gt; br.title()&lt;br /&gt;&lt;span class="pykw" style="color: #9668d7;"&gt;print&lt;/span&gt; response1.geturl()&lt;/pre&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ohloh.net/p/crawl-e"&gt;Crawl-e&lt;/a&gt; ( Not bad )&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: helvetica, arial, sans-serif; font-size: 12px; line-height: 18px;"&gt;&lt;i&gt;The CRAWL-E developers are very familiar with how TCP and HTTP works and using that knowledge have written a web crawler intended to maximize TCP throughput. This benefit is realized when crawling web servers that utilize persistent HTTP connections as numerous requests will be made over a single TCP connection thus increasing the throughput.&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: helvetica, arial, sans-serif; font-size: 12px; line-height: 18px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;#Squzer - missing in action&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/supercrawler/"&gt;SuperCrawler&lt;/a&gt;&amp;nbsp;- not bad but the code looks a bit terse.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ohloh.net/p/fseek"&gt;Fseek&lt;/a&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: helvetica, arial, sans-serif; font-size: 12px; line-height: 18px;"&gt;&lt;i&gt;Fseek is a python-based web crawler. The user-interface is implemented using Django, the back-end uses pyCurl to fetch pages, and Pyro is used for IPC.&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: helvetica, arial, sans-serif; font-size: 12px; line-height: 18px;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="color: black; font-family: Times; font-size: medium; font-style: normal; line-height: normal;"&gt;Says it's Django but it's not. I liked the idea of it being "presentation ready"... could be handy.&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; cannot create writable FSEEK_DATA_DIR /var/fseek.&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; ImportError: No module named Pyro.naming&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; failed to load external entity "/var/fseek/solr/etc/jetty.xml"&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; Needed Pyro&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.ohloh.net/p/WenChuan"&gt;Webchuan&lt;/a&gt; ( XPathy)&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: helvetica, arial, sans-serif; font-size: 12px; line-height: 18px;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: inherit; font-size: 1em; font-weight: inherit; line-height: 1.5em; margin-bottom: 1.5em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left; vertical-align: baseline;"&gt;&lt;i&gt;WebChuan is a set of open source libraries and tools for getting and parsing web pages of website. It is written in Python, based on Twisted and lxml.&lt;/i&gt;&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: inherit; font-size: 1em; font-weight: inherit; line-height: 1.5em; margin-bottom: 1.5em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left; vertical-align: baseline;"&gt;&lt;i&gt;It is inspired by GStreamer. WebChuan is designed to be back-end of web-bot, it is easy to use, powerful, flexible, reusable and efficient.&lt;/i&gt;&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #333333; font-family: inherit; font-size: 1em; font-style: inherit; font-weight: inherit; line-height: 1.5em; margin-bottom: 1.5em; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-align: left; vertical-align: baseline;"&gt;&lt;span class="Apple-style-span" style="color: black; font-family: Times; font-size: medium; line-height: normal;"&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; Error in the setup.py script removed the description and it worked.&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://code.google.com/p/jazz-crawler/"&gt;Jazz crawler&lt;/a&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp; Broke on unicode data (added any2ascii wrappers and it solved it -- hack)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;Then borked on a draw_graph (probably too big)&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;Interesting because it builds a graph (and calculates PageRank) of the data gathered enabling an out-of-the-box simple visualisation...&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="https://dl.dropbox.com/u/93802/test.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="200" src="https://dl.dropbox.com/u/93802/test.png" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.harvestmanontheweb.com/"&gt;Harvestman&lt;/a&gt;&lt;br /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;I have used this before and found did "too powerful", I never did work out how to get it not to save crawled files to disk. I was unhappy with the config.xml approach to running the crawler too. If Scrapy proves unsuccessful I will return to this because it's a great product.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://plone.org/products/externalsitecatalog"&gt;ExternalSiteCatalog&lt;/a&gt;&amp;nbsp;- an integration of Harvestman with Plone - may be useful later.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://github.com/wehriam/awspider/"&gt;AWCrawler&lt;/a&gt;&amp;nbsp;- a python spider that saves data to Amazon S3 etc&lt;br /&gt;&lt;br /&gt;&lt;a href="http://doc.scrapy.org/intro/overview.html"&gt;Scrapy&lt;/a&gt; (XPath and pipelines)&lt;br /&gt;&amp;nbsp;&amp;nbsp; &lt;br /&gt;&lt;div&gt;Scrapy is new (to me) and looks easy to modify. I will experiment with this, though will need to learn some XPath to begin with.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-1730095130793103937?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/1730095130793103937/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/review-of-crawling-tools.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1730095130793103937'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1730095130793103937'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/review-of-crawling-tools.html' title='Review of Python Crawling Tools'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-7835875383233655925</id><published>2010-03-22T05:09:00.000-07:00</published><updated>2010-03-22T05:09:44.001-07:00</updated><title type='text'>The Sprint Cycle</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6daaMhqosI/AAAAAAAAAKI/vOlWjjjBEqc/s1600-h/Sprint+0.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6daaMhqosI/AAAAAAAAAKI/vOlWjjjBEqc/s320/Sprint+0.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;So that I don't "bite off more than I can chew" -- my aims are to make many iterations of the Sprint Cycle (at a guess, four-ish). The dotted areas on the diagram above show the gaping holes in my knowledge. The first iteration will be about getting a better understanding of each of the areas and finding the simplest tools for working with them.&lt;br /&gt;&lt;br /&gt;Sprint 1: The aim to to produce the &lt;b&gt;simplest thing possible that is remotely useful&lt;/b&gt;. I will attempt (this week) to make a full traversal of the cycle. At this point I don't have the details of people's profile on social media sites (the survey hasn't been published yet), I don't have any real knowledge about worth with repositories, nor of working with LinkedData. I will need to work with what I have, namely the data that is already "out there"... web pages and links.&lt;br /&gt;&lt;br /&gt;Whilst the aim may be to make something more like the &lt;a href="http://researchportal.be/en/person/johan-ackaert-(UH_0847)/collaboration.html#tabs"&gt;Research Portal&lt;/a&gt; (shown below), this week is going to be all about doing something MUCH SIMPLER.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6dc1zGiBeI/AAAAAAAAAKQ/vjiQfNkCOW4/s1600-h/ResearchPortal.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6dc1zGiBeI/AAAAAAAAAKQ/vjiQfNkCOW4/s320/ResearchPortal.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Aims&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Review crawling components (and mine York's existing data - wherever it is)&lt;/li&gt;&lt;li&gt;Display content in a TagCloud&lt;/li&gt;&lt;li&gt;Attempt to integrate with some LinkedData&lt;/li&gt;&lt;li&gt;Attempt to present and data or relationships in a novel way&lt;/li&gt;&lt;li&gt;Publish the Social Media Survey to hopefully find out more about what is being used at York.&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;The first iteration of the cycle looks like this. And nothing like thrashing around in frogspawn.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6dd1eFYDdI/AAAAAAAAAKY/mpSUc1uIl1A/s1600-h/Sprint+1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6dd1eFYDdI/AAAAAAAAAKY/mpSUc1uIl1A/s320/Sprint+1.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-7835875383233655925?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/7835875383233655925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-cycle.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7835875383233655925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7835875383233655925'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/sprint-cycle.html' title='The Sprint Cycle'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2IeQft2KL-g/S6daaMhqosI/AAAAAAAAAKI/vOlWjjjBEqc/s72-c/Sprint+0.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-5782996989285366343</id><published>2010-03-22T04:49:00.000-07:00</published><updated>2010-03-22T04:53:28.437-07:00</updated><title type='text'>Milestone 1 Report</title><content type='html'>I have trialled a &lt;a href="http://www.theotherblog.com/Articles/2010/03/22/the-tools/"&gt;number of tools&lt;/a&gt;&amp;nbsp;(SocialText, Jive, Elgg, Confluence, LifeRay Social Office and Cyn.in) with a number of teams which include...&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="color: #4d4d4d; font-family: 'Lucida Grande', 'Lucida Sans Unicode', 'Lucida Sans', Tahoma, 'DejaVu Sans', 'Bitstream Vera Sans', Arial, Verdana, 'Verdana Ref', sans-serif; font-size: 12px; line-height: 20px;"&gt;&lt;/span&gt;&lt;br /&gt;&lt;ul class="navTree navTreeLevel0" style="font-size: 12px; line-height: 1.6667em; list-style-image: initial; list-style-position: inside; list-style-type: none; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="Social Policy Research Unit" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot bt-active navtipTarget" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/spru" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;/a&gt;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/spru" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;Social Policy Research U&lt;/a&gt;nit&lt;/span&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="A space to discuss possible future directions for research computing at York. " class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/research-computing-opportunities" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Research Computing Opportunities&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="The Forum creates space for interested staff and graduate students to meet to discuss sustainability issues " class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/sustainability-forum" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Sustainability Forum&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="SATSU is an internationally recognised social science research centre exploring the dynamics, practices, and possibilities of contemporary science and technology" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/satsu" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;/a&gt;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/satsu" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;Science And Technologies Studies U&lt;/a&gt;nit&lt;/span&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/elearning-development" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;/a&gt;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/elearning-development" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;eLearning Development&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/computing-service" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Computing Service&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="The project that aims to improve collaboration between staff, researchers and between universities at York." class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/collaborative-tools-project" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Collaborative Tools Project&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="This is an informal group for people in the University who have a marketing role." class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/marketing-interest-group" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Marketing Interest Group&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/humanities-research" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Humanities Research&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="Help with research" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/liaison-librarians" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Liaison Librarians&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/stockholm-environment-institute" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Stockholm Environment Institute&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/library" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Library&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/planning" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Planning&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="IT Services collaboration for the PURE project" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/pure-project" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;PURE Project&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="York Centre for Complex Systems Analysis (YCCSA)" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/complex-systems" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Complex Systems&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/philosophy" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Philosophy&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/employability-co-ordination-group" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Employability Co-ordination Group&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div class="state-published navTreeFolderish navTreeLevel1 forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/18th-century-studies" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;18th Century Studies&lt;/span&gt;&lt;/a&gt;&lt;/div&gt;&lt;/li&gt;&lt;li class="navTreeItem visualNoMarker navTreeFolderish" style="font-size: 12px; line-height: 1.6667em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;br /&gt;&lt;div bt-xtitle="Project to explore use of business analytics / data visualisation in areas of student recruitment and admissions, using Tableau and simplified layers of data warehouse" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/business-analytics" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;&lt;img alt="Space" height="16" src="http://collaborate.york.ac.uk/folder_icon.png" style="border-bottom-style: solid; border-bottom-width: 0px; border-color: initial; border-left-style: solid; border-left-width: 0px; border-right-style: solid; border-right-width: 0px; border-top-style: solid; border-top-width: 0px; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: text-bottom;" width="16" /&gt;&amp;nbsp;&lt;/a&gt;&lt;span style="font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;a class="state-published navTreeFolderish" href="http://collaborate.york.ac.uk/home/business-analytics" style="color: #4d4d4d; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-style: none; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: none;"&gt;Business Analytics&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div bt-xtitle="Project to explore use of business analytics / data visualisation in areas of student recruitment and admissions, using Tableau and simplified layers of data warehouse" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;... and although we now have 117 registered members, many of those are not particularly active... BUT many more are contributing to the discussion about the needs and directions of the project from a mixture of background, research, development, management, support and financial. The important thing is that the &lt;b&gt;debate surrounding the project is happening in the project tool itself&lt;/b&gt;, and not hidden on email lists or password protected wikis.&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6daBSsA4oI/AAAAAAAAAKA/kZ9OP0r4QgY/s1600-h/CyninDiscussions.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6daBSsA4oI/AAAAAAAAAKA/kZ9OP0r4QgY/s320/CyninDiscussions.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div bt-xtitle="Project to explore use of business analytics / data visualisation in areas of student recruitment and admissions, using Tableau and simplified layers of data warehouse" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;The Social Media Survey is complete and awaiting publication. This will hopefully find sites that we don't know exist and the extent of the use of different tools at the university.&lt;/div&gt;&lt;div bt-xtitle="Project to explore use of business analytics / data visualisation in areas of student recruitment and admissions, using Tableau and simplified layers of data warehouse" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;br /&gt;&lt;/div&gt;&lt;div bt-xtitle="Project to explore use of business analytics / data visualisation in areas of student recruitment and admissions, using Tableau and simplified layers of data warehouse" class="state-published navTreeFolderish navTreeLevel1 navtip forcelink hot" style="cursor: pointer; font-size: 12px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0.4167em; padding-left: 0.5em; padding-right: 0.5em; padding-top: 0.4167em;" title=""&gt;&lt;br /&gt;&lt;/div&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-5782996989285366343?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/5782996989285366343/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/milestone-1-report.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/5782996989285366343'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/5782996989285366343'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/milestone-1-report.html' title='Milestone 1 Report'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_2IeQft2KL-g/S6daBSsA4oI/AAAAAAAAAKA/kZ9OP0r4QgY/s72-c/CyninDiscussions.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-4212628941741247112</id><published>2010-03-18T04:54:00.000-07:00</published><updated>2010-03-18T04:54:47.360-07:00</updated><title type='text'>Milestones</title><content type='html'>&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Milestone 1: People and Site&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;a. People. &lt;/b&gt;The first milestone is about assembling a team of people engaged with the as yet, non-existent PPPeople PPPowered technology. This team will provide the data, particularly their social media usage and profiles etc with which the initial data-mining can be built upon.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Goals: &lt;/b&gt;To have at least 20 people in a number of teams of people willing to work with me on the JISC project, providing hard data, usage and feedback.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;b. Presentation. &lt;/b&gt;The data that is ultimately mined needs to be "hosted" in a software tool that provides a level of service that is valuable enough to be used frequently. This will ensure that meaningful relationships between people are presented and then subsequently &lt;i&gt;pruned&lt;/i&gt; by participant. &amp;nbsp;The choice for which tool we use at this point was initially between Wordpress, Drupal, Elgg and LifeRay.&lt;br /&gt;&lt;br /&gt;The conceptual framing of what this tool should be, a people directory, a people discovery tool or a profile repository or a personal new aggregator is quite important with regards to setting expectations (for re-visiting the site)&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Outcomes:&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;To have a site up and running, available to staff at University of York that at least displays their profile in some way.&lt;/li&gt;&lt;li&gt;A blog post reporting on the project so far, the plan, the tool chosen (with reasons) and invitations to help with the plan (which semantic repositorities to us (and how), which crawling tools to use&lt;/li&gt;&lt;li&gt;Review the approaches take with other social networking sites to gather any useful approaches. For example, are their benefits to be gained by leveraging an existing social network and evangelising the use of a certain social network in order to ease data gathering.&lt;/li&gt;&lt;/ol&gt;&lt;b&gt;&lt;/b&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s1600-h/exclamation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;i&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s320/exclamation.png" /&gt;&lt;/i&gt;&lt;/a&gt;&lt;i&gt;This phase is very dependent on peoples' time availability and willingness to engage and also on the richness of the data returned from the survey. If nobody at all uses social media or completes the survey we won't have a lot to begin with.&lt;/i&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Milestone 2: Just Data and Display&lt;/span&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6IQqR-NaXI/AAAAAAAAAJk/dPbBdMCHQtg/s1600-h/Stage1.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6IQqR-NaXI/AAAAAAAAAJk/dPbBdMCHQtg/s320/Stage1.png" /&gt;&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The first stages of displaying mined data with avoid the complexities of semantic reasoning about the data, about using unusual sources of data and begin by simply asking people to complete a survey. This will hopefully result in a spreadsheet of sites, blogs, social media memberships (twitter, linkedin, CiteULike etc) and will form the basis for exploring slightly less explicit data (connections in Linked in, followers on Twitter, mentions in Twitter or on other people's blogs.&lt;br /&gt;&lt;br /&gt;To display this data, initially we will need to adapt the profile in the presentation tool, perhaps including an RSS aggregation.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Outcomes:&amp;nbsp;&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;A survey asking people to identify their social media accounts, interests in the form of keyword, URLs etc&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;&lt;li&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-weight: normal;"&gt;A site with at least 20 members showing their "simple" social media membership and some of the data (recent tweets, friends etc).&amp;nbsp;&lt;/span&gt;&lt;/b&gt;&lt;/li&gt;&lt;li&gt;The initial adaptation code (module/plugin) uploaded to the SVN site.&lt;/li&gt;&lt;li&gt;A blog post showing developments and with comments from members.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s1600-h/exclamation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;i&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s320/exclamation.png" /&gt;&lt;/i&gt;&lt;/a&gt;&lt;i&gt;This will require a good understanding of the plug-in architecture of the presentation tool. The whole point of this project is to be integrated into a tool that is usable and used.&lt;/i&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Milestone 3: More Deeply Mined Data&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;This stage looks to find information beyond that that is given, perhaps crawling Google searches, finding more distant links between resources. Ideally any crawling or querying tools should be well integrated into the presentation tool OR componentized and talk to the presentation tool via XMLRPC or similar.&lt;br /&gt;&lt;br /&gt;We will experiment with (probably python-based) web-crawlers such as Domo, Harvestman, Mechanize and Scrapy. In addition we will look at free or low lost tools and services available. These may include...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;a href="http://pipes.yahoo.com/pipes/"&gt;Yahoo Pipes&lt;/a&gt;&amp;nbsp;- online data manipulation and routing&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.80legs.com/"&gt;80 legs&lt;/a&gt;&amp;nbsp;- a crawling service&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.paterva.com/web4/index.php/maltego"&gt;Maltego&lt;/a&gt;&amp;nbsp;- a desktop open source forensics application&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.picalo.org/"&gt;Picalo&lt;/a&gt; - desktop forensics application (or any other tool)&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Outcomes:&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;More interesting data, richer connections&lt;/li&gt;&lt;li&gt;A generic crawling methodology&amp;nbsp;&lt;/li&gt;&lt;/ol&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s1600-h/exclamation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s320/exclamation.png" /&gt;&lt;/a&gt;&lt;i&gt;&amp;nbsp;Whilst I am more than familiar with creating simple crawlers this will need a more standalone, robust, better architected approach.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Milestone 4: Semantified Data&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This stage will look at how to apply semantic tools, or a more reasoned understanding of the data gathered. We will be looking at ...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Yahoo Boss for general searching.&lt;/li&gt;&lt;li&gt;OpenCalais, to attempt to discover entities in unstructured data&lt;/li&gt;&lt;li&gt;DBPedia, Edina, Freebase for understanding entities like Towns, Universities, Concepts etc&lt;/li&gt;&lt;li&gt;ePrints research repository to connect people via research outputs&lt;/li&gt;&lt;/ol&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s1600-h/exclamation.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s320/exclamation.png" /&gt;&lt;/a&gt;&lt;i&gt;&amp;nbsp;SPARQL is very new to me. I need more understanding of RDF, LinkedData etc. Although the JISC Dev8D conference gave me new insights into the possibilities presented by LinkedData and open data I still feel I have a way to go to fully understand this area.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;Milestone 5: Slightly More Sophisticated Presentation&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;Depending on the data gathered, there will be opportunities to present the data in more interesting and usable ways. Initially we will attempt to use simple visualisations such as&amp;nbsp;&lt;a href="http://code.google.com/apis/visualization/documentation/gallery/annotatedtimeline.html"&gt;timelines&lt;/a&gt;, &lt;a href="http://www.drasticdata.nl/DrasticTreemapGApi/index.html"&gt;tree maps&lt;/a&gt;, &lt;a href="http://visapi-gadgets.googlecode.com/svn/trunk/wordcloud/doc.html"&gt;word clouds&lt;/a&gt; etc. which can be easily integrated in the presentation tool.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S6IDO2DnQ9I/AAAAAAAAAJc/peQdHEXqCQo/s1600-h/MentionMap.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S6IDO2DnQ9I/AAAAAAAAAJc/peQdHEXqCQo/s320/MentionMap.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;More ambitious visualisations such as Mention Map example will be explored if appropriate.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;a href="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s1600-h/exclamation.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;i&gt;&lt;img border="0" src="http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s320/exclamation.png" /&gt;&lt;/i&gt;&lt;/a&gt;&lt;i&gt;&amp;nbsp;Need to understand more about the maths behind networks and visualisation. Luckily Gustav Delius is around to advise.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-4212628941741247112?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/4212628941741247112/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/milestones.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4212628941741247112'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4212628941741247112'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/milestones.html' title='Milestones'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_2IeQft2KL-g/S6ISZ_UIXiI/AAAAAAAAAJs/kJyUv7dnSs4/s72-c/exclamation.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-6948395889596937418</id><published>2010-03-16T10:10:00.000-07:00</published><updated>2010-03-16T10:10:49.703-07:00</updated><title type='text'>Basic Plan</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_2IeQft2KL-g/S5-5_6rIEWI/AAAAAAAAAI4/9J1C1crAqHs/s1600-h/JISC+Architecture.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="491" src="http://1.bp.blogspot.com/_2IeQft2KL-g/S5-5_6rIEWI/AAAAAAAAAI4/9J1C1crAqHs/s640/JISC+Architecture.png" width="640" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;This shows the basic plan for the project. The hope is that the &lt;i&gt;first run&lt;/i&gt; of this process should be complete as soon as is possible. This means that the REASONING and VISUALISATION stage may be omitted, where the the data can be presented as a simple list.&lt;br /&gt;&lt;br /&gt;This makes the choice for the DATA-GATHERING and PRESENTATION tools very important. We want them to be very well integrated. Ironically, given the importance of the PRESENTATION stage, namely what user profile data it stores (social media usernames etc), what features it supports (RSS aggregation, Twitter presentation, Plugin development opportunities etc).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-6948395889596937418?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/6948395889596937418/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/basic-plan.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6948395889596937418'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/6948395889596937418'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/basic-plan.html' title='Basic Plan'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2IeQft2KL-g/S5-5_6rIEWI/AAAAAAAAAI4/9J1C1crAqHs/s72-c/JISC+Architecture.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-3935698870996603696</id><published>2010-03-06T07:30:00.000-08:00</published><updated>2010-03-16T07:31:42.037-07:00</updated><title type='text'>Reflections on Tyre-Kicking</title><content type='html'>&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-3935698870996603696?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/3935698870996603696/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/reflections-on-tyre-kicking.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/3935698870996603696'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/3935698870996603696'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/reflections-on-tyre-kicking.html' title='Reflections on Tyre-Kicking'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-7234255694038788645</id><published>2010-02-16T07:30:00.000-08:00</published><updated>2010-03-16T10:29:59.753-07:00</updated><title type='text'>Creating Teams of "Tyre-kickers"</title><content type='html'>&lt;div style="text-align: right;"&gt;&lt;span class="Apple-style-span" style="font-size: xx-large;"&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="background-color: #999999;"&gt;Collaboration is impossible to fake&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/div&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;br /&gt;One of the problems with evaluating collaboration tools is that they can't be compared on features. One tool's blog can be very different from another tools blog. A small missing feature from a wiki can make it less easy to use.&lt;br /&gt;&lt;br /&gt;The only real way to test which collaboration tool will best suit the PRESENTATION layer, the part where any mined data is displayed &lt;b&gt;and used&lt;/b&gt; is to use the tool with a number of groups of real, live, collaborating people. With this in mind I have tried to engage a number of teams willing to "throw their all into testing an environment"... their nickname is "Tyre Kickers".&lt;br /&gt;&lt;br /&gt;A large component of this projects success will not depend on whether or not the technology works but whether or not people use it.&lt;br /&gt;&lt;br /&gt;One of the problems with this Tyre-Kicking approach is that once people have invested time and effort into working with, and more importantly around a tool, they may have grown to like the tool despite its warts and be loathe to move on to another tool because all their content exists somewhere else.&lt;br /&gt;&lt;br /&gt;Aware that the different teams will have differing uptakes in terms of engagement, some will want and need to use the tools whereas others will dabble. Some groups will work in the same office whilst others will be dispersed, across and beyond the university.&lt;br /&gt;&lt;br /&gt;The Tyre-Kickers I have held meetings with and introduced to up to three tools currently are....&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Applications Deployment - A Computing Service team&lt;/li&gt;&lt;li&gt;Biology - Development Team&lt;/li&gt;&lt;li&gt;Collaborative Tools Project&lt;/li&gt;&lt;li&gt;Communications - Small departmental team&lt;/li&gt;&lt;li&gt;Computing Service - Large departmental team&lt;/li&gt;&lt;li&gt;Court, Country, City. British Art 1660 - 1730 - Humanities project team&lt;/li&gt;&lt;li&gt;Digital Library - Development team&lt;/li&gt;&lt;li&gt;Finance at University of York - Small department&lt;/li&gt;&lt;li&gt;Humanities Research Centre - Large cross disciplinary departments&lt;/li&gt;&lt;li&gt;IT Support Office - Small team&lt;/li&gt;&lt;li&gt;Liaison Librarians - Large dispersed team&lt;/li&gt;&lt;li&gt;Marketing Interest Group - Dispersed team&lt;/li&gt;&lt;li&gt;Mathematics - Small team&lt;/li&gt;&lt;li&gt;Planning at University of York - Small team&lt;/li&gt;&lt;li&gt;Research Themes - Small project team looking at strategy&lt;/li&gt;&lt;li&gt;Science and Technology Studies Unit&lt;/li&gt;&lt;li&gt;Sociology - Small team&lt;/li&gt;&lt;li&gt;Stockholm Environment Institute - Large collaboration team&lt;/li&gt;&lt;li&gt;Sustainability at University of York - Small evangelisation team&lt;/li&gt;&lt;li&gt;Tourism and Topography in Britain - Humanities project team&lt;/li&gt;&lt;li&gt;Web Office - Small departmental team&lt;/li&gt;&lt;li&gt;Wireless Strategy - Small computing service team&lt;/li&gt;&lt;li&gt;York Centre for Complex Systems Analysis - Large cross departmental team&lt;/li&gt;&lt;li&gt;YorkShare HQ Revamp - Small development team&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;&lt;br /&gt;In some of the cases above, some have barely used the tools, some already have some systems in place whilst others have if anything being using the tools too much, investing lots of time and effort in adding and linking content.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-7234255694038788645?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/7234255694038788645/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/creating-teams-of-tyre-kickers.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7234255694038788645'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/7234255694038788645'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/creating-teams-of-tyre-kickers.html' title='Creating Teams of &quot;Tyre-kickers&quot;'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-2376747509423503847</id><published>2010-02-02T07:31:00.000-08:00</published><updated>2010-03-16T07:32:42.391-07:00</updated><title type='text'>The Tools</title><content type='html'>List of tools to be used&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-2376747509423503847?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/2376747509423503847/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/02/tools.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/2376747509423503847'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/2376747509423503847'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/02/tools.html' title='The Tools'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-4684516201117770210</id><published>2010-01-16T10:30:00.000-08:00</published><updated>2010-03-16T10:51:41.888-07:00</updated><title type='text'>The Collaborative Tools Project</title><content type='html'>The Collaborative Tools Project aims to improve collaboration at the University of York. Currently York has no central intranet system and no real support for academic blogging. A dozen or so MediaWiki wikis exist for specific technical teams (those who knew who to ask) but this was rightly seen as not an option going forward in terms of scalability.&lt;br /&gt;&lt;br /&gt;As a result, understandably, people have gone off-piste as it were and made use of Wordpress, Blogger, Google Docs and the like, or are running their own servers with tools like Wordpress, Moodle and Drupal.&lt;br /&gt;&lt;br /&gt;The Collaborative Tools Project looks to offer blogging, instant messaging, wikis, rich media hosting, discussion tools for all staff at the university. It is not looking to work with students because the requirements for student-facing systems are very different.&lt;br /&gt;&lt;br /&gt;Work began a few years ago gathering Use Cases to discover what both academic and other staff at York needed. These can be distilled down to...&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Collaborative Document Editing - working on documentation, bid documents, research papers ect&lt;/li&gt;&lt;li&gt;Project Workgroup - private wikis, blogs, file-sharing etc&lt;/li&gt;&lt;li&gt;Public Presentation - public-facing blogs, media rich, communities of practice, aesthetically pleasing&lt;/li&gt;&lt;li&gt;Other Stuff - meeting support, sharing minutes, gathering consensus, broadcasting&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;How The PPPeople PPowered project fits with the Collaborative Tools Project&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Interestingly or perhaps predictably, these Use Cases threw up what people wanted and needed but little in the way of what might improve the information eco-system at the university. &amp;nbsp;Almost nobody came to the Collaborative Tools Project with needs that were community focused.&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This JISC project will create a people-oriented, browsable area of discoverability where people will simply be able to find out about each other, their work and interests. Whatever tools (blogging, wikis etc) we provide for the staff will integrate with the PPPeople project acting as a s&lt;i&gt;emi intelligent hub&lt;/i&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Because the PPPeople project is embedded in a larger project, it guarantees plenty of real users providing feedback and use cases during its development.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-4684516201117770210?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/4684516201117770210/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/01/collaborative-tools-project.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4684516201117770210'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/4684516201117770210'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/01/collaborative-tools-project.html' title='The Collaborative Tools Project'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-8560185820472870114.post-1137453719007508390</id><published>2010-01-09T07:29:00.000-08:00</published><updated>2010-03-18T09:41:35.049-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='pppeople'/><category scheme='http://www.blogger.com/atom/ns#' term='collaboration'/><category scheme='http://www.blogger.com/atom/ns#' term='jisc'/><title type='text'>Brief Project Description</title><content type='html'>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_2IeQft2KL-g/S6JXttTqOTI/AAAAAAAAAJ0/9MrxmVXNmVU/s1600-h/YorkPeopleDirectory.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://1.bp.blogspot.com/_2IeQft2KL-g/S6JXttTqOTI/AAAAAAAAAJ0/9MrxmVXNmVU/s320/YorkPeopleDirectory.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;The aims of this project are to help enable collaboration at the University of York by making people and their work more easily &lt;b&gt;discoverable&lt;/b&gt; and more &lt;b&gt;richly linked&lt;/b&gt;. Currently at York the people directory is very basic (shown below) revealing nothing of a person's work, interests, colleagues or personality.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;This project looks to create richer connection between people using...&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;ul&gt;&lt;li&gt;mined data - such as web pages containing their email address or blog posts or Google queries&lt;/li&gt;&lt;li&gt;known data - such as research repositories, such as &lt;a href="http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.5008"&gt;CiteSeer&lt;/a&gt;, Mendeley,&amp;nbsp;&lt;a href="http://eprints.whiterose.ac.uk/"&gt;ePrints at White Rose Consortium&lt;/a&gt;&amp;nbsp;or &lt;a href="http://www.citeulike.org/search/all?q=Gustav+Delius"&gt;CiteULike&lt;/a&gt;&lt;/li&gt;&lt;li&gt;known organisational data - such as what department somebody belongs to&lt;/li&gt;&lt;li&gt;social media usage - such as &lt;a href="http://twitter.com/gustavdelius"&gt;Twitter&lt;/a&gt; followers, &lt;a href="http://uk.linkedin.com/pub/gustav-delius/5/5B2/196"&gt;LinkedIn Connections&lt;/a&gt; and Delicious tags&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://3.bp.blogspot.com/_2IeQft2KL-g/S5-Y34721aI/AAAAAAAAAIA/pOwxdwu5bVc/s1600-h/wireframe.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://3.bp.blogspot.com/_2IeQft2KL-g/S5-Y34721aI/AAAAAAAAAIA/pOwxdwu5bVc/s320/wireframe.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Above is a wireframe showing a simplification of the sorts of elements we expect to be able to present, including tags, people, links and related articles.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;The Caveats&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;span class="Apple-style-span" style="font-size: x-large;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;Of course, it would be easy to believe that creating an Amazon-esque recommendation list (below) is easy. Creating these sorts of tools can be deceptively difficult with the development always on the &lt;i&gt;brink of a breakthrough&lt;/i&gt; and never quite able to fully deliver.&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_2IeQft2KL-g/S5-eVuTuArI/AAAAAAAAAIQ/FI6kFX3lVNY/s1600-h/Amazon.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="241" src="http://2.bp.blogspot.com/_2IeQft2KL-g/S5-eVuTuArI/AAAAAAAAAIQ/FI6kFX3lVNY/s320/Amazon.png" width="320" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;Having experience of these pitfalls this project will initially aim to &lt;b&gt;implement the simplest technological solution available&lt;/b&gt;. We will begin by looking at the known, later extending towards the reasoned and guessed connections. We will begin with explicit connections and later explore semantic relationships.&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;Rather than beginning with the technology we will &lt;b&gt;simply ask people&lt;/b&gt; which social media tools they use with which to pre-populate our knowledge of people. This will take the form of survey. In addition to asking people for their social media usage the tool should allow people to explicitly make connections.&amp;nbsp;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/_2IeQft2KL-g/S5-iMpMCRlI/AAAAAAAAAIY/QUYtFhASUYo/s1600-h/connections1.jpeg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="320" src="http://2.bp.blogspot.com/_2IeQft2KL-g/S5-iMpMCRlI/AAAAAAAAAIY/QUYtFhASUYo/s320/connections1.jpeg" width="315" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;So that this tool doesn't become a technological folly, presenting beautifully visualised connections between people but never actually being used, the presentation of this data and these connections needs to be &lt;b&gt;embedded in an everyday online environment&lt;/b&gt;. By that, I mean that this tool should not be a destination in its own right but should be stumbled across naturally whilst looking someone up or engaging in discussion.&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/8560185820472870114-1137453719007508390?l=pppeoplepppowered.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://pppeoplepppowered.blogspot.com/feeds/1137453719007508390/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/brief-project-description.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1137453719007508390'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/8560185820472870114/posts/default/1137453719007508390'/><link rel='alternate' type='text/html' href='http://pppeoplepppowered.blogspot.com/2010/03/brief-project-description.html' title='Brief Project Description'/><author><name>tom</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_2IeQft2KL-g/S6JXttTqOTI/AAAAAAAAAJ0/9MrxmVXNmVU/s72-c/YorkPeopleDirectory.png' height='72' width='72'/><thr:total>1</thr:total></entry></feed>
