Upload
everythingability
View
329
Download
0
Embed Size (px)
DESCRIPTION
An attempt to create an "if you like this person, you make want to know about these people" interface.http://pppeople.collabtools.org.uk/
Citation preview
PPPeople PPPowered
If you like this person you may also like
The Cunning “Plan”
• Crawler - to get the data
• Semantic Engine - to understand the data
• Database - to save the data
• Visualisation - to show the data
• Social Media Account Details - to extend the data
• Social Integration - to lure people into the data
Crawlers: AKA spiders, bots, scrapers, data mining
80legs can crawl over 5,000,000 web pages in 1 hour
Yahoo BOSS
http://www.ibm.com/developerworks/linux/library/l-spider/?ca=dgr-lnxw01WebSpiderLinux
Extractiv
ScraperWiki
But Yahoo already has!!!
Python crawlers• Mechanize• Harvestman• Scrapy• Spynner
99 on Google Code!
Database
http://neo4j.org/
The largest production cluster has over 100 TB of data in over 150 machines.
Social Media Account Details
Visualisation
http://www.twitt3d.com
http://www.neuroproductions.be/twitter_friends_network_browser/
Neo4j + Gephi
http://thejit.org/
Social Media Integration
Lessons Learned
You’re on your own
“In theory”
Neo4j
Gephi
Treebeard
FreebaseWikipedia
Delicious
Betsy
Harvestman
Bug
Missing API
Data Cleansing
• People with one name
• Telephone numbers
• United Kingdom
• Lecturer
Data Scrying &
Not working with people slows you down
Working with people slows you down
“It’s just one big matrix”
Bad Semantics
Jargon Buster
SIPIGWSGDPS
V/C/011Zero Point Energy
Codex Alimentarius
Dept. Buster
Browse vs Search
No data creation
Cheap tricks: Pictures and Google
What Brings People Back?
The “jiggle” is everything
Conclusion
• I’m onto something