Lecture4 Social Web

Social WebLecture 4

How can we MINE, ANALYSE and VISUALISE the Social Web? (1)

Marieke van ErpThe Network Institute

VU University Amsterdam

• UCG provides an enormous wealth of data

• insights in users’ daily lives

• insights in communities

• insights in trends

To whom it may concern

• Politicians

• Companies

• Governmental institutions

• You?

The Age of Big Data

• 25 billion tweets on Twitter in 2010, by 175 million users

• 360 billion pieces of contents on Facebook in 2010, by 600 million different users

• 35 hours of videos uploaded to YouTube every minute

• 130 million photos uploaded to flickr per month

Questions to Ask

• Who uploads/talks? (age, gender, nationality, community)

• What are the trending topics?

• What else do these users like?

• Who are the most/least active users?

• etc.

What do you prefer?

Image: http://www.co.olmsted.mn.us/prl/propertyrecords/RecordingDocuments/PublishingImages/forms.jpg

The Rise of the Data Scientist

http://radar.oreilly.com/2010/06/what-is-data-science.html

The Rise of the Data Scientist

• Data Science enables the creation of data products

• Data products are applications that acquire their value from the data, and create more data as a result.

• Users are in a feedback loop: they constantly provide information about the products they use, which gets used in the data product.

Popular Data Products

Data Mining 101

(Inspired by George Tziralis’ FOSS Conf’09, John Elder IV’s Salford Systems Data Mining Conf. and Toon Calders’ slides)

Data mining is the exploration and analysis of large quantities ofdata in order to discover valid, novel, potentially useful, andultimately understandable patterns in data.

http://www.freefoto.com/images/33/12/33_12_7---Pebbles_web.jpg

Data Mining 101

Databases Statistics

Artificial Intelligence

• Data input & exploration

• Preprocessing

• Data mining algorithms

• Evaluation & Interpretation

Data Input & Exploration

• What data do I need to answer question X?

• What variables are in the data?

• Basic stats of my data?

Input & Exploration in ‘LikeMiner’

Preprocessing

• Cleanup!

• Choose a suitable data model

• What happens if you integrate data from multiple sources?

• Reformat your data

Preprocessing in ‘LikeMiner’

Data mining algorithms

• Classification: Generalising a known structure & apply to new data

• Association: Finding relationships between variables

• Clustering: Discovering groups and structures in data

Mining in ‘LikeMiner’

• Filter users by interests

• Construct user graphs

• PageRank on graphs to mine representativeness

• Result: set of influential users

• Compare page topics to user interests to find pages most representative for topics

Interpreting your results

Data Mining is not easy

Mining Social Web Data

source: http://kunau.us/wp-content/uploads/2011/02/Screen-shot-2011-02-09-

at-9.03.46-PM-w600-h900.png

Single Person

Source: http://infosthetics.com/archives/2011/12/all_the_information_facebook_knows_about_you.html

See also: http://www.youtube.com/watch?feature=player_embedded&v=kJvAUqs3Ofg

Populations

http://www.brandrants.com/brandrants/obama/

Brand Sentiment via Twitter

http://flowingdata.com/2011/07/25/brand-sentiment-showdown/

Lecture4 Social Web

Education

Merge - Lecture4

Doctor Lecture4

Lecture4 chap5

Lecture4 blood

Lecture4 minoansmycenaensedited

Assembly Lecture4

Lecture 4kabanets/405/Slides/Lecture4.pdf · 2019. 1. 10. · Lecture 4 Thursday, January 10, 2019 12:19 PM Lecture4 Page 1 . Lecture4 Page 2 . Lecture4 Page 3 . Lecture4 Page 4

Lecture4 Slides

Lecture4 Assembly

Lecture4- ENG.pdf

Lecture4 transmission

Lecture4 Compression

BDACA1516s2 - Lecture4

Lecture4 Erd

Lecture4: 123.702

Dam Lecture4

TSL3123 Lecture4

FY Lecture4

Lecture4 Torsion

Accnt lecture4