Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
2012. 12. 07
http://kse.kaist.ac.kr
KAIST
?
•– / , , , ,
– ; , ,
– ; data.gov(data.or.kr), recovery.gov, challenge.gov
•– (Polarization; balkanization)
– ;
– ; ;
3
What Can We Do?
• ?
• ? ( , , )
2011.11
2012.12
What Can We Do?
• , , – “ ” ; /– : , , – /
• -– : ,
, , ( )
• “” (Doug Schuler, 1999)
• : – : , MMORPG
–––––
6
• : – , , ,
–
•– : , – : , – : , – : 2.0, ,
Applications
Content provider
Fixed access
Content
Networking
Radioaccess
참고문헌
Schuler 1994Social computing, CACM 1994
Wikipedia http://wikipedia.org/wiki/Social_computing
,
Wang et al. 2007 Social computing: From Social Informatics to Social Intelligence, IEEE-IntelSys
8
Social computing: From social informatics to social intelligence, Wang et al., IEEE Intelligent Systems, 22(2), 79-83 2007
•––––
• , ,
• /
•
•
• (sentiment analysis/opinion mining)
•
•
• …
• LinkedIn’s People You May Know (PYMK)–
• Facebook’s PYMK– ;
–
• Google/Amazon’s web log analytics–
( , )
DJ Patil, Building Data Science Teams, 2012
12
“Acquire, process, and leverage data in a timely fashion to create efficiencies, iterate on and develop new products and navigate the competitive landscape”
DJ Patil, Building Data Science Teams, 2012
/ / Product delivery(e.g.., leveraging
smarts in product features)
/
• :
• : /
• :
•––––
•– , ,
•–– API –––
•
• (Structured data)–– ( , , )
• (Semi-structured data)–
( )
– ( “self-describing” )
– (XML, JSON )
• (Unstructured data)––
http://www.dcs.bbk.ac.uk/~ptw/teaching/ssd/notes.html
:
• : HTTP(Hypertext Transfer Protocol)
•GET
response
• GET /parking/space.asp
200 OK… data data
( / / )
18
:
•– IP ,
– (GET, POST )
– ( : 200, : 404)
–– Referrer URL:
– User agent ( )
– ,
64.12.105.154 - - [16/Feb/2001:06:59:35 -0800] "GET /cgi-bin/Count.cgi?df=gecbhome&dd=B HTTP/1.0" 404 21164.12.97.10 - - [16/Feb/2001:06:59:37 -0800] "GET /java/FixFontHeadline.class HTTP/1.0" 200 289864.12.97.9 - - [16/Feb/2001:06:59:43 -0800] "GET /graphics/trombone.gif HTTP/1.0" 200 105064.12.96.206 - - [16/Feb/2001:06:59:58 -0800] "GET /images/joinband.jpg HTTP/1.0" 200 1345764.12.97.9 - - [16/Feb/2001:07:00:30 -0800] "GET /images/parade.jpg HTTP/1.0" 200 22754128.93.11.53 - - [16/Feb/2001:10:20:53 -0800] "GET /schedule.shtml HTTP/1.0" 200 7103128.93.11.53 - - [16/Feb/2001:10:26:48 -0800] "GET /index.shtml HTTP/1.0" 200 8650128.93.11.53 - - [16/Feb/2001:10:21:18 -0800] "GET /about.shtml HTTP/1.0" 200 9151128.93.11.53 - - [16/Feb/2001:10:26:25 -0800] "GET /communty.shtml HTTP/1.0" 200 5731
:
•
•
•– NAT(Network Address Translation)
–
: API
• REST – REpresentational State Transfer– HTTP (client/server + stateless server)
–– URL
– - (State transfer)
– (JSON, XML )
GET http://search.twitter.com/trends.json
Returns the top ten topics that are currently trending on Twitter.
GET Read
POST Create
PUT Update
DELETE Delete
How to access top ten Twitter topics?
21
: API
•
• 140
• REST API: , ,
•
22
: API
• Twitter REST API v1.1– https://dev.twitter.com/docs/api/1.1
• My Applications (OAuth + Access tokens)– https://dev.twitter.com/apps
use Net::Twitter::Lite;
my $nt = Net::Twitter::Lite->new(consumer_key => my consumer_key',consumer_secret => ‘my consumer secret',access_token => my access token',access_token_secret => my access token secret'
);
// search “google”, return 100 resultsmy $r = eval { $nt->search({ q => "google"}) };
for my $status ( @{$r->{results}} ) {print "$status->{text}\n";
}
Karok smpai lebam with @SyarhDinie @MyoArieff @muhd_google @DzuLHarithGoogle Sources Say Company Didn't Buy ICOA Wireless (Arik Hesseldahl/AllThingsD) http://t.co/neX31U01Wide character in print at test.pl line 14.RT @NMB_gplus: [高野祐衣] 寝 !! http://t.co/p4p9P8vh #nmbLG Optimus G2 Allegedly Packs 2GHz Quad-Core Processor, 5-inch Display http://t.co/txmdmKel #tekfalkeRT @NancyGraceHLN: Did #TotMom detectives overlook Google search on Anthony home computer for suffocation methods?Google compra un proveedor de redes wifi por 308 millones http://t.co/L97itVep
http://search.cpan.org/~mmims/Net-Twitter-3.18004/lib/Net/Twitter.pod
23
: www.data.go.kr
:
• :
• : wget curl
• :––– URL
– URL
– URL
• : Nutch( ), Hetrix( )
25
:
•
Web
URLs crawledand parsed
URLs frontier
Unseen Web
Seedpages
26
:
•– ( )– ( )– ( )
•–– GPS
•
•
•
• , ,
:
• :
Personal Sensing
Public Sensing
Social Sensing
SENSE
LEARN
INFORM, SHARE, PERSUASION
Mobile Sensing A
rchitecture
Mobile Computing Cloud
:
•
Nexus One
Galaxy Nexus iPhone4/5 Samsung
Galaxy S3HTC
IncredibleGalaxy
Tab/ iPad2
(GPS/ )
:
•
•
•– ( GPS)– ( )
•
•
• (Reliability)– (availability) (fault-tolerance)
• (Scalability)– ( )
• (Extensibility)– , ,
• (Manageability)–
http://www.slideshare.net/cloudera/flume-intro100715
32
agent
agent
agent
agent
agent
agent
agent
agent
agent
( )
/
Real-timeAggregator
Real-timeAggregator
Real-timeAggregators
Collection Manager
CollectionPlanning
: Flume, Scribe, Chukwa
33
•––––
Review Spotlight: A User Interface for Summarizing User-generated Reviews Using Adjective-Noun Word Pairs, Koji Yatani, Michael Novati, Andrew Trusty, Khai N. Truong, CHI 2011
?
• (Rivadeneira at al.)– :
– :
– (impression formation)
– (recognizing):
36
“ ”
•
•
•( )
Review Spotlight: A User Interface for Summarizing User-generated Reviews Using Adjective-Noun Word Pairs, Koji Yatani, Michael Novati, Andrew Trusty, Khai N. Truong, CHI 2011
• 8
• 4– , – 30
•
•–––
•– : (Chinese food)– : (great steak)
•––
• :
Review Spotlight
•
Review Spotlight, Tag-cloud-like interface
Review Spotlight: A User Interface for Summarizing User-generated Reviews Using Adjective-Noun Word PairsKoji Yatani, Michael Novati, Andrew Trusty, Khai N. Truong, CHI 2011
42
Review Spotlight
1. -
2.
3.
4.
43
Review Spotlight
1. -– (Part-of-speech tagging)
– ( )
– : “The food is great” => great food
2.– ,
– :
:
: 10~30
44
Review Spotlight
3. – SentiWordNet ( , )
– , ,
––– ;
45
Review Spotlight
4. ––– 4
46
Review Spotlight
•
• : , / , ,
• : / , , ,
• : , API,