Upload
han-woo-park
View
686
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Studying online conversations in the Korean blogosphere: A network approach
Citation preview
Studying online conversations in the Korean blogosphere: A network approach
Anatoliy Gruzd ([email protected])
Dalhousie University, Canada
Chung Joo Chung ([email protected])
State University of New York at Buffalo, USA
Han Woo PARK ([email protected])
YeungNam University, Korea
International Sunbelt Social Network ConferenceRiva del Garda, Italy
July 3, 2010
Anatoliy Gruzd ([email protected])
Automated Discovery of Online Networks
2
What content-based features of online interactions help to uncover nodes and ties between online participants?
Automated Discovery of Social Networks
among Blog Readers/Commentators
?
Anatoliy Gruzd ([email protected])
Automated Discovery of Online Networks
3
ICTA - Online Tool for Social Network Discovery http://TextAnalytics.net
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 5
Dataset
OhMyNews – popular blogging website in Korea
Single blog authored by 방짜 (bangzza) http://blog.ohmynews.com/bangzza
~1,000 comments (April 2008 - April 2009)
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 6
Sample blog post and comments
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 7
Automated Discovery of Social NetworksChain (reply-to) method
Visualized by CMU ORA
Anatoliy Gruzd ([email protected])
Automated Discovery of Online Networks
8
Automated Discovery of Social Networks Name Network Approach
Method Connect the sender to people mentioned in the message
Connect people whose names co-occur in the same message(s)
Discovered Tie(s)
Ann -> Steve Ann -> Natasha
Steve <-> Natasha
FROM: Ann
“Steve and Natasha, I couldn't wait to see your site.
I knew it was going to [be] awesome!”
This approach looks for personal names in the content of the comments to identify social connections between online participants.
Anatoliy Gruzd ([email protected])
Automated Discovery of Online Networks
9
• Main Communicative Functions of Personal Names (Leech, 1999)
– getting attention and identifying addressee
– maintaining and reinforcing social relationships
• Names are “one of the few textual carriers of identity” in discussions on the web (Doherty, 2004)
• Their use is crucial for the creation and maintenance of a sense of community (Ubon, 2005)
Automated Discovery of Social Networks
Name Network Approach
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 10
Network representation of blog comments
Anatoliy Gruzd ([email protected])
Automated Discovery of Online Networks
11
1 난 (I) 2 사진쟁이 (Photographer) 3 그래서 (and, so)4 테츠 (Tetz)5 방짜 (Bangzza)6 댓글 (comment)7 녹두 (Nokdu)8 ㅋㅋ (, : ))9 좀 (a little, a bit)10 사람 (people)
Among 10 nodes, only 2, 4, 5 and 7 are NANEs or IDs of participants in the Bangzza blog
1
2 3
5
67
10
8
94
Semi-automated network evaluation
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 12
Clues suggesting that a word is likely to be a nickname
context words such as "님 " = an honorific or "씨 " = Mr./Ms
full name, which is almost always three characters
punctuation indicative of someone being addressed (e.g., “/” or “:”)
combination of characters (Korean, English and/or Chinese), symbols (e.g., underscores, hyphens) and numbers
patterns indicative of non-native words
phonetic koreanization of English (e.g., "미디어몽골 " = mediamogul = Media Mogul)
phonetic romanization of Korean (e.g. “jihwaja” = 지화자 )
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 13
Words that are NOT likely to be used as a nickname
a word candidate is a phrase
e.g., if the “FROM” field is used more like a subject line (possible indicators include white spaces and length)
a word candidate consists of a single character (e.g., “a” or “ㄱ” )
a word candidate consists of netspeak
emoticons (e.g. “=_=”)
slang and abbreviations (e.g., using “2MB” to refer to the former Korean president)
onomatopoeia (e.g., "ㅋㅋ” = heehee, "하하” = haha)
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 14
Words that are NOT likely to be used as a nickname (2)
a word candidate appears more than one time in the comment
a word candidate consists of random characters (e.g. "ㅁㄴㅇㄹ " or “asdf”)
a word candidate is a short, conversational word or phrase (e.g., "나 " = me,"아이고 " = oh no, "그래서 " = so/therefore)
a word candidate is a common word or idea in the given context/topic (e.g "대한민국 " = Republic of Korea, "쥐체사상 " = a newly created word used to refer to political fanatics)
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 15
Conclusion
A network representation of comments posted to a blog makes it much easier to analyze social interactions among online participants
Even in a blog dominated by mostly anonymous and argumentative commentators, a community can still emerge
Suggested future improvements to our network discovery algorithm.
Anatoliy Gruzd ([email protected]) Automated Discovery of Online Networks 16
Acknowledgments
Jaeeun Yoo at the University of Toronto for her help with the data analysis
The project is in part supported by