데이터 탐사 그리고 SE - Jan 8 2014, mc lab, seoul, south korea

데이터 탐사 그리고 SE김상희sanghee.kim@colorodo.edu

Session 1

과학 패러다임

빅 데이터 / 스몰 데이터

“네가 가진 데이터가 정말 빅 하다고 생각해?"

데이터 처리의 흐름데이터 모으기(생성)

데이터 가공

데이터 분석

데이터 시각화

데이터 처리와 관련 툴

각 툴에 대한 참고: http://goo.gl/ooYExB

google big query

apache lucene

manyeyes

google chart API

matplotlib

pandas, numpy

open refine

data wrangler

tableau

NodeXL

splunk

데이터 모으기(생성)

데이터 가공

데이터 분석

데이터 시각화

“트위터 데이터를 분석해보자."

데이터 가공

데이터 분석

데이터 시각화

일단 한 번 해보기

연장을 준비하자데이터 모으기(생성): Twitter API, Twython

데이터 가공: Python, Twython, IPython, Pandas

데이터 분석: Splunk, Python, IPython, Pandas

데이터 시각화: Splunk, matplotlib, Google Chart API

Add a comment

By this query we see that the highest retweet on the nexus 5 is by google. Which shows that they have a strong voice when getting to their fans.

data: https://github.com/sangheestyle/bisonsampledatapresenation: http://goo.gl/MLFf96

스플렁크로 해보기Interesting query 1 of 3

source="/Users/kimsanghee/Dev/datastore4bison/nexus_5_raw.csv.zip:./nexus_5_raw.csv"

By this query at launching time we see that the highest retweet by RT on the nexus 5 is by Sundar Pichai who is is a senior vice president at Google, where he oversees Android, Chrome and Google Apps. Which shows that he has a strong voice when getting to their fans.

트위터 데이터로 분석해보기Interesting query 2 of 3

Interesting query 3 of 3Top tweets show what organization is most influential during 19 days

2nd largest tweet is about promotional event for free nexus 5.

http://mobilesyrup.com/2013/11/02/win-a-google-nexus-5/

트위터 데이터로 분석해보기

“트위터 데이터로 분석해보기+ 툴과 생각 바꿔보기”

Bison: Project OverviewObject: Analyzing tweets about mobile devicesSource & demo: https://github.com/sangheestyle/bisonHow Big: 789,051 tweetsTools: Python, Pandas, Numpy, Google ChartMember: Jacob, Sanghee

What happen?http://goo.gl/L26mmP

What happen once again?

Only two weeks!

http://goo.gl/1yaekZ

What they use?http://goo.gl/OzYu0J

When they do?http://goo.gl/Y28HrQ

Where do they live?http://goo.gl/vyi1Gy

“툴 변경은 단지 툴만 변경되는 것인가?”

생각해보기

이거 어떻게 생각하냐? (마음에 드는건? 아닌건?)

정확성을 위해서 두 개의 그래프를 동시에 보여줘?

확장을 한다면 어떻게?

무슨 데이터를 더 제공한다면 너는 뭘 더 할 수 있지?

네가 만든 모델이 다른 곳에서 유효할까? (기간, 데이터 크기, 같은 카테고리의 다른 아이템, 다른 카테고리…)

Session 1 마감+ 중간회고

Session 2

“40 percent of major decisions are based not on facts, but on the manager’s gut”

from Software Analytics = Sharing Information by Thomas Zimmermann http://goo.gl/WQ0BKv

데이터 처리의 흐름데이터 모으기(생성)

데이터 가공

데이터 분석

데이터 시각화

“Git 에서 나오는 데이터를 분석해보자."

데이터 가공

데이터 분석

데이터 시각화

일단 한 번 해보기

연장을 준비하자데이터 모으기(생성): Git

데이터 가공: Python, IPython, Pandas

데이터 분석: Splunk, Python, IPython, Pandas

데이터 시각화: Splunk, matplotlib, Google Chart API

“미리 만들어 놓은 것으로 집단 감상을 해보자."

“우리 집단의 특성을 시간순으로 알아보자."

“누가누가 잘하나? 눈속임에 주의하면서!"

“분쟁지역! UN은 어디에?"

“다른것도 한 번 보자."

https://github.com/twbs/bootstrap/graphs

“우리 이래도 되는거야?"

생각해보기

미숙한 모델을 들이대지 말 것

상관관계

인센티브

From SE lecture by Professor Ruth Dameron (University of Colorado, Boulder)

개발: 어떤식으로 일을 하면 덜 고통스러울까?

교육: 우리는 어떠한 교육을 만들어내야 하는가?

HR: 어떤 사람들이 필요한가? 조직 구조는?

조직문화: 우리 조직의 특성은 어떠한가?

확장해보기

어디서 어떻게 데이터를 수집 할 것인가?데이터는 집단을 충분히 반영하는가?데이터는 지속적으로 변경될 수 있다.

분석하는 방법에 따라서 정보는 달라질 수 있다.가정을 하고, 대화를 하고, 생각을 확장하자.집단 내 전문가들을 이용하자.

잘라내기보다 이상치를 조정해보자.의도적으로 툴을 바꿔보자.(그 외에는?)

중요한 점

“(현 시스템 회고, 개선안 도출, 반영) X 지속적인 반복”

“결론적으로 무엇을 하고 왜 할건데?”

“커밋 개수로 개발자의 능력을 판단할 수 있을까?”

집단 토론

Session 2 마감+ 최종회고

데이터 탐사 그리고 SE - Jan 8 2014, mc lab, seoul, south korea

Education

Reactive, component 그리고 angular2

20171013 데이터 분석가를 위한 데이터플랫폼 Seminar - Public Cloud 101

파이썬 내부 데이터 검색 방법

Seoul Entertainment: The official Hallyu guide to Seoul

[232] 수퍼컴퓨팅과 데이터 어낼리틱스

[2015 체인지온] 로봇과 미래 그리고 우리 사회 - 한재권

『데이터 분석을 통한 네트워크 보안』 - 맛보기

R을 이용한 데이터 분석

파이썬+데이터+구조+이해하기 20160311

Seoul int'l culinary academy (seoul culinary 2012)

링크드 데이터 구축 공정 가이드V1.0

1,2,5 Seoul National University, Seoul, Korea 3 Kunsan ... · 1,2,5 Seoul National University, Seoul, Korea 3 Kunsan National University, Kunsan, Korea 4 Dankook University, Seoul,

SEOUL METROPOLITAN GOVERNMANT IN KOREA · SEOUL METROPOLITAN GOVERNMANT IN KOREA. I II III Seoul IT Policy Seoul 3D GIS Seoul Mobile GIS IV Vision of Seoul GIS . I . Seoul IT Policy

SEOUL STATION SHUTTLE BUS STOP (From Seoul Station)cdlhotels.co.kr/Upload/en_SELHITW_Shuttle-Seoul-Station_Aug2019.… · SEOUL STATION SHUTTLE BUS STOP (From Seoul Station) Proceed

교육분야 성취기준 링크드 데이터 프로파일 설계

AUTUMN SEOUL SEOUL 5 วัน 3 คืน · AUTUMN SEOUL SEOUL 5 วัน 3 คืน • ชมวิวบน สะพาน Gamaksan Suspension Bridge • สัมผัสบรรยากาศ

Council of Local Authorities for International Relations · 2012. 11. 9. · SEOUL FOOD SEOULFOOD : SEOUL FOOD 2013 SEOUL FOOD & HOTEL/ SEOUL FOODTECH / SEOUL INT'L CULINARY ACADEMY

Flurry 를 사용한 사용자 데이터 분석

e-Newsletter Vol.6 000 Seoul 2015 Seoul 2015 CAPA Seoul ... · e-Newsletter Vol.6 000 Seoul 2015 Seoul 2015 CAPA Seoul 201 My Conference Live Q&A Contact Exchange Fun Event ooe CAPAS«ot20tS

IFIES-International Food Industry Exhibition, Seoul 2009 SEOUL FOOD & HOTEL SEOUL FOODTECH SEOUL FOODPACK SEOUL FOOD SAFETY