30
Machine Learning and Data at Meetup Evan Estola Meetup.com [email protected] @estola

Machine learning and data at Meetup

Embed Size (px)

DESCRIPTION

Presentation given for Tech Talks at Meetup event on 8/27/13

Citation preview

Page 1: Machine learning and data at Meetup

Machine Learning and Data at Meetup

Evan EstolaMeetup.com

[email protected]@estola

Page 2: Machine learning and data at Meetup

My Background

● Software Engineer/Data Scientist● Machine learning team● At Meetup since May 2012● BS Computer Science

○ Information Retrieval○ Data Mining○ Math

■ Linear Algebra■ Graph Theory

Page 3: Machine learning and data at Meetup

You

● Data Scientists?● Engineers?● Statisticians?● Students?● Non-technical?

Page 4: Machine learning and data at Meetup

What this talk is

● Super secret peek into Meetup!● Meetup recommendations examples● How we do recommendations

(model/features)● Lessons learned/what’s next

Page 5: Machine learning and data at Meetup

What this talk isn’t

● What is a data scientist?● What is big data?● How does matrix factorization or gradient

boosted decision trees or map reduce or this framework I hope you’ll use work?

Page 6: Machine learning and data at Meetup

Why Meetup data is cool

● Real people meeting up● Every meetup could change someone's life● No ads, just do the best thing● Oh and 114 million rsvps by >14 million

members● 2.7 million rsvps in the last 30 days

○ ~1/second

Page 7: Machine learning and data at Meetup
Page 8: Machine learning and data at Meetup

Data at Meetup

● User data● Site monitoring/performance● AB testing● Recommendations*

Page 9: Machine learning and data at Meetup

“Everything is a recommendation”

● Not my phrase● Not actually true yet● Working on it

Page 10: Machine learning and data at Meetup

Recommendation

Page 11: Machine learning and data at Meetup
Page 12: Machine learning and data at Meetup
Page 13: Machine learning and data at Meetup

Topic Recommendations

● New registrant● Don’t know anything about you yet!● Most popular is boring/repetitive

Algorithm:○ Group local meetups by topic○ Select topic with most groups○ Remove those groups○ Repeat

Page 14: Machine learning and data at Meetup
Page 15: Machine learning and data at Meetup
Page 16: Machine learning and data at Meetup

Group/Event Recommendations

● Replaced a topic only system● Inputs:

○ Member, location, topics, facebook friends? demographics?

● Outputs:○ Ranking

Page 17: Machine learning and data at Meetup

Collaborative Filtering

● Classic recommendations approach● Users who like this also like this

Page 18: Machine learning and data at Meetup

Why Recs at Meetup are hard

● Incomplete Data (topics)● Cold start● Asking user for data is hard● Going to meetups is scary● Sparsity

○ Location○ Groups/person○ Membership: 0.001%○ Compare to Netflix: 1%

Page 19: Machine learning and data at Meetup

Supervised Learning/Classification

● “Inferring a function from labeled training data”

● Joined Meetup/Didn’t join Meetup● “Features”

Page 20: Machine learning and data at Meetup

Topic Match

Page 21: Machine learning and data at Meetup

State Match

Page 22: Machine learning and data at Meetup

Logistic Regression

● Score○ “Probability”○ Ranking

● Fast + Easy● Weights!

Page 23: Machine learning and data at Meetup

Group recommendation weights

● TopicMatch 1.21● TopicMatchExtended 0.17● FacebookFriends 0.15● SecondDegreeFacebook 0.79● AgeUnmatch -2.20● GenderUnmatch -2.6● StateMatchFeature 0.44● CityMatch 0.02● DistanceBucket <2 1.39● DistanceBucket 2-5 0.83● DistanceBucket 5-10 0.60● DistanceBucket >10 n/a

Page 24: Machine learning and data at Meetup

Making up features

● “Zipscore”● All topics not created equal● Facebook likes

Page 25: Machine learning and data at Meetup

Real data is gross

● Preprocessing is critical!○ missing data○ outliers○ log scale○ bucketing○ selection/sampling (not introducing bias)

Page 26: Machine learning and data at Meetup

Cleaning data

● Schenectady● Beverly Hills● Astronaut● Fake RSVP boosts (+100 guests!)● Rsvp hogs

Page 27: Machine learning and data at Meetup
Page 28: Machine learning and data at Meetup
Page 29: Machine learning and data at Meetup

TO THE FUTURE!

● Hadoop● Clicks● Impressions● People to people recommendations?● Recommending people to groups?

Page 30: Machine learning and data at Meetup

Thanks!

Smart people come work with me.http://www.meetup.com/jobs/

Special thanks:● Chris Halpert● Victor J Wang