Internalizing location services with geo names

Preview:

DESCRIPTION

Presented by John Marc Imbrescia, Senior Software Engineer, Etsy.com Etsy recently chose to bring our location services in house. We used the open source GeoNames data set and built the tools we needed to use that data to allow members to select their location, show translations of place names, and to feed data into our search database for local, regional, and country based searches. This talk will cover the implementation details and decisions we made along the way. How we mapped places from our old data set to the GeoNames data. The internal tools we built including a SOLR core for doing location place name autosuggest. Modifications to our Listings Search and Shop Search cores and the different ways we use location based search around the site both distance and region based using GeoNames hierarchy data. There will also be a discussion about choosing to release some of the tools we built for this project open source and the decisions behind the non-search (display etc.) related elements of the project and the tools we chose for them and why.

Citation preview

Internalizing  loca.on  services  with  GeoNames

John Marc ImbresciaSenior Software Engineer - Etsy

Wednesday, May 1, 13

Internalizing location services with GeoNames

May 2 2013

Wednesday, May 1, 13

The world’s online handmade marketplace.What is Etsy.com?

Wednesday, May 1, 13

What is Etsy.com?•20 million unique items•18 million daily item searches•800,000 sellers•28 million unique views per month•Developer blog: codeascraft.etsy.com•450 worldwide employees

Wednesday, May 1, 13

Our ProblemLocation names were only in English

•Search based on English names•Display and search needed to be i18n friendly.•API limits and speed concerns meant we needed a new solution.

Wednesday, May 1, 13

What do we use Location for?More than just search

•Display•Local Search

•No Mapping•No Bounding boxes

Wednesday, May 1, 13

What do we use Location for?Item Search

Wednesday, May 1, 13

What do we use Location for?Item Search

Wednesday, May 1, 13

What do we use Location for?Item Search

Wednesday, May 1, 13

What do we use Location for?Location Display

Wednesday, May 1, 13

How did this use to work?•Yahoo API•Every lookup was an API call•Stored user input and API response•Searched based on text match of API response•Not radius using lat/lon•No way to Internationalize

Wednesday, May 1, 13

What Services did Etsy need to Internalize location services?

•Lookup - Autosuggest•Update - Scripts to refresh data•Display - Built into the php stack•Search - Existing, modified for new pattern

Wednesday, May 1, 13

What we have now•GeoNames as a data source•Feeds “geonamessuggest” Solr Core•Sqlite database for place name lookup•GeoName IDs used for local search•Leverages GeoName hierarchy data•Built in Internationalization

Wednesday, May 1, 13

How did we get here?•Mapped old locations to GeoNames•Added Geoname ID hierarchy to listing search•Pushed out Sqlite database to webservers•Slowly transitioned lookup and search services•Did side by side testing to look for anomalies

Wednesday, May 1, 13

What are the data types?GeoNames

Wednesday, May 1, 13

SchemasGeoNames

•775k Entries•1.4m alternate spellings

Wednesday, May 1, 13

SchemasGeoNames

Wednesday, May 1, 13

GeonamessuggestOur autosuggest for place names

•Localized•GeoIP•Population

Wednesday, May 1, 13

GeonamessuggestSchema

Wednesday, May 1, 13

Distance and population come firstSort function

Wednesday, May 1, 13

GeonameId HierarachyLocal listing search

•Each listing gets a hierarchy of geonameids•Local search is a filter on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields

Wednesday, May 1, 13

GeonameId CollectionLocal listing search

•Each listing gets a hierarchy of geonameids•Local search is a filter query on this ID•Fast & Reliable•Enables powerful functionality•Kept old data fields

Wednesday, May 1, 13

CONFERENCE PARTYThe Tipsy Crow: 770 5th AveStarts after Stump The ChumpYour conference badge gets you in the door

TOMORROW Breakfast starts at 7:30Keynotes start at 8:30

CONTACT John Marc Imbrescia@thejohnmarcjohnmarc@etsy.com

Wednesday, May 1, 13

Recommended