9
Aardvark Aardvark Anatomy of a Anatomy of a Large-Scale Large-Scale Social Search Social Search Engine Engine

Aardvark Anatomy of a Large-Scale Social Search Engine

Embed Size (px)

Citation preview

Page 1: Aardvark Anatomy of a Large-Scale Social Search Engine

AardvarkAardvarkAardvarkAardvark

Anatomy of aAnatomy of aLarge-Scale Social Large-Scale Social

Search EngineSearch Engine

Anatomy of aAnatomy of aLarge-Scale Social Large-Scale Social

Search EngineSearch Engine

Page 2: Aardvark Anatomy of a Large-Scale Social Search Engine

The Library vs. The VillageThe Library vs. The VillageThe Library vs. The VillageThe Library vs. The Village

The LibraryThe LibraryKeywords are used to searchKeywords are used to searchKnowledge base comes from a small number of publishersKnowledge base comes from a small number of publishersContent is created before the question is askedContent is created before the question is askedTrust is based on AuthorityTrust is based on AuthorityTraditional search engines (e.g. Google)Traditional search engines (e.g. Google)

The VillageThe VillageQuestions are phrased in natural languageQuestions are phrased in natural languageAnswers are generated in real timeAnswers are generated in real timeAnyone in the community might answerAnyone in the community might answerTrust is based on intimacyTrust is based on intimacy

The LibraryThe LibraryKeywords are used to searchKeywords are used to searchKnowledge base comes from a small number of publishersKnowledge base comes from a small number of publishersContent is created before the question is askedContent is created before the question is askedTrust is based on AuthorityTrust is based on AuthorityTraditional search engines (e.g. Google)Traditional search engines (e.g. Google)

The VillageThe VillageQuestions are phrased in natural languageQuestions are phrased in natural languageAnswers are generated in real timeAnswers are generated in real timeAnyone in the community might answerAnyone in the community might answerTrust is based on intimacyTrust is based on intimacy

Page 3: Aardvark Anatomy of a Large-Scale Social Search Engine

PurposePurposePurposePurpose

Harness the power of the village to answer Harness the power of the village to answer questions not easily answered with traditional questions not easily answered with traditional search engines.search engines.Works best for subjective questionsWorks best for subjective questions

““What is a good Italian restaurant in the north end What is a good Italian restaurant in the north end of Boston with live entertainment on Fridays?of Boston with live entertainment on Fridays?

Utilize the power of mobile devices to provide Utilize the power of mobile devices to provide quick answers from knowledgeable users.quick answers from knowledgeable users.

Harness the power of the village to answer Harness the power of the village to answer questions not easily answered with traditional questions not easily answered with traditional search engines.search engines.Works best for subjective questionsWorks best for subjective questions

““What is a good Italian restaurant in the north end What is a good Italian restaurant in the north end of Boston with live entertainment on Fridays?of Boston with live entertainment on Fridays?

Utilize the power of mobile devices to provide Utilize the power of mobile devices to provide quick answers from knowledgeable users.quick answers from knowledgeable users.

Page 4: Aardvark Anatomy of a Large-Scale Social Search Engine

Aardvark ComponentsAardvark ComponentsAardvark ComponentsAardvark Components

Crawler / IndexerCrawler / IndexerFinds and labels resourcesFinds and labels resourcesUsers, not documentsUsers, not documents

Query AnalyzerQuery AnalyzerClassifies queriesClassifies queries

Filters out non-questions, trivial questions, inappropriate questionsFilters out non-questions, trivial questions, inappropriate questionsDetermines if the query is specific to a locationDetermines if the query is specific to a location

Determines the topicDetermines the topicUses natural language processing to find salient phrases and determine what is Uses natural language processing to find salient phrases and determine what is semantically significant.semantically significant.Uses a taxonomy of popular topicsUses a taxonomy of popular topics

Ranking FunctionRanking FunctionRanks resources (users) to determine which provide the best information on a Ranks resources (users) to determine which provide the best information on a topictopic

UIUIWeb, IM, Mobile devicesWeb, IM, Mobile devices

Crawler / IndexerCrawler / IndexerFinds and labels resourcesFinds and labels resourcesUsers, not documentsUsers, not documents

Query AnalyzerQuery AnalyzerClassifies queriesClassifies queries

Filters out non-questions, trivial questions, inappropriate questionsFilters out non-questions, trivial questions, inappropriate questionsDetermines if the query is specific to a locationDetermines if the query is specific to a location

Determines the topicDetermines the topicUses natural language processing to find salient phrases and determine what is Uses natural language processing to find salient phrases and determine what is semantically significant.semantically significant.Uses a taxonomy of popular topicsUses a taxonomy of popular topics

Ranking FunctionRanking FunctionRanks resources (users) to determine which provide the best information on a Ranks resources (users) to determine which provide the best information on a topictopic

UIUIWeb, IM, Mobile devicesWeb, IM, Mobile devices

Page 5: Aardvark Anatomy of a Large-Scale Social Search Engine

Aardvark ArchitectureAardvark ArchitectureAardvark ArchitectureAardvark Architecture

Page 6: Aardvark Anatomy of a Large-Scale Social Search Engine

User ExperienceUser ExperienceUser ExperienceUser Experience

Approached directly, via IM or Mobile Approached directly, via IM or Mobile Application to answer a specific question they Application to answer a specific question they should know something about.should know something about.Answerers are found within the users social Answerers are found within the users social networknetworkInteraction is real-timeInteraction is real-timeOne on One conversationOne on One conversation

Approached directly, via IM or Mobile Approached directly, via IM or Mobile Application to answer a specific question they Application to answer a specific question they should know something about.should know something about.Answerers are found within the users social Answerers are found within the users social networknetworkInteraction is real-timeInteraction is real-timeOne on One conversationOne on One conversation

Page 7: Aardvark Anatomy of a Large-Scale Social Search Engine

RankingRankingRankingRanking

Similar to traditional ranking in concept, a Similar to traditional ranking in concept, a statistical probability that the user can answer statistical probability that the user can answer a question on a topic is computed.a question on a topic is computed.Also takes into account the “connectedness” Also takes into account the “connectedness” of the users.of the users.Asks for feedback as to the quality of the Asks for feedback as to the quality of the question after it is answered.question after it is answered.

Similar to traditional ranking in concept, a Similar to traditional ranking in concept, a statistical probability that the user can answer statistical probability that the user can answer a question on a topic is computed.a question on a topic is computed.Also takes into account the “connectedness” Also takes into account the “connectedness” of the users.of the users.Asks for feedback as to the quality of the Asks for feedback as to the quality of the question after it is answered.question after it is answered.

Page 8: Aardvark Anatomy of a Large-Scale Social Search Engine

Does it work?Does it work?Does it work?Does it work?

87.7% of questions received an answer87.7% of questions received an answer57.2% answered in less than 10 minutes57.2% answered in less than 10 minutes70.4% of answers were ranked “good”70.4% of answers were ranked “good”In my experience, works great for:In my experience, works great for:

Idea generatorIdea generatorGetting pointed in the right directionGetting pointed in the right directionGetting opinions on subjective topicsGetting opinions on subjective topics

87.7% of questions received an answer87.7% of questions received an answer57.2% answered in less than 10 minutes57.2% answered in less than 10 minutes70.4% of answers were ranked “good”70.4% of answers were ranked “good”In my experience, works great for:In my experience, works great for:

Idea generatorIdea generatorGetting pointed in the right directionGetting pointed in the right directionGetting opinions on subjective topicsGetting opinions on subjective topics

Page 9: Aardvark Anatomy of a Large-Scale Social Search Engine

ReferencesReferencesReferencesReferences

““Anatomy of a Large-Scale Social Search Anatomy of a Large-Scale Social Search Engine” by Damon Horowitz and Engine” by Damon Horowitz and Sepandar KumarSepandar Kumar

http://vark.com/aardvarkFinalWWW2010.pdf

Try it yourself – www.vark.comTry it yourself – www.vark.com

““Anatomy of a Large-Scale Social Search Anatomy of a Large-Scale Social Search Engine” by Damon Horowitz and Engine” by Damon Horowitz and Sepandar KumarSepandar Kumar

http://vark.com/aardvarkFinalWWW2010.pdf

Try it yourself – www.vark.comTry it yourself – www.vark.com