Personalized Federated Search at LinkedIn · Personalized Federated Search at LinkedIn Dhruv Arya [email protected] Viet Ha-Thuc [email protected] Shakti Sinha [email protected]

Personalized Federated Search at LinkedIn

Dhruv [email protected]

Viet [email protected]

Shakti [email protected]

LinkedIn2029 Stierlin Ct

Mountain View, CA, USA

ABSTRACTLinkedIn has grown to become a platform hosting diversesources of information ranging from member profiles, jobs,professional groups, slideshows etc. Given the existence ofmultiple sources, when a member issues a query like “soft-ware engineer”, the member could look for software engi-neer profiles, jobs or professional groups. To tackle thisproblem, we exploit a data-driven approach that extractssearcher intents from their profile data and recent activitiesat a large scale. The intents such as job seeking, hiring,content consuming are used to construct features to person-alize federated search experience. We tested the approachon the LinkedIn homepage and A/B tests show significantimprovements in member engagement. As of writing this pa-per, the approach powers all of federated search on LinkedInhomepage.

Categories and Subject DescriptorsH.4 [Information Systems Applications]: Miscellaneous

General TermsTheory

KeywordsFederated search, social network, ranking, machine learning,information retrieval

1. INTRODUCTIONLinkedIn has grown over the years from a professional net-

working site to becoming a platform containing multiple pro-fessional information sources such as member profiles, jobs,professional groups, member posts and slideshows. At thesame time, the member base has also increased quickly andcurrently has more than 350 million members. The mem-bers visit the site for various reasons ranging from search-ing and recruiting candidates to looking for jobs or finding

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected]’15, October 19–23, 2015, Melbourne, Australia.Copyright is held by the owner/author(s). Publication rights licensed to ACM.ACM 978-1-4503-3794-6/15/10 ...$15.00.DOI: http://dx.doi.org/10.1145/2806416.2806615 .

Figure 1: LinkedIn Federated Search Result Pagefor Query “Software Engineer”

professional content etc. As the number of information ver-ticals and member base increase, the problem of serving theright information to fulfill each individual member’s infor-mation need becomes more and more critical. This problemcontains two subtasks including selecting the right verticalsand aggregating the vertical results into a single ranking(See Figure 1) and typically is referred as federated search.

The area of federated search originated from meta searchand has been actively researched in the field of informa-tion retrieval [5] and particularly in the context of Websearch. Diaz [2] addresses the problem of vertical selection,i.e., whether or not to show a specific item above the Webresults given a query. Ponnuswami et. al. [4] and Arguelloet. al. [1] propose machine learning approaches for aggre-gating vertical results into single search result pages. Theydemonstrate effectiveness of the approaches on Bing search.More recently, Lefortier et. al. [3] present a way to blendindividual vertical results and individual Web results with acase study on Yandex video search.

However, the problem of federated search on LinkedInpresents unique challenges. First, the level of personal-ization in a platform like LinkedIn is much deeper thangeneral Web search. For instance, if a member enters a

arX

iv:1

602.

0492

4v1

[cs

.IR

] 1

6 Fe

b 20

16

http://dx.doi.org/10.1145/2806416.2806615

query like “software engineer”, depending on if he or she isa recruiter, job seeker or professional content consumer, themember could expect to see software engineers’ profiles, jobsor slideshows on the topic. Second, individual results andblocks of results from different verticals are often associatedwith different features. Moreover, even if a feature is com-mon, it is not equally important to them. This challenge issimilar to federated Web search. Nonetheless, unlike Websearch which typically blends either blocks of results [1][4] orindividual results [3] from different verticals, our system ag-gregates both individual results (e.g. jobs as shown in Figure1) and blocks of vertical results (e.g. people and professionalgroup verticals). Thus, the system has to normalize featuresand eventually make relevance scores comparable across re-sult verticals and result types (individual vs. block). Third,in Web search, Web results are typically the primary verti-cal thus they can be used as a “pivot” to normalize scores ofthe other verticals [4]. In our problem, the primary verticalsvary depending on queries and other search context whichcould includes searcher’s data, past activities, location etc.

To overcome these challenges, we propose a data-drivenapproach to personalize federated search. Specifically, wemine members’ data and their recent activities at a largescale to understand whether or not they currently have in-tent of hiring, job seeking or content consuming etc. Thisinsight coupled with other signals are used to select verticalsand aggregate vertical results into a single search result pagepersonally relevant to each of our members. To make thesesignals comparable across different result categories, includ-ing verticals and result types (block vs. individual), we con-struct composite features combining the signals for each ofthe categories. Then, we let learning algorithms estimatedifferent weights for all of the combinations (i.e., normalizethe signals across the result categories) from training data.

A/B tests done on the LinkedIn homepage shows improve-ments in member engagement and downstream traffic to theverticals. At the time of this writing, the work currentlypowers all of the federated search on the LinkedIn home-page. We organize the rest of the paper as follows. Section2 details how we formulate federated search problem andour proposed approach. Section 3 describes searcher intentfeatures and other signals used in the system. We discuss ex-perimental results in Section 4. Finally, concluding remarkscan be found in Section 5.

2. OVERALL APPROACH

2.1 Problem Statement and Overall FrameworkGiven a pair of (query, searcher), our task is to select

from a list of all possible verticals including people (mem-bers), jobs, companies, professional groups, member posts,slideshows etc. a primary vertical and a set of secondaryverticals, then rank the primary individual results and thesecondary vertical blocks in a single ranked list as shown inFigure 1, without changing the order in the primary verti-cal. The reason for this constraint is two fold. First, webelieve the base ranker of each vertical is the best one torank results within its domain. Second, this keeps memberexperience consistent between federated search and verticalsearch.

The overall framework is described in Figure 2. When amember issues a query q, the query is passed to verticalsand triggers the corresponding vertical search engines to get

Figure 2: Federated Search Overall Framework

the top K results for each. In preliminary vertical selectionphase, the federated scorer extracts features (which are de-scribed later in Section 3) and computes a relevance scorefor each of the verticals. The top vertical is selected as theprimary one and the rest are selected as candidates for sec-ondary verticals. Then, in aggregation phase, these candi-dates compete with individual results in the primary verticalto form the final ranking. Note that these candidates are notguaranteed to show up in the ranking. Instead, dependingon queries, searchers and vertical results, all, some or noneof these candidates could be selected. The aggregation al-gorithm and the process of training the federated scorer aredescribed in the next subsections.

2.2 Aggregation AlgorithmIn this section we discuss how our aggregation algorithm

works. Input to the algorithm includes results from the pri-mary vertical P along with all the candidate secondary ver-tical clusters C and a federated scorer fs. We go througheach of primary vertical result Pi computing the relevancescore for this result using the federated scorer (the secondloop in Algorithm 1). We compare this score with the rel-evance scores of all candidate secondary vertical clusters.If the former is higher, we pick the primary vertical resultfor ith position. If on the other hand there exists a candi-date secondary vertical cluster that has a higher relevancescore than the primary result, we add the secondary verticalcluster to the aggregated rank list and move on to the nextprimary result. We repeat this process till we position allthe primary results. Any secondary vertical results left aredropped.

2.3 Federated Scorer TrainingAs presented above, the purpose of federated scorer is to

provide a universal relevance score for each vertical blockas well as each vertical individual result. The scores haveto be comparable across verticals (for preliminary vertical

Algorithm 1: Federated Search Aggregation Algorithm

Input : Individual results from primary vertical PSecondary vertical clusters CFederated scorer fs

Output: Aggregated result ranked listsortedSecondaryV erticals→ []; rankList→ [];for i=1 to len(C) do

// data structure sorted based on fs(Ci)

sortedSecondaryVerticals.add(Ci)endj ← 1;for i=1 to len(P) do

if fs(Pi) > fs(sortedSecondaryVerticals.get(j))then

rankList[i] = Pi

i← i + 1else

rankList[i] = sortedSecondaryVerticals.get(j)sortedSecondaryVerticals.remove(j)j ← j + 1

end

endreturn rankList

selection) and between vertical blocks and vertical individualresults (for result aggregation). In this work, we train afederated scorer predicting probability that a vertical blockor vertical individual result gets clicked if it appears on asearch result page (SERP) shown to the member.

Traditionally, ground truth data is labeled by human judges.However, this approach is expensive and not scalable. More-over, it is very hard for the judges to judge the relevance onbehalf of some other member, making it challenging to applythe approach for personalized settings. Thus, in this workthe training data is collected from click logs via a random-ization experiment exposed to a small random fraction ofLinkedIn search traffic. In this experiment, given a querywe apply some business rules to come up with a few eligi-ble verticals. We randomly pick one as a primary verticaland leave the others as secondary ones. Then, we randomlyinsert the secondary verticals (as blocks of results) into theprimary ranking without re-ordering the primary individualresults. Clicked results (either primary individual results orsecondary vertical blocks) are labeled positive and skippedresults (unclicked ones ranked above the last clicked result ina ranking) are labeled as negative. The results ranked belowthe last clicked one are dropped since it is unknown if themember ignored these results or simply did not see them.The benefit of randomization is that it avoids the bias to-wards the original vertical selection and ranking. Given thetraining data, we apply logistic regression to train a feder-ated scorer. The features used to train the scorer are de-scribed in the next section.

3. FEATURES

3.1 Searcher IntentA query can be ambiguous in the light of all information

that exists in multiple verticals. For example, if a memberenters a query like “machine learning”, he or she could beinterested in recruiting machine learning people, looking for

jobs related to machine learning, finding professional groupson the topic to join, discovering content on the topic or someof the use cases. To tackle this issue, we take a data-drivenapproach to personalize search results. For instance, if weknow that the member is currently looking for a job, he orshe is likely to be more interested in job results than theother verticals. Similarly, if the member is hiring machinelearning scientists, people results should be more importantto him or her.

Intents of searchers such as job seeking, hiring, contentconsuming etc. are inferred from their profiles and pastbehaviors. At a high level, if a user’s title is recruiter, he orshe is likely to have hiring intent. Similarly, if a user is afinal year student, he or she could have job seeking intent.Another source of data used to infer user intents is theirrecent activities. For example, if users recently searched orapplied for jobs, they tend to have job seeking intent. Wetrain a machine-learned model combining all of the signals topredict intents for all of the member base. The model is runon a daily basic to update members’ intents dynamically. Itis worth noting that a member could have multiple intentsat the same time.

A remaining challenge is that some evidence such as know-ing a searcher has job seeking intent might be associatedone or a few verticals (e.g. job vertical), but not all ofthem. Some evidence might be related to multiple verti-cals but with different levels of importance. To overcomethis issue, we construct composite features, capturing bothsearcher intents and result categories including both verti-cals and result types (individual or block). For instance, thefeature below only fires if the searcher has job seeking intentand the result is a block of jobs. Otherwise, it has value ofzero. We create all of the combinations of intents and resultcategories and learn different weights for them. In essence,we let the learning algorithm associate each of the evidencewith the categories and normalize them across the categoriesfrom training data.

f =

1 if searcher has job seeking intentand the result is job vertical block

0 otherwise

3.2 Keyword IntentThe feature category aims to capture the intents (result

categories) of queries. Specifically, we mine historical clicklogs to estimate p(result category| query), for instance theprobability that members click on professional group verti-cal block for the query “leadership”. These probabilities arecomputed offline for every head query and use this insightto construct the features online. One issue of this approachis that the probabilities are biased towards the previous ver-tical selection and ranking algorithms. Resolving this is afuture direction of this work.

3.3 Base Ranking FeaturesWe also exploit information provided by vertical first pass

rankers (base rankers) to construct features. One exam-ple could be relevance scores from the first pass rankers.These features also have an effect of minimizing the inconsis-tency between the federated scorer and the first pass rankers(recall that the order in the primary vertical is kept un-changed). One issue with this kind of information is thatthe relevance scores might not be calibrated well across ver-

ticals. To resolve this issue, we again construct compositefeatures like relevance score if result is an individual job or aslideshow vertical block. For the later, we compute relevancescore for each block by averaging scores of the top results inthe block. Then, we let the learning algorithm normalize thescores across result categories by learning different weightsfor the features from the training data.

4. EXPERIMENTAL RESULTSBaseline is a legacy federated search algorithm at LinkedIn.

It uses a set of business rules based on past member in-teraction with verticals associated with keywords and rele-vance scores returned by vertical ranking functions. It canbe viewed as a function on the feature sets described in Sec-tions 3.2 and 3.3. Although the function is manually defined,it has been running on live traffic for a relatively long timeand have been iteratively refined. The key difference be-tween the baseline and the new approach is that the later ishighly personalized by using searcher intent as a key signal.

We conducted A/B test for sufficient duration of time (6weeks) to account for any novelty effect. A query is firsttagged for existence of entities like names, job titles, skillsetc. As we are interested in exploratory search, we onlyexperimented with non-name queries. Based on this condi-tion, a random portion of LinkedIn federated search trafficfrom our world-wide member base is used for A/B testing.The federated search combines results from all of seven ver-ticals available on LinkedIn including people, job, company,university, group, slideshow and members’ posts. The traf-fic is then randomly split into control and treatment buck-ets. Each bucket ends up with several hundreds of thousandsearches. Searches in the control bucket are processed bythe baseline and the treatment bucket uses the proposedapproach. We look at three metrics including primary verti-cal click-throught-rate (CTR), secondary vertical CTR andsecondary vertical switches. The difference between the sec-ond and the third metrics is that the former is defined onclicks on individual results within secondary clusters whilethe later is based on clicks on cluster headers that take usersto vertical search pages.

Table 1 shows that the proposed approach is better thanthe baseline on all of the metrics (all of the improvementsare statistically significant). Specifically, the proposed ap-proach is 0.62% better than the baseline on primary verticalCTR. In terms of secondary vertical engagement, the pro-posed approach largely improves over the baseline: 4.31%and 10.66% improvements on secondary vertical CTR andsecondary vertical switches. It is somewhat surprising thatthe improvement on primary vertical CTR is much lowerthan on secondary vertical CTR. It is probably becausethe proposed approach shows more relevant secondary ver-ticals, members are more likely to switch to secondary ver-tical search result pages (10.66% higher). Thus, they haveless chance to engage in the primary results. A deep diveinto log data also reveals that the baseline tends to over-emphasize primary results and on average ranks secondaryvertical clusters lower in result pages. Fully understandingand modeling the trade-off between member engagement onprimary and secondary results is another future direction ofthis work.

Metric Improvements

Primary Vertical CTR +0.62%Secondary Vertical CTR +4.31%Secondary Vertical Switches +10.66%

Table 1: Metrics improvements of treatment overbaseline. Due to business sensitivity, we only showrelative improvements instead of absolute metricvalues.

5. CONCLUSIONSIn this paper, we present the problem of personalized fed-

erated search at LinkedIn and propose a data-driven ap-proach for this problem. Our approach takes into accountmembers’ data and activities to infer their intents such ashiring and job seeking. This insight combined with other sig-nals are used to select primary and candidate secondary ver-ticals and then to aggregate primary individual results andthe secondary clusters into a personalized ranking. Thoughpresented in LinkedIn federated search context, the approachis applicable other domains where vertical selection and ag-gregation are highly personalized. Our A/B tests show thatthe approach could significantly improve user engagement.The approach is currently serving all of federated search onLinkedIn homepage.

One future direction of this work is to remove the bias to-wards the previous vertical selection and ranking algorithmsin the current keyword intent features. Another directionis to understand and model the trade-off between memberengagement on primary verticals and engagement on sec-ondary verticals. Given the insight, we will determine thebest balance in terms of member experience.

6. REFERENCES[1] J. Arguello, F. Diaz, and J. Callan. Learning to

aggregate vertical results into web search results. InProceedings of the 20th Conference on Information andKnowledge Management, pages 201–210, 2011.

[2] F. Diaz. Integration of news content into web results. InProceedings of the Second International Conference onWeb Search and Web Data Mining, pages 182–191,2009.

[3] D. Lefortier, P. Serdyukov, F. Romanenko, andM. de Rijke. Blending vertical and web results - A casestudy using video intent. In Proceedings of the 36thEuropean Conference on IR Research, pages 184–196,2014.

[4] A. K. Ponnuswami, K. Pattabiraman, D. Brand, andT. Kanungo. Model characterization curves forfederated search using click-logs: predicting userengagement metrics for the span of feasible operatingpoints. In Proceedings of the 20th InternationalConference on World Wide Web, pages 67–76, 2011.

[5] M. Shokouhi and L. Si. Federated search. Foundationsand Trends in Information Retrieval, 5(1):1–102, 2011.

Documents

Personalized Federated Search at LinkedIn · Personalized Federated Search at LinkedIn Dhruv Arya [email protected] Viet Ha-Thuc [email protected] Shakti Sinha [email protected]