31
Finding Co-solvers on Twitter, with a Little Help from Linked Data Milan Stankovic, Hypios, Université Paris-Sorbonne, France* Matthew Rowe, KMi, Open University, UK Philippe Laublet, Université Paris-Sorbonne, France

Finding Co-solvers on Twitter, with the Little Help from Linked Data

  • Upload
    milstan

  • View
    561

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Finding Co-solvers on Twitter, with a Little Help from Linked Data

Milan Stankovic, Hypios, Université Paris-Sorbonne, France*Matthew Rowe, KMi, Open University, UKPhilippe Laublet, Université Paris-Sorbonne, France

Page 2: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Outline

• Context• Problem• Our Approach• Evaluation• Example of use• Conclusion and questions

Page 3: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Context: Innovation on the Web

Innovation SeekersSolvers from industry, research etc.

Academia

Page 4: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Problem: Find Collaborators

Innovation Seeker

Problem

???

solver

Page 5: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Problem: Find Collaborators

Innovation Seeker

Problem

???

solver

•How to find collaborators that complement the solver’s competence with regards to the problem

•How to find collaborators that are compatible with him in terms of teamwork

?

Page 6: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Problem: Find Collaborators

Problem

solver

Complementary Competence

Interest Similarity

Social Similarity

inspired by social studies on team composition, and factors that influence good teamwork

Page 7: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach

profiling >> profile extension >> calculation of similarities >> ranking

Implementation and tests performed using data from Twitter

Page 8: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

Page 9: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

solver

candidate collaborators

problem

conceptial

social

Page 10: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling• Conceptual Profiles– users: Zemanta used to extract DBPedia concepts from

textual elements that the user created on twitter (tweets, bio, etc.). Profiles contain concepts and the frequency of their occurrence

– problem: Text of the innovation problem treated with Zemanta to extract concepts

• Social Profiles– contain all the contacts of a given user on Twitter

• Both types of profiles are in vector form.• Simple in purpose, to get most topics, not to specialize

for topics of highest expertise

Our Approach: Profiling

Page 11: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profile Extension

Page 12: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• Why extend profiles:– imperfection of source data

(tweets)– incompleteness of coverage

(due to difference in vocabulary some concepts may stay unnoticed)

– to perform broader/lateral match

Our Approach: Profile Extension

Page 13: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• How– hyProximity (HPSR): a graph

measure using Linked Data (tested on DBPedia)

– DMSR: distributional measure inspired by Normalized Google Distance

– PRF: Pseudo Relevance Feedback

Our Approach: Profile Extension

Page 14: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• HSPR (hyProximity)

Our Approach: Profile Extension

HPSR(c1,c2) = ic(K i) + link(pp∈P

∑K i ∈K (c1 ,c2 )

∑ ,c1,c2) • pond(p,c1)

skos:broader

skos:broaderdct:subject

Page 15: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• DMSR – Distributional Measure of Semantic Relatedness

Our Approach: Profile Extension

c1 c2 c16 c18 c32

c1 c2 c15 c43 c56

c1 c3 c4 c10 c13

c1 and c2 more related then c1 and c3

Page 16: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• PRF: Pseudo Relevance Feedback– Distributional measure based on the profiles

appearing in the n best ranked solutions.– The same measure of co-occurrence as DMSR,

applied to the set of first 10 suggestions– This method can be applied with any ranking

technique

Our Approach: Profile Extension

Page 17: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingOur Approach: Similarities

Page 18: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingOur Approach: Similarities

Complementarity (Similarity with difference topics)

Conceptual Similarity (Similarity of conceptual profiles)

Social Similarity (Similarity of Social Profiles)

Page 19: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: Profiling

• Vector Similarity Measures– Weighted Overlap

– Cosine Similarity

Our Approach: Similarities

wi

cosine

Page 20: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingRanking

• By one similarity measure– complementarity– conceptual similarity– social similarity

• By a linear combination of measuresa*Comp+b*ConcSim+c*SocSim

• By a product of measuresComp*ConcSim*SocSim

Page 21: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation

• Evaluation 1– recommending a collaborator to a group of solvers– a group of 3 solvers (experts in Semantic Web) is

trying to solve 3 cross-disciplinary problems– problems inspired from real challenges (workshops,

calls for papers, etc.)• Evaluation 2– recommending collaborators to individual solvers– 12 twitter users, experts in Semantic Web look for

collaborators for the same 3 problems

Page 22: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation: Metrics

• Discounted Cumulative Gain– what is the value of considering first 10

suggestions, and what is the quality of their ordering

• Average Precision– what is the cumulative benefit of considering each

next suggestion in a particular ranking

Page 23: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation 1

• Discounted Cumulative Gaincompatibility

Page 24: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation 1

• Discounted Cumulative Gainconceptual similarity

Page 25: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation 2

• Composite Ranking Functions: Product– Comp*ConcSim*SocSim– PRF(Comp*ConcSim*SocSim): PRF problem profile expansion with

composite similarity. – HSPR(Comp)*ConcSim*SocSim: HPSR expansion performed on difference

topics prior to calculating the complementarity (similarity with difference topics)

– Comp*DMSR(ConcSim)*SocSim: DMSR expansion performed over the seed user profile prior to calculating interest similarity.

– HSPR(Comp)*DMSR(ConcSim)*SocSim: composite function in which HPSR is used to expand profile topics and DMSR to expand seed user topic profile prior to calculating the similarities.

Page 26: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation 2

• Discounted Cumulative Gain

Comp*ConcSim*SocSim

PRF(Comp*ConcSim*SocSim)

HSPR(Comp)*ConcSim*SocSim

Comp*DMSR(ConcSim)*SocSim

HSPR(Comp)*DMSR(ConcSim)*SocSim

Page 27: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingEvaluation 2

• Average Precision (Cumulative)

Comp*ConcSim*SocSim

PRF(Comp*ConcSim*SocSim)

HSPR(Comp)*ConcSim*SocSim

Comp*DMSR(ConcSim)*SocSim

HSPR(Comp)*DMSR(ConcSim)*SocSim

Page 28: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingConclusions

• The Linked Data based concept expansion technique (hyProximity) gives best results when expanding topics for Compatibility measures. A distributional one works slightly better for Conceptual Similarity measures.

• In a composite ranking function, expanding profiles with hyProximity is beneficial if applied only to Compatibility. Expansion in both Compatibility and Conceptual Similarity has negative effects.

• All profile expansion techniques, applied individually, have positive effects in comparisons to direct similarity calculation with no expansion.

Page 29: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingTake Away

Problem

Compatibility

Conceptual Similarity

( ), hyProximitya Linked Data-based measure

hyProximitya Linked Data-based measure

DMSRa distributional

measure

DMSRa distributional

measure

Expansion

Page 30: Finding Co-solvers on Twitter, with the Little Help from Linked Data

Our Approach: ProfilingExample

Problem : Semantic Web representation of start-up history for start-up performance indicators

davidsrosefundingpostECVentureCapitaBVCAvc20AndySackCVCACanadaAustin_Startupstgmtgmdavidblerner

User: Milan Stankovic (@milstan)

Suggestions:Angel investor specialized

in technology statups

Entrepreneur, Social Networks (KLOUT), Metrics

Investors and Entrepreneurs, Information

technology

Investors and Entrepreneurs, Information

technology

Page 31: Finding Co-solvers on Twitter, with the Little Help from Linked Data

?