44
Representation Learning on Networks Yuxiao Dong Microsoft Research, Redmond Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University) Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)

Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Representation Learning on Networks

Yuxiao Dong

Microsoft Research, Redmond

Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University)

Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)

Page 2: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Networks

Economic networksSocial networks

Networks of neurons

Biomedical networks

InternetInformation networks

Slides credit: Jure Leskovec

Page 3: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

hand-crafted feature matrix

feature engineering

The Network & Graph Mining Paradigm

X

Graph & network applications• Node label inference;

• Link prediction;

• User behavior… …

𝑥𝑖𝑗: node 𝑣𝑖’s 𝑗𝑡ℎ feature,

e.g., 𝑣𝑖’s pagerank value

machine learning models

Page 4: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

hand-crafted latent feature matrix

Feature engineering learning

Representation Learning for Networks

Graph & network applications• Node label inference;

• Node clustering;

• Link prediction;

• … …

Z

machine learning models

• Input: a network 𝐺 = (𝑉, 𝐸)

• Output: 𝒁 ∈ 𝑅 𝑉 ×𝑘 , 𝑘 ≪ |𝑉|, 𝑘-dim vector 𝒁𝑣 for each node v.

Page 5: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

𝑤𝑖

𝑤𝑖−2

𝑤𝑖−1

𝑤𝑖+1

𝑤𝑖+2

Network Embedding: Random Walk + Skip-Gram

Perozzi et al. DeepWalk: Online learning of social representations. In KDD’ 14, pp. 701–710.

• sentences in NLP

• vertex-paths in Networks

skip-gram

(word2vec)

Page 6: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Random Walk Strategies

• Random Walk

– DeepWalk (walk length > 1)

– LINE (walk length = 1)

• Biased Random Walk

• 2nd order Random Walk

– node2vec

• Metapath guided Random Walk

– metapath2vec

Page 7: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Application: Embedding Heterogeneous Academic Graph

Microsoft Academic Graph

metapath2vec

• https://academic.microsoft.com/

• https://www.openacademic.ai/oag/

• metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Page 8: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Application 1: Related Venues

• https://academic.microsoft.com/

• https://www.openacademic.ai/oag/

• metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Page 9: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

HarvardStanford

ColumbiaYale

UChicagoJohns Hopkins

Microsoft

GoogleAT&T Labs

MIT

Facebook

CMU

Application 2: Similarity Search (Institution)

• https://academic.microsoft.com/

• https://www.openacademic.ai/oag/

• metapath2vec: scalable representation learning for heterogeneous networks. In KDD 2017.

Page 10: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Input:

Adjacency Matrix

𝑨

Random Walk Skip Gram

Output:

Vectors

𝒁

Network Embedding

• Random Walk

– DeepWalk (walk length > 1)

– LINE (walk length = 1)

• Biased Random Walk

• 2nd order Random Walk

– node2vec

• Metapath guided Random Walk

– metapath2vec

Page 11: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization

1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

• DeepWalk

• LINE

• PTE

• node2vec

𝑣𝑜𝑙 𝐺 =

𝑖

𝑗

𝐴𝑖𝑗

𝑨 Adjacency matrix

𝑫 Degree matrix

b: #negative samples

T: context window size

Page 12: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

𝑤𝑖

𝑤𝑖−2

𝑤𝑖−1

𝑤𝑖+1

𝑤𝑖+2

log(#(𝒘, 𝒄)|𝒟|

𝑏#(𝑤)#(𝑐))?

𝐺 = (𝑉, 𝐸)

• Adjacency matrix 𝑨• Degree matrix 𝑫• Volume of 𝐺: 𝑣𝑜𝑙 𝐺

Levy and Goldberg. Neural word embeddings as implicit matrix factorization. In NIPS 2014

• (𝑤, 𝑐): co-occurrence of w & c

• (𝑤): occurrence of node w

• (𝑐): occurrence of context c

• 𝒟: node−context pair (w, c) multi−set

• |𝒟|: number of node-context pairs

Understanding Random Walk + Skip Gram

Page 13: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Understanding Random Walk + Skip Gram

Suppose the multiset is constructed based on random walk on

log(#(𝒘, 𝒄)|𝒟|

𝑏#(𝑤)#(𝑐))

• (𝑤, 𝑐): co-occurrence of w & c

• (𝑤): occurrence of node w

• (𝑐): occurrence of context c

• 𝒟: node−context pair (w, c) multi−set

• |𝒟|: number of node-context pairs

Page 14: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Understanding Random Walk + Skip Gram

• Partition the multiset 𝒟 into several sub-multisets according to the

way in which each node and its context appear in a random walk

node sequence.

• More formally, for 𝑟 = 1, 2,⋯ , 𝑇, we define

Distinguish direction and

distance

log(#(𝒘, 𝒄)|𝒟|

𝑏#(𝑤)#(𝑐))

• (𝑤, 𝑐): co-occurrence of w & c

• (𝑤): occurrence of node w

• (𝑐): occurrence of context c

• 𝒟: node−context pair (w, c) multi−set

• |𝒟|: number of node-context pairs

Page 15: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Understanding Random Walk + Skip Gram

the length of random walk 𝐿 → ∞

• (𝑤, 𝑐): co-occurrence of w & c

• 𝒟: (w, c) multi−set

Page 16: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Understanding Random Walk + Skip Gram

the length of random walk 𝐿 → ∞

Page 17: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

𝑤𝑖

𝑤𝑖−2

𝑤𝑖−1

𝑤𝑖+1

𝑤𝑖+2

DeepWalk is asymptotically and implicitly factorizing

1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Understanding Random Walk + Skip Gram

𝑣𝑜𝑙 𝐺 =

𝑖

𝑗

𝐴𝑖𝑗

𝑨 Adjacency matrix

𝑫 Degree matrix

b: #negative samples

T: context window size

Page 18: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Unifying DeepWalk, LINE, PTE, & node2vec as Matrix Factorization

Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18. The most cited paper in WSDM’18 as of May 2019

• DeepWalk

• LINE

• PTE

• node2vec

Page 19: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

NetMF: explicitly factorizing the DeepWalk matrix

𝑤𝑖

𝑤𝑖−2

𝑤𝑖−1

𝑤𝑖+1

𝑤𝑖+2

DeepWalk is asymptotically and implicitly factorizing

1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Matrix

Factorization

Page 20: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

1. Construction

2. Factorization

𝑺 =

the NetMF algorithm

1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Page 21: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Results

•Predictive performance on varying the ratio of training data;

•The x-axis represents the ratio of labeled data (%)1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Page 22: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Results

1. Qiu et al. Network embedding as matrix factorization: unifying deepwalk, line, pte, and node2vec. In WSDM’18.

Explicit matrix factorization (NetMF) offers

performance gains over implicit matrix factorization

(DeepWalk & LINE)

Page 23: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Input:

Adjacency Matrix

𝑨

Random Walk Skip Gram

𝑺 = 𝑓(𝑨)(dense) Matrix

FactorizationOutput:

Vectors

𝒁

Network Embedding

DeepWalk, LINE, node2vec, metapath2vec

NetMF

Incorporate network structures 𝑨 into the similarity matrix 𝑺, and then factorize 𝑺

𝑓 𝑨 =

Page 24: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Challenges

NetMF is not practical for very large networks

𝑺 =

Page 25: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

NetMF

How can we solve this issue?

1. Construction

2. Factorization

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

𝑺 =

Page 26: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

NetSMF--Sparse

How can we solve this issue?

1. Sparse Construction

2. Sparse Factorization

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

𝑺 =

Page 27: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Sparsify 𝑺

For random-walk matrix polynomial

where and non-negative

One can construct a 1 + 𝜖 -spectral sparsifier ෨𝑳 with non-zeros

in time

for undirected graphs

• Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng, Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification, COLT 2015.

• Dehua Cheng, Yu Cheng, Yan Liu, Richard Peng, and Shang-Hua Teng. Spectral sparsification of random-walk matrix polynomials. arXiv:1502.03496.

Page 28: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Sparsify 𝑺

For random-walk matrix polynomial

where and non-negative

One can construct a 1 + 𝜖 -spectral sparsifier ෨𝑳 with non-zeros

in time

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

𝑺 =

Page 29: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

NetSMF --- Sparse

Factorize the constructed sparse matrix

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Page 30: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

NetSMF---bounded approximation error

𝑴

෩𝑴

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Page 31: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

#non-zeros

~4.5 Quadrillion → 45 Billion

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Page 32: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Page 33: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Effectiveness: • (sparse MF)NetSMF ≈ (explicit MF)NetMF > (implicit MF) DeepWalk/LINE

Efficiency: • Sparse MF can handle billion-scale network embedding

Page 34: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Embedding Dimension?

1. Qiu et al. NetSMF: Network embedding as sparse matrix factorization. In WWW 2019

Page 35: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Input:

Adjacency Matrix

𝑨

Random Walk Skip Gram

𝑺 = 𝑓(𝑨)(dense) Matrix

FactorizationOutput:

Vectors

𝒁

Network Embedding

DeepWalk, LINE, node2vec, metapath2vec

(sparse) Matrix

FactorizationSparsify 𝑺

NetSMF

NetMF

𝑓 𝑨 =

Incorporate network structures 𝑨 into the similarity matrix 𝑺, and then factorize 𝑺

Page 36: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

ProNE: More fast & scalable network embedding

1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019

Page 37: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Embedding enhancement via spectral propagation

𝑅𝑑 ← 𝐷−1𝐴(𝐼𝑛 − ෨𝐿) 𝑅𝑑

is the spectral filter of 𝐿 = 𝐼𝑛 − 𝐷−1𝐴

𝐷−1𝐴(𝐼𝑛 − ෨𝐿) is 𝐷−1𝐴 modulated by the filter in the spectrum

1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019

The idea of Graph Neural Networks

Page 38: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Performance

20 Threads 1 Thread

ProNE offers 10-400X speedups

(1 thread vs 20 threads)

19hours 98mins 10mins

1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019

ProNE embeds 100,000,000 nodes by 1 thread:

29 hours with performance superiority

Page 39: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

A general embedding enhancement framework

1. Zhang et al. ProNE: Fast and Scalable Network Representation Learning. In IJCAI 2019

Page 40: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Input:

Adjacency Matrix

𝑨

Random Walk Skip Gram

𝑺 = 𝑓(𝑨)(dense) Matrix

FactorizationOutput:

Vectors

𝒁

Network Embedding

DeepWalk, LINE, node2vec, metapath2vec

(sparse) Matrix

FactorizationSparsify 𝑺

NetSMF

(sparse) Matrix

Factorization𝒁 = 𝑓(𝒁′)

ProNE

NetMF

Factorize 𝑨, and then incorporate network structures via spectral propagation

Page 41: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Input:

Adjacency Matrix

𝑨

Random Walk Skip Gram

𝑺 = 𝑓(𝑨)(dense) Matrix

FactorizationOutput:

Vectors

𝒁

Network Embedding

DeepWalk, LINE, node2vec, metapath2vec

(sparse) Matrix

FactorizationSparsify 𝑺

NetSMF: handle billion-scale graphs

(sparse) Matrix

Factorization𝒁 = 𝑓(𝒁′)

ProNE: 10--400X speedups

NetMF: theory & better accuracy

Page 42: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

References

1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-

Scale Network Embedding as Sparse Matrix Factorization. WWW 2019.

2. Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. ProNE: Fast and Scalable Network

Representation Learning. IJCAI 2019.

3. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. Network Embedding as

Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec. WSDM 2018.

4. Yuxiao Dong, Nitesh V. Chawla, Ananthram Swami. metapath2vec: Scalable Representation Learning for

Heterogeneous Networks. KDD 2017.

5. Fanjing Zhang, et al. OAG: Toward Linking Large-scale Heterogeneous Entity Graphs. ACM KDD 2019.

6. Wu, Shi, Dong, Huang, Chawla. Neural Tensor Decomposition. WSDM 2019.

Page 43: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Microsoft Academic Graph

https://academic.microsoft.com as of Sep. 2019

The graph data is open!

230 million authors

25,570 Institutions

48,757 journals

4,307 conferences

664,862 fields of study

228 million papers/patents/books/preprints

1800 --- 2019

Page 44: Representation Learning on Networks - Yuxiao Dong's Homepage · References 1. Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. NetSMF: Large-Scale

Thank you!

Papers & data & code available at https://ericdongyx.github.io/

[email protected]

Joint work with Jiezhong Qiu, Jie Zhang, Jie Tang (Tsinghua University)

Hao Ma (MSR & Facebook AI) and Kuansan Wang (MSR)