Research Directions of the 2018 Tencent Rhino-Bird Elite ... · Research Directions of the 2018 Tencent Rhino-Bird Elite Training Program ... His main research interests are transfer

Appendix 1:

Research Directions of the 2018 Tencent Rhino-Bird

Elite Training Program

Direction 1: Machine Learning and Related Applications

Subject 1.1: Time Series Analysis and Modeling of User Behavior

Analyze tera-scale data using various machine learning algorithms (including deep

learning, graph learning, reinforcement learning etc.) and large-scale computing

clusters. In addition, explore effective user behavior modeling tools (such as user

segmentation, content recommendation, anomaly detection, visualization etc.) to help

improve user experience and system efficiency.

Mentor Profile

He received a bachelor's degree in Biomedical Engineering from Zhejiang University,

a master's degree in Control Theory and Engineering from Zhejiang University, and a

Ph.D. from the University of Texas at Arlington in Computer Science. During this

time, he worked as a visiting student and research intern at Microsoft Research Asia

and the IBM T. J. Watson Institute. He has published over 30 papers in related major

conferences and magazines (ICML, NIPS, CVPR, ICCV, AAAI, IJCAI, SIGKDD

etc.) and participated in two U.S. startups that were listed on NASDAQ and NYSE

respectively and worked as a key data scientist. He also worked at Didi Chuxing

before he joined Tencent. Currently, he is an expert researcher.

Subject 1.2: Training Acceleration and Structural Learning in Large-scale

Distributed Deep Learning

This subject focuses on the following two aspects:

1. Compression and acceleration of deep learning models: Reduce space usage of the

models during storage and operation and accelerate their computing speed during

inference, through the quantification or sparsification of the parameters and/or

gradients in the deep learning models.

2. Structural learning of deep learning models: Explor a more effective deep learning

neural network structure for large-scale data scenarios and achieve automatic learning

to reduce the research cost of deep learning and improve the accuracy of deep

learning models.

Mentor Profile

Mentor 1: He received a Ph.D. from the Institute of Automation, Chinese Academy of

Sciences, and Tencent senior researcher. His main research interests are deep learning

and distributed learning; especially the application of quantitative methods in both

fields to enhance model training and inference efficiency.

Mentor 2: He graduated from the Beijing University of Aeronautics and Astronautics.

He has been engaged in machine learning for many years at Baidu and Tencent.

Currently, he is a Tencent senior researcher, with research interests focused on

machine learning platform construction, large-scale distributed system design, deep

learning, hyperparameter learning, online learning, boosting etc.

Subject 1.3: Transfer Learning and Parallel Acceleration of Large-scale Graph

Algorithms

This subject focuses on two areas:

1. An aspect-based recommender system that can improve the recommended coverage

and accuracy. Due to the huge consumption of annotating Aspect data, it is hoped that

transfer learning algorithms can transfer knowledge from existing annotation data

fields to unlabeled data fields, and improve efficiency when constructing Aspect-

based recommender systems.

2. Parallel acceleration of traditional graph algorithms has been a hot topic in parallel

algorithm research, such as maximal biclique enumeration (MBE). The traditional

solution mainly uses a DFS-based serial algorithm for MBE. How to use the parallel

algorithm to solve MBE requirements is still an open question.

Mentor Profile

Mentor 1: He received a Ph.D. from the Hong Kong University of Science and

Technology. His main research interests are transfer learning theory & applications

and heterogeneous data fusion. During his Ph.D. study, he published several papers at

top conferences such as KDD, AAAI, and IJCAI. In addition, he has been a reviewer

for IJCAI, AAAI, PAMI, SDM, TCSVT and other conferences and magazines.

Mentor 2: He received a Ph.D. from the Department of Systems Engineering and

Engineering Management -The Chinese University of Hong Kong. His main research

interests are graph theory and data mining, graph-based large-scale distributed

machine learning, social network analysis and recommender systems. He has

published four papers at top data mining conferences KDD, WWW and CIKM,

DASFAA, and has served as a reviewer at conferences such as KDD, WWW, CIKM,

WSDM, SDM and magazines such as VLDBJ, TKDE.

Subject 1.4: Research on Core Algorithms and Application of Reinforcement

Learning in the Physical World

In recent years, remarkable achievements have been made in reinforcement learning

(RL) in the areas of virtual world games and simulation (e.g. Alpha Go, CMU Poker,

OpenAI DOTA2), but it has few applications in the physical world. How to build

bridges between the virtual world and physical world and effectively deploy the

models obtained through training in virtual simulators into the real world, or conduct

efficient RL training directly in the real world and apply the corresponding core

algorithms to the lives of users, are challenging and important issues. The results will

help in applying general artificial intelligence in the real world.

Mentor Profiles

Mentor 1: He received a Ph.D. from University of Wisconsin at Madison, now a

Tencent expert researcher. Prior to joining Tencent, he was a senior research scientist

at Intel Research in Silicon Valley, USA. He proposed the world-leading DC flow

algorithm and has published over 10 papers at top conferences such as CVPR, ICCV,

and ICML. His current research interests are deep reinforcement learning and

computer vision.

Mentor 2: He received a Ph.D. from University of Southern California, now a Tencent

expert researcher. Prior to joining Tencent, he teached at the University of Central

Florida. He has published nearly 20 papers and presented at CVPR, ICCV, NIPS,

ICML, ICLR and other top conferences. His current research interests are deep

reinforcement learning and computer vision.

Subject 1.5: Research on Core Algorithms of Reinforcement Learning in Game

AI

Remarkable achievements have also been made in RL in the areas of game AI in

limited scenarios (e.g. Atari, Vizdoom, Alpha Go and OpenAI Dota2). Researchers are

seeking solutions to important challenges, such as how to build a common game AI

platform that can be used in complex strategy games involving multiple intelligent

agents (such as StarCraft and King of Glory) to accurately estimate and understand

incomplete game scenarios, make long-term game strategy planning in collaboration

with different intelligent agents, and achieve victory. The results will help promote RL

in game AI.

Mentor Profile

He received a Ph.D. from Tsinghua University, now a Tencent senior researcher.

Before joining Tencent, he was a postdoctoral researcher at Cornell University and

Rutgers University. He has published several papers at ICML and other top

conferences. His current research interests are deep reinforcement learning and

computer vision.

Subject 1.6: Massive Social Relationship Chain Computing Oriented to

Information Security

An important user profile for WeChat or QQ social networks to understand users is

their social relations. Taking the social relations of 800 million active WeChat users as

an example, the complete expression is the adjacency matrix of 800 million multiplied

by 800 million. However, this is quite inconvenient when performing analytic or

machine learning tasks. The computational cost is also very high. Network embedding

is a graph-featured representation learning method that maps a network node into a

vector of vector space while improving the efficiency of relational computing by

transforming the relational network into a vector of low dimensional space. The

representative algorithms of network embedding include Deepwalk in 2014 papers,

Node2vec in 2016 KDD papers, and LINE released by Microsoft in 2015. However,

the open source algorithms have performance and functionality problems in practical

application. This subject mainly studies and implements efficient relational algorithms

that meet the needs of business applications.

Mentor Profile

He received a Ph.D. in machine learning from Italy and now a Tencent senior

researcher. His doctoral dissertation was published in ACL (long paper). He has been

devoted to the application of machine learning in practical business scenarios,

including e-commerce, information, O2O and information security.

Subject 1.7: Optimization of Conversion Modeling and Conversion Rate

Estimation Based on Deep Neural Networks

In Internet advertising scenarios, conversion rate estimation has become an important

strategic link that affects advertising performance. Different industries have different

definitions of conversion. Conversion types may include account registration, paying

for downloads, order purchases etc. It is challenging to model conversion rate

estimates in these scenarios. We want to present a unified modeling method that uses

all advertising behavior data but avoids the interaction of different types of conversion

data and achieves an accurate estimation of different types of conversions.

Mentor Profile

He graduated from Shanghai Jiao Tong University in Computer Application

Technology and now a Tencent senior researcher mainly engaged in data mining,

machine learning, and related research. He has published six international conference

papers. Two of which were published in CIKM and AAAI as the primary author. At

present, he is chiefly responsible for estimating conversion rates in social advertising

and participating in the strategy of conversion optimization. Some projects have won

the Tencent Technology Breakthrough Prize.

Subject 1.8: Research on AI in MOBA Games

MOBA (multiplayer online battle arena) games in recent years have been the hottest

games in the market. Both League of Legends and King of Glory have hundreds of

millions of users and eSports competitions attract global attention. The key attractions

of MOBA games lie in its rich and varied characters, skill sets, and strategic and

tactical cooperation. These real-time, high DOF, complex games also present a good

environment for artificial intelligence technology research. How to use the existing

artificial intelligence technology to achieve normal role operations in MOBA games

and reach or exceed the level of human players is quite challenging subjects and

major concerns.

Mentor Profile

He received a bachelor's degree from Fudan University and a Ph.D. from the School

of Computing, National University of Singapore. His Ph.D. study focus was

processing of document images. He worked as a postdoctoral researcher at the

National University of Singapore and was responsible for the application of machine

learning in medical imaging. Before he joined Tencent, he served as a researcher at

the Institute for Infocomm Research, Singapore, and was responsible for the

application of machine learning in such areas as intelligent transportation systems and

character recognition. Currently, he is a senior researcher and his main research

interest is the application and exploration of artificial intelligence in games.

Subject 1.9: Research on Deep Neural Network Algorithms Oriented to

Automated Pronunciation Evaluation and Feedback

Automated pronunciation evaluation is one of the core modules of computer-aided

language learning (CALL). The speech model in the traditional evaluation systems is

based on speech recognition setup; thus ignoring the specific needs of the evaluation

tasks and results in difficulty when evaluating non-standard pronunciation. At the

same time, traditional evaluation algorithms are based on certain specific acoustic and

vocal features. The recognition and extraction of these features require a lot of

training and data, which pose difficulties for practical application. The purpose of this

subject is to explore the construction of DNN algorithms oriented to pronunciation

evaluation to achieve end-to-end mapping from speech to evaluation results; and to

improve the correlation between evaluation results and manual evaluation to achieve

pronunciation evaluation and guidance feedback.

Mentor Profile

He received a bachelor's and a master's degrees from Tsinghua University. Ph.D. from

MIT, and now a Tencent senior researcher. His research interests include large-scale

numerical simulation, statistical analysis, stochastic simulation, optimization

algorithms, model prediction and uncertainty analysis. He has published several

applied mathematics papers in SIAM. Currently, he is responsible for the perfection of

optimization algorithms for deep learning models, and development & algorithm

research of speech evaluation technology.

Subject 1.10: Low-dimensional Coding of Multimodal Samples

Low-dimensional coding of samples is a fundamental issue in the field of machine

learning. It is also a desirable technique for practical applications such as word

embedding in NLP. Generative models based on Bayesian inference have been very

successful over past decades. Most are used to describe the mapping of samples from

low-dimensional to observational space. In recent years, generative adversarial

network (GAN) has increased in profile in terms of the probability distribution of

learning samples. However, GAN is often hard to train when working with

multimodal samples. In this subject, we will explore the low-dimensional encoding in

GAN-based multimodal samples.

Mentor Profile

He is a graduate of Fudan University (Bachelor), Tsinghua University (Master) and

Department of Computer Science, Princeton University (Ph.D.), postdoctoral research

at California Institute of Technology, an associate professor of the Chinese University

of Hong Kong, and now a distinguished scientist at Tencent. He has been an editorial

board member of the magazines Theoretical Computer Science and International

Journal of Quantum Information. His main research interests are quantum and

classical random algorithms, complexity analysis, distributed protocol design, and

their applications in the large-scale data processing, machine learning and basic

research of artificial intelligence.

Direction 2: Quantum Computing

Subject 2.1: Quantum Machine Learning Algorithms

Quantum algorithms show an exponential computing advantage in solving certain

large-scale machine learning tasks. Understanding the advantages of quantum

computers for certain types of tasks and conditions is one of the most important

research areas in the field of quantum computing. During the joint training program,

through working with mentors and team members, a student will develop new highly-

efficient quantum machine learning algorithms by studying known quantum

algorithms.

Mentor Profile

He graduated from Fudan University (Bachelor), Tsinghua University (Master) and









Direction 3: Speech Technology

Subject 3.1: Integrating Prediction Network with End to End Adaptive Speech

Recognition System

Currently the end to end speech recognition system is lack of the ability on adaptation

and robustness. It can only achieve the comparative performance with hybrid speech

recognition with tremendous training data, so the accuracy is still not good on many

scenarios. In this project, we want to construct the prediction networks based on both

acoustic and language information, which can perceive the knowledge for speaker,

noise, accent etc. Finally the predicted information is integrated into the end to end

system to perform fast adaption.

Mentor Profile

He is an IEEE fellow, ACM distinguished scientist, and Tencent distinguished

scientist. He worked at Microsoft for several years. His main research interest is

language recognition, and he has published 2 books and more than 170 papers.

Subject 3.2: Robust Multi-talker Speech Recognition System for the Cocktail

Party Problem

Although speech recognition has achieved the good performance under some

scenarios, the accuracy degrades a lot for many real complex noisy scenarios, and it is

still far from the real applications. The processing on the multi-talker overlapped

speech is especially challenging. This project seeks to use some advanced deep

learning techniques, such as PIT and DPCL, and integrate multi-microphone

processing and fast speaker adaptation technologies. We hope that the system

performance can be improved significantly on the overlapped speech with the

technologies proposed in this project.

Mentor Profile

He is a Ph.D. holder who previously worked at Shanghai Jiao Tong University and

now a Tencent expert scientist. Currently, he focuses on speech recognition, speaker

recognition, deep learning etc. He has published nearly 100 papers.

Subject 3.3: Key Technologies for Speech Information Security in Low-resource

Complex Social Scenarios

This research focuses on UGC keyword recognition under the scenarios with Internet

complex channels and cross-language and multilingual environment. Speech may be a

paragraph of a low-resource foreign language, far-field audio collected in a complex

channel scenario, a recently popular live broadcast clip. In these complex social

scenarios, the research methods of keyword trigger & retrieval technology for low-

resource speech mainly can be:

1. Adaptation methods for low-resource multi-language neural network acoustic

models: The neural network acoustic model of target language with similar

performance to the existing model is obtained through training by effectively using an

existing neural network acoustic model in which adequate language data is used for

training, with only limited target language data available, and on the basis of various

models' adaptation technologies.

2. Computing performance optimization of the neural network: One of the directions

is to carry out quantification or subspace clustering for large network parameter sets

from different perspectives. This reduces the representation accuracy or representation

numbers of parameters and accelerates computing. We hope to carry out research and

implementation of n-bit quantization neural networks in the speech keyword field.

Mentor Profile

He joined the Dolby Laboratory after graduating from the Institute of Automation,

Chinese Academy of Sciences. He was in charge of the speech front-end (single and

multi-channel enhancement, echo and reverberation cancellation, source positioning),

next-generation speech codec TCS, robust speech transmission, real-time conference

speech recognition, keyword retrieval and other projects. He has published 17 papers

in various international speech conferences and magazines and has been granted more

than 10 U.S. patents. His current interests include low-resource minority-language

keyword retrieval, decoder acceleration, single-channel Internet audio enhancement

etc.

Subject 3.4: Voice and Music Processing Technology

This research focuses on processing the single-channel speech enhancement based on

the neural network, solving single-channel speech enhancement problems that are

difficult to solve with traditional signal processing such as cocktail parties.

Intelligently restore the singing voice, and adjust voice features of the singing voice

that is not in the rhythm or running tone to each of the sounds. Researches on the

related technologies of voice conversion, which can change the personality

characteristics of one person's voice through voice processing to make it possess the

characteristics of another person's voice, but at the same time keep the original

semantic information unchanged.

Mentor Profile

He is a Tencent expert engineer, graduated from Beijing Institute of Technology,

successively worked in ZTE and Tencent Technologies. He has more than 10 years’

experience in voice-related technology research and development. He has conducted

in-depth research on real-time voice communication technologies and has many patents

on signal processing and network related technologies. In recent years, the team he

leads has actively explored new technologies and conducted in-depth exploration and

good technical accumulation in voice enhancement, voice conversion, and sound

beautification based on neural networks. The team also published papers in related

fields at the top conferences such as Interspeech.

Direction 4: Natural Language Processing

Subject 4.1: Technologies and Applications of Deep Text Comprehension Based

on Semantic Analysis and Knowledge Reasoning

To study and explore deep text comprehension technologies based on semantic

analysis and knowledge reasoning, and their applications in open domain chats and

other scenarios.

Mentor Profile

He graduated from Department of Computer Science and Technology, Tsinghua

University. He was a former lead researcher at Microsoft Research Asia and a senior

algorithm expert at Alibaba Group. His research interests include semantic

understanding and intelligent Human–Computer Interaction. He has published more

than 20 papers in international conferences including ACL, EMNLP, WWW, SIGIR,

CIKM, and AAAI. He has served multiple times as a member of procedural

committees of such conferences as ACL, EMNLP, WWW, and AAAI, and as a

reviewer for the journals TOIS and TKDE.

Subject 4.2: Technologies and Applications of Text Generation Based on Deep

Neural Networks

To study and explore text generation technologies based on deep neural networks and

their applications in automatic conversation generation and text style generation.

Mentor Profile:

He graduated from of University of Science and Technology of China and a former

researcher at Microsoft Research Asia. He joined Tencent as an expert researcher. His

research interests include dialog interaction and text generation. He has published

multiple papers in international conferences including EMNLP, WWW, and KDD.

Subject 4.3: NLP Technology in Tencent Information Security

How to represent articles and sentences is a hot topic in NLP research. The current

major approach is to represent sentences by learning useful features from a large

number of unlabeled corpora. Many researchers have tried unsupervised sentence

representation methods. Google Doc2Vec and SkipThought, Facebook Sentence2vec

and ICLR'18 have proposed sentence representation frameworks, but several key

issues remain unresolved:

1. How to embed the semantics of words into sentences; 2. How to express effectively

both long and Chinese articles; 3. How to define object functions to transform

unsupervised questions to self-monitored ones for learning. The current platform has

accumulated a large number of articles, with different length and in different areas.

The major direction here is to effectively train article representation models and use

transfer learning for applying learned information to subsequent NLP tasks.

Mentor Profile

He graduated from the University of Sydney, and mainly engaged in the application

and research of natural language processing. He has extensive experience in

information extraction, text categorization, knowledge graph and machine learning.

He worked in the Australian financial sector, engaged in dealing with intelligent anti-

money laundering and risk profiling, establishing machine learning prediction models

using natural language processing technology, and recommending prevention

programs. Currently, he is engaged in basic research on natural language processing.

Subject 4.4: Machine Translation Based on Differential Neural Computers

Compared with the traditional CNN and RNN networks, differentiable neural

computer (DNC) as a general framework has more memory and generalization

capabilities. However, some problems still restrict its practical application:

1. Complex network structure leads to difficulty in optimization, and parameters are

very sensitive. 2. Some addressing operations lead to a modest degree of parallelism

and difficulty in the effective use of GPU acceleration. This subject aims at these two

problems, optimizes the DNC network and constructs a new generation of DNC-based

neural machine translation (NMT) models.

Mentor Profile

He received a Ph.D. from the Institute of Computing Technology, Chinese Academy

of Sciences, with research interests in natural language processing and deep learning.

He has published dozens of papers at top international conferences such as ACL,

EMNLP, IJCAI, and AAAI and has long served as a reviewer of top conferences and

magazines such as ACL, EMNLP, Neural Computation and JCST.

Subject 4.5: Multi-objective Function Optimization for NMT: Translation and

Translation Quality Evaluation

Neural machine translation (NMT) is an important research issue in AI and NLP.

Existing NMT uses the maximum likelihood estimation as the optimization goal and

does not run a quantitative evaluation of translation quality. The purpose of this

subject is to explore ways for improving NMT optimization strategies. The aim is to

achieve multi-objective optimization of the maximum likelihood estimation and

translation evaluation indexes by improving model structures and adjusting

optimization goals to improve translation quality and evaluate translation usability.

Mentor Profile


of Sciences, with research interests including machine translation, natural language

processing, and dialog systems. He has been a core member in the R&D on a number

of research projects such as major projects of the 863 Program, general projects for

the Ministry of Education and Samsung SVoice (Chinese, Japanese) intelligent

assistant system. He has published more than 10 papers at the ACL, EMNLP, AAAI

and other top international conferences. Currently, he is engaged in the development

and improvement of translation engines and related NLP tools.

Subject 4.6: Reading Comprehension and Q&A

Providing answers to given questions and reference information paragraphs, including

understandings of questions, understanding of reference information, extraction of

answers and other natural language processing techniques.

Subject 4.7: Application of Reinforcement Learning in Natural Language

Processing

Based on real product scenarios and data, it explores the application of reinforcement

learning in natural language processing, including sequence generation, multi-round

dialogs, Q&A and other technical directions.

Subject 4.6 & 4.7 Mentor Profile

He received a Ph.D. from the Institute of Theoretical Physics, Chinese Academy of

Sciences in Statistical Physics. He is currently responsible for technology and product

applications related to machine learning and natural language understanding,

including dialog systems, reading comprehension, machine translation and other

directions. He has published multiple papers at top conferences such as ACL and

NIPS.

Subject 4.8: Construction of Large-scale Knowledge Graphs and Application in

Q&A Systems

This research is about the construction of large-scale domain knowledge graph. It

focuses on research on knowledge acquisition, knowledge expression, and

knowledge-based Q&A.

Subject 4.9: Chatbots Based on Generative Model

It studies generative model-based chatbots, including the multi-round interaction

mechanism, domain knowledge fusion, dialog style transfer and diversification,

interaction-based online learning etc.


He received a Ph.D. from the State University of New York at Buffalo. He is currently

responsible for the R&D and product applications of chatbots; and has published

multiple papers at the top conferences such as ACL, SIGIR, and IJCAI.

Direction 5: Visual and Multimedia Computing

Subject 5.1: Research on Key Technologies for Facial Detection and Recognition

The human face is one of the most important types of visual information. Automatic

face detection and recognition research is a hot and difficult issue in the fields of

artificial intelligence and computer vision and is highly valued in industry and

academia. There is great demand for human facial recognition technology in finance,

mobile, video surveillance and other related fields. The subject incorporates advanced

computer vision technology and uses deep learning as its main technical means. It

focuses on facial recognition, face liveness detection, 3D facial reconstruction and

recognition and other core technologies.

Mentor Profile

He received a master’s degree and a Ph.D. from The Chinese University of Hong

Kong in Information Engineering respectively. An IEEE senior member and now a

Tencent expert engineer. He was a postdoctoral researcher at The Chinese University

of Hong Kong and Michigan State University. He worked in the Institute of Advanced

Technology of the Chinese Academy of Sciences as an associate follow and then

fellow (Ph.D. supervisor). His research interests include artificial intelligence,

computer vision and facial detection and recognition. He has published and presented

more than 20 high-quality papers in top international journals and conferences,

including the top 3 computer vision international conferences (CVPR, ICCV, and

ECCV) and the top multimedia international conference (ACM MM).

Subject 5.2: Research on Image and Video Editing Technologies

This subject involves image processing, editing, generation and other issues. It

includes the study of image/video underlying visual issues and explores new research

tasks of GAN, capsule and other models in image/video.

Subject 5.3: Research on Deep Video Understanding Technologies

Video understanding requires not only learning the representational significance of

single-frame images but also modeling temporal correlation between video frames.

Video understanding issues include video classification, action recognition, action

proposal, action localization, video captioning etc.


He received a bachelor’s and a master’s degrees from Harbin Institute of Technology,

School of Computer Science and Technology respectively. Ph.D. from The Chinese

University of Hong Kong, Department of Electronic Engineering. Now a Tencent

expert researcher. Before joining Tencent, he worked in Huawei’s Hong Kong Noah’s

Ark Lab. Now he is primarily engaged in deep learning in image/video applications

and multimodal deep learning research. He has presented and published multiple

papers at top international conferences and journals.

Subject 5.4: Research on Computer Vision Technologies in Augmented Reality

The computer vision technologies involved in augmented reality include image/video-

based SLAM technology, 3D scenario understanding, and others. The mentee can

focus on visual SLAM, 3D reconstruction, scenario analysis and other research issues.

Subject 5.5: Research on Computer Vision Technology in Robots

Explores the application of computer vision in robots. Typical research areas of visual

technology in robots include learning to grasp, robot navigation, learning to run etc.


He received a Ph.D. in Computer Science and Electronic Engineering from Columbia

University. Research scientist at IBM Thomas J. Watson Research Center, and now a

Tencent expert researcher. He has won the Facebook Ph.D. Scholarship, the

Outstanding Doctoral Dissertation Award of Columbia University, the Young

Investigator Award of the Computer Vision and Pattern Recognition International

Conference (CVPR) and the Best Paper Honor of the Special Interest Group on

Information Retrieval (SIGIR). He has long been engaged in basic research and

product development in computer vision, machine learning, data mining, information

retrieval and other fields. So far, he has published or had accepted more than 100

papers, most at internationally-authoritative journals and conferences (such as IEEE,

IEEE TPAMI, NIPS, ICML, KDD, CVPR, ICCV, ECCV, IJCAI, AAAI, UAI, SIGIR,

SIGCHI), and has been cited more than 3,600 times according to Google Scholar. He

has served as a visiting editorial board member and reviewer for many authoritative

journals. Since 2007, he has been a member of the procedural committees of top

international conferences such as NIPS, CVPR and ICCV.

Subject 5.6: Research and Application of Deep Learning Technology in

Advertising Images

Multimodal information (including text information, object information, logo

information, etc.) in advertising images has positive significance for promoting

creative ads, understanding user preferences, and improving the impact of advertising.

This subject mainly studies the algorithm and application of deep learning technology

in the multimodal information extraction of advertising images; including optical

character recognition (OCR), object detection, logo recognition, basic attribute

analysis (definition, similarity), CTR estimation and so on.

Mentor Profile

He received a Ph.D. from the School of Data and Computer Science, Sun Yat-sen

University. His main areas of interest include the detection & tracking of video

objects, image & text recognition, application of deep learning and distance metric

learning in the computer vision field and more. He has published 11 papers in

magazines and conferences such as IEEE Trans on TIP and JCST. He won the

Excellent Paper Award of the National Conference on Image and Graphics and won

first prizes for the China Graduate Contest on Smart-city Technology and Creative

Design. He currently engages in the research and application of advertising image

recognition algorithms.

Subject 5.7: Research on Text & Image Multimodal Relevance Based on Deep

Learning

Mainly focused on deep learning-based image recognition technology and multimodal

research based on NLP association, specifically including analyzing article topic

models, generating keyword content according to images, using topic models and

image content for analysis, combining the latest technical means of deep learning, and

making breakthroughs in research on the relevance of article titles and content and

images.

Mentor Profile

He received a Ph.D. from the Chinese Academy of Sciences in Pattern Recognition

and Artificial Intelligence, and mainly engaged in computer vision, machine learning,

reinforcement learning theory, and application. He has published nine papers in

computer vision magazines (including Trans. Image Processing, Neurocomputing,

Signal Processing Letters, etc.) and important international conferences, as well as a

translated book on computer vision. He has applied for a patent and was engaged in

scenario classification, large-scale object classification, game AI R&D (including Go,

Texas Hold'em), and intelligent customer service Q&A systems. He is currently

responsible for the research and application of image/video content-based AI.

Subject 5.8: Image Content Understanding and Sentiment Retrieval Based on

Deep Learning

The general image retrieval engine is designed to match the image content and user-

retrieved object or person entries. However, for a particular scenario, the image needs

to match not only object content, but also the specific needs and desires of users. In

this area, we need to understand image content and run sentiment analysis for images

to satisfy sentiment retrieval of images in certain scenarios, such as music

backgrounds and radio posters.

Mentor Profile

He received a Ph.D. from The Chinese University of Hong Kong and now a Tencent

senior researcher. He was responsible for algorithms and applications of search and

recommendation tasks. He has published multiple papers at top international

conferences (such as AAAI, SIGIR, WWW) and major international conferences

(such as CIKM, SIGSPATIAL, ICONIP). He has been nominated for the ICONIP

Best Paper Award, granted a patent, and contributed a chapter for the book

Encyclopedia of Social Network Analysis and Mining. He has also served as a

reviewer for many authoritative international journals, such as IEEE Transactions on

Knowledge and Data Engineering, IEEE Transactions Multimedia, Neural Networks

and others. Currently, he is mainly engaged in research on understanding image

content, sentiment retrieval, and automatic image composition.

Subject 5.9: Research on Key Technologies for Object Detection & Recognition

Object detection & recognition research is a popular and difficult issue in the field of

artificial intelligence and computer vision, and highly valued in the industry and

academia. This subject targets the great demand of general object detection

technology in finance, mobile Internet, video surveillance and other related fields,

combines advanced computer vision technology, uses deep learning as the main

technical means, and focuses on the breakthrough of object detection & recognition in

different scenarios.

Mentor Profile

He is a Tencent senior researcher. He worked as a research assistant and obtained a

Ph.D. from The Chinese University of Hong Kong. He was a senior researcher at

Hong Kong Lenovo Research and the Hong Kong Jiu Ling Institute of AI Technology.

His research interests include artificial intelligence, computer vision, object detection

& recognition. He has been granted an international patent and three domestic patents.

Subject 5.10: Research on Advertising Image Generation Based on GAN

Network

From the dawn of the Internet, a variety of advertising such as banner ads, text ads,

graphic ads and dynamic creative ads have emerged. Exploring new ways to generate

advertising is of utmost importance, such as micro-advertising to attract attention and

improve user experience, dynamic banner creation to save tremendous labor and help

create a more personalized advertising system. Based on understanding advertising

content and the GAN network, this area will dynamically generate more advertising

images through in-depth knowledge and the dynamic combination of materials,

templates, texts, styles, and fonts, and produce superior advertising images for display

through dynamic selection (ranking issue).

Mentor Profile

He is a graduate of the Beijing University of Aeronautics and Astronautics and now a

Tencent senior researcher. He worked in the core teams of Baidu and Alibaba and has

conducted in-depth research in various AI fields such as computer vision,

computational advertising, LBS, SLAM, and robotics. He holds more than ten patents

and currently is engaged in research and application of computer vision in products

and advertising recommendations.

Subject 5.11: Visual Computing of the Human Face

Human face is one of the important research objects in the fields of computer vision

and computer graphics and plays an important role in many visual tasks. According to

the statistics of authoritative image websites, face images account for more than 60%

of daily photos. Both face retrieval, liveness detection, beauty and makeup into C

scenarios, and security monitoring, man-machine interaction, face visual computing

into B scenarios, have important research and practical value. This area relies on the

Tencent platform and takes facial images as the key research objects. It covers many

hot issues of computer vision, computer graphics (such as illuminance calibration,

face detection, 3D reconstruction, pose estimation, appearance modeling, attribute

editing) and their optimization and improvement effects on facial images. It not only

has access to world-class research subjects and the best young researchers in the

industry but also provides the opportunity to make outstanding contributions in these

fields as facial image processing and promotes the research results to millions of

users.

Mentor Profile

He received a Ph.D. from Zhejiang University and now a Tencent senior researcher.

Prior to joining Tencent, he worked at DJI as an innovative algorithm pre-

development engineer. He has published many first-author papers in CVPR, ECCV,

TIP and other top international computer vision academic conferences and magazines,

and has served as a reviewer of CVPR, PG, TIP, TPAMI and other conferences and

magazines. He has rich scientific research and practical experience and his research

interests cover the interdisciplinary areas of computer vision, computer graphics, such

as 3D reconstruction, computational photography, appearance modeling and reverse

rendering.

Subject 5.12: Character Tracking & Recognition in Video Scenarios

Character tracking & recognition in video scenarios is an important research direction

in the field of video analysis and video understanding. It aims to understand the

position, actions, and relationships of characters in the video. This area has attracted a

great deal of attention from the industry and academia and involves many key

technologies in the field of computer vision, such as face detection, tracking,

character recognition, and semantic understanding. However, the tracking &

recognition of characters in videos, especially in open scenario videos, is still a

challenging issue because of the complexity of video content, difficulty in

distinguishing foreground and background, rapid scenario change and other problems.

In recent years, the development of deep learning technology has provided a feasible

solution to this problem. Relying on Tencent's advantages in data, technology, and

infrastructure, this area aims to study a deep learning method based on weak

supervision, uses an end-to-end deep network structure to realize automatic tracking

& recognition of characters in video scenarios, and applies the method to various

Tencent services.

Mentor Profile

He is a Ph.D. graduate of the University of Exeter, the UK in Computer Science and

now a Tencent senior researcher. He was a postdoctoral researcher at Visual Geometry

Group (VGG), University of Oxford. He is now responsible for face-related algorithm

research. His main research interests include deep learning, computer vision, face

detection, tracking & recognition etc.

Subject 5.13: Medical Imaging AI

The interdisciplinary integration of artificial intelligence and medical science will

bring about disruptive changes in the future medical field. Tencent has a high level of

technical reserves in medical imaging AI, has made a large investment, and

established cooperation with more than 100 top domestic hospitals.

In November 2017, the Ministry of Science and Technology recruited Tencent as the

AI "national team" to build the "open innovation platforms". This area will use the

massive medical imaging data and calibration obtained by Tencent from the partner

hospitals to develop the early screening algorithms for diseases (including cancer,

cardiovascular and cerebrovascular diseases, cranial nerve diseases) based on deep

learning, including lesion localization, segmentation, benign and malignant

classification etc.

Mentor Profile

He is a Tencent expert researcher. He holds bachelor’s and master’s degrees from

Tsinghua University and a Ph.D. from the University of Maryland. He joined USA

Siemens Corporate Research after graduation and invented the projection space

learning method that has been widely used in Siemens intelligent image analysis

products. He has published 3 books and more than 100 papers, which have been cited

more than 4500 times. He has been granted nearly 70 U.S. patents. Currently, he is a

senior IEEE member, an associate editor of IEEE Journal of Biomedical and Health

Informatics (impact factor 3.45), and an AIMBE fellow. He has won the second prize

of China National Science & Technology Progress Award, Thomas Alva Edison

Patent Award, and EACTS Techno-College Innovation Award.

Subject 5.14: Multimodal WeChat User Profile Analysis

Analyze the content of the UGC images, videos, and texts in WeChat Moments,

andcreate multi-dimensional and hierarchical user profiles to assist the recommender

system for different fields.

Subject 5.15: Construction of a Massive Image Database and Evaluation

Protocols in WeChat Ecology

It constructs a multi-label, hierarchical massive image database that is applicable to

WeChat scenarios. The labels need to embody both concrete and abstract visual

semantic conception.


He is a graduate of the Institute of Computing Technology, Chinese Academy of

Sciences, with a Ph.D. research interest of multimodal multi-granularity large-scale

face retrieval. During his Ph.D. study, he published 15 computer vision papers at

international conferences and magazines, including CVPR (CCF A class), ICCV (CCF

A class) and the top journal TIP (CCF A class). He is now at Tencent, engaged in user

profile R&D.

Subject 5.16: Audio and Video Quality Evaluation

This research focuses on audio, video and image quality evaluation, which includes

full reference evaluation, partial reference evaluation and no reference evaluation. It

involves algorithm research on objective quality analysis of audio, video and images

by combining psychoacoustic models and the human visual system, with the intention

of providing objective assessment criteria that can be easily applied and satisfy

subjective needs.

Subject 5.17: Target Recognition and Tracking

This subject focuses on the research and application of computer vision based on deep

learning. It combines product data with user behavior to create a personalized and

intelligent product experience. Primary research directions include gesture

recognition, human pose recognition, image/video editing, generation and

understanding, target detection and tracking and recognition.

Subjects 5.16 & 5.17 Mentor Profile

He graduated from the South China University of Technology and now is a Tencent

expert engineer. He has been engaged in research on system architecture, network

technology, performance optimization, audio and video processing technology,

machine learning applications and other fields. He has been granted dozens of patents.

In recent years, his primary research focus is on exploration and application of new

technologies. He has rich experience in computer vision analysis and high-

performance neural network modeling.

Subject 5.18: Video Coding and Processing Technologies

A better visual experience can be provided by combining video/image processing and

coding technologies including video classification, automatic video effect

beautification, automatic video editing and synopsis, object tracking and recognition,

AI video compression, video super-resolution, AI flow control and video

communications.

Mentor Profile

He received a Ph.D. from University of California at San Diego in Electrical and

Computer Engineering. Before he joined Tencent, he worked for Apple, responsible

for R&D of iTunes and FaceTime-related video technologies. Now he is with Tencent,

dedicated to enhancing the user experience for video-related applications. His

research interests include video analysis, processing, and codec and machine learning

in videos.

Direction 6: Research on Data Mining and Related Applications

Subject 6.1: Application of Reinforcement Learning Technology in Advertising

Recommender Systems

This research focuses on how to apply reinforcement learning technology to

advertising recommender systems, design reinforcement learning algorithms, explore

users’ potential interests, combine CTR estimation to learn the best online

recommendation strategy, and maximize the revenue of recommended platforms.

Mentor Profile

He received a Ph.D. from Hong Kong University of Science and Technology in

Computer Science and Engineering and now a Tencent senior researcher. His main

research interests include transfer learning, recommender systems, machine learning,

etc. During his Ph.D. study, he published many papers at KDD, AAAI and other

conferences and magazines, and was a reviewer for many international conferences

and magazines such as IJCAI, WWW, and TKDE.

Subject 6.2: Social Network Structure Mining

This research studies the structures and attributes of the WeChat social network,

including user characteristics in the social network, the similarity between users, user

influence etc. Related technical areas involve machine learning, complex network,

network representation learning, user influence modeling, influence maximization and

so on.

Mentor Profile

He received a master’s degree from the School of Mathematics, South China

University of Technology and now a Tencent expert researcher. He joined Tencent

after graduation and works in data mining. At present, he is mainly responsible for

WeChat social data mining, WeChat social Lookalike and WeChat social

communication analysis and modeling. He has led APP social recommendation, friend

circle mining, user profile construction and other projects. He has also given keynote

speeches at InfoQ and other industry conferences.

Subject 6.3: Research on Forecasting Marital/Child-Rearing Status of Mass

Users

This research studies user marital status mining based on massive data belongs to

typical user data modeling tasks. It involves such typical machine learning tasks as a

screening of training samples, feature engineering, and model optimization. The main

challenge in this area is how to use Tencent massive user behavior data to mine

feature combinations applicable to marital status classification and select a model

algorithm that can effectively handle million-dimensional features.

Mentor Profile

He received a Ph.D. from the National University of Singapore in Machine Learning.

He was a postdoctoral researcher at Duke University and a GE researcher and now is

a Tencent senior researcher. He has published more than 30 papers at international

conferences and magazines. His main research interests are machine learning,

Bayesian statistical models, and compressed sensing.

Subject 6.4: Summary Generation of Game Video Content

The personalized content recommendation is a popular application in the Internet field

nowadays. Video content attracts much attention, and video information extraction

and processing has also been an important research area of pattern recognition and

artificial intelligence. In the games and video field, how to quickly extract valuable

information from massive amounts of games and videos of different types generated

each day, conduct title/abstract generation, key content capture and other applications

for improving user click intention and stickiness for video content when personalized

content is recommended are the main concerns in this area.

Mentor Profile

He received a Ph.D. from the University of Science and Technology of China in Pure

Mathematics. He joined Huawei Technologies after graduation, responsible for

research on the application of data mining technology in telecommunications,

including CRM, personalized recommendation, text mining and other fields.

Currently, he is engaged in data mining technology and applications in the game field,

providing better users experience through user profile analysis and personalized

services, offering more valuable operational support for services.

Subject 6.5: Page Quality Analysis Based on Social Data

It aims to build new PeopleRank, TrustRank, and other models based on WeChat

social communication data to analyze the quality of the page and improve search

results.

Mentor Profile

He is a graduate of Institute of Computing Technology, Chinese Academy of

Sciences. He is currently responsible for WeChat search and recommendation

technology research and product applications. He has published multiple articles in

top-level conferences such as ACL and AAAI.

Subject 6.6: Research on News Hotspot Mining and Ranking Predictions

Hotspot discovery and hotspot tracking are key parts of the recommender system. We

need to mine hot topics and emergencies from real-time news data. We hope to

discover potential hot news in time when hotspots haven’t completely broken out,

combine WeChat social communication data, track the latest events as they develop,

and form the key time series of events.

Mentor Profile

He received a Ph.D. from Stevens Institute of Technology. Currently, he is responsible

for basic data construction of WeChat Top Stories, including mining high-quality

articles, low-quality articles, hot news and others.

Subject 6.7: WeChat Official Account Authority Research

This research has two focuses: on the one hand, based on WeChat social data and user

behavior data, we can excavate elites in various fields, and use the reading behavior of

the elite to determine the degree of authority of the WeChat official accounts

(including content depth, etc.); on the other hand, NLP technology can be used to

determine the degree of authority of the text from the aspect of content of the article.

We can recommend high-level authority content to elites, improve reputation of “Top

Stories” among them, and guide official accounts to create more high-quality content.

Subject 6.8: Research on User Experience of Mini Programs

With the growing prosperity of the mini programs, a large number of mini program

developers are pouring in, but quality of the developed mini programs is uneven. We

hope to establish a model to judge the quality of mini programs, from both the aspect

of the code of the mini programs and the behavior of the user. Specifically, based on

behavioral sequence modeling, it determines whether the user uses the app fluently,

whether there is fraud, whether there is malicious traffic and so on; and it synthesize

the sequence model and the NLP technology to determine whether the content and the

title of the mini programs match.

Subjects 6.7 & 6.8 Mentor Profile

He received a master’s degree from Beijing University of Post and

Telecommunications. He is currently responsible for vertical search technologies in

WeChat Search, including search rankings, search satisfaction, intention recognition,

official account profile, Mini Program profile, search growth and other directions. He

is also engaged in the basic data construction of WeChat Top Stories, including

mining high-quality articles, low-quality articles, hot news and others.

Subject 6.9: Large-scale Knowledge Graph Construction in Game Field

Extracting large-scale and high-quality structured data has always been one of the

difficulties in constructing a knowledge graph. The traditional text information

extraction helps extract structured information from large-scale, unstructured text in

accordance with the knowledge graph schema. However, this method limits the

coverage of entities, relations etc. For these reasons, open information extraction has

gradually become the focus of research. The goal is to extract open-type entities,

relations, events, and other multi-level semantic unit information from massive,

redundant, heterogeneous, non-standard, large-scale texts that contain massive noise.

Our research focus is to use open information extraction techniques to extract high

quality, standard and game-related triplets from large-scale, unstructured data to

construct a large-scale knowledge graph in the game field.

Mentor Profile

He received a bachelor’s degree from Jilin University, master and Ph.D. from the

University of Trento, Italy. The main research interests during his Ph.D. study were

checking the large-scale knowledge graph through gamification (gaming with a

purpose), and construction of a large-scale linguistic resource (ontology). His current

research interest is building a knowledge graph in the game field and trying to find

more applications.

Direction 7: Database Storage Technology Research

Subject 7.1: Historical Database Storage Optimization

Historical data record is the track of the current database. Efficient retrospection of

data changes and historical value inquiry are of great significance, especially in the

financial sector. For example, regulators demand to provide changes of balance of a

certain account over the past five years, which requires rapid retrospection of

historical data. Historical databases and temporal databases will better realize the

value of data in the context of big data. Studying the storage and management of

massive data formed in historical and temporal databases is proving to be quite

promising and relevant.

Mentor Profile

He received a master’s degree from the University of Science and Technology of

China in Software Engineering, distinguished mentor of the School of Information,

the Renmin University of China for engineering master's students, national-level

senior engineer, a member of the expert advisory group for the Database Technique

Conference China (DTCC).He has been engaged in database engine development,

database architecture design, and database technology management for 20 years,

working at King base, Oracle, Teamsun Technology and other companies. He is now a

Tencent expert engineer, engaged in R&D of distributed database (TDSQL). He won a

silver Technology Breakthrough Prize and the top prize for the Beijing Science &

Technology Progress Award. He has published two database related books and applied

for more than ten patents.

Direction 8: Network Research

Subject 8.1: Research on Scalable and Highly Reliable RDMA Networks

The emergence of high-performance computing, distributed applications, and cloud

storage business places requirements for higher bandwidth and lower delay on cloud

networks. RDMA over Converged Ethernet (RoCEv2) can meet these needs quite

well. However, there are still many problems with RDMA deployment in a super-

large scale Ethernet environment. Based on Tencent RDMA network environment,

this area will study and optimize RDMA flow control, congestion control, QoS and

other mechanisms, to lay a solid foundation for building scalable and highly reliable

RDMA networks.

Mentor Profile


of Sciences, with research interests of data center networks and reconfigurable

computing. He worked at Microsoft Research Asia and was responsible for R&D of

DCN, NFV, RDMA, SmartNIC and other fields. He worked at the Microsoft US

headquarters designing the Microsoft cloud network acceleration system based on

SmartNIC. Now he is responsible for intelligent NIC R&D, cloud network system

planning, and network research. He has published many papers at top international

network conferences (SIGCOMM, CoNEXT, INFOCOM, ATC, ToN etc.).

Subject 8.2: Distributed Algorithms on Large-scale Social Network

WeChat and QQ have 1 billion and 800 million active users respectively, and involve

diverse link status and information exchange. Traditionally, such super-large graphs

often fail to support efficient data processing. In this project, we will explore the

design of distributed algorithms and flow algorithms for super large graphs and test

successful algorithms in real data scenarios.

Mentor Profile

He is a graduate of Fudan University (Bachelor), Tsinghua University (Master) and









Documents

Research Directions of the 2018 Tencent Rhino-Bird Elite ... · Research Directions of the 2018 Tencent Rhino-Bird Elite Training Program ... His main research interests are transfer