32
Harnessing Heterogeneous Computing for Cloud- and Mobile-based Visual Search Dr. Ren Wu Distinguished Scientist, Baidu [email protected]

"Harnessing Heterogeneous Computing for Cloud- and Mobile-Based Visual Search," A Presentation From Baidu

Embed Size (px)

Citation preview

Harnessing Heterogeneous Computing

for Cloud- and Mobile-based Visual

Search

Dr. Ren Wu

Distinguished Scientist, Baidu

[email protected]

Baidu

Everyday

5b+ queries

500m+ users

100m+ mobile users

100m+ photos

Google == American Baidu !?

Baidu Stock

Baidu Q2’14

Deep Learning Applications

• Speech recognition

• Image recognition

• Optical character recognition (OCR)

• Language translation

• Web search

• Computational Ads (CTR)

• …

Deep Learning Applications

Progress of Deep Learning at Baidu

• Big improvement on speech & image recognition (2013)

• Speech: error rate reduced by 25%

• OCR: error rate reduced by 30%

• Face: LFW benchmark, 94% correct

• DNN CTR for search ads was launched on May 20th 2013,

serving billions of search queries everyday – substantial

improvement

http://stu.baidu.com

Baidu – Visual Search

Visual Search Example

Visually similar images

The competition

Baidu

Another Example

The competition Baidu

Image

uploaded

Baidu Google搜索结果

Image

uploaded

Visually Similar Images - Comparison

The competition

Peak uploading rate at 100 million images per day! IOS APP #1 for 3 weeks

百度魔图:PK大咖

Deep Learning vs. Human Brain

pixels

edges

object parts

(combination

of edges)

object models Deep Architecture in the Brain

Retina

Area V1

Area V2

Area V4

pixels

Edge detectors

Primitive shape

detectors

Higher level visual

abstractions

Slide credit: Andrew Ng

Voic

e

Text

Imag

e User

Neural Network, DNN, and AI

Why it works now but not before?

What have happened?

Big Data

High Performance Computing

Big Data @ Baidu

• >2000PB Storage

• 10-100PB/day Processing

• 100b-1000b Webpages

• 100b-1000b Index

• 1b-10b/day Update

• 100TB~1PB/day Log

Heterogeneous Computing

1993 world #1 Think Machine CM5/1024

131 GFlops

2013 Samsung Note 3 smartphone

(Qualcomm SnapDragon 800)

129 Gflops

2000 world #1 ASCI White (IBM RS/6000SP)

6MW power, 106 tons

12.3 TFlops

2013 Two MacPro workstation

(dual AMD GPUs each)

14 TFlops

Deep Learning at Scale

Voice

, Text

Imag

e

User

DNN for Speech 10k hours of voice data

10b training samples

Months on a GPU cluster

High Performance Computing

Datasets

• Image recognition: 100

millions

• OCR: 100 millions

• Speech: 10 billions

• CTR: 100 billions

Projected training data to

grow 10x each year

Training time:

Weeks to Months

on GPU clusters

Big data + Deep learning + HPC

= Intelligence

Infrastructure

Baidu Brain – 100x bigger than Google’s

Artificial Intelligence

Big data + Deep learning + High performance computing =

Intelligence

Omnipotent

Mobile Applications of DNN

“手机百度 随时知道”

DNN on Mobile Phones

Samsung Galaxy Note 3

AT&T version

SAMSUNG-SM-N900A

Snapdragon 800

Andriod 4.3

No connectivity needed!

Image Recognition

Jen-Hsun Huang’s Keynote

Rob Fergus’s talk yesterday

Image processed at cloud

(data center)

VS.

World’s First Mobile DNN App

• Image recognition on mobile

device

• Real time and no connectivity

needed

• directly from video stream, what

you point is what you get

• Everything is done within the

device

• OpenCL based, highly optimized

• Large deep neural network

models

• Thousands of objects, flowers,

dogs, and bags etc

• Unleashed the full potential of the

device hardware

• Smart phones now, Wearables

and IoTs tomorrow

Exceptional performance!

DNNs Everywhere

Supercomputers Datacenters Tablets, smartphones Wearable devices

IoTs

1000s GPUs 100k-1m servers 700m (in China) Billions?

Supercomputer used for training

Trained DNNs then deployed to data centers (cloud),

smartphones, and even wearables and IoTs

OpenCL-based Open ECO-SYSTEM

• Diverse industry participation, from cell phones to supercomputers

o Processor vendors, system OEMs, middleware vendors, application developers.

• OpenCL is the industry standard embraced by many companies.

Third party names are the property of their owners. * Courtesy of Simon McIntosh-Smith and Tom Deakin

Artificial Intelligence

Big data + Deep learning + Heterogeneous computing =

Success

Omnipresence

Thank you!

Dr. Ren Wu

[email protected]

@韧在百度