16
Cloud-Sea Computing on ZB of Data Zhiwei Xu Institute of Computing Technology (ICT) Chinese Academy of Sciences (CAS) www.ict.ac.cn, [email protected] This research is supported in part by the National Basic Research Program of China (Grant 2011CB302502), the Strategic Priority Program of Chinese Academy of Sciences (Grant XDA06010400), and the Guangdong Talents Program INSTITUTE OF COMPUTING TECHNOLOGY

INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

  • Upload
    haduong

  • View
    219

  • Download
    5

Embed Size (px)

Citation preview

Page 1: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Cloud-Sea Computing

on ZB of Data

Zhiwei Xu

Institute of Computing Technology (ICT)

Chinese Academy of Sciences (CAS)

www.ict.ac.cn, [email protected]

This research is supported in part by the National Basic Research Program of China (Grant 2011CB302502),

the Strategic Priority Program of Chinese Academy of Sciences (Grant XDA06010400),

and the Guangdong Talents Program

INSTITUTE OF COMPUTING

TECHNOLOGY

Page 2: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

We Are Entering a

ZB Computing Era

• Two historical observations: – Per-capita capacity: Mega Giga Tera;

– Worldwide capacity: Peta Exa Zetta

• Two major challenges – Capacities increase 1000X, while power (and energy) 1X

– Enable existing and new workloads (and values)

Per Capita Worldwide Per Capita Worldwide Per Capita WorldwideStorage 4.3 MB 21 PB 44.7 GB 295 EB 5.23 TB 41.8 ZBCommunication 12 MB 59 PB 9.86 GB 65 EB 2.88 TB 23 ZBGP Computing 0.06 MIPS 0.3 PIPS 0.97 GIPS 6.39 EIPS 4.98 TIPS 40 ZIPSSP Computing 0.09 MIPS 0.44 PIPS 28.6 GIPS 189 EIPS 321 TIPS 2570 ZIPS

1986 2007 2030Capacity

1986 and 2007 data: Hilbert and López, Science 2011: 332 (6025), 60-65.

2030 projection: from a conservative estimation by ICT, CAS.

Page 3: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Workload Mega Trend: e-People • e-People = Computing for the Masses

– IT that directly benefits the masses (billions of individuals), not institutions

• e-People, not e-Business, e-Science, e-Government

– Computer science utilizing the human-cyber-physical ternary universe

• Ternary computing, not just cyber computing (unary computing)

• e-People is not fully realized if we have to use cyber devices

Institutional Computing

e-Business

e-Science

e-Government

Cyberspace Computing

IT services

IT software

IT hardware

Billions of users

Trillions of devices

Millions of verticals

ZB of data

Human-facing devices

are not enough

Currently videos are the #1 load.

2.88 TB = 8 HD movies per day!

Page 4: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

The Chinese Academy of Sciences

NICT Project

• New generation ICT

– 10-year research project (2012-2021)

– 19 institutes, over 200 faculty members

– Targeting potential mainstream markets of 2020-2030

– Aiming at China’s needs in 2020-2050

• Human-cyber-physical ternary computing for ZB of data

– Functional sensing

– Customizable Internet

– Cloud-sea computing

Page 5: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Functional Sensing:

Acquisition of Home Appliances Data

• Application examples (2020-2030)

– Web search Grid search

• “Top 100 green households in Beijing and London”

– Appliances R&D • Utilizing field data for all appliances (better than software beta-test)

• Acquisition challenge – Can we timely acquire massive and accurate field data from billions of households,

for each and every appliance (lamp, refrigerator, etc.) in every household, with 1(~3)

sensors per home?

Page 6: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Traditional Sensing

• One sensor per device – ~50 devices per home, 220V@50Hz

– Up to 128th harmonics • 256 samples/cycle, 10 bytes/sample

– 6.4 MB/s, or 200TB per year per home

– For China, 200TB x 0.5 billion homes = 100 ZB per year

Current waveform of

a heater in one cycle

Page 7: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Functional Sensing

• One sensor per home

• Function is formalized behavior – Type 0: human sensor

– Type 1: current smart meters

– Type 2: on-off behavior data for each device

– Type 3: event behavior data

– Type 4: finite behavior data (up to kth harmonics for a given finite k)

– Type 5: infinite behavior data

• Data storage needs can be reduced 10,000 times – 20GB/year per home for aggregated data

– 1TB/year per home for disaggregated data for each device

Page 8: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

The REST 2.0 Architecture

for Cloud-Sea Computing

EB-scale

Billion-thread

Servers

PB-scale

Servers

CDN/CGN

Cloud-side functions aggregation, request-response, big data

100s units

10Ks units

Millions

SeaHTTP

SeaHTTP

Seaport

Sea Zone

Sea-side functions sensing, interaction, local processing

Trillions, KB-GB

HTTP 2.0+

Seaport Billions units

TB-PB/unit

Sea Zone Billions, GB-TB

Page 9: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

New Gadgets for Homes

• GB sensor nodes @0.2W

• TB “smart phones” @2W

• PB wuTV (home datacenter) @20W

• PB Personal Watson (iPC) @200W

SeaHTTP

wuTV

iPC

Home

HTTP 2.0+

Page 10: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Three examples of

Data Computing • Off-line (back end):

RCFile for Apache Hive

– Production use: Facebook, Taobao,

Netflix, Twitter, Yahoo!, Linkedin, AOL,

Salesforce.com, etc.

– http://en.wikipedia.org/wiki/RCFile

• On-line (front end):

CCIndex on Hbase

– Production use in Taobao, Tencent

• High-speed communication: DataMPI

Alexa Top Sites

(2013.06.14)

1. Facebook

2. Google

3. YouTube

4. Yahoo!

5. Baidu

6. Wikipedia

7. Windows Live

8. Twitter

9. QQ (Tencent)

10. Taobao

22. eBay

Page 11: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

DataMPI open sourced at datampi.org

Hadoop

DataMPI

EXEC Time

99 sec

EXEC Time

18 sec

EXEC Time

364 sec

EXEC Time

103 sec

Sort PageRank

Page 12: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Billion-Thread Server

...Core

Traditional Architecture of Datacenters

...

......

...

...

Aggregation

Access

Hypervisor

Application Management

Runtime Environment

Applications

REST 1.0 Requests

Reduce Datacenter

Layers

Simplify SW/HW Stacks

REST 2.0 Requests

Architecture of Cloud-Sea Server

Micro OSApplications

Nano Kernel

Workload Processing Unit (WPU)

Memory

Storage

OS

Chipset

CPU

NIC

Memory

Disk

Page 13: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Cloud-Sea Storage

• Emphasize power-on efficiency

(70% HW peak), while matching

latency, scalability, resilience needs

• Innovations

– stable sets

– metadata clustering

– network RIAD

40 benchmark apps: reduces latency 123 times, backend load 50 times

Time

Addresses

Page 14: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

Elastic Processor • A new architecture style (FISC)

– Featuring function instructions executed by

programmable ASIC accelerators

– Targeting 1000 GOPS/W applications

• Results: 932 GOPS/W for machine learning

RISC ARM

FISC Function Instruction Set Computer

CISC Intel X86

Chip types: 10s 1K 10K

Power: 10~100W 1~10W 0.1~1W

Apps/chip: 10M 100K 10K

Page 15: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

References • Rui Hou, Tao Jiang, Liuhang Zhang et al, Cost Effective Data Center Servers, HPCA-19,

2013: 179-187

• Zhiwei Xu: High-Performance Techniques for Big Data Computing in Internet Services.

Invited speech at SC12, SC Companion 2012: 1861-1895

• Zhiwei Xu: Measuring Green IT in Society. IEEE Computer 45(5): 83-85 (2012)

• Zhiwei Xu: How Much Power Is Needed for a Billion-Thread High-Throughput Server?

Frontiers of Computer Science 6(4): 339-346 (2012)

• Zhiwei Xu, Guojie Li: Computing for the Masses. Commun. ACM 54(10): 129-137 (2011)

• Jingjie Liu, Lei Nie, Zhiwei Xu: The Input-Sensing Problem in Ternary Computing and Its

Application in Household Energy-Saving. GreenCom 2011: 131-138

• Yongqiang He, Rubao Lee, Yin Huai, Zheng Shao, Namit Jain, Xiaodong Zhang, Zhiwei

Xu: RCFile: A Fast and Space-Efficient Data Placement Structure in MapReduce-based

Warehouse Systems. ICDE 2011: 1199-1208

• Xiaoyi Lu, Bing Wang, Li Zha, Zhiwei Xu: Can MPI Benefit Hadoop and MapReduce

Applications? ICPP Workshops 2011: 371-379

• Qi Guo, Tianshi Chen, Yunji Chen, Zhi-Hua Zhou, Weiwu Hu, Zhiwei Xu: Effective and

Efficient Microprocessor Design Space Exploration Using Unlabeled Design

Configurations. IJCAI 2011: 1671-1677

• Yongqiang Zou, Jia Liu, Shicai Wang, Li Zha, Zhiwei Xu: CCIndex: A Complemental

Clustering Index on Distributed Ordered Tables for Multi-dimensional Range Queries.

NPC 2010: 247-261

Page 16: INSTITUTE OF COMPUTING TECHNOLOGY Cloud-Sea …novel.ict.ac.cn/zxu/Talks/Xu at ISC-BigData 2013.pdf · on ZB of Data Zhiwei Xu Institute ... Techniques for Big Data Computing in

谢谢! Thank you!

[email protected]