Hongtao Du

Pipelined and Parallel Computing

Partition for

Hongtao DuAICIP Research

Dec 1, 2005

Part 2

Partition Scheme

Driving Force

• Data-driven– How to divide data sets into different sizes for multiple

computing resources – How to coordinate data flows along different directions

such that brings appropriate data to the suitable resources at the right time.

• Function-driven– How to perform different functions of one task on

different computing resources at the same time.

Data - Flynn's Taxonomy

• Single Instruction Flow Single Data Stream (SISD)

• Multiple Instruction Flow Single Data Stream (MISD)

• Single Instruction Flow Multiple Data Stream (SIMD)– MPI, PVM

• Multiple Instruction Flow Multiple Data Stream (MIMD)– Shard memory– Distributed memory

Data Partitioning Schemes

Scatter Contiguous point

Contiguous row

Communication Patterns and Costs

• Communication expense is the first concern in data-driven partition.

• Successor/Predecessor (S-P) pattern • North/South/East/West (NSEW) pattern

is the message preparation latency, is the transmission speed (Byte/s),

is the number of processors, is the number of data, is the length of each data item to be transmitted.

Understanding Data-driven

• The arrivals of data initiate and synchronize operations in the systems.

• The whole system in execution is modeled as a network linked by data streams.

• Granularity of the algorithm: the size of data block that transmitted between processors. The flows of data blocks form data streams.

• Granularity selection: trade-off between computation and communication

– Large: reducing the degree of parallelism; increasing computation time; little overlapping between processors.

– Small: increasing the degree of overlapping; increasing communication and overhead time

Data Dependency

• Decreasing even dismissing the speedup

• Caused by edge pixels on different blocks

Block Reverse diagonal

Function

• Partitioning procedure– Evaluating the complexity of individual process in

function and the communication between processes

– Clustering processes according to objectives

– Partitioning optimization

Space-time-domain Expansion

• Definition: sacrificing the processing time to meet the performance requirements.

Time complexity:

)),(( nmMaxO

One Dimension Partitioning

• Keeping the processing size to one column at a time.

• Repeatedly feeding in data until the process finishes.

• Increases the time complexity by n (the number of column)

Two Dimension Partitioning

• Fixing the processing size to a two-dimensional subset of the original processing.

• Increasing the time complexity by

Resource Constraints

• Multi-processor– Software implementation– Homogenous system– Heterogeneous system

• Hardware/software (HW/SW) co-processing– Software and hardware components are co-designed– Process scheduling

• VLSI– Hardware implementation– Communication time is ignorable

Multi-processor

• Heterogeneous system– Contains computers in different types of parallelism.– Overheads in communicating add extra delays.– Communication tasks such as allocating buffers and setting

up DMA channels have to be performed by the CPU and cannot be overlapped with the computation.

• Host/Master - a powerful processor

• Bottleneck processor - the processor taking the longest amount of time to perform the assigned task.

HW/SW Co-processing

• System structure– SW - a single general purpose processor, Pentium or PowerPC– HW- a single hardware coprocessor, FPGA or ASIC– A block of shared memory

• Design view– Hardware components: RTL components (adders, multipliers,

ALUs, registers)– Software component: general-purpose processor– Communication: between the software component and the local

memory

• 90-10 Partitioning– Most frequent loops generally correspond to 90 percent of

execution time but only consisting of simple designs

• Constraints– Execution time (DSP ASIC)– Power consumption– Design area– Throughput

• Examples– Globally asynchronous locally synchronous on-chip

bus (Time)– 4-way pipelined memory partitioning (Throughput)

Question ……

Thank you!

Hongtao Du

Documents

Emerging Markets Monthly€¦ · 11-05-2017 · 11 May 2017 Emerging Markets Monthly Stretching Thin Drausio Giacomelli Sameer Goel Hongtao Jiang Juliana Lee Elina Ribakova Hongtao

Transportation Research Part D - mingxu/file/J41.pdf · Hongtao Wangc, Ming Xua,f, ... Available online 6 November 2015 Keywords: High Speed Rail Life cycle assessment China abstract

THE SCHOOL OF COMPUTER, MATHEMATICAL, AND NATURAL · PDF fileSchool of Computer, Mathematical and Natural Sciences Dr. Hongtao Yu, Dean Dr. Gaston N’Guerekata, Associate Dean for

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei

Digital Image Processing Hongtao Lu Dept. of Computer Science Sept. 2006

Characterization of two Field- Plated GaN HEMT Structures Hongtao Xu, Christopher Sanabria, Alessandro Chini, Yun Wei, Sten Heikman, Stacia Keller, Umesh

Beijing SCP Roundtable Report Nov 2009 final · Hongtao, College of Architecture and Environment at Sichuan University. UNEP Disclaimer The designations employed and the presentation

Original Article Pirarubicin-based chemotherapy displayed ...Shuier Zheng 1, Shuhui Zhou , Guanglei Qiao1, Qingcheng Yang2, Zhichang Zhang2, Feng Lin 1, Daliu Min , Lina Tang 1 , Hongtao

Hot Chips 16August 24, 2004 OptimoDE: Programmable Accelerator Engines Through Retargetable Customization Nathan Clark, Hongtao Zhong, Kevin Fan, Scott

Copyright Undertaking...heterostructure anodes for high-performance lithium-ion batteries. RSC Adv., 2014, 4, 39906-39911. 3. Xiaoyan Li,Yuming Chen, Hongtao Wang,Haimin Yao,Haitao

1 Dimensionality Reduction in Hyperspectral Image Analysis Using Independent Component Analysis Hongtao Du Feb 18, 2003

Error Tolerant Address Configuration for Data Center Networks with Malfunctioning Devices Xingyu Ma, Chengchen Hu, Kai Chen, Che Zhang, Hongtao Zhang,

The Tenth International Conference on Advanced ... · Integration machine learning with evolutionary computation for data- ... Jianzhong Cao and Hongtao Yang ... Blind Equalization

Nonparametric Statistical Methods Presented by Guo Cheng, Ning Liu, Faiza Khan, Zhenyu Zhang, Du Huang, Christopher Porcaro, Hongtao Zhao, Wei Huang 1

Micro-sensing Modalities Hongtao Du August 31, 2004

Optimistic Parallel Discrete Event Simulation Based on Multi-core Platform and its Performance Analysis Nianle Su, Hongtao Hou, Feng Yang, Qun Li and Weiping

Performance Evaluation of HVAC Systems via Coupled ... · Performance Evaluation of HVAC Systems via Coupled Simulation between Modelica and OpenFOAM Hongtao QIAO*, Saleh NABI and

Programming with CORBA Hongtao Shi 04/23/01. §CORBA Overview §Advantages of CORBA §Interface Definition Language §Application: Address Book Outline

Ion clustering in aqueous solutions probed with vibrational ...Ion clustering in aqueous solutions probed with vibrational energy transfer Hongtao Biana, Xiewen Wena, Jiebo Lia, Hailong

DES ŒDÈMES BIEN MYSTÉRIEUX - SSVQ · Cheng, Delei, Hao Xu, Zhao-Jun Lu, Rong Hua, Huan Qiu, Hongtao Du, Xinjian Xu, et Jing Zhang. 2013. « Clinical Features and Etiology of Budd-Chiari