8
Thur., Oct 24, 2013 Pin Yi Tsai WEEKLY REPORT

20131024

  • Upload
    jocelyn

  • View
    102

  • Download
    2

Embed Size (px)

DESCRIPTION

2013/10/24 meeting at Delta R621

Citation preview

Page 1: 20131024

Thur., Oct 24, 2013

Pin Yi Tsai

WEEKLY REPORT

Page 2: 20131024

OUTLINE

• Current Work

• Compute Integral Image – parallel version

• Why the difference is so implicit?

• An accidental Error

• In Process

• Compute 11 types of Features

Page 3: 20131024

COMPUTE INTEGRAL IMAGE – PARALLEL VERSION

• Computation and communication time

input 16x16:

serial version: 0.006336 ms

for loop outside of kernel function:

parallel version: 6.80778 ms

for loop inside of kernel function:

parallel version: 5.88559e-39 ms

Page 4: 20131024

COMPUTE INTEGRAL IMAGE (CONT.)

input 640x480:

serial version: 5.1607 ms

parallel version: 4.94058 ms

Page 5: 20131024

WHY THE DIFFERENCE IS SO IMPLICIT?

• Profile:

Time : 4.91024 ms

======== Profiling result:

Time(%) Time Calls Avg Min Max Name

71.71 2.75ms 1 2.75ms 2.75ms 2.75ms computeByColumn(float*, int)

10.91 418.56us 2 209.28us 209.06us 209.50us [CUDA memcpy HtoD]

10.08 386.46us 2 193.23us 191.10us 195.36us [CUDA memcpy DtoH]

7.31 280.22us 1 280.22us 280.22us 280.22us computeByRow(float*, int, int)

Access the inconsistent memory

Memory Access is too time-consuming

Page 6: 20131024

AN ACCIDENTAL ERROR

• Occurred when copy data from Mat to one-dimension float array

0 175 175 175 175 175

0 174 174 174 174 174

6.78807e-29 175 175 175 175 175

0 175 175 175 175 175

0 175 175 175 175 175

6.79909e-29 134 151 158 136 142

0 138 132 135 140 135

6.80354e-29 136 136 143 142 137

Page 7: 20131024

AN ACCIDENTAL ERROR (CONT.)

• Why?

• memset(ar,0,sizeof(float)*(image1.step+1)*(image1.rows+1));

• The size is not correct.

Page 8: 20131024

The End