26
ENHANCEMENT OF TEGRA TABLET'S COMPUTATIONAL PERFORMANCE BY GEFORCE DESKTOP AND WIFI Di Zhao The Ohio State University GPU Technology Conference 2014, March 24-27 2014, San Jose California

Enhancement of Tegra Tablet's Computational Performance by

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Enhancement of Tegra Tablet's Computational Performance by

ENHANCEMENT OF TEGRA

TABLET'S COMPUTATIONAL

PERFORMANCE BY GEFORCE

DESKTOP AND WIFI

Di Zhao

The Ohio State University

GPU Technology Conference 2014, March 24-27 2014, San Jose California

Page 2: Enhancement of Tegra Tablet's Computational Performance by

1 TEGRA-WIFI-GEFORCE

Tegra tablet and Geforce desktop are one of

the most popular home based computing

platforms;

Wifi is the most popular home networking

platform;

Wildly applied to entertainment, healthcare,

gaming, etc;

Page 3: Enhancement of Tegra Tablet's Computational Performance by

1.1 DEVELOPMENT OF TEGRA AND

GEFORCE

Core GFLOPS

(single)

Tegra 4 CPU 4+1/GPU

72

~ 80

Tegra K1 CPU 4+1/192

CUDA core

~ 180

TITAN 2688 CUDA

core

~ 4500

TITAN Z 5760 CUDA

core

~ 8000

Page 4: Enhancement of Tegra Tablet's Computational Performance by

1.2 DEVELOPMENT OF WIFI

Protocol Theoretical

Speed (M)

Frequency (G) Release Date

802.11 1 − 2 2.4 1997-06

802.11a 6 − 54 5 1999-09

802.11b 1 − 11 2.4 1999-09

802.11g 6 − 54 2.4 2003-06

802.11n 15 − 150 2.4/5 2009-10

802.11ac < 866.7 5 2014-01

802.11ad < 6912 60 2012-12

IEEE 802.11 Network Standards

Page 5: Enhancement of Tegra Tablet's Computational Performance by

THE PROBLEM

Mobile applications such as computer graphics or healthcare often

result in heavy computation;

Mobile applications have time constraints because user do not want

to wait for seconds;

Geforce has large GFLOPS, and Tegra can be supported by

Geforce and Wifi?

In this talk, experiences of Tegra-Wifi-Geforce are introduced, and

an example of medical image is discussed;

Tegra has limited GFLOPS, when the

applications exceed Tegra’s

computational ability GFLOPS, what

should we do?

Page 6: Enhancement of Tegra Tablet's Computational Performance by

2 SETUP DEVELOPMENT

ENVIRONMENT FOR TEGRA-WIFI-

GEFORCE

Setup of Development Environment for

Tegra Tablet;

TestingWifi Device;

Setup of Development for Geforce

Desktop;

Development of the Communication

Model;

Page 7: Enhancement of Tegra Tablet's Computational Performance by

2.1 SETUP OF DEVELOPMENT

ENVIRONMENT FOR TEGRA TABLET

Install Visual Studio on the development computer;

Setup Android Debug Bridge (ADB) on Tegra

tablet, connect Tegra tablet and the development

computer by USB cable, bluetooth or Wifi;

Enable developer mode for Android on Tegra

tablet;

Install the latest version of Tegra Android

Development Pack;

If everything works fine, you will see:

Page 8: Enhancement of Tegra Tablet's Computational Performance by

2.2 TESTING WIFI SPEED

Test the real speed of Wifi speed by tools, unrelated to Internet speed;

Iperf is a tool to measure maximum TCP bandwidth, delay jitter, datagram loss, etc;

Maximum TCP bandwidth is important parameter to develop the application for Tegra-Wifi-Geforce;

TCP bandwidth can be read from Iperf output:

Page 9: Enhancement of Tegra Tablet's Computational Performance by

2.3 PROGRAMMING TEGRA-WIFI-

GEFORCE

2.3.1.1 Tigre Android

Development Pack, Android

Application (Java), Android

Application with Native Code

(Java, C++), Android Native

Application (C++)

2.3.1.2 Graphic

Programming: OpenGL ES,

OpenCV, etc.

2.3 .3

TCP/IP

Socket

Program

2.3.2 CUDA C/C++,

CUDA Fortran,

MATLAB Parallel

Computing Toolbox,

other commercial

libraries, existing

software, etc.

Page 10: Enhancement of Tegra Tablet's Computational Performance by

2.4 COMMUNICATION

Tegra and Geforce run different code, not SIMD;

Tegra and Geforce cooperate and communicate

for the application;

Tegra tablet and Geforce desktop are different

computers, and heterogeneous computing with

OpenCL may be not an option;

Tegra tablet and Geforce desktop communicate

only by Wifi;

Any library?

Page 11: Enhancement of Tegra Tablet's Computational Performance by

2.4 COMMUNICATION

Tegra

Geforce

Communication

Communication

Tegra

Communication

Geforce

Blocked Point-to-Point Communication between Geforce

and Tegra

Page 12: Enhancement of Tegra Tablet's Computational Performance by

2.4 COMMUNICATION

Work perfect between Tegra and Geforce;

Advantage: no conflict for too much data or no

data;

Advantage: easy to program for

communication;

Disadvantage: low efficiency;

Disadvantage: single Geforce and single Tegra;

Better solution?

Blocked Point-to-Point Communication between Geforce

and Tegra

Page 13: Enhancement of Tegra Tablet's Computational Performance by

2.5 APPLICATIONS FOR TEGRA-WIFI-

GEFORCE

Evaluation of computational requirements;

Evaluation of communication requirements;

Evaluation the programming difficulty: Geforce is much easier programming than Tegra;

Decide the Separation Point of the applications into Tegra tablet and Geforce desktop;

After the setup of the development environment, applications on Tegra-Wifi-Geforce can be developed by:

Page 14: Enhancement of Tegra Tablet's Computational Performance by

3 Enhancement of Tegra's

Computational Performance by

GeForce

MOBILE HEALTH MARKET WORTH $20.7 BILLION BY 2018

1000000 KJ

Healthcare apps for the physicians;

Healthcare apps for the public;

Currently, medical image in healthcare apps are currently in medical image

viewer;

ITK on the iOS

Page 15: Enhancement of Tegra Tablet's Computational Performance by

3.1 THE EXAMPLE ON TEGRA-WIFI-

GEFORCE: ULTRASOUND SIMULATION

Ultrasound simulation consists of two parts: signal simulation and image

reconstruction;

By the settings, ultrasound signals are simulated from solving physical

equations, not from real medical equipment;

Based on the simulated signals, ultrasound images are reconstructed;

Signal Simulation Image Reconstruction

Settings

Communication 1 Communication 2

Page 16: Enhancement of Tegra Tablet's Computational Performance by

3.2 SIMULATION OF SIGNAL

In the simulation equation, both variable t and p are

discretized, and each solution of the two variables

results in the matrix for the simulated signals: the sensor

data.

Page 17: Enhancement of Tegra Tablet's Computational Performance by

EVALUATION OF COMPUTATIONAL

REQUIREMENTS

Generally, the ultrasound signals are calculated by numerical methods;

Ultrasound signal simulation is computationally intensive: by finite difference method or finite element method, at every time step T, a tri-diagonal matrix is solved with the discretization size P of the variable p;

Page 18: Enhancement of Tegra Tablet's Computational Performance by

Signal Simulation can Reach Large GFLOPS

Number of Channels C

One scan line

EVALUATION OF COMPUTATIONAL REQUIREMENTS

Page 19: Enhancement of Tegra Tablet's Computational Performance by

EVALUATION OF COMPUTATIONAL

REQUIREMENTS

Beamforming

TGC

Filtering

Envelop Detection

Log Compression

Scan Convention

FFT IFFT

Image Domain Frequency Domain

IFFT FFT

Image reconstruction needs

small GFLOPS, and easy to

program. In Tegra K1, CUDA,

cuFFT and cuBLAS are

available. Signal processing?

Page 20: Enhancement of Tegra Tablet's Computational Performance by

EVALUATION OF COMMUNICATION

REQUIREMENTS AND PROGRAMMING

DIFFICULTY

Communication 1: small;

Communication 2: for simulating an image, fps ×

scan resolution × channels × precision ( float or

double ), generally several mb per second;

Image data is transferred in scan resolution, not in

image resolution. After the scan data is received, the

image resolution is obtained by interpolation.

Image reconstruction is much easier programming

than signal simulation;

the Separation Point: the simulated ultrasound signal

Page 21: Enhancement of Tegra Tablet's Computational Performance by

3.3 ULTRASOUND SIMULATION

The setting of parameters is transferred to Geforce

through Wifi (communication 1);

The ultrasound signals are simulated in Geforce;

The simulated signals are transferred to Tegra tablet

through Wifi (communication 2);

The image is reconstructed in Tegra tablet;

Page 22: Enhancement of Tegra Tablet's Computational Performance by

By CUDA and libraries, ultrasound

simulation on Tegra K1;

Better communication mode for Tegra-

Wifi-Geforce;

Real-time 3D+time ultrasound

simulation: faster signal simulation and

faster image reconstruction;

3.4 FUTURE RESEARCH

Page 23: Enhancement of Tegra Tablet's Computational Performance by

4 DISCUSSION

Dicoogle Mobile

Open-source Medical Image for Mobile Device

Droid Dicom Viewer ITK on the iOS

MIRC Viewer KiwiViewer

Endrov

Page 24: Enhancement of Tegra Tablet's Computational Performance by

MORE MEDICAL IMAGE

Drishti is volume exploration and presentation tool:

http://sf.anu.edu.au/Vizlab/drishti/;

ITK-SNAP is a software application used to segment structures in 3D

medical images: http://www.itksnap.org/;

VTK is an open-source, freely available software system for 3D

computer graphics, image processing and visualization:

http://www.vtk.org/;

Voreen volume rendering engine: http://www.voreen.org/;

InVesalius is Open source software for reconstruction of CT and MRI:

http://www.cti.gov.br/invesalius/;

GIMIAS is a workflow-oriented environment focused on biomedical

image computing and simulation: www.gimias.net;

Open-source Medical Image Software

Page 25: Enhancement of Tegra Tablet's Computational Performance by

EVEN MORE MEDICAL IMAGE?

Please visit my poster:

Page 26: Enhancement of Tegra Tablet's Computational Performance by

THANKS !