22
University of Iowa | Mobile Sensing Laboratory Static Memory Management for Efficient Mobile Sensing Applications EMSOFT 2015 Farley Lai, Daniel Schmidt, Octav Chipara Department of Computer Science

Static Memory Management for Efficient Mobile Sensing Applications

Embed Size (px)

Citation preview

Page 1: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

Static Memory Management for Efficient Mobile Sensing

Applications

EMSOFT 2015

Farley Lai, Daniel Schmidt, Octav ChiparaDepartment of Computer Science

Page 2: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• A class of applications that process continuous input data streams and may produce continuous output streams

– real-time processing

– efficient resource management

Emerging Mobile Sensing Applications

2

Speaker Models

Speech Recording

VADFeature

Extraction

HTTP Upload

Speaker Identifier

Introduction

Sensing Stream Processing

Page 3: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Workload: stream operations on frames of samples

– e.g., windowing, splitting, or appending

– stream operation tend to be memory intensive

• Goal: implement stream operations efficiently

– reduce memory footprint

– reduce number of memory accesses

• Challenges:

– handle complex interaction between components

– avoid unnecessary memory copies

– enable data sharing between components

The Memory Management Challenge

3

Introduction

Page 4: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Dynamic memory management

– specialized data structures to implement memory management

• e.g., SigSeg [Girod, et al. 2008] – linked list of buffered samples

– a level of indirection in accessing streaming data

• Static memory management

– no runtime overhead

– requires precise knowledge of the variable live ranges

• difficult to achieve in complex applications

• must be time-efficient to be included in compilers

Approaches to Memory Management

4

Introduction

[Girod2008] L. Girod, Y. Mei, R. Newton, S. Rost, A. Thiagarajan, H. Balakrishnan, and S. Madden, “XStream: a Signal-Oriented Data Stream Management System,” in ICDE, 2008.

Page 5: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Application model

• Static analysis

• Memory layout

• Evaluation

• Conclusions

Outline

5

Page 6: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• StreamIt – synchronous data flow (SDF) language

– application = graph of filters connected with FIFO channels

• limited memory operations: pop(), peek(), and push()

• known consumption and production rates

A Model for Stream Applications

6

pop

peek

push

Filter::work()

INPUT: OUTPUT:

Page 7: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• StreamIt – synchronous data flow language

– applications are constructed hierarchically

• pipeline of streams

• split and joins (splitter and joiner)

– pass-by-value semantics

• naïve implementation would incur significant number of copies

A Model for Stream Applications

7

LPF2

Source

Du

plic

ate LPF1

Subtract SinkR

ou

nd

-Ro

bin

Page 8: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• SDFs may be executed in a cyclo-static schedule– the complete memory behavior of the program may be

observed within one execution of the schedule

• Our solution: static analysis + memory layout

Insight

8

LPF2

Source

Du

plic

ate LPF1

Subtract Sink

Ro

un

dR

ob

in

Source,3 DUP, 3 LPF1,1 LPF2,1

Source,1 DUP, 1 LPF1,1 LPF2,1 RR,1 Sub,1 Sink

INIT PHASE:

STEADY

PHASE:

RR,1 Sub,1 Sink

Page 9: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Location Sharing

– an output element is pushed from an unmodified input element

– each I/O element is associated with a pop/push index

• Temporal Sharing

– an output element reuses the input element storage

– each I/O element is associated with a live range [i, j]

• Builds on abstract interpretation

– build a Control-Flow Graph (CFG) for each filter

– abstract interpretation of memory operations

Component Analysis

9

Page 10: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Abstract interpretation of memory operations

– memory counter (MC) – relative order of operation

– indexes of current push (out) and pop (in)

– live range for each input (LIN) and output (LOUT) element

• Indexes and live ranges represented as intervals

• Subset of rules for determining live ranges:

Component Analysis

10

MC, out, LOUT

LOUT [out]⊔ MC, out++, MC++push

MC, in, LIN

LIN[in]⊔MC, in++, MC++pop

(MC1, in1, out1) (MC2, in2, out2)

(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)join

Page 11: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 11

Example of Component Analysis

[0,0] ∅ ∅ExampleLIN LOUT

0 0 1

MC, LIN, in

LIN[in]⊔MC, in++, MC++pop

RULE:

STATE:

MC 0

in 0 0

out 0 0

MC 1

in 1 1

out 0 0

CFG:

LIN[0] =LIN[0]⊔[0,0]

Page 12: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 12

Example of Component Analysis

[0,0] [1,1] ∅ExampleLIN LOUT

0 0 1

RULE:

STATE:

MC 1

in 1 1

out 0 0

MC 2

in 1 1

out 1 1

CFG:

LOUT[0] =LOUT[0]⊔[1,1]

MC, LOUT, out

LOUT [out]⊔ MC, out++, MC++push

Page 13: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 13

Example of Component Analysis

[0,0] [1,1] ∅ExampleLIN LOUT

0 0 1

RULE:

STATE:

MC 1

in 1 1

out 0 0

MC 2

in 1 1

out 0 1

CFG:

MC 2

in 1 1

out 1 1

(MC1, in1, out1) (MC2, in2, out2)

(MC=max(MC1,MC2), in= in1 ⊔ in2, out=out1 ⊔ out2)

join

Page 14: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory | 14

Example of Component Analysis

[0,0] [1,1] [2,2]ExampleLIN LOUT

0 0 [0,1]

RULE:

STATE:

MC 2

in 1 1

out 0 1

MC 3

in 1 1

out 1 2

CFG:

LOUT[0,1] =LOUT[0,1]⊔[2,2]

MC, LOUT, out

LOUT [out]⊔ MC, out++, MC++push

Page 15: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Component analysis constructs a memory fragment

– captures live ranges for temporal reuse

– captures location sharing edges

• Whole program analysis constructs a memory graph

– stitches together memory fragments

– simulates the schedule to

• connect location sharing edges into paths and

• extend live ranges with the phase number and invocation index

• Our approach:

– analysis is precise when there is no input dependency

– otherwise, it is a sound approximation

Whole Program Analysis

15

Page 16: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

B

• Empirical insights– split-joins can be eliminated for manipulating location shared

elements

– a filter usually can reuse its input memory

• Heuristic approaches to resolving temporal reuse conflicts

Memory Layout

16

A

B

A0

0

0

A B other comps A memory B memory

0

0 0

No conflict Append on Conflict (AoC) Insert-in-Place (IP)

B

A

A

Page 17: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Intel x86_64 on Mac OS X 10.10.3– 3GHz Intel Xeon CPU E5-1680 v2.

– 32KB L1 instruction + 32KB L1 data caches

– 256KB L2 + 25MB L3 caches

• StreamIt Compiler– baseline default settings without optimizations

– enabled cache optimizations with –cacheopt

– gcc –O3 to compile generated C/C++ code

• 11 micro benchmarks from StreamIt

• 3 macro benchmarks from real MSAs– BeepBeep [Peng, C., et al. 2007],

– MFCC and Crowd [Xu, C., et al. 2013]

Experimental Setup

17

Evaluation

Page 18: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

– ESMS reduces both channel buffer sizes and the number memory operations from splitters, joiners and reordering filters

Memory Usage on Intel x86_64

18

45% to 96% reductions73% reductions on average

Evaluation

Page 19: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

– Compared with baseline StreamIt– The average speedup of AA, AoC, and IP are 3, 3.1, and 3 while the average

speedup of CacheOpt is merely 1.07. – ESMS improves the performance by eliminating unnecessary memory

operations and reducing cache/memory references.

Speedup on Intel x86_64

19

Evaluation

Page 20: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• Static memory management is effective for stream languages

– whole program memory behaviors may be characterized

– both location and temporal sharing opportunities are exploited

– performance improvement due to fewer memory operations and references

• ESMS provides significant performance improvements

– 45% to 96% data size reduction

– 73% code size reduction

– 3X speedup

Conclusions

20

Page 21: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

• National Science Foundation (NeTs grant #1144664 )

• Carver Foundation (grant #14-43555 )

Acknowledgements

21

CSense Toolkit

Page 22: Static Memory Management for Efficient Mobile Sensing Applications

University of Iowa | Mobile Sensing Laboratory

Questions?

Thank You

22