24
TaintART: A Practical Multi-level Information-Flow Tracking System for Android RunTime Mingshen Sun * , Tao Wei , and John C.S. Lui * * The Chinese University of Hong Kong Baidu X-Lab October 25 @ CCS 2016

TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

TaintART: A Practical Multi-level Information-FlowTracking System for Android RunTime

Mingshen Sun∗, Tao Wei†, and John C.S. Lui∗

∗ The Chinese University of Hong Kong† Baidu X-Lab

October 25 @ CCS 2016

Page 2: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction – Android Malware

Mobile devices become the biggest target among all threats

Report

From 2004 to 2013 we detected nearly 200,000 samples of maliciousmobile code. In 2014 there were 295,539 new programs, while thenumber was 884,774 in 2015. — Kaspersky 1

1https://securelist.com/analysis/kaspersky-security-bulletin/73839/mobile-malware-evolution-2015/

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 2 / 24

Page 3: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction – Android Malware

Android malware samples accounted for 98% of all mobilethreats

TrojanSpywarePhishing appsRansomwareRootkit…

22http://www.phonearena.com/news/

Malware-on-Android---a-myth-or-real-threat_id37322Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 3 / 24

Page 4: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Rest of the talk…

1 Introduction to dynamic taint analysis

2 TaintART: A Practical Multi-level Information-Flow Tracking Systemfor Android RunTime

Introduction to dynamic taint analysis systemTaintARTBackground of Dalvik and ARTSystem Design of TaintARTImplementation & Case StudyEvaluation by macro/micro benchmark

3 Summary

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 4 / 24

Page 5: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction

Dynamic taint analysis (aka. dynamic information-flowanalysis)

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 5 / 24

Page 6: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction

Dynamic taint analysis (aka. dynamic information-flowanalysis)

1 label (taint) sensitive data from certain functions (source)

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 6 / 24

Page 7: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction

Dynamic taint analysis (aka. dynamic information-flowanalysis)

1 label (taint) sensitive data from certain functions (source)2 handle label transitions (taint propagation) between variables, files,

and procedures at runtime3 a tainted label transmit out of the device through some functions

(sinks)4 data leakage

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 7 / 24

Page 8: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction

Applications of dynamic taint analysis systems (aka. dynamicinformation-flow analysis)

1 attack detection and prevention2 information policy enforcement3 testing in software engineering4 data lifetime and scope analysis

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 8 / 24

Page 9: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Introduction

Current status of dynamic taint analysis tool for AndroidTaintDroid is a notable system released in 2010 by William Enck etal., and many systems are based on TaintDroidTaintDroid was designed for VM-based system and implementedon legacy Android systems (2.1, 2.3, 4.1, and 4.3)recent Android adopted ahead-of-time (AOT) compilation strategyand introduced new Android RunTime (ART) to replace Davlik VMportability, compatibility, and performance issues

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

0

0.1

0.2

0.3

0.4

mea

n=19.1

1

mea

n=11

.37

mea

n=20.6

9

mea

n=12.6

8

Android 2.3 4.0 4.4 5.0 6.0

Android SDK Version

Perc

enta

ge

Target SDK Version (Oct. 2015) Minimum SDK Version (Oct. 2015)

Target SDK Version (Feb. 2016) Minimum SDK Version (Feb. 2016)

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 9 / 24

Page 10: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

TaintART

TaintARTWe design and implement TaintART, a dynamic information-flowtracking system which targets the latest Android runtimeTaintART introduces a multi-level taint label to tag the sensitivelevelsTaintART instruments Android’s compiler and utilizes processorregisters for taint storageTaintART only needs registers accesses and achieve faster taintpropagation compared to TaintDroid

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 10 / 24

Page 11: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Background – Android App Environment

The Dalvik app environmentsource code -> dex bytecode -> optimized dex bytecode -> run

The ART app environmentsource code -> dex bytecode -> compiled native code -> run

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 11 / 24

Page 12: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

System Design – Overview of TaintART

The TaintART compiler in the installation stageThe TaintART runtime in the runtime stage

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 12 / 24

Page 13: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

System Design – Taint Tag Storage

Taint tag storageThe TaintART compiler will reserve registers for taint storage

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 13 / 24

Page 14: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

System Design – Taint Propagation Logic

Taint tag propagation (from R1 to R0)1 mask2 shift3 merge

Figure: Taint tag propagates from R1 to R0.Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 14 / 24

Page 15: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

System Design – Taint Propagation Logic

Taint propagation logicclasses of instructions (instruction type and related locations)e.g., move, boolean not, add, etc.the location is an abstraction over the potential registers containingvariables or constants

Method invocation taint propagationBinder IPC & native code taint propagation

Table 1: Descriptions of multi-level aware taint propagation logic.

HInstruction (Location) Semantic Taint Propagation Logic Description

HParallelMove(dest, src) dest ← src Set dest taint to src taint, if src is constant then clear dest taint

HUnaryOperation(out, in)out ← in Set out taint to in taint, unary operations ∈ {!, -, ~}

HBooleanNot, HNeg, HNot

HBinaryOperation(out, first, second)

out ← first ⊗ secondSet out taint to max(first taint, second taint),⊗ ∈ {+, -, *, /, %, <<, >>, &, |, ^}

HAdd, HSub, HMul, HDiv, HRem,

HShl, HShr, HAnd, HOr, HXor

HArrayGet(out, obj, index) out ← obj[index] Set out taint to obj taint

HArraySet(value, obj, index) obj[index] ← value Set obj taint to value taint

HStaticFieldGet(out, base, offset) out ← base[offset] Set out taint to base[offset] field taint

HStaticFieldSet(value, base, offset) base[offset] ← value Set base[offset] field taint to value taint

HInstanceFieldGet(out, base, offset) out ← base[offset] Set out taint to base[offset] field taint

HInstanceFieldSet(value, base, offset) base[offset] ← value Set base[offset] field taint to value taint

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 15 / 24

Page 16: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Implementation & Case Study

Taint sources and privacy leakagelevels

four levels: no leakage, deviceidentity, sensor data & location dataleakage, and sensitive contentclasses or services: TelephonyManger, Sensor Manger, LocationManger, Content Resolver, File,Camera, and MediaRecorder

Case study for privacy trackinganalysis popular apps at runtimetracking data flowsTaobao leaks device identity, sensordata and location data at runtime

Table 2: Taint Sources and Privacy Leakage Levels

Level Leaked Data Source Class/Service

0 (00) No Leakage N/A N/A

1 (01) Device Identity

IMSI TelephonyManager

IMEI TelephonyManager

ICCID TelephonyManager

SN TelephonyManager

2 (10)

Sensor DataAccelerometer SensorManager

Rotation SensorManager

Location Data

GPS Location LocationManager

Last Seen Location LocationManager

Network Location LocationManager

3 (11) Sensitive Content

SMS ContentResolver

MMS ContentResolver

Contacts ContentResolver

Call log ContentResolver

File content File

Camera Camera

Microphone MediaRecorder

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 16 / 24

Page 17: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Evaluation – Macrobenchmarks

Macrobenchmarksapp launch time: 6%app installation time: 12%contacts read/write: 20%/12%

Table 5: Macrobenchmark results.

Macrobenchmark Name(ms)

Original (with Opti-mizing Backend)

TaintART

App Launch Time 348.2 370.3

App Installation Time 1680.5 1886.3

Contacts Read/Write 7.0/9538.5 8.4/9655.2

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 17 / 24

Page 18: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Evaluation – Microbenchmarks

Compiler microbenchmark: compilation time80 built-in apps in AOSP project19.9% overhead

0 20 40 60 800

1,000

2,000

3,000

336.067403.064

Index of Built-in Apps

Com

pila

tion

Tim

e(m

illis

econ

ds)

Original

TaintART

Figure: Comparison of compilation time.

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 18 / 24

Page 19: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Evaluation – Microbenchmarks

Compiler microbenchmark: instruction overhead21% overhead in total0.8% overhead only for memory-related instructions

0 0.2 0.4 0.6 0.8 1 1.2 1.4

·107

I: 5,028,730, II: 5,159,454, III: 2379, IV: 1,906,759,

V: 747,014, VI: 61,898, VII: 132,760

I: 4,988,969, II: 2,747,897, III: 2379, IV: 1,909,696

V: 904,164, VI: 61,693, VII: 130,338

13,038,994

10,745,136

TaintART Compiler

Original Compiler

Number of Instructions (ARM)

I. Memory access instructions II. Data processing instructions III. Multiply instructions

IV.Branch/control instructions V. Barrel shifter instructions VI. VFP instructions

VII. Other instructions

Figure: Comparison of instruction numbers for different types.Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 19 / 24

Page 20: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Evaluation – Microbenchmarks

Java microbenchmarkCaffeineMark 3.0

14% overhead overall99.8% more scores compared to the legacy Dalvik environment

Sieve Loop Logic String Float Method Overall

0

25,000

50,000

75,000

100,000

125,000

150,00049,8

26.5

101,

470.

4 124,

526.

2

25,3

28.2

38,9

48.3

39,6

57.8

53,9

26.7

49,0

84.1

56,8

00.2

110,

183.

4

26,3

03.8

35,6

75.7

34,2

55.8

46,3

06.9

Benchmark Item of CaffeineMark 3.0

Caf

fein

eMar

k3.

0Sc

ore

Optimizing Compiler Backend

Quick Compiler Backend

Interpreter Only

TaintART

Dalvik VM (4.4)

Figure: CaffeineMark 3.0 Java microbenchmark.Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 20 / 24

Page 21: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Evaluation – Others

Memory microbenchmark: 0.4%IPC microbenchmark: 4%Compatibility evaluation

0 5 10 15 20 250

5,000

10,000

15,000

20,000

25,000

Elapsed Time of Launching CaffeineMark (seconds)

Mem

ory

Usa

geof

Caf

fein

eMar

k(k

B)

ART (Optimizing Backend)

TaintART

Table 6: IPC Throughput Benchmark (10,000 pairsof messages).

Macrobenchmark Name Original TaintART Overhead

Execution Time 2987ms 3117ms 4.35%

Memory (client) 51 572 kB 53 170 kB 3.10%

Memory (server) 38 812 kB 39 689 kB 2.26%

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 21 / 24

Page 22: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Summary

Tracking information flowsdynamic taint analysis for new Android RunTimeregister-based and compiler instrumentationevaluate in micro/macro benchmark

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 22 / 24

Page 23: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Thank you!

Thank you.

Question?

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 23 / 24

Page 24: TaintART: A Practical Multi-level Information-Flow ... · TargetSDKVersion(Oct.2015) MinimumSDKVersion(Oct.2015) TargetSDKVersion(Feb.2016) MinimumSDKVersion(Feb.2016) MingshenSun

Backup Slides for TaintART – Taint Tag Storage

Taint tag spillingthe register allocator will temporarily store extra variables in themain memory

Mingshen Sun (CUHK) TaintART @ CCS 2016 October 25, 2016 24 / 24