Understanding the characteristics of android wear os

Understanding the Characteristics of Android Wear OSRenju Liu, Felix Xiaozhu LinPurdue ECEPresentation By: Pratik Jain

Motivation

Interactive wearables, like smart watches, are a newcomer to the spectrum of mobile computers.

Integrate computing even tighter with our daily lives.

Substantial increase in demand for smart watches.

Usage Patterns

&Device

Hardware

Users interact with wearable devices frequently throughout the daily use

Each interaction is short ( < 10s ), and is dedicated to a simple task

Due to the limited content that can be displayed on one screen, users spend a short time on one screen before switching to the next.

Tiny Battery capacity (200 – 400mAh) Slower CPU – Fewer cores Simpler CPU – Scaled-down but often architecturally

identical to handheld’s CPU

Android Wear OS

One of the most popular OSes for interactive wearables. Wearable OS with the most public information. Supports third-party applications and features a

resigned system UI, including Card for notifications, Context streams, and voice input.

Apps – renovated UI – Follow Android’s conventional programming paradigm – Written in Java – Compiled ahead-of-time – executed atop the managed Android Runtime.

Major OS components – System Server – Key daemon hosting the core OS services Surface Flinger – Daemon controlling UI animation Clockwork – OS shell that implements the system UI

Benchmark Scenarios

A benchmark suite that consists of 15 benchmarks falling into the following 4 categories: 1. Wakeup – Due to internal or external events, device

transits out of suspended mode and presents brief information. Due to frequent daily wakeup, energy efficiency is the most important metric.

2. Single input – A waking wearable device responds to a single input from the user. Because the user is waiting, the device needs to achieve low UI latency.

3. Continuous interaction – Users are interacting with the devicecontinuously. The resultant UI animation requires the device to produce a steady stream of graphic frames.

4. Sensing – For the execution of wearable apps, sensor data is sampled and processed periodically to collect context information.

METHODOLOGY

Experimental Setup

All the benchmarks are run on 2 state-of-the-art Android Wear devices

LG Watch R Samsung Gear Live

Qualcomm’s APQ8026 system on-chip Android Wear 5.0 “Lollipop”

Power Manageme

nt

Batteries have tiny contacts which are incompatible with commodity power monitors.

A compatible interface circuit is carved out from a smartphone battery.

Used the interface as an adapter between the smart watch and an external power monitor.

The battery interface carved out from Nexus 5

The interface (flipped) connected to the LG watch R

Toolset

Used the following to examine system behaviors at different levels and granularities

Systrace – for capturing global system events such as scheduling, I/O activities, and IPC

Android Runtime’s built-in function tracer – for recording function call history in individual processes

Linux perf – for sampling CPU performance counters.

Tackling profiling

overhead

Event Tracing – Major profiling overhead Memory overhead can be overwhelming in tracing

function invocations. 2 ways used to tackle

In quantifying global system behaviors, the paper only relies on system events. It collects function trace from extra runs.

In quantifying function-level activities, deduction of an overhead of 4 µs from each traced function invocation ( constant overhead ).

CPU Usage

CPU usage is collected at two granularities Task-level breakdown. An analyzer is built to identify

the tasks . Function-level breakdown. To further locate the

performancehot spots in System Server, the following 2 metrics are employed:

Exclusive CPU cycles are spent in the function’s own code Inclusive CPU cycles are spent in the function’s code as

well asin all subroutines being called

Both metrics include the time spent in both user and kernel spaces and do not cover the time when a task is off CPU due to being scheduled out.

Idle Time Analysis

Amount and duration of the observed idle episodes are unusual. Match some idle episodes to system events known to cause idle,

e.g. I/O and power management. Others often root in stalling of OS service in serving app’s

requests. IdleChecker, an analyzer that helps mapping anomalous idle

episodes to the responsible code regions, based on a simple rationale:

The function calls and IPC transactions spanning an anomalous idle episode are suspicious.

IdleChecker runs the following steps for each idle episode. Identifies suspicious app tasks that are blocked throughout the entire

idle episode but run after the episode. For each suspicious task, it identifies two suspicious CPU time

quanta:the one right before the idle episode and the one right after it.

Examines the suspicious quanta, looking for IPC transactions spanning across the idle episode.

Identifies the function invocations that either coincide with the IPCor span across the idle episode.

Thread-level Parallelism

Metric widely used for gauging an interactive system’s need for core count.

Average number of busy CPU cores during the non-idle time.

TLP - total time when no threads are running

- time when exactly i threads are running simultaneouslyn - number of cores available.

For measurement, all 4 cores are forced online

Microarchitectural behaviors

Microarchitecture design is a Mystery By using the Linux perf, the paper samples the

performance counters of the Cortex-A7 CPU on test devices.

Observe branch prediction, cache, and TLB in all benchmarks

RESULTS

Where do CPU cycles

go?

Intensive OS execution often dominates the global CPU usage.

Many costly OS services are likely to make software unnecessarily complicated

The CPU time distribution of hot functions is highly skewed.

Manipulating basic data structures consumes substantial CPUcycles.

Legacy OS functions may become serious performance bottlenecks

OS Execution Bottlenecks setLight(), Layout(), computeOom(), getSimpleName()

Idle Episodes

Plentiful and of a variety of lengths Improper OS Designs

Interference from voice UI Legacy support for device suspending

Performance overprovision during continuous Interaction

Design Implications Hunting OS inefficiencies Filling idle time with useful work

reducing CPU & GPU clock rates which will shrink idle episodes

predictive execution

Thread-level parallelism

Short interactions exhibit substantial TLP, which is on par with desktop workloads.

While apps are mostly single-threaded, OS daemons contribute to TLP significantly.

A wearable device needs at least two cores.

Microarchitectural behaviors

A significant mismatch exists between the OS and CPU microarchitecture, particularly in L1 icache, iTLB, and branch predictor.

The mismatch is largely due to the OS code complexity, and will not be eliminated by a unilateral enhancement of wearable CPU.

OS should be trimmed down to match the simplicity of its apps.

Related Work

Gao et al. find that smartphone workloads show limited TLP, concluding that they need no more than two cores.

ProfileDroid contributes an approach for charactering smartphone apps at multiple layers

Min et al. studies the battery usage of smart watches WearDrive creates synthetic benchmarks to shed light

on wearable storage. RisQ and TypingRing target gesture recognition iShadow tracks gaze in real time Ha et al. build wearable for cognitive assistance Cornelius et al. focus on user identification

Recap

In-depth analysis of one of the most popular wearable Oses, Android Wear.

Examination of 4 key aspects: CPU usage, idle episodes, TLP, and micro-architectural behaviors – in fifteen benchmarks.

Discovery of serious OS inefficiencies and system bottlenecks that were widespread but unknown before.

The results clearly point out the system bottlenecks for immediate optimization and have strong implications on future wearable system software and hardware design.

THANK YOU!

Technology

Understanding the characteristics of android wear os