• Given any normal voice command, we obtain the attack ultrasound with
following steps.
• We adopt amplitude modulation in step 3 and add the same carrier wave
in the final step.
Inaudible Voice CommandsLiwei Song, Prateek Mittal
Department of Electrical Engineering, Princeton University
• A typical microphone consists of a transducer, an amplifier, a low-pass
filer and an analog to digital converter.
• The transducer and the amplifier are not perfectly linear modules,
resulting in the following non-linear function [2].
𝑆𝑜𝑢𝑡 =
𝑖=1
∞
𝐺𝑖𝑆𝑖𝑛𝑖 =𝐺1𝑆𝑖𝑛 + 𝐺2𝑆𝑖𝑛
2 +⋯
• We can leverage the microphone’s inherent non-linearity to obtain
normal voice frequencies from the processing of ultrasound frequencies.
Motivation
• Voice assistants are becoming increasingly popular
in IoT devices.
• Previous attacks on voice assistants leverage the
gap between speech recognition system and
human voice perception [1].
• The limitation is that attack sounds are audible and
conspicuous to device owners.
• Can we inaudibly control voice assistants?
Typical Diagram of a Microphone
Carrier Wave Addition
Low-Pass Filtering
Upsampling
Ultrasound Modulation
• Attack demonstration of the command “OK Google, take a picture”.
• Attack ranges for two devices with different input powers.
Input Power (𝑊) 9.2 11.8 14.8 18.7 23.7
Range (Phone, 𝑐𝑚) 222 255 277 313 354
Range (Echo, 𝑐𝑚) 145 168 187 213 239
• You can scan the QR code in the title to see our attack demo.
Attack Scenario for Inaudible Voice Commands
[1] N. Carlini et al., “Hidden voice commands”, USENIX Security, 2016.
[2] N. Roy et al., “Backdoor: making microphones hear inaudible sounds”,
Mobisys, 2017.
[3] G. Zhang et al., “DolphinAttack: inaudible voice commands”, CCS,
2017 (concurrent work).
Attack Algorithm
Attack Overview
Non-Linearity Insight
Attack Experiments
References
Apple Siri
Amazon Alexa
Google Assistant
• Challenge: How to design inaudible attacks?
• Solution: We transmit ultrasounds (frequencies above 20kHz) to attack
victim devices.
• Challenge: How to control voice assistants?
• Solution: We exploit the non-linearity of microphone to convert
ultrasounds into normal voice commands.
Non-linear function
Non-Linearity of the Microphone