Machine learning techniques for sidechannel analysis · Machine learning techniques as side-channel attacks • Input: data, labels ๏ points of interest from the measurement trace

Machine learning techniques for side-

channel analysisAnnelie Heuser

• Side-channel analysis and its terminology

• Dictionary: Side-channel to Machine learning

• When can machine learning be helpful?

• New application: semi-supervised learning

Outline

Side-channel analysis

Alice Bob

cryptography


Alice Bob

cryptography


Alice Bob

cryptography

Time Sound electromagnetic emanation

Side-channel information


Alice Bob

cryptography

Time Sound electromagnetic emanation

Side-channel information

secret key / sensitive data

• Measurement setup๏ AES 128๏ attacking SBox

operation in the first round

๏ key enumeration feasible


<latexit sha1_base64="M1TlK1rDxVLQnSYQCH6W9bmGYNY=">AAACFHicbZDLSgMxFIYzXmu9VV26CRahKpQZEXRZdOOyor1Ap5ZMmrahmWRIzkjL0Idw46u4caGIWxfufBvTaRfa+kPgy3/OITl/EAluwHW/nYXFpeWV1cxadn1jc2s7t7NbNSrWlFWoEkrXA2KY4JJVgINg9UgzEgaC1YL+1bhee2DacCXvYBixZki6knc4JWCtVu4E+8AGAJDcBmowKqS3JBKEyzGNsK8iERvcvz8+auXybtFNhefBm0IeTVVu5b78tqJxyCRQQYxpeG4EzYRo4FSwUdaPDYsI7ZMua1iUJGSmmaRLjfChddq4o7Q9EnDq/p5ISGjMMAxsZ0igZ2ZrY/O/WiOGzkUz4TKKgUk6eagTCwwKjxPCba4ZBTG0QKjm9q+Y9ogmFGyOWRuCN7vyPFRPi55b9G7O8qXLaRwZtI8OUAF56ByV0DUqowqi6BE9o1f05jw5L8678zFpXXCmM3voj5zPH5Kpnxs=</latexit>

• Hamming weight model

• Electromagnetic emanation easily distinguishable regarding HW of intermediate operation


−0.400 −0.350 −0.300 −0.2500

0.02

0.04

0.06

0.08

0.1

0.12

Power Consumption At Time A

Pro

ba

bili

ty

HW 0

HW 1

HW 2

HW 3

HW 4

HW 5

HW 6

HW 7

HW 8

Side-channel attacks• …are real in practice • Beginning 2016: FBI asks Apple to

bypass their encryption

• Handful methods to break into the encrypted iPhone

๏ software bugs ๏ side-channel attacks ๏ glitch attack ๏ invasive attacks

Documents released by Snowden: NSA is studying the use of side-channel attacks to break into iPhones

Side-channel attacks• …are real in practice

• attacking Philips Hue smart lamps

• side-channel attack revealed the global AES-CCM key used to encrypt and verify firmware updates

• insert malicious update: lamps infect each other with a worm that has the potential to control the device

Paper: Eyal Ronen et al, IoT Goes Nuclear: Creating a ZigBee Chain

Reaction

Side-channel analysis• unprofiled / profiled

• measurement traces

• point in time / point of interest

• intermediate operations / sensitive variable / leakage model

• distinguisher / attack

• success rate / guessing entropy



















Side-channel analysis to machine learning

• unprofiled / profiled unsupervised / supervised

• measurement trace data

• point in time / point of interest feature

• intermediate operations / sensitive variable / leakage model label

• distinguisher / attack classification algorithm

• success rate / guessing entropy accuracy

Machine learning techniques as side-channel attacks

• Input: data, labels๏ points of interest from the measurement trace๏ leakage models (intermediate operations)

• Mainly in supervised scenarios: support vector machines, random forest, Naive Bayes, deep learning




• Nature of side-channel leakage is still unknown





• Advantages:๏ suitable in “unperfect scenarios”๏ more resistant to imprecisions





• Advantages:๏ suitable in “unperfect scenarios”๏ more resistant to imprecisions

• Disadvantages:๏ time / computational complexity (additional tuning)๏ community is not yet trusting (“only empirical results”)

New perspective• Traditional profiled scenario: realistic?

New perspective• Semi-supervised learning: a more realistic assessment

Semi-supervised learning• Techniques for label prediction:

๏ self-training

๏ label spreading

• Classification algorithms:

๏ Support vector machines

๏ Naive Bayes

๏ Template attack (standard + pooled version)


๏ self-training

๏ label spreading



๏ Naive Bayes


• Datasets (13 k in total)๏ 100 labeled + 12.9k unlabeled๏ 500 labeled + 12.5k unlabeled๏ 1k labeled + 12k unlabeled๏ 3k labeled + 10k unlabeled๏ 5k labeled + 8k unlabeled๏ 10k labeled + 3k unlabeled


๏ self-training

๏ label spreading



๏ Naive Bayes


• Datasets (13 k in total)๏ 100 labeled + 12.9k unlabeled๏ 500 labeled + 12.5k unlabeled๏ 1k labeled + 12k unlabeled๏ 3k labeled + 10k unlabeled๏ 5k labeled + 8k unlabeled๏ 10k labeled + 3k unlabeled

Semi-supervised learning• Supervised

• Semi-supervised<latexit sha1_base64="ULIfhbKwC1zGEczIHcyNER2tPJg=">AAADvXicbZLbbtNAEIadhEMTDk3hkpsVKRFXq12ffUVbbhASqKhpWqmOovVmk67ik+w1KJg8Gg/CG/AYjGMnPTGSvePZ75/9Nd4gDWWuCPnTancePX7ydK/be/b8xcv9/sGrcZ4UGRfnPAmT7DJguQhlLM6VVKG4TDPBoiAUF8HyY7V/8V1kuUzikVqlYhKxRSznkjMFpelB67cfiIWMS8WCImTZukxLgj0erX+lJcUOJLfW3Xvd63a7Z/KnQEO0ja8nNx9n4y/1x+gY1tHx4TQ9RMj3ka+SNCtCUekpIUAMkU2xRWHxMHFARLBBKrFpYZOCpkKtLWphzxgiV8cOqVGzQm0Xyqhh6RLVrIldaOva2LKBodgwKtYxMDUb1NiiDtahn0c2CmRhvUahg711cIO60M7TMakc6NitUQs73tYAWdaoi4kHqIF1uzbrbVA4bGuAGjvUsDeoAwagmUEb1DE3Y4vkrBmbL+LZ7m9N+wMCRqpADxPaJAOtidNp/68/S3gRiVjxkOX5FSWpmpQsU5KHYt3zi1ykjC/ZQlxBGrNI5JNyc8/W6B1UZmieZPDECm2qtxUli/J8FQVARkxd5/f3quL/9q4KNXcnpYzTQomY1wfNixCpBFWXFs1kJrgKV5Awnknwivg1yxhXcLXvnBIkyRKGk697MBp6fxAPk7GOKcH0mz44OmmGtKe90d5q7zWqOdqR9kk71c413j5sf26ftUedDx3RCTtxjbZbjea1dic6P/4BL4jz8g==</latexit>

<latexit sha1_base64="btLmPrbIaBm1cd/t4w12AiLNTEY=">AAAFSnichZRLb9QwEMfT7hba8mrhyMWiZYVUYdl5OMmtlAtCAhX1KTVV5WS9bZSnEgephHwovgd3xA1unLkhLoyT3e1uW4mRIk/Gf/9mPHbi53FYSkK+Lyz2+kt37i6vrN67/+Dho7X1x4dlVhWBOAiyOCuOfV6KOEzFgQxlLI7zQvDEj8WRH71W80cfRVGGWbovL3NxmvDzNByFAZcQOlvvvV3xfHEeprXkfhXzoqnzmmIWJM1n5djgXI0EuzPjfwXN6srKXvhJILABer+DJjZAe4fvrl72X115m2f5Jpou6IStYm7yNvM85MksL6pYQN4BBJIqliH0qErS2mzqoKkRdGr0UhY8TMP0vFFZbpXF3BcxKlUrh51Q0ZNwOKZTQraojt1IESwXE3OAGAz6ANWWgx23gXjNbGxZTbcZpLuYsQHSTezCQB1sMwjrFBPH8wBpdUirRTKGqTVAjo4dqpAMM71F2gw7bIJkFrZg2qGYGvCqMtu2koHOxKbbtGAaATca95Q5rdaxMSWAM02skw7sYN2Zgm1sK5WJCWtpoG6bBektq6UaQCUzVAO27hJsmapcAxqiqAAl7uRslUp1wKXYacuFTbI2uW1jk3ZNiLacCF1RiQN6HSpTtTJsTqiGNUu13SnVwLqJOqRroW7/JNoyZiplNqgNbCmmbcAxjEtw6RwTdSpgmjUcnTlR2bM3wRPpcPrlnK1tEExaQzcdOnY2tLHtnq398oZZUCUilUHMy/KEklye1ryA6xiLZtWrSpHzIOLn4gTclCeiPK3bj75BzyEyRKOsgCeVqI3Orqh5UpaXiQ/KhMuL8vqcCt42d1LJkXNah2leSZEGXaJRFSOZIfUHQcOwEIGML8HhQRFCrSi44AUPJPxn5rL4WRZBc8pmFVpDrzfipnOowz3D9IO+sb0zbtKy9lR7pr3QqGZr29obbVc70ILel9633o/ez/7X/u/+n/7fTrq4MF7zRJuzpf4/IB1fgg==</latexit>

Semi-supervised learning• Influence of noise: in high noise scenarios this approach may not be

beneficial, optimal signal-to-noise ratio?

• Number of measurements: in which restricted (practical) scenarios semi-supervised learning is beneficial?

• Number of classes: what is the amount of classes where label predictions can be beneficial?

• Misclassification of labels: how to limit misclassification of labels? which algorithms cope best with misclassification?

• Generalisation of models: What are the benefits when considering leakage measurements from different devices and therefore with different leakage distributions?

Questions?

[email protected]

mailto:[email protected]

Countermeasures

Countermeasures• Implementation level countermeasures:

๏ masking: adding additional randomness๏ hiding: adding additional noise, decreasing signal-to-noise ratio๏ drawbacks: high implementation overhead, depending on

cryptographic primitive

Countermeasures• Implementation level countermeasures:

๏ masking: adding additional randomness๏ hiding: adding additional noise, decreasing signal-to-noise ratio๏ drawbacks: high implementation overhead, depending on

cryptographic primitive

• Protocol level: Leakage-Resilience๏ modelling attackers with the capability to monitor side-channel

information๏ require a side-channel secure initialization in order to obtain a

fresh session key for every cryptographic operation๏ drawback: mostly theoretical, not been tested (thoroughly) in

practice yet