Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Breaking up the black boxErlang on Embedded Systemsby Magnus Feuer, CTO Feuerlabs
Monday, August 27, 2012
Breaking up the black box:
1. The problem
2. Insights
3. The tool
4. Feuerlabs
Monday, August 27, 2012
1. The problem
Monday, August 27, 2012
Case Study: OEM recall
Monday, August 27, 2012
Case Study: OEM recall• Millions of vehicles to a shop
Monday, August 27, 2012
Case Study: OEM recall• Millions of vehicles to a shop
• Many are safety critical
Monday, August 27, 2012
Case Study: OEM recall• Millions of vehicles to a shop
• Many are safety critical
• Billions in cost
Monday, August 27, 2012
Case Study: OEM recall• Millions of vehicles to a shop
• Many are safety critical
• Billions in cost
• Negative brand impact
Monday, August 27, 2012
Fault searching
Monday, August 27, 2012
Fault searching• Software/Hardware/Mechanical?
Monday, August 27, 2012
Fault searching• Software/Hardware/Mechanical?
• Months of uncertainty
Monday, August 27, 2012
Fault searching• Software/Hardware/Mechanical?
• Months of uncertainty
• Very little data to go on
Monday, August 27, 2012
Fault searching• Software/Hardware/Mechanical?
• Months of uncertainty
• Very little data to go on
• Engineering pressure
Monday, August 27, 2012
Obstacles
Monday, August 27, 2012
Obstacles• Fault recreation is almost impossible
Monday, August 27, 2012
Obstacles• Fault recreation is almost impossible
• No way to monitor the vehicles in the field
Monday, August 27, 2012
Obstacles• Fault recreation is almost impossible
• No way to monitor the vehicles in the field
• Operating in a black-box environment
Monday, August 27, 2012
Root cause
Source: http://spectrum.ieee.org/green-tech/advanced-cars/this-car-runs-on-code/
Monday, August 27, 2012
Root cause• The industry is approaching server-level
complexity in embedded system
Source: http://spectrum.ieee.org/green-tech/advanced-cars/this-car-runs-on-code/
Monday, August 27, 2012
Root cause• The industry is approaching server-level
complexity in embedded system
• Still using embedded tools
Source: http://spectrum.ieee.org/green-tech/advanced-cars/this-car-runs-on-code/
Monday, August 27, 2012
Root cause• The industry is approaching server-level
complexity in embedded system
• Still using embedded tools
• Mercedes Benz navigation system – 20M SLOC
Source: http://spectrum.ieee.org/green-tech/advanced-cars/this-car-runs-on-code/
Monday, August 27, 2012
Root cause• The industry is approaching server-level
complexity in embedded system
• Still using embedded tools
• Mercedes Benz navigation system – 20M SLOC
• 50+ Networked Control Units
Source: http://spectrum.ieee.org/green-tech/advanced-cars/this-car-runs-on-code/
Monday, August 27, 2012
It will get worse:
Monday, August 27, 2012
It will get worse:• Software becoming a di!erentiator
Monday, August 27, 2012
It will get worse:• Software becoming a di!erentiator
• Device Segmentation
Monday, August 27, 2012
It will get worse:• Software becoming a di!erentiator
• Device Segmentation
• Multicore is coming with threading
Monday, August 27, 2012
It will get worse:• Software becoming a di!erentiator
• Device Segmentation
• Multicore is coming with threading
• Connected devices are becoming mandatory
Monday, August 27, 2012
2. The insight
Monday, August 27, 2012
Lessons learned in the field
Monday, August 27, 2012
Lessons learned in the field• Realized that we needed to break up
the black box
• This is a distributed system
• Both server and device stack needed
Monday, August 27, 2012
Cascading requirements
Monday, August 27, 2012
Cascading requirements
Staging
Key Revocation
Transfer
Key Rotation
Authentication
Threading
Tra!c O
verride
Rollback
Encryption
Encryption
Simulations vs. H
ardware
Binary Delta Creation
Dependency R
esolve
Watchdog Trigger
Dom
ain Based Unit Testing
Record-R
eplay
Provisioning
Race Conditions
Versioning
Cascading Rollback
Regression w
/ Live Data
Sandboxing
Load Balancing
Tra!c Throttling
Validation
Access Control
Core A!
nity
Rating Plans
Deploym
ent
Pooled Plans
Testing
Package Mgm
t.
Authentication
Third Party KeysValidation
Device/Server keys
Atom
ic Update
Key ChainsR
ollback
Configuration Segmentation Data Budget FOTA Multi-core
Monday, August 27, 2012
3. The tool
Monday, August 27, 2012
Why Erlang?
Monday, August 27, 2012
Why Erlang?• Open Source Software
Monday, August 27, 2012
Why Erlang?• Open Source Software
• Lightweight processes
Monday, August 27, 2012
Why Erlang?• Open Source Software
• Lightweight processes
• Message based
Monday, August 27, 2012
Why Erlang?• Open Source Software
• Lightweight processes
• Message based
• Created by Ericsson to handle telco infrastructure
Monday, August 27, 2012
Erlang usage:
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
Erlang usage:- Vocalink Real-Time Payment Switch [1]
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
Erlang usage:- Vocalink Real-Time Payment Switch [1]
- Yahoo [2]
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
Erlang usage:- Vocalink Real-Time Payment Switch [1]
- Yahoo [2]
- Facebook chat [3]
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
Erlang usage:- Vocalink Real-Time Payment Switch [1]
- Yahoo [2]
- Facebook chat [3]
- Github [4]
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
Erlang usage:- Vocalink Real-Time Payment Switch [1]
- Yahoo [2]
- Facebook chat [3]
- Github [4]
- Telco (AXD301)[5]
1) http://www.trifork.com/cases/cloud/vocalink2) http://www.drdobbs.com/parallel/using-erlang-to-build-reliable-fault-tol/220600332?cid=RSSfeed_DDJ_HighPerformanceComputing3) http://www.erlang-factory.com/upload/presentations/31/EugeneLetuchy-ErlangatFacebook.pdf4) https://github.com/blog/112-supercharged-git-daemon5) http://www.erlang.org/doc.html
Monday, August 27, 2012
The key factor enabling Erlang
Monday, August 27, 2012
The key factor enabling ErlangEmbedded systems today are as powerful as servers 15 years ago
Monday, August 27, 2012
The key factor enabling ErlangEmbedded systems today are as powerful as servers 15 years ago
1998 Server 2011 Embedded
Compaq ProLiant 7000 Embest SBC6045
Pentium II Xeon 450 Mhz Atmel AT91SAM9G45 400 Mhz
256MB RAM 256MB RAM
6GB disk (140Mbit/sec) 32GB SD card (115Mbit/sec)
Monday, August 27, 2012
Use speed to build robustness
Monday, August 27, 2012
Use speed to build robustness• Telco’s 90s needs are today’s embedded
requirements
Monday, August 27, 2012
Use speed to build robustness• Telco’s 90s needs are today’s embedded
requirements
• VMs are no longer too resource intensive
Monday, August 27, 2012
Use speed to build robustness• Telco’s 90s needs are today’s embedded
requirements
• VMs are no longer too resource intensive
• High-level languages allow multicore utilization
Monday, August 27, 2012
Why Erlang is better than X
Monday, August 27, 2012
Why Erlang is better than X• Built for communication
Monday, August 27, 2012
Why Erlang is better than X• Built for communication
• Runtime upgrade
Monday, August 27, 2012
Why Erlang is better than X• Built for communication
• Runtime upgrade
• Graceful failure
Monday, August 27, 2012
Why Erlang is better than X• Built for communication
• Runtime upgrade
• Graceful failure
• Enables replay of input data
Monday, August 27, 2012
Solving the OEM recall case
Monday, August 27, 2012
Solving the OEM recall case• Design a probe to be deployed
Monday, August 27, 2012
Solving the OEM recall case• Design a probe to be deployed
• Push probe over the air
Monday, August 27, 2012
Solving the OEM recall case• Design a probe to be deployed
• Push probe over the air
• Harvest log and capture data OTA
Monday, August 27, 2012
Solving the OEM recall case• Design a probe to be deployed
• Push probe over the air
• Harvest log and capture data OTA
• Replay in lab to recreate fault
Monday, August 27, 2012
Solving the OEM recall case• Design a probe to be deployed
• Push probe over the air
• Harvest log and capture data OTA
• Replay in lab to recreate fault
• Design and deploy fix (if applicable)
Monday, August 27, 2012
Probe deployment 1-2-3
Monday, August 27, 2012
Probe deployment Step 1: Push probe to vehicles
Monday, August 27, 2012
Probe deployment Step 1: Push probe to vehicles
Telematics Server
TPS MAFEngine Controller
Telematics Device
Battery Controller
Monitor
Monitor package
CAN
bus
Monday, August 27, 2012
Probe deploymentStep 2: Fault detection
Telematics Server
TPS MAFEngine Controller
Telematics Device
Battery Controller
Monitor
CAN
bus TPS
Monday, August 27, 2012
Probe deploymentStep 3: Capture and report
Telematics Server
TPS MAFEngine Controller
Telematics Device
Battery Controller
Monitor
CAN
bus TPS
Monday, August 27, 2012
Inside the vehicle1. Normal operations
Monday, August 27, 2012
Inside the vehicle1. Normal operations
Engine Control Unit
CAN Bus
Transmission Control Unit
Throttle Position Sensor
Erlang Probe
CAN Chipset
Monday, August 27, 2012
Inside the vehicle1. Normal operations
Engine Control Unit
CAN Bus
Transmission Control Unit
Throttle Position Sensor
Erlang Probe
CAN Chipset
Monday, August 27, 2012
Inside the vehicle2. Capture TPS data
Engine Control Unit
CAN Bus
Transmission Control Unit
Throttle Position Sensor
Erlang Probe
CAN Chipset
Capture DBData + Time stamp
Monday, August 27, 2012
Recreate fault in HQ Lab3. Replay data to test harness
Monday, August 27, 2012
Recreate fault in HQ Lab3. Replay data to test harness
Capture DBData + Time stamp
CAN Bus
Transmission Control Unit
Erlang Probe
CAN Chipset
Monday, August 27, 2012
Recreate fault in HQ Lab3. Replay data to test harness
Capture DBData + Time stamp
CAN Bus
Transmission Control Unit
Erlang Probe
CAN Chipset
Monday, August 27, 2012
The end result
Monday, August 27, 2012
The end result• Shortens root cause location time
Monday, August 27, 2012
The end result• Shortens root cause location time
• Immediate action item for the OEMs
Monday, August 27, 2012
The end result• Shortens root cause location time
• Immediate action item for the OEMs
• Identifies QA or design issue
Monday, August 27, 2012
The end result• Shortens root cause location time
• Immediate action item for the OEMs
• Identifies QA or design issue
• Fixes installed without customer involvement
Monday, August 27, 2012
4. Feuerlabs
Monday, August 27, 2012
What do we do?
Monday, August 27, 2012
What do we do?• Feuerlabs provides device management servers,
data communication, and embedded development frameworks to implement industrial-level M2M solutions
Monday, August 27, 2012
What do we do?• Feuerlabs provides device management servers,
data communication, and embedded development frameworks to implement industrial-level M2M solutions
• Embedded stack, Exosense Device
Monday, August 27, 2012
What do we do?• Feuerlabs provides device management servers,
data communication, and embedded development frameworks to implement industrial-level M2M solutions
• Embedded stack, Exosense Device
• Server side, Exosense Server
Monday, August 27, 2012
What do we do?• Feuerlabs provides device management servers,
data communication, and embedded development frameworks to implement industrial-level M2M solutions
• Embedded stack, Exosense Device
• Server side, Exosense Server
• Communication - Provide roaming data plans
Monday, August 27, 2012
Feuerlabs’ domain
TPS MAFEngine Controller
Battery Controller
CAN
bus
Monday, August 27, 2012
Feuerlabs’ domain
TPS MAFEngine Controller
Battery Controller
CAN
bus
Telematics Server
Telematics Device
Monday, August 27, 2012
Monday, August 27, 2012
Exosense ServerExosense Device
Device Application
Server Application
NOC monitoring
Data, RPC
Config, updates, RPC
Sensor Pkg
DAQ Module
JSON Router
SNMPRouter
RPCRPC
ConfigConfig
MonitorMonitorExoport
Exoport
Monday, August 27, 2012
Exosense Device Stack
Monday, August 27, 2012
Exosense Device Stack• MPLv2
Monday, August 27, 2012
Exosense Device Stack• MPLv2
• Reference Hardware
Monday, August 27, 2012
Exosense Device Stack• MPLv2
• Reference Hardware
• Mixed language apps through DBUS
Monday, August 27, 2012
Exosense Device Stack• MPLv2
• Reference Hardware
• Mixed language apps through DBUS
• Package management w/ upgrades
Monday, August 27, 2012
Exosense Device Stack• MPLv2
• Reference Hardware
• Mixed language apps through DBUS
• Package management w/ upgrades
• Bi-directional encrypted RPCs
Monday, August 27, 2012
Exosense Device Stack
Monday, August 27, 2012
Exosense Device Stack
Linux Kernel
Erlang
Device Drivers File Systems
Exosense Device Platform
Geospatial
GPS
RPC Mgr
RPC Queue
Alarm Mgr
Data Monitor
i2c
i2c bus
SocketCAN
CAN bus
Backend ServerConnection Scheduler
Backend ServerConnection
Protocol
Encryption
Mobile Communication
Integration
Graphics Mgr
Framebu!er
Busgraph
Touchscreen
Device Mgr
Data Acqusition
Mgr
Config Mgr
Config Database
Config Sync
Queue
Application
Monday, August 27, 2012
Exosense Server Stack
Monday, August 27, 2012
Exosense Server Stack• Device and Tra"c manager
Monday, August 27, 2012
Exosense Server Stack• Device and Tra"c manager
• Configuration management
Monday, August 27, 2012
Exosense Server Stack• Device and Tra"c manager
• Configuration management
• Update manager
Monday, August 27, 2012
Exosense Server Stack• Device and Tra"c manager
• Configuration management
• Update manager
• Mixed assets
Monday, August 27, 2012
Exosense Server Stack• Device and Tra"c manager
• Configuration management
• Update manager
• Mixed assets
• Data plan aware
Monday, August 27, 2012
Conclusion
Monday, August 27, 2012
Conclusion• Connectivity is happening
Monday, August 27, 2012
Conclusion• Connectivity is happening
• Increased complexity is happening
Monday, August 27, 2012
Conclusion• Connectivity is happening
• Increased complexity is happening
• A new toolset, and mindset, is needed
Monday, August 27, 2012
Conclusion• Connectivity is happening
• Increased complexity is happening
• A new toolset, and mindset, is needed
• Erlang provides the tools
Monday, August 27, 2012