Upload
others
View
15
Download
3
Embed Size (px)
Citation preview
Title Slide
Some Real-Time-ishProblems-ish
THOMAS GUSTAFSSON
Title and Content
• Present some real-time problems from my career
• Think hard and discuss among yourselfs for a while
• Discussions about possible solutions
6 December 2019 Info class internal Department / Name / Subject 2
Agenda
Title and Content
• Scania is world-leading manufacturer of
− heavy trucks
− buses
− engines
• Some 50000 employees around the world
• R&D in Södertälje, Sweden, and Sao Paulo, Brazil
6 December 2019 Info class internal Department / Name / Subject 3
Scania
Title and Content
• Ca 3500 employees mainly focused to Södertälje, Sweden
• Examples of R&D development
− engine
− gearbox
− chassi
− cab
− Scania electrical system
− a lot of ECUs and the software for some of them
• A lot of SW is developed by Scania, so there are a lot of SW developeropportunities
6 December 2019 Info class internal Department / Name / Subject 4
Scania R&D
Title and Content
• Test automation framework for complete-vehicle HIL testing (past)
• Logging system for autonomous vehicles (current)
• Base software for autonomous vehicles (current)
• All work has boiled down to system development in general and sw developmentand sw engineering in particular
6 December 2019 Info class internal Department / Name / Subject 5
My career at Scania
Title and Content
• A Hardware In the Loop (HIL) connects inputs and outputs of Electronic Control Units (ECUs) and ”fools” them to be in a real vehicle
• Purpose of HIL is to
− run regression test suites
− make sure new software changes do not change old behavior
− run dangerous tests
− remove a brake management system sensor in 90 km/h
− manipulate hw that is difficult to get to in real vehicle
− sensors in engine
6 December 2019 Info class internal Department / Name / Subject 6
Test automation framework for HIL testing
Title and Content
• A Scania vehicle is determined by its Scania On-board Product Specification(SOPS)
− A SOPS describes which function product codes (FPCs) the vehicle has, and their values
− For instance, FPC1 describes the type of vehicle, A is truck, and B is bus
• To run a test suite:
− read SOPS and configure input to test system according to FPC conditions
− flash ECUs
− parameterize ECUs
− run tests
− collect test results
6 December 2019 Info class internal Department / Name / Subject 7
Test automation framework for HIL testing
Title and Content
• Electrical system consists of some main CAN buses
− red for safety critical systems, e.g., engine management system, brake management system, air production system
− yellow for less critical systems, e.g., information cluster, external lights system
− green for non-critical systems, e.g., infortainment system
− brown for ADAS related sensors
• And a lot of sub-buses to ECUs
• As many as possible of these CAN buses shall be recorded during testing in HIL
6 December 2019 Info class internal Department / Name / Subject 8
Test automation framework for HIL testing
Title and Content
• Web based GUI
• Utility programs like boot program, html and websockets server, process monitor
• Log domains: CAN, CCP/XCP, cameras, IO, ethernet, etc.
• Each log domain is its own
• Originally Windows based but today Linux based
• C++ with Cmake build chain targetting both platforms
6 December 2019 Info class internal Department / Name / Subject 9
Logging system
Title and Content
• Most log domains are implementing the actor pattern
− a dedicated thread reading the sensor data and putting it into a FIFO
− a dedicated thread reading from the FIFO doing some useful stuff with the data
6 December 2019 Info class internal Department / Name / Subject 10
Logging system
sensor to be logged
thread
1
thread
2
Do something
like publish to
middleware
Title and Content
6 December 2019 Info class internal Department / Name / Subject 11
Logging system
Log computer CANn
Bus1
CANn
Bus2
Camera1
Camera2
Switch
cameralog
domain
CAN log
domain
publish/subscribemiddleware
write to disk
lidar1
lidar2
Switch
lidar log domain
Title and Content
• An application framework is needed for the applications to be based on
− abstraction layer so the underlying OS can be switched out
− an application consists of init method, step function, and an execution profile like periodicexecution
− Upon start up, all applications start and the OS schedules them
6 December 2019 Info class internal Department / Name / Subject 12
Base software
Title and Content
• The phase between applications may change each startup
• The behavior of the system can, but most likely will not, be slightly different eachtime it runs
• Determinism is important so test results are meaningful
6 December 2019 Info class internal Department / Name / Subject 13
Base software
Title and Content
• A small PIC processor reads the seat heater knob
− each knob position corresponds to a resistance
• A digital code shall be generated to an ASIC that controls the heater
6 December 2019 Info class internal Department / Name / Subject 14
Seat heater controller
12 3
4
PIC ASIC
Title and Content
• Best practice requirements on code
− no interrupts or show interrupt bursts are properly handled
− code guidelines: no dynamic memory, no pointers
− consistency checks in code
− checksums on permanent data structures
− variables originating from different code paths that shall match
− use of watchdog
6 December 2019 Info class internal Department / Name / Subject 15
Seat heater controller
Title and Content
• How to log many (40+) CAN buses in HIL?
• How to make response time analysis on logged CAN?
• How to ensure time sync on logged log domains?
• How to ensure synchronized execution of applications using applicationframework?
• What is a good sw architecture for the seat heater problem?
6 December 2019 Info class internal Department / Name / Subject 16
Real-time problems to ponder
Title and Content
• Given:
− Around 40 CAN buses
− up to 10 meters apart (CAN specification says how long stubs can be)
− the CAN frames shall be synchronized in time
• Solution 1
− centralized recording
− time synchronization by the recorder’s clock
• Solution 2
− decentralized recording
− time synchronization must be solved by some means
6 December 2019 Info class internal Department / Name / Subject 17
Test automation framework and HIL testing
Title and Content
• CAN frames are sent on a bus
• Several CAN controllers can connect to the bus
• In the arbitration phase, it is determined which CAN controller that can continueto send its frame
• All CAN controller read and get the frame at (roughly) the same time
6 December 2019 Info class internal Department / Name / Subject 18
Test automation framework and HIL testing
Title and Content
6 December 2019 Info class internal Department / Name / Subject 19
Test automation framework and HIL testing
CAN controller
CAN controller
CAN synchronization
Computer Computer
Sender ofsynch msg payload: 123
timestamp: 100023,channel: 1,payload: 123
timestamp: 99,channel: 1,payload: 123
Title and Content
• Now we know that first CAN controller’s timestamp 99 references the same timepoint as second CAN controller’s timestamp 100023
• A computer program can thus
− get all CAN frames
− wait for the same synch message from each CAN controller
− form a global time
− sort buffered frames
− repeat
6 December 2019 Info class internal Department / Name / Subject 20
Test automation framework and HIL testing
Title and Content
• Performs a worst case response time analysis on each message based on published work by Reinder J. Bril: Controller Area Network (CAN) schedulabilityanalysis: Refuted, revisited and revised, 2007
• Message properties found in CAN databases and which CAN frames used from real logs
• Outputs a list of potential response time problems
6 December 2019 Info class internal Department / Name / Subject 21
Test automation framework and HIL testing
Title and Content
• It works and since the load is split over several computers it can support logging40+ CAN buses
• The solution consists of several programs
− synch message sender
− CAN frame receiver
− Merger that sorts all CAN frames
− Logger of sorted CAN frames to file
− Worst case response time analsysis program
• Distributed solution is much more complex than the centralized solution. Even ifthey would have the same number of lines of code
6 December 2019 Info class internal Department / Name / Subject 22
Reflections on CAN logging solution
Title and Content
• This problem can be split into two problems
− single computer logging
− distributed logging
• Log domains on single computer can rely on the same PC clock for timestamping
• Important to timestamp as close to the source as possible
• After the timestamping is done, the time it takes to save to disk does not matter
6 December 2019 Info class internal Department / Name / Subject 23
Synched time between log domains
Title and Content
• In distributed logging, a global time must be established
• There are protocols for this
− NTP can achieve millisecond level synch
− PTP can achieve sub-millisecond or even sub-microsecond level synch
• When a global time is established, the same hold as for single computer
− timestamp as close to the source as possible
6 December 2019 Info class internal Department / Name / Subject 24
Synched time between log domains
Title and Content
• On a high abstraction level there are two options for a sw architecture
− event based
− time triggered
• Most sensible sw architectures for this problem and incapable CPU is timetriggered
− state machine
− time slots
6 December 2019 Info class internal Department / Name / Subject 25
Seat heater
Title and Content
• Time slots
− start by resetting free running timer
− do some work
− wait for reaching specific timer value
− Repeat for next slot
• Slots dedicated for
− sampling
− clocking digital signal
− logic
• Check different counter regularly and rewrite GPIO registers
6 December 2019 Info class internal Department / Name / Subject 26
Seat heater
Title and Content
• Solution in user-space
− an application must be aware of all other applications and denote this the ticker that knowsabout clock ticks
− each application must wait to be started by the ticker
− each application has a period time that is converted into clock ticks by the ticker
− when remaining clock ticks reaches zero for an application, the ticker releases it
− the synchronization primitives can be, e.g., mutexes and condition variables
6 December 2019 Info class internal Department / Name / Subject 27
Base software application synchronization
Title and Content
• Solution in user-space
6 December 2019 Info class internal Department / Name / Subject 28
Base software application synchronization
application 1
init method sends period time
ticker
ticker says go!
waits at start of step function
ticker says go!
waits at start of step function
clock ticks
clock ticks
Title and Content
• Solution in kernel space
− The operating system must have some mechanism to start applications synchronized
− The operating system must have notion of period times
− The operating system must have notion of priority and possibly priority inheritance
6 December 2019 Info class internal Department / Name / Subject 29
Base software application synchronization