Upload
bryson
View
84
Download
0
Embed Size (px)
DESCRIPTION
OMAP4 SysLink Overview. SysLink Team 1/13/2011. Topics. SysLink Architecture Overview SysLink Notify Overview SysLink Protocol Overview Device Error Handling Ducati Tracing – Trace Daemon. SysLink Architecture Overview. SysLink - Introduction. SysLink IPC for OMAP4 and beyond - PowerPoint PPT Presentation
Citation preview
System Engineering
OMAP4 SysLink Overview
SysLink Team
1/13/2011
System EngineeringTopics
• SysLink Architecture Overview
• SysLink Notify Overview
• SysLink Protocol Overview
• Device Error Handling
• Ducati Tracing – Trace Daemon
System Engineering
SysLink Architecture Overview
System EngineeringSysLink - Introduction
• SysLink– IPC for OMAP4 and beyond– Is an evolution of both DSP/BIOS Link & DSP/BIOS Bridge
• Main Features– Supports symmetrical IPC interfaces– Decoupled IPC & MM Frameworks– Scalable and Modular IPC architecture– Retains the IPC and Device Management features developed for OMAP3 IPC
• Device Loading & Exception Handling• Dynamic Memory Mapping• Power Management
– Support for ELF format– Flexibility to support parallel and custom 3rd party IPC– Enables remote procedure calls
System Engineering
ARM Cortex A9
Mailbox Driver
Notify Module
MessageQ / IPC
SysM3
AppM3
IPI & Mailbox
ProcMgr
ProcMgr RCM
MessageQ / IPCLoader
IOMMU
DEH
MessageQ / IPC
RCM
HWSpinLockremoteproc
Power Mgmt
Power Mgmt
TILER Memory Manager
SpinLock
Notify Driver
MessageQ / IPC
RCM
Power Mgmt
TILER Memory Manager
SpinLockMailbox & IPI
Notify DriverIPC Interrupt MechanismsA9 – SysM3 : MailboxA9 – AppM3 : Mailbox & IPISysM3 – AppM3 : IPI
OMAP4 SysLink Architecture
User
Kernel
System EngineeringSysM3-AppM3 (SysLink Functionality Split)
• All SysLink IPC modules are available on both SysM3 and AppM3 cores with a few functional differences
• Common IPC Components/Features– Notify, MessageQTransportShm, NameServerRemoteNotify, SLPM, MessageQ,
NameServer, SharedRegion– Ipc Synchronization between each pair of processors– Traces– Exception Context Buffer
• SysM3 only– Mailbox Interrupt Rx– A9 to AppM3 interrupt rerouting– M3-A9 Power Management notification
System Engineering
SysLink Notify Overview
System EngineeringNotify Driver - Features• Rationale
– Enable multiple clients to multiplex over a single physical interrupt between a pair of processors.– Keep the payload to the minimum, and allow higher level protocols to define their own transport memory
needs.– Generic to handle physical interrupts without any messaging capability
• Features– Simple and quick method of sending 32-bit message– Multiple clients can register for same event
• Clients may be processes or threads/tasks or kernel modules• Callback function can be registered with an associated parameter to receive events from remote processor• All clients get notified when an event is received
– Multiple clients can send notifications to the same event• Events are sent across to the remote processor• 32-bit payload value can be sent along with the event• Synchronous Messaging
– Event IDs are prioritized.• 0 Highest priority• 31 Lowest priority• If multiple events are set, highest priority event is always signaled first
– Notification can be unregistered when no longer required
System EngineeringNotify Send Event – From A9 to AppM3
AppNotify-
knlNotifyHw
krnlMailbox
-knl
Notify_sendEvent
ioctl
notify_ducatidrv_send_event
omap_mbox_msg_send
MBOX ISR
NotifyDriver
Shm_isr
Notify
SHM
Ack
Clear notifyflag
MBOX
KernelSysM3
NotifyShmDriver
NotifyDriver
App-M3 App
ti_sdo_ipc_notify_exec Callback to
AppM3 Application
AppM3
IPI
NotifyDriver
Shm_isr_appm3
System EngineeringNotify Send Event – From AppM3 to A9
AppNotify-
knlNotifyHw
krnlMailbox
-knl
read()
notify_execnotify_shmdrv_isr
Notify
SHM
Set notifyflag
MBOX
KernelSysM3
NotifyShmDriver
NotifyDriver
App-M3 App
NotifyDriverShm_sendEvent
Notify_sendEvent
AppM3
Mailbox Hw Driver(InterruptD
ucati)
InterruptDucati_intSend
System call ,calls notify_drv_read
notify_add_buf_by_pid
notify_add_buf_by_pid unblocks the notify_drv_read call and the callback pointer is returned to user space
System Engineering
SysLink Protocol Overview
System EngineeringSysLink IPC Protocol - ModulesHere are the main SysLink IPC modules.
NameServer
MessageQ
Heaps
ListMP Notify
Transports
SharedRegion
Gates
Helper modules
Data mover modules
NotifyDrivers
MultiProc NameServerRemoteNotify
HeapBufMP HeapMemMP
MessageQTransportShm
GateMPs - GatePeterson
GateHWSpinLock
NotifyDriverShm
System EngineeringListMP
• Purpose– Provides an open-ended queue– Enables variable-size messaging automatically
• Design Features– Basic multi-processor doubly-linked circular linked list (ListMP).– Simple multi-reader, multi-writer protocol.– ListMP management is protected internally using an appropriate multiprocessor lock module
–GateMP– Uses shared memory elements with portable pointers.– Building block for MessageQ Transport, and the HeapBufMP buffer.
• Usage– Instances of ListMP objects can be created and deleted by any processor. – Each instance of the ListMP is identified by a system-wide unique string name. – To use a ListMP instance, Clients open it by name and receives a handle.– Elements can be put at the tail of the list, removed from head of the list– Elements can be inserted & removed at/from an intermediate location– Provides API for list traversal.
System EngineeringMessageQ• Purpose
– Homogeneous or heterogeneous multi-processor messaging.– Message delivery functionality localized to pluggable transports achieving flexibility.
• Design Features– Basic IPC queue with a single-reader and multiple-writers.– Supports the structured sending and receiving of variable length messages.– Supports two priorities of messages.– No restrictions on number of messages on the queue.– Uses NameServer for name management– Uses a Transport (MessageQTransportShm) for actual message delivery
• Usage– Reader owns a queue and creates the queue.– Writers opens a queue by name. – Writers can also use a received message to send a response back.
System EngineeringIpcMgr• Purpose
– One step setup for System integrators– Dynamic configurablity of shared memory objects without static assignment & management
of memory. This becomes complex as we scale IPC processor pairs.
• Design Features– Automates the setup of IPC modules – Notify, MessageQTransportShm,
NameServerRemoteNotify, between a pair of processors, for easier setup.– Provides handshaking mechanism between a pair of processors (A9-SysM3, A9-AppM3,
SysM3-AppM3) to synchronize startup sequences.– Provides IPC infrastructure setup synchronization by passing system configuration info from
a slave processor to a master processor between a pair of processors.– Creates various IPC modules based on the shared IPC infrastructure information for ready-
made usage between multiple processors.
• Usage– Configures the IPC modules in each of the user process’s context by calling setup() and
destroy() functions of IPC modules.– Integrated with device management API
System EngineeringMultiProc• Purpose
– Simplest unified interface and processor definition on multiple processors.
• Design Features– Basic lowest level helper module that centralizes the management of processor Ids.– Is essentially a lookup table of Processor names and their ids.– MultiProc order dictates the master-slave behavior assignment of certain asymmetric
modules like IpcMgr.
• Usage– Processor ids start at 0 and ascend without skipping values.– Id can be available at configuration time or at initialization time.– Configured automatically in each user process.– OMAP4 MultiProc Configuration
• MPU: 3• SysM3: 2• AppM3: 1• DSP: 0
System EngineeringSharedRegion• Purpose
– Eases the address translation across multiple processors, each with their own address space– The HeapMemMP within a SharedRegion, provides an readily available shared memory
manager.
• Design Features– The SharedRegion module is used to identify shared memory regions in the system. – Provides SharedRegion pointers (SRPtr) or portable pointers from one processor to another.– Each SharedRegion can have a shared memory heap (HeapMemMP) created within the region, which
can be used to provide a generic shared memory allocator.– This module creates a shared memory region lookup table. The lookup table contains the processor's
view for every shared region in the system– Each processor will have its own lookup table– Each processor's view of a particular shared memory region can be determined by the same table index
across all lookup tables– Each table entry is a base and length pair. This table along with the shared region pointer((32 bit SRPtr) is
used to do a quick address translation.– SRPtr is built using some bits to represent the SharedRegion and a relative offset to indicate the relative
address.
• Usage– SharedRegion information from the M3 baseimages is read during IpcMgr synchronization to
populate the region tables in A9 as well as in each user process.
System EngineeringNameServer• Purpose
– Distributed table management to avoid update synchronization problems, and cyclic dependencies.
• Design Features– The NameServer enables an application and other modules to store and retrieve values
based on a name– The NameServer module manages local name/value pairs– Each NameServer instance manages a different name/value table (customization possible)– Names in a specific table must be unique, but the same name can be used in different tables.– Supports different lengths of value.– Name lookup across processors supported by a separate transport layer
• Usage– The NameServer/ NameServerRemoteNotify is used by IPC components such as
HeapBufMP, ListMP to get the portable pointer (Shared Region pointer) that gives the location of these module objects in shared memory. This is mainly used while opening an instance of above modules
– The NameServer/ NameServerRemoteNotify is used by IPC MessageQ to exchange MessageQ id between Ducati and MPU
System EngineeringGates - GateMP• Purpose
– Unified interface for different proxy implementations– Common allocation & management of different gate interfaces from any processor
• Design Features– Critical region mechanism to protect common resources– Abstracted interface for various implementations of hardware/software gates– Simple enter/leave APIs (no timeouts).
• Implementations– GateHWSpinLock
• Hardware gate based on SpinLock hardware IP• Support for protection between a number of processors• Default gate for current OMAP4 SysLink software.
– GatePeterson• Software gate based on Peterson algorithm for protection between 2 different processors.• GatePeterson objects created in SharedMemory.• Default gate in OMAP4 SysLink 1.0 software or for SOCs not supporting the H/W SpinLock IP.
System EngineeringHeaps• HeapBufMP
– Purpose• Fast-deterministic allocation & free time for a HeapBufMP buffer block. Memory calls are non-blocking.
– Design Features• A fixed size buffer heap implementation that can be used in a multiprocessor system with shared memory• HeapBufMP manages a single fixed-size buffer, split into equally sized allocatable blocks.• HeapBufMP buffer blocks are maintained using a ListMP object.• This module is instance based. Each instance requires shared memory • The HeapBufMP module uses NameServer instance to store instance information when created• Blocks can be allocated and freed on different processors.
• HeapMemMP– Purpose
• Serves as a default allocator to define custom shared memory objects, and also as a macro-allocator for other shared memory heaps.
– Design Features• A variable size buffer heap implementation that can be used in a multiprocessor system. • Non-deterministic allocation & free time, is flexible but less efficient than HeapBufMP.• Can be created by default within a SharedRegion.• All other features are same as HeapBufMP.
System EngineeringIPU-PM• Purpose
– Represents the IPU Sub-system and IPU-managed resources to the Host OS PM frameworks.– Provides sub-system Power Management operations asynchronously to the main processor.– Decouples Idle processing and state transitions for maximum power savings.– Hibernation gets the IPU Sub-system out of the way of System PM when it is not in active use. That is, it allows the greater
Core power domain (wherein the IPU resides) to be transitioned to low power states by the MPU with no coordination overhead.
– Common resource pools provide the greatest system flexibility.
• Design Features– Provides Power Management for the IPU Sub-system and its directly managed resources.– Provides a local interface to IPU clients for acquiring, activating, and tracking Shared Resources whose pools are managed
on the MPU side.– Provides APIs for setting frequency, latency, and bandwidth Constraints for applicable HW modules.– Provides for Resource Cleanup in the event of an unrecoverable error in the IPU SS.– Participates in system power management operations such as Suspend/Resume.– Provides a framework for forwarding System Event notifications to registered IPU clients.– Supports system suspend and resume operations with context save and restore.– Supports self-Hibernation -- a zero power, rapid recovery state when the IPU SS is not in active use.– Supports efficient I2C controller sharing across MPU/IPU processor domains.
• Usage– Resources are activated via the System PM (provided with power and clocks) when requested and deactivated when
released.– Constraints may be placed to ensure required performance and limit low power / high latency transitions to meet the use
case requirements.– Clients may register for system event notifications such as suspend and resume to take appropriate local actions.
System EngineeringSysLink Daemon
• SysLink Daemon– Is a user-side daemon process, mainly used to load the Ducati cores– Privilege – User: Media, Group: Video– Also, responsible for restarting the Ducati cores in case of an MMU Fault or other
remote core exceptions.– A common location to create the necessary HeapBufMPs to be used by all
applications.
• Rationale for using SysLink Daemon– To have the persistent Process that is detached from the Parent process that starts
the Process.– Loader is pushed to User-space to avoid the file access from kernel.
System EngineeringDucati Bring Up SequenceSysLinkDaemon
ProcMgrLib
Load (SYS)
Start (SYS)
RempoteProc
Start (SYS)
BIOS
Release RST
Ipc-BIOS
Ipc_start
Ipc_attachMPU
Start (APP)Start (APP)
Release RST
BIOS Ipc-BIOS
Ipc_start
Ipc_attachMPU
Ipc_attachAppM3
Ipc_attachSysM3
Load (APP)
Ipc_getconfig
Ipc_setup
SysM3 AppM3Kernel
System EngineeringRCM Client
• Features– Establish connection with a RCM server by name.– Allocation of memory for remote command messages.– Send remote command messages and receive return context.– Supports both synchronous and asynchronous execution modes.
• Design– Multiple RCM Client instances supported.– RCM client will run in the caller thread context i.e. IL Client/ OMX proxy component. On
Ducati, RCM client will run in the calling task context.– An RCM client can connect only to a single RCM server.– Multiple RCM clients can connect to a single RCM server.– RCM client will be implemented as a user-side library. – Supports one or many RCM Clients per process.
• Usage– Remote OMX component initialization, function invocation and callbacks by invoking remote
functions (RPC SKEL functions) on the remote core
System EngineeringRCM Server
• Features– Receives messages from the connected RCM client(s).– Invoke remote function (RPC Skel func).– Register remote functions (RPC Skel funcs).– Unregister remote functions (RPC Skel funcs).– Gets return value from RPC SKEL. Two kind of return values: RCM return value (that is used
if there is any error in RCM layers) and OMX Return value (the actual return value of the OMX_XXX call)
– Sends return context
• Design– One RCM server instance for every process (can support multiple)– RCM server will run in it’s own thread context.– The RCM server thread will wait on receiving a message. As soon as it receives a message it
will unblock and execute the required remote function.– Multiple RCM clients can connect to a single RCM server.– RCM Server will be implemented as a user-side library
System Engineering
Device Error Handling
System EngineeringHandled Errors
• HW detected errors– Unicache MMU (Detected on Ducati and forwarded to A9 through Notify)
• ACTION : We are terminating the Ducati as it is a major error – L2 MMU Faults (Detected on A9)
• ACTION : We are terminating the Ducati as it is a major error • Detection is made on Ducati as well, as MMU fault is added on Cortex-M3 for ES2.0 and it
gives the ability to retrieve the PC where the issue occurs for precise faults.– Ducati Data Aborts (Detected on Ducati and forwarded to A9)
• ACTION : We are terminating the Ducati as it is a major error
• SW detected errors– Endless loop tasks using a watchdog timer (Detected on A9)
• ACTION : We are terminating the Ducati as it is a major error.– A9 Process terminated abnormally
• ACTION : Cleanup the resources associated with this process– SysLink Daemon crash
• ACTION : Restart the daemon
System EngineeringResource management software design (1/2)
• SysLink Daemon in charge of:– Loading the base image code – Initializing/Uninitializing Ducati
• A9 side Ducati Resource Manager kernel driver in charge of:– Tracking resources requested by Ducati
• Each linux user process that uses Ducati perform following operations (depending on the needs):
– Opens a device file on SysLink IPC kernel driver to be able to communicate with remote processor
– Opens a device file on the IO-MMU kernel driver to be able to share buffers using SysLink DMM feature
– Opens a device file on the TILER kernel driver to be able to allocate TILER buffers– Opens a device file on the Remote proc kernel driver to be able to get notifications when the
remote processor is stopped/started.– Linux kernel closes automatically the device files if not done explicitly by the process and the
resources are cleaned in this context.
System EngineeringResource management software design(2/2)• Resources that can be allocated by Ducati tasks:
– TILER Memory• Allocations requested from the process that has created the Ducati task on behalf of
Ducati task.• Tiler Driver is in-charge of tracking allocated memory per Process.
– Regular Memory using SW DMM for huge memory blocks:• Allocations requested from the process that has created the Ducati task. DMM driver is
in-charge of tracking the allocated memory per process.– Local BIOS Heaps (from SDRAM or L2RAM):
• Release of this memory must be managed on Ducati side as A9 is not aware of.– HW resources
• Ducati side Resource Manager is in charge of tracking allocated HW resources at task level.
• Release of these resources is managed on Ducati side as A9 is tracking at subsystem level only.
– Power resources (same as for HW resources)• Ducati Power Manager is in charge of tracking set power constraints by module… at task
level.• Release of these constraints must be managed on Ducati side as A9 is tracking at
subsystem level only.
System EngineeringMMU Fault Handling
• Handles MMU faults and provides the stack dump when the MMU fault occurred.
• Expectations from Users that are using SysLink– Register and wait for MMU fault notification. – On MMU fault notification, close the IPC handle(s). If any user still has open
handle(s), then SysLink doesn’t initiate recovery process
• Ducati is reloaded once all user processes close the IPC handles
• See the figure in next slide for sequence flow of how the MMU fault is handled.
System EngineeringM
MU
Fau
lt H
andl
ing
sd Ducati_mmu_fault_recovery
DucatiKernel
IOMMU/DMM_USERdevice_error_handler DucatiSyslinkDaemon/DEH
remote_procApplication TilerDriver
register for MMU fault notification
register for MMU fault notification
Ducati MMUfault
register for MMU fault notification
MMU fault
fault callback
set proc state to FAULT
release Ducati resources
notify MMU fault
kick off the recoverysequence
stop the Processor
release IPC handles
release handle to IOMMU
clean MMU entries
notify MMU fault
release the handles similar to Daemon
release Tiler Driver handle
clean Tiler allocations
start reloading process after allProcesses closed the IPC handles
System EngineeringDucati Exceptions (Data Aborts)
• Ducati can crash due to other reasons than MMU faults, and these errors are handled using SYS error mechanism.
• Ducati sends SYS error event to device manger handler on A9 side.
• Event is notified to registered users in A9.
• Expectations from Users– Register for Sys error notification– On error notification, close the IPC handle(s)
• See the next slide for sequence flow
System EngineeringD
ucat
i Exc
eptio
ns (D
ata
Abo
rts)
sd Ducati_crash_recovery
DucatiKernel
device_error_handler Ducati ducati_exception_handlerSyslinkDaemon/DEH
Application
register for Exception
register for Exception
Crashes due to exception
Send SYS ERRmessage/WatchdogInterrupt
release Systemresources
Gate IPCcommunication
notify exception
close handles to IPC and Tiler
notify exception
Dump out the exceptioninformation
Close the IPC andiommu handles
Start recovery process
System EngineeringDucati Hang
• Ducati hangs, where one process in Ducati spins in a while loop are detected using GP timer/Watchdog like mechanism.
• Expectations from Users– Register for hang notification– On event notification, close the IPC handles
System EngineeringA9 Process Abnormal Termination
• Ducati component/task need to know when the A9 process terminated abnormally to release the corresponding Ducati resources.
• Such occurrence is communicated using PID_DEATH event to Ducati.
• The Ducati component registers with SysLink framework to receive this notification.
• See the following slide for sequence flow illustrating the case where the A9 process that exchanged buffer with Ducati is terminated.
System EngineeringA
9 Pr
oces
s A
bnor
mal
Ter
min
atio
nsd A9 Process abnormal termination
DucatiKernelApplication IOMMU/DMM_USER Ducati Taskdevice_error_handler ducati_resource_mgrKernel
register_callback(app_release_notification)
create Ducati task
save A9 PID thatstarted this Task
open
alloc buffer
Track buffer allocated to this PID
use buffer
App Dies
close fd
app_release_notificationcallback
notify_A9_task_term(pid)Release and Kill taskcreated by this A9 PID
Release DMMresources
System EngineeringSysLink Daemon Crash
• SysLink Daemon is in charge of loading the Base Image and setting up IPC
• Ducati is put in reset state upon SysLink Daemon crash.
• Expectations from Users– Register for PROC_STOP event– On event notification, close the IPC handles
• See the following slide for sequence flow
System EngineeringSy
sLin
k D
aem
on C
rash
sd Syslink_daemon crash
DucatiKernel
SyslinkDaemon/DEH
remote_procApplication DucatiIOMMU/DMM_USERdevice_error_handler
register for PROC STOP
Daemon crash
release
stop PROC
notify PROC_STOP
set the state toPROC_STOPPED
release resources
notify PROC STOP
close IPC handles
release
remove MMU entries
System EngineeringDucati Crash Info - Example
Error Type1. MMU Fault (Source core not
identifiable)2. SysError (Source core (SysM3
or AppM3) is identified)
Execution State• Type of task (Task, Swi, Hwi)• Task handle• Address and size of stack• Internal state registers
snapshot
Stack• Stack of the offending thread• Bottom up (Grow from
bottom to top)• 0xbebebebe marks unfilled
blocks• Depends on type of build
profile (whole_program_debug results in optimized (shorter) stack
System EngineeringDucati Crash Info - Locating Error Source
• Note PC(R15) from Execution State
• Look up the corresponding symbol in the map file– Map file is found under package/cfg folder of the binary source folder. It has the
same name as the loaded base image with the addition of “.map” at the end.– Look under the “GLOBAL SYMBOLS: SORTED BY Symbol Address” section to
easily lookup the corresponding symbol whose memory range encloses the PC address
– Search for the symbol in source code. The location can be further confirmed by matching function arguments, variable addresses with values of R0, R1, R2, R3 etc.
– The LR indicates the return point of current sub-routine and can be used to also confirm the location by identifying the nesting pattern. The debug profile will be more meaningful in this regard as optimization might not preserve call sequence.
System Engineering
Ducati Tracing
System EngineeringDucati Tracing - Trace Daemon
• SysLink Trace Daemon dumps out the traces from Ducati cores.
• The tracing mechanism is based on the circular buffer in the shared memory. Each Ducati M3 has its own shared memory.
• Trace daemon wakes up at regular intervals to dump out any traces available from Ducati. The interval at which it dumps the traces can be changed.
• On BIOS, the SysMin is configured to route the prints to the trace buffer.– Use System_printf and System_flush to print and flush the traces to shared memory.
• Way to start the trace daemon.– ./syslink_tracedaemon.out– Dump format:
[APPM3]: APP M3:MultiProc id = 1
[APPM3]: APPM3: Ipc_attach to host ...Host DONE
[APPM3]: APPM3: IPC attach to SYSM3... DONE
------
[SYSM3]: Ipc_start status = 0
[SYSM3]: Ipc_attach to host ...Host DONE
[SYSM3]: Ipc_attach to AppM3 ...AppM3 DONE