Upload
hsa-foundation
View
756
Download
0
Tags:
Embed Size (px)
Citation preview
HSA RUNTIMEYEN-CHING CHUNG, NATIONAL TSING HUA UNIVERSITY
OUTLINE Introduction
HSA Core Runtime API (Pre-release 1.0 provisional) Initialization and Shut Down Notifications (Synchronous/Asynchronous) Agent Information Signals and Synchronization (Memory-Based) Queues and Architected Dispatch
Summary
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (1) The HSA core runtime is a thin, user-mode API that provides the interface necessary for
the host to launch compute kernels to the available HSA components.
The overall goal of the HSA core runtime design is to provide a high-performance dispatch mechanism that is portable across multiple HSA vendor architectures.
The dispatch mechanism differentiates the HSA runtime from other language runtimes by architected argument setting and kernel launching at the hardware and specification level.
The HSA core runtime API is standard across all HSA vendors, such that languages which use the HSA runtime can run on different vendor’s platforms that support the API.
The implementation of the HSA runtime may include kernel-level components (required for some hardware components, ex: AMD Kaveri) or may be entirely user-space (for example, simulators or CPU implementations).
© Copyright 2014 HSA Foundation. All Rights Reserved
Component 1
DriverComponent N…
Vendor m
…Component 1
DriverComponent N…
Vendor 1
Component 1
HSA RuntimeComponent N…
HSA Vendor 1
HSAFinalizer Component 1
HSA RuntimeComponent N…
HSA Vendor m
HSAFinalizer
INTRODUCTION (2)
Programming Model
Language Runtime
The software architecture stack without HSA runtime
OpenCLApp
JavaApp
OpenMPApp
DSLApp
OpenCLRuntime
JavaRuntime
OpenMPRuntime
DSLRuntime
…
…
The software architecture stack with HSA runtime
…
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (3)
OpenCL Runtime HSA RuntimeAgent
Start Program
HSA Memory Allocation
Enqueue Dispatch Packet
Exit Program Resource Deallocation
Command Queue
Platform, Device, and Context Initialization
SVM Allocation and Kernel Arguments Setting
Build Kernel
HSA Runtime Close
HSA Runtime Initialization and Topology Discovery
HSAIL Finalization and Linking
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (4) HSA Platform System Architecture Specification support
Runtime initialization and shutdown Notifications (synchronous/asynchronous) Agent information Signals and synchronization (memory-based) Queues and Architected dispatch Memory management
HSAIL support Finalization, linking, and debugging
Image and Sampler support
HSA Runtime
HSA Memory Allocation
Enqueue Dispatch Packet
HSA Runtime Close
HSA Runtime Initialization and
Topology Discovery
HSAIL Finalization and Linking
© Copyright 2014 HSA Foundation. All Rights Reserved
RUNTIME INITIALIZATION AND SHUTDOWN
OUTLINE
Runtime Initialization API hsa_init
Runtime Shut Down API hsa_shut_down
Examples
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA RUNTIME INITIALIZATION
When the API is invoked for the first time in a given process, a runtime instance is created.
A typical runtime instance may contain information of platform, topology, reference count, queues, signals, etc.
The API can be called multiple times by applications Only a single runtime instance will exist for a given process. Whenever the API is invoked, the reference count is increased by one.
© Copyright 2014 HSA Foundation. All Rights Reserved
HSA RUNTIME SHUT DOWN
When the API is invoked, the reference count is decreased by 1.
When the reference count < 1 All the resources associated with the runtime instance (queues, signals, topology
information, etc.) are considered invalid and any attempt to reference them in subsequent API calls results in undefined behavior.
The user might call hsa_init to initialize the HSA runtime again. The HSA runtime might release resources associated with it.
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE – RUNTIME INITIALIZATION (1)
Data structure for runtime instance
If hsa_init is called more than once, increase the ref_count by 1
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE – RUNTIME INITIALIZATION (2)
hsa_init is called the first time, allocate resources and set the reference count
Get the number of HSA agent
Initialize agents
Create an empty agent list
If initialization failed, release resources
Create topology table
© Copyright 2014 HSA Foundation. All Rights Reserved
Agent-0
node_id 0
id 0
type CPU
vendor Generic
name Generic
wavefront_size 0
queue_size 200
group_memory 0
fbarrier_max_count 1
is_pic_supported 0……
EXAMPLE - RUNTIME INSTANCE (1)Platform Name: Generic Memory
node_id 0
id 0
segment_type 111111
address_base 0x0001
size 2048 MB
peak_bandwidth 6553.6 mpbs
Agent-1
node_id 0
id 0
type GPU
vendor Generic
name Generic
wavefront_size 64
queue_size 200
group_memory 64
fbarrier_max_count 1
is_pic_supported 1
Cache
node_id 0
id 0
levels 1
associativity 1
cache size 64KB
cache line size 4
is_inclusive 1
Agent: 2Memory: 1
Cache: 1
… …
© Copyright 2014 HSA Foundation. All Rights Reserved
Agent-0
node_id = 0
id = 0
agent_type = 1 (CPU)
vendor[16] = Generic
name[16] = Generic
wavefront_size = 0
queue_size =200
group_memory_size_bytes =0
fbarrier_max_count = 1
is_pic_supported = 0
Platform Header File
*base_address = 0x00001
Size = 248
system_timestamp_frequency_mhz = 200
signal_maximum_wait = 1/200
*node_id
no_nodes = 1
*agent_list
no_agent = 2
*memory_descriptor_list
no_memory_descriptor = 1
*cache_descriptor_list
no_cache_descriptor = 1
EXAMPLE - RUNTIME INSTANCE (2)
…
…
cache
node_id = 0
Id = 0
Levels = 1
* associativity
* cache_size
* cache_line_size
* is_inclusive
1 NULL
64KB NULL
1 NULL
4 NULL
Memory
node_id = 0
Id = 0
supported_segment_type_mask = 111111
virtual_address_base = 0x0001
size_in_bytes = 2048MB
peak_bandwidth_mbps = 6553.6
0 NULL
45 165 NULL
285 NULL
325 NULL
Agent-1
node_id = 0
id = 0
agent_type = 2 (GPU)
vendor[16] = Generic
name[16] = Generic
wavefront_size = 64
queue_size =200
group_memory_size_bytes =64
fbarrier_max_count = 1
is_pic_supported = 1…
© Copyright 2014 HSA Foundation. All Rights Reserved
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE – RUNTIME SHUT DOWN
If ref_count < 1, then free the list; Otherwise decrease the ref_count by 1.
NOTIFICATIONS (SYNCHRONOUS/ASYNCHRONOUS)
OUTLINE
Synchronous Notifications hsa_status_t hsa_status_string
Asynchronous Notifications
Example
© Copyright 2014 HSA Foundation. All Rights Reserved
SYNCHRONOUS NOTIFICATIONS Notifications (errors, events, etc.) reported by the runtime can be synchronous or
asynchronous
The HSA runtime uses the return values of API functions to pass notifications synchronously.
A status code is define as an enumeration, , to capture the return value of any API function that has been executed, except accessors/mutators.
The notification is a status code that indicates success or error. Success is represented by HSA_STATUS_SUCCESS, which is equivalent to zero. An error status is assigned a positive integer and its identifier starts with the
HSA_STATUS_ERROR prefix. The status code can help to determine a cause of the unsuccessful execution.
© Copyright 2014 HSA Foundation. All Rights Reserved
STATUS CODE QUERY
Query additional information on status code
Parameters status (input): Status code that the user is seeking more information on status_string (output): An ISO/IEC 646 encoded English language string that potentially
describes the error status
© Copyright 2014 HSA Foundation. All Rights Reserved
ASYNCHRONOUS NOTIFICATIONS The runtime passes asynchronous notifications by calling user-defined
callbacks. For instance, queues are a common source of asynchronous events because the
tasks queued by an application are asynchronously consumed by the packet processor. Callbacks are associated with queues when they are created. When the runtime detects an error in a queue, it invokes the callback associated with that queue and passes it an error flag (indicating what happened) and a pointer to the erroneous queue.
The HSA runtime does not implement any default callbacks. When using blocking functions within the callback implementation, a callback that
does not return can render the runtime state to be undefined.
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE - CALLBACK
Pass the callback function when create queue
If the queue is empty, set the event and invoke callback
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION
OUTLINE
Agent information hsa_node_t hsa_agent_t hsa_agent_info_t hsa_component_feature_t
Agent Information manipulation APIs hsa_iterate_agents hsa_agent_get_info
Example
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION The runtime exposes a list of agents that are available in the system.
An HSA agent is a hardware component that participates in the HSA memory model. An HSA agent can submit AQL packets for execution. An HSA agent may also but is not required to be an HSA component. It is possible for
a system to include HSA agents that are neither an HSA component nor a host CPU.
HSA agents are defined as opaque handles of type hsa_agent_t .
The HSA runtime provides APIs for applications to traverse the list of available agents and query attributes of a particular agent.
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION (1)
Opaque agent handle
Opaque NUMA node handle An HSA memory node is a node that delineates a set of
system components (host CPUs and HSA Components) with “local” access to a set of memory resources attached to the node's memory controller and appropriate HSA-compliant access attributes.
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION (2)
Component features An HSA component is a hardware or software component that can be a target of the AQL queries
and conforms to the memory model of the HSA.
Values HSA_COMPONENT_FEATURE_NONE = 0
No component capabilities. The device is an agent, but not a component. HSA_COMPONENT_FEATURE_BASIC = 1
The component supports the HSAIL instruction set and all the AQL packet types except Agent dispatch.
HSA_COMPONENT_FEATURE_ALL = 2 The component supports the HSAIL instruction set and all the AQL packet types.
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION (3)
Agent attributes
Values HSA_AGENT_INFO_MAX_GRID_DIM HSA_AGENT_INFO_MAX_WORKGROUP_DIM HSA_AGENT_INFO_QUEUE_MAX_PACKETS HSA_AGENT_INFO_CLOCK HSA_AGENT_INFO_CLOCK_FREQUENCY HSA_AGENT_INFO_MAX_SIGNAL_WAIT
HSA_AGENT_INFO_NAME HSA_AGENT_INFO_NODE HSA_AGENT_INFO_COMPONENT_FEATURES HSA_AGENT_INFO_VENDOR_NAME HSA_AGENT_INFO_WAVEFRONT_SIZE HSA_AGENT_INFO_CACHE_SIZE
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION MANIPULATION (1)
Iterate over the available agents, and invoke an application-defined callback on every iteration
If callback returns a status other than HSA_STATUS_SUCCESS for a particular iteration, the traversal stops and the function returns that status value.
Parameters callback (input): Callback to be invoked once per agent data (input): Application data that is passed to callback on every iteration. Can be
NULL.
© Copyright 2014 HSA Foundation. All Rights Reserved
AGENT INFORMATION MANIPULATION (2)
Get the current value of an attribute for a given agent
Parameters agent (input): A valid agent attribute (input): Attribute to query value (output): Pointer to a user-allocated buffer where to store the value of the
attribute. If the buffer passed by the application is not large enough to hold the value of attribute, the behavior is undefined.
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE - AGENT ATTRIBUTE QUERY
Copy agent attribute information
Get the agent handle of Agent 0
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNALS AND SYNCHRONIZATION (MEMORY-BASED)
OUTLIINE Signal
Signal manipulation API Create/Destroy Query Send Atomic Operations
Signal wait Get time out Signal Condition
Example
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL (1) HSA agents can communicate with each other by using coherent global memory,
or by using signals.
A signal is represented by an opaque signal handle
A signal carries a value, which can be updated or conditionally waited upon via an API call or HSAIL instruction.
The value occupies four or eight bytes depending on the machine model in use.
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL (2) Updating the value of a signal is equivalent to sending the signal.
In addition to the update (store) of signals, the API for sending signal must support other atomic operations with specific memory order semantics
Atomic operations: AND, OR, XOR, Add, Subtract, Exchange, and CAS Memory order semantics : Release and Relaxed
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL CREATE/DESTROY
Create a signal Parameters
initial_value (input): Initial value of the signal.
signal_handle (output): Signal handle.
Destroy a signal previous created by hsa_signal_create
Parameter signal_handle (input): Signal handle.
© Copyright 2014 HSA Foundation. All Rights Reserved
Send and atomically set the value of a signal with release semantics
SIGNAL LOAD/STORE Atomically read the current signal value with
acquire semantics
Atomically read the current signal value with relaxed semantics
Send and atomically set the value of a signal with relaxed semantics
© Copyright 2014 HSA Foundation. All Rights Reserved
Send and atomically increment the value of a signal by a given amount with release semantics
SIGNAL ADD/SUBTRACT
Send and atomically decrement the value of a signal by a given amount with release semantics
Send and atomically increment the value of a signal by a given amount with relaxed semantics
Send and atomically decrement the value of a signal by a given amount with relaxed semantics
© Copyright 2014 HSA Foundation. All Rights Reserved
Send and atomically perform a logical AND operation on the value of a signal and a given value with release semantics
SIGNAL AND (OR, XOR)/EXCHANGE
Send and atomically set the value of a signal and return its previous value with release semantics
Send and atomically perform a logical AND operation on the value of a signal and a given value with relaxed semantics
Send and atomically set the value of a signal and return its previous value with relaxed semantics
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL WAIT (1) The application may wait on a signal, with a condition specifying the terms of
wait.
Signal wait condition operator
Values HSA_EQ: The two operands are equal. HSA_NE: The two operands are not equal. HSA_LT: The first operand is less than the second operand. HSA_GTE: The first operand is greater than or equal to the second operand.
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL WAIT (2) The wait can be done either in the HSA component via an HSAIL wait instruction
or via a runtime API defined here. Waiting on a signal returns the current value at the opaque signal object; The wait may have a runtime defined timeout which indicates the maximum amount of time that
an implementation can spend waiting.
The signal infrastructure allows for multiple senders/waiters on a single signal.
Wait reads the value, hence acquire synchronizations may be applied.
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL WAIT (3)
Signal wait
Parameters signal_handle (input): A signal handle condition (input): Condition used to compare the passed and signal values compare_ value (input): Value to compare with return_value (output): A pointer where the current signal value must be read into
© Copyright 2014 HSA Foundation. All Rights Reserved
SIGNAL WAIT (4)
Signal wait with timeout Parameters
signal_handle (input): A signal handle timeout (input): Maximum wait duration (A value of zero indicates no maximum) long_wait (input): Hint indicating that the signal value is not expected to meet the given condition
in a short period of time. The HSA runtime may use this hint to optimize the wait implementation. condition (input): Condition used to compare the passed and signal values compare_ value (input): Value to compare with return_value (output): A pointer where the current signal value must be read into
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE – SIGNAL WAIT (1)
thread_1 thread_2
thread_1 is blocked
hsa_signal_add_relaxed(value = value + 3)
Return signal value
Condition satisfied, the execution of thread_1 continues
value = 0
Timeline Timeline
value = 3
hsa_signal_substract_relaxed(value = value - 1)value = 2
hsa_signal_wait_timeout_acquire(value == 2)
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE – SIGNAL WAIT (2)
If signal_handle is invalid, then return signal invalid status
Compare tmp->value with compare_value to see if the condition is satisfied? If timeout = 0 then return signal time out status
Signal wait condition function
If the condition is satisfied, then return signal and status
© Copyright 2014 HSA Foundation. All Rights Reserved
QUEUES AND ARCHITECTED DISPATCH
OUTLINE
Queues Queue Types and Structure HSA runtime API for Queue Manipulations
Architected Queuing Language (AQL) Support Packet type Packet header
Examples Enqueue Packet Packet Processor
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (1) An HSA-compliant platform supports multiple user-level command queues allocation.
A use-level command queue is characterized as runtime-allocated, user-level accessible virtual memory of a certain size, containing packets defined in the Architected Queuing Language (AQL packets).
Queues are allocated by HSA applications through the HSA runtime.
HSA software receives memory-based structures to configure the hardware queues to allow for efficient software management of the hardware queues of the HSA agents.
This queue memory shall be processed by the HSA Packet Processor as a ring buffer.
Queues are read-only data structures. Writing values directly to a queue structure results in undefined behavior. But HSA agents can directly modify the contents of the buffer pointed by base_address, or use
runtime APIs to access the doorbell signal or the service queue.
© Copyright 2014 HSA Foundation. All Rights Reserved
Two queue types, AQL and Service Queues, are supported AQL Queue consumes AQL packets that are used to specify the information of kernel functions
that will be executed on the HSA component Service Queue consumes agent dispatch packets that are used to specify runtime-defined or user
registered functions that will be executed on the agent (typically, the host CPU)
INTRODUCTION (2)
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (3) AQL queue structure
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (4) In addition to the data held in the queue structure, the queue also defines two
properties (readIndex and writeIndex) that define the location of “head” and “tail” of the queue.
readIndex: The read index is a 64-bit unsigned integer that specifies the packetID of the next AQL packet to be consumed by the packet processor.
writeIndex: The write index is a 64-bit unsigned integer that specifies the packetID of the next AQL packet slot to be allocated.
Both indices are not directly exposed to the user, who can only access them by using dedicated HSA core runtime APIs.
The available index functions differ on the index of interest (read or write), action to be performed (addition, compare and swap, etc.), and memory consistency model (relaxed, release, etc.).
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (5) The read index is automatically advanced when a packet is read by the packet
processor.
When the packet processor observes that The read index matches the write index, the queue can be considered empty; The write index is greater than or equal to the sum of the read index and the size of
the queue, then the queue is full.
The doorbell_signal field of a queue contains a signal that is used by the agent to inform the packet processor to process the packets it writes.
The value that the doorbell signaled is equal to the ID of the packet that is ready to be launched.
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION (6) The new task might be consumed by the packet processor even before the
doorbell signal has been signaled by the agent. This is because the packet processor might be already processing some other
packets and observes that there is new work available, so it processes the new packets.
In any case, the agent must ring the doorbell for every batch of packets it writes.
© Copyright 2014 HSA Foundation. All Rights Reserved
QUEUE CREATE/DESTROY Create a user mode queue
When a queue is created, the runtime also allocates the packet buffer and the completion signal.
The application should only rely on the status code returned to determine if the queue is valid
Destroy a user mode queue A destroyed queue might not be accessed after being
destroyed. When a queue is destroyed, the state of the AQL packets
that have not been yet fully processed becomes undefined.
© Copyright 2014 HSA Foundation. All Rights Reserved
GET READ/WRITE INDEX Atomically retrieve read index of a queue with
acquire semantics
Atomically retrieve write index of a queue with acquire semantics
Atomically retrieve read index of a queue with relaxed semantics
Atomically retrieve write index of a queue with relaxed semantics
© Copyright 2014 HSA Foundation. All Rights Reserved
SET READ/WRITE INDEX Atomically set the read index of a queue with
release semantics
Atomically set the read index of a queue with relaxed semantics
Atomically set the write index of a queue with release semantics
Atomically set the write index of a queue with relaxed semantics
© Copyright 2014 HSA Foundation. All Rights Reserved
COMPARE AND SWAP WRITE INDEX Atomically compare and set the write index of a
queue with acquire/release/relaxed/acquire-release semantics
Parameters queue (input): A queue expected (input): The expected index value val (input): Value to copy to the write index if expected
matches the observed write index
Return value Previous value of the write index
© Copyright 2014 HSA Foundation. All Rights Reserved
ADD WRITE INDEX Atomically increment the write index of a
queue by an offset with release/acquire/relaxed/acquire-release semantics
Parameters queue (input): A queue val (input): The value to add to the write index
Return value Previous value of the write index
© Copyright 2014 HSA Foundation. All Rights Reserved
ARCHITECTED QUEUING LANGUAGE (AQL)
An HSA-compliant system provides a command interface for the dispatch of HSA agent commands.
This command interface is provided by the Architected Queuing Language (AQL).
AQL allows HSA agents to build and enqueue their own command packets, enabling fast and low-power dispatch.
AQL also provides support for HSA component queue submissions The HSA component kernel can write commands in AQL format.
© Copyright 2014 HSA Foundation. All Rights Reserved
AQL PACKET (1)
AQL packet format
Values Always reserved packet (0): Packet format is set to always reserved when the queue is initialized. Invalid packet (1): Packet format is set to invalid when the readIndex is incremented, making the
packet slot available to the HSA agents. Dispatch packet (2): Dispatch packets contain jobs for the HSA component and are created by HSA
agents. Barrier packet (3): Barrier packets can be inserted by HSA agents to delay processing subsequent
packets. All queues support barrier packets. Agent dispatch packet (4): Dispatch packets contain jobs for the HSA agent and are created by HSA
agents.
© Copyright 2014 HSA Foundation. All Rights Reserved
AQL PACKET (2)
HSA signaling object handle used to indicate completion of the job
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE - ENQUEUE AQL PACKET (1)
An HSA agent submits a task to a queue by performing the following steps: Allocate a packet slot (by incrementing the writeIndex) Initialize the packet and copy packet to a queue associated with the Packet Processor Mark packet as valid Notify the Packet Processor of the packet (With doorbell signal)
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE - ENQUEUE AQL PACKET (2)
Dispatch Queue
Allocate an AQL packet slot
Copy the packet into queue. Note that, we can have a lock here to prevent race condition in multithread environment
WriteIndex
ReadIndexInitialize packet
Send doorbell signal
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE - PACKET PROCESSOR
WriteIndex
ReadIndex
Get packet content
Check if barrier packet
Update readIndex, change packet state to invalid, and send completion signal.
Receive doorbell Dispatch Queue
If there is any packet in queue, process the packet.
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY MANAGEMENT
OUTLINE
Memory registration and deregistration
Memory region and memory segment
APIs for memory region manipulation
APIs for memory registration and deregistration
© Copyright 2014 HSA Foundation. All Rights Reserved
INTRODUCTION One of the key features of HSA is its ability to share global pointers between the
host application and code executing on the HSA component. This ability means that an application can directly pass a pointer to memory allocated on the host
to a kernel function dispatched to a component without an intermediate copy
When a buffer created in the host is also accessed by a component, programmers are encouraged to register the corresponding address range beforehand.
Registering memory expresses an intention to access (read or write) the passed buffer from a component other than the host. This is a performance hint that allows the runtime implementation to know which buffers will be accessed by some of the components ahead of time.
When an HSA program no longer needs to access a registered buffer in a device, the user should deregister that virtual address range.
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY REGION/SEGMENT
A memory region represents a virtual memory interval that is visible to a particular agent, and contains properties about how memory is accessed or allocated from that agent.
Memory segments
Values HSA_SEGMENT_GLOBAL = 1 HSA_SEGMENT_PRIVATE = 2 HSA_SEGMENT_GROUP = 4
HSA_SEGMENT_KERNARG = 8 HSA_SEGMENT_READONLY = 16 HSA_SEGMENT_IMAGE = 32
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY REGION INFORMATION
Attributes of a memory region
Values HSA_REGION_INFO_BASE_ADDRESS HSA_REGION_INFO_SIZE HSA_REGION_INFO_NODE HSA_REGION_INFO_MAX_ALLOCATION_SIZE HSA_REGION_INFO_SEGMENT HSA_REGION_INFO_BANDWIDTH HSA_REGION_INFO_CACHED
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY REGION MANIPULATION (1)
Get the current value of an attribute of a region
Iterate over the memory regions that are visible to an agent, and invoke an application-defined callback on every iteration
If callback returns a status other than HSA_STATUS_SUCCESS for a particular iteration, the traversal stops and the function returns that status value.
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY REGION MANIPULATION (2)
Allocate a block of memory
Deallocate a block of memory previously allocated using hsa_memory_allocate
Copy block of memory Copying a number of bytes larger than the size of the
memory regions pointed by dst or src results in undefined behavior.
© Copyright 2014 HSA Foundation. All Rights Reserved
MEMORY REGISTRATION/DEREGISTRATION
Register memory
Parameters address (input): A pointer to the base of
the memory region to be registered. If a NULL pointer is passed, no operation is performed.
size (input): Requested registration size in bytes. A size of zero is only allowed if address is NULL.
Deregister memory previously registered using hsa_memory_register
Parameter address (input): A pointer to the base of the
memory region to be registered. If a NULL pointer is passed, no operation is performed.
© Copyright 2014 HSA Foundation. All Rights Reserved
EXAMPLE
Allocate a memory space
Use hsa_region_get_info to get the size in byte of this memory space
Register this memory space for a performance hint
Finish operation, deregister and free this memory space
© Copyright 2014 HSA Foundation. All Rights Reserved
SUMMARY
SUMMARY Covered
HSA Core Runtime API (Pre-release 1.0 provisional) Runtime Initialization and Shutdown (Open/Close) Notifications (Synchronous/Asynchronous) Agent Information Signals and Synchronization (Memory-Based) Queues and Architected Dispatch Memory Management
Not covered Extension of Core Runtime HSAIL Finalization, Linking, and Debugging Images and Samplers
© Copyright 2014 HSA Foundation. All Rights Reserved
QUESTIONS?
© Copyright 2014 HSA Foundation. All Rights Reserved