A step towards data orientation



A step towards data orientation. JOHAN TORP . STHLM GAME DEVELOPER FORUM 5/5 2011. MY BACKGROUND. M.Sc. Computer Science. OO (Java) and functional programming (Haskell) - PowerPoint PPT Presentation

Citation preview


› M.Sc. Computer Science. OO (Java) and functional programming (Haskell)› Worked ~5 years outside game industry. C++, generic programming &

boost, DbC› AI coder at DICE ~2½ years

Optimal game dev credentials?


PC & normal sized apps = cache schmache

Games on consoles 5000 L2 misses = ~1ms

- data-oriented design ftw!

› A lot of OO code and knowledge out there› Incrementally moving from OO to cache-friendlier code

› Facts needed before looking at code› Cache-friendly pathfinding› Async vs sync code› Questions

› Visual domain specific scripting language› Gameplay / AI code in C++› NavPower pathfinding middleware› EASTL containers› We love to blow up parts of our game worlds – and call this destruction

› PS3 has 1 core: 32KB data and 32KB instruction L1 cache › 512KB L2 I+D cache› 360 has 3 cores: 32KB data and 32KB instruction L1 cache for each core, › 1Mb L2 I+D shared by all cores

› 1 L1 cache miss ~= 40 cycles

“You miss L1 so much that you cry yourself to sleep every night with a picture of it under your pillow” @okonomiyonda

› 1 L2 cache miss ~= 600 cycles › 1 L2 cache miss ~= 20 matrix multiplications› Other than heavy calculations: CPU performance ~= cache misses

Keep copy of common data nearby … in a compact representation

Pointer chasing thrashes both I-cache and D-cache

Often better to copy frequently accessed data once each frame, access copy instead


/// Temporary struct containing information about a single sensor. /// Never stored between updates. struct VisionInfo {

VisionInfo(const AiSettings& settings, EntryComponent& owner, ...);

Vec3 eyePos; Vec2 eyeForwardXz; uint playerId;

// Extracted from settings float centralAngle; float peripheralAngle; float seeingDistance; bool seeThroughTerrain;


› Temporary data structures common in Data-Oriented Design› Stack variables or alloca()› Not suited for large amounts or large edge cases

› Put aside 8x128kb blocks for ”scratch pad calculations”› Linear allocator – doesn’t free within block› Return whole block when done – zero fragmentation

Find a good slot in fragmented memory space Expensive!Container of new:ed objects scattered in memory Poor cache locality!Mix short/long lived allocations -> fragmention Lose memory over time!

You should prefer pre-allocated flat vectors and try to minimize new/malloc








› Find path› Load / unload nav mesh section› Add / remove obstacles› Path invalidation detection› Can go-tests› Line- / can go straight-tests, circle tests, triangle tests

› Find path› Load / unload nav mesh section› Add / remove obstacles› Path invalidation detection› Can go-tests› Line- / can go straight-tests, circle tests, triangle tests

Collect and batch process for good cache locality

› Pathfinder - find path, path invalidation, circle/line tests› Random position generator - can go-tests› Manager - load nav mesh, obstacles, destruction, updates

Let some line tests in AI decision making remain synchronous

class Pathfinder { virtual PathHandle* findPath(const PathfindingPosition& start, const PathfindingPosition& end, float corridorRadius, PathHandle::StateListener* listener) = 0;

virtual void releasePath(PathHandle* path) = 0;

virtual bool canGoStraight(Vec3Ref start, Vec3Ref end, Vec3* collision = nullptr) const = 0; };

typedef eastl::fixed_vector<Vec3, 8> WaypointVector; typedef eastl::fixed_vector<float, 8> WaypointRadiusVector;

struct PathHandle { enum State {ComputingPath, ValidPath, NoPathAvailable, RepathingRequired};

class StateListener { virtual void onStateChanged(PathHandle* handle) = 0; };

PathHandle():waypoints(pathfindingArena()), radii(pathfindingArena()) {}

WaypointVector waypoints; WaypointRadiusVector radii; State state;


typedef eastl::fixed_vector<Vec3, 8> WaypointVector; typedef eastl::fixed_vector<float, 8> WaypointRadiusVector;

struct PathHandle { enum State {ComputingPath, ValidPath, NoPathAvailable, RepathingRequired};

class StateListener { virtual void onStateChanged(PathHandle* handle) = 0; };

PathHandle():waypoints(pathfindingArena()), radii(pathfindingArena()) {}

WaypointVector waypoints; WaypointRadiusVector radii; State state;


› class NavPowerPathfinder : public Pathfinder {› public:

virtual PathHandle* findPath(...) override;› virtual PathHandle* findPathFromDestination(...) override;› virtual void releasePath(...) override;› virtual bool canGoStraight(...) const override;

void updatePaths();› void notifyPathListeners();

› private:› bfx::PolylinePathRCPtr m_paths[MaxPaths];

PathHandle m_pathHandles[MaxPaths];› PathHandle::StateListener* m_pathHandleListeners[MaxPaths];

› u64 m_usedPaths, m_updatedPaths, m_updatedValidPaths; };

1. Copy all new NavPower paths -> temporary representation2. Drop unnecessary points for all paths3. Corridor adjust all paths 4. Copy temporaries -> PathHandles

typedef eastl::vector<CorridorNode> Corridor;

ScratchPadArena scratch; Corridor corridor(scratch); corridor.resize(navPowerPath.size()); // Will allocate memory using scratch pad

1. Copy all new NavPower paths -> temporary representation2. Drop unnecessary points for all paths3. Corridor adjust all paths 4. Copy temporaries -> PathHandles

for (...) { // Loop through all paths in their corridor representation dropUnnecessaryPoints(it->corridor, scratchPad);

for (...) shrinkEndPoints(it->corridor);

for (...) calculateCornerDisplacements(it->corridor);

for (...) displaceCorners(it->corridor);

for (...) shrinkSections(it->corridor);

for (...) copyCorridorToHandle(it->corridor, it->pathHandle);


void NavPowerManager::update(float frameTime) { m_streamingManager.update(); m_destructionManager.update(); m_obstacleManager.update();

for (PositionGeneratorVector::const_iterator it= ...) (**it).update();


for (PathfinderVector::const_iterator it=m_pathfinders.begin(), ...) (**it).updatePaths(); for (PathfinderVector::const_iterator it=m_pathfinders.begin(), ...) (**it).notifyPathListeners(); }

AI Decision Making Code

Pathfinding Runtime Code

NavPower Code

Animation Code


Animation Code

Animation Code

AI Decision Making Code

AI Decision Making Code

NavPower Code

Pathfinding Runtime Code


› Keep pathfinding code/data cache hot› Avoid call sites cache running cold› Easier to jobify / SPUify› Easy to schedule and avoid spikes

















Waypoint DataCorridor Radii

Waypoint Positions

Each server update

1. Each AI decision making2. Pathfinding manager update

All pathfinding requestsAll corridor adjustmentsAll PathHandle notifications -> path following -> server locomotion

3. Network pulse. Server locomotion -> client locomotion4. ...rest of update

No extra latency added

› Callbacks. Delay? Fire in batch?› Handle+poll instead of callbacks. Poll in batch?› Record messages/events, act on them later.. in batch?› Assume success, recover from failure next update

+ Cache friendly & parallelizable+ Easy to profile & schedule+ Avoid bugs with long synchronous callback chains+ Modular

- More glue code managers, handles, polling update calls, multiple representations of the same data

- More bugsindex fiddling, life time handling, latency, representations drifting out of sync

- Callstack won’t tell you everythingbreak point in sync code gives easy-to-debug vertical slice...

...but can we afford vertical deep dives?

› Do not have to abandon OO nor rewrite the world› Start small, batch a bit, cut worst pointer chasing, avoid deep dives, grow

from there› Much easer to rewrite a system in a DO fashion afterwards

Existing code is crystallized knowledge, refactor incrementally to learn!

›Background: Console caches, heap allocations expensive, temporary memory›AI decision making – pathfinding – animation›Code: Async abstractions, handles, scratch pad, fixed_vector, batch processing›Latency analysis, pros&cons sync vs async

Think about depth/width of calls, try stay within your system, keep hot data nearby

Avoid rewritis You can retract your synchronous tentacles slowly

email johan.torp@dice.se twitter semanticspeed slides www.johantorp.com