View
221
Download
2
Category
Preview:
Citation preview
Compact Framework Memory Management Compact Framework Memory Management
Chris TackePrincipal PartnerOpenNETCF Consultingwww.OpenNETCF.com
Chris TackePrincipal PartnerOpenNETCF Consultingwww.OpenNETCF.com
IntroductionIntroduction
Microsoft Windows CE memory architectureWindows CE application memory usageGC heap
How it grows and shrinks
JIT HeapHow it grows and shrinks
Demo - Remote Performance Monitor (RPM)Managed application costsManaged code and deterministic behavior
Microsoft Windows CE memory architectureWindows CE application memory usageGC heap
How it grows and shrinks
JIT HeapHow it grows and shrinks
Demo - Remote Performance Monitor (RPM)Managed application costsManaged code and deterministic behavior
SharedSharedmemorymemory
Slot 0Slot 0
Slot 1Slot 1
Slots 2-31Slots 2-31
0x0000 00000x0000 0000
0x0200 00000x0200 0000
0x0400 00000x0400 0000
0x8000 00000x8000 0000
Slot 0Slot 0
RAM DLLsRAM DLLs
Slot 0Slot 0
RAM DLLs load at the RAM DLLs load at the top of the slottop of the slot
CodeCode
DataData
Stack/heapStack/heap
Virtual SpaceVirtual Space
Code loads at Code loads at the bottomthe bottom
App data and App data and resources load above resources load above codecodeStack and heaps load Stack and heaps load in reserved space in reserved space above resourcesabove resources
Remainder is free Remainder is free virtual space for virtual space for processprocess
Slot 0Slot 0
ROM DLLsROM DLLs
In-ROM DLLs are In-ROM DLLs are in Slot1 (XIP)in Slot1 (XIP)
OtherOtherprocessesprocesses
Other processes Other processes are paged in are paged in
Executing a Managed ApplicationThe typical managed developer’s view
Executing a Managed ApplicationThe typical managed developer’s view
ApplicationApplication
ManagedManagedreferencesreferences
CompactCompactframeworkframework
NativeNativestuffstuff
My application
Loads some references
Uses the Microsoft .NET Compact Framework
Calls some arcane “Native stuff”
Slot 0Slot 0AppDomain HeapAppDomain Heap
SharedShared
memorymemoryMyApp.exeMyApp.exe
MyReference.dllMyReference.dll
Slots 2-31Slots 2-31
0x0000 0000
0x0200 0000
0x0400 0000
0x8000 0000
Slot 1Slot 1
JITted codeJITted codeJITted codeJITted codeGC HeapGC Heap
MetadataMetadata
JITted codeJITted codeReference TypesReference Types
mscorlib.dllmscorlib.dll
mscoree2_0.dllmscoree2_0.dllSystem.dllSystem.dll
MiscellaneousMiscellaneousmscoree.dllmscoree.dll
CLRCLR
CLRCLRCLRCLR
CLRCLR
System.Data.dllSystem.Data.dll
ApplicationApplication
ManagedManagedreferencesreferences
CompactCompactframeworkframework
NativeNativestuffstuff
netcfagl2_0.dllnetcfagl2_0.dll
1. CLR loads
2. App and class library assemblies load
3. EE JIT compiles code
4. Reference types allocated from GC Heap5. Types generated in memory from metadata6. Miscellaneous allocations
A Deeper LookA Deeper Look
Object DObject D
Object AObject AObject BObject B
Object CObject C
ptrptr64+K Segment64+K Segment
Object EObject E
Object EObject E
64+K Segment64+K Segment Object FObject F
Object GObject G
Object HObject H
GC heap holds allocations for GC heap holds allocations for reference types in 64K reference types in 64K segmentssegmentsAllocations in a segment are fast Allocations in a segment are fast because they are simple pointer because they are simple pointer arithmeticarithmetic
Eventually an allocation will Eventually an allocation will require more memory than the require more memory than the heap containsheap contains
If more memory is not If more memory is not available, OOM Exception is available, OOM Exception is thrownthrown
Another segment is added to the Another segment is added to the Heap through a VirtualAlloc call Heap through a VirtualAlloc call and allocation continues. and allocation continues. NOTE: Segments are NOTE: Segments are not not necessarily contiguous memory.necessarily contiguous memory.
If an object requires > 64K If an object requires > 64K allocation, a segment large allocation, a segment large enough is created for the enough is created for the allocation allocation (not necessarily a 64K multiple)(not necessarily a 64K multiple)
GC Resource RecoveryGC Resource Recovery
Resources are recovered through a process called “collection”A collection is triggered by specific conditions
For every 1 MB of heap data allocated (not tied directly to heap size)
An OOM on most large allocations (bitmaps, etc.)
The application receives a WM_HIBERNATE message
The application moves to the background
The application explicitly calls GC.Collect
Resources are recovered through a process called “collection”A collection is triggered by specific conditions
For every 1 MB of heap data allocated (not tied directly to heap size)
An OOM on most large allocations (bitmaps, etc.)
The application receives a WM_HIBERNATE message
The application moves to the background
The application explicitly calls GC.Collect
GC Collection ActivitiesGC Collection Activities
A collection will always:Free unreferenced objects in the heap that have no finalizer or whose finalizer has already run
A Collection may:Compact the GC heap (combine small free spaces by moving objects)
Shrink the GC heap itself
Release (pitch) JITted code
The application explicitly calls GC.Collect
To fully understand GC resource recovery, we need a deeper understanding of allocation
A collection will always:Free unreferenced objects in the heap that have no finalizer or whose finalizer has already run
A Collection may:Compact the GC heap (combine small free spaces by moving objects)
Shrink the GC heap itself
Release (pitch) JITted code
The application explicitly calls GC.Collect
To fully understand GC resource recovery, we need a deeper understanding of allocation
Object AObject AObject BObject B
Object CObject C
Object DObject D(No Finalize)(No Finalize)
Object EObject E(Has Finalize)(Has Finalize)
Strong references to objects are Strong references to objects are called “roots” and are maintained called “roots” and are maintained by the CLRby the CLR
When new objects are created, a When new objects are created, a reference is stored as a rootreference is stored as a root
If an object has a Finalize If an object has a Finalize method, a reference to it is also method, a reference to it is also added to the finalization queueadded to the finalization queue
As objects are destroyed or As objects are destroyed or go out of scope, roots are go out of scope, roots are removedremovedDuring collection the GC walks all During collection the GC walks all roots, marking all actively roots, marking all actively referenced objects in the heap referenced objects in the heap (mark)(mark)After marking all objects from After marking all objects from roots, unmarked objects are roots, unmarked objects are released (sweep)released (sweep)
RootsRoots
FinalizationFinalizationqueuequeue
Strong Strong referencesreferences
GlobalsGlobals
StaticsStatics
LocalsLocals
Object EObject E
GC InternalsGC Internals
Object AObject A
Object CObject C
Object EObject E(Has finalize)(Has finalize)
When a finalizable object is When a finalizable object is destroyed root reference is destroyed root reference is removed as usualremoved as usualDuring GC Collection objects in During GC Collection objects in finalization queue finalization queue are notare not releasedreleased
Once collection is complete, a Once collection is complete, a separate thread runs Finalize on separate thread runs Finalize on all items in the freachable queueall items in the freachable queue
Once Finalize has been run, Once Finalize has been run, the freachable reference is the freachable reference is removedremoved
RootsRoots
FinalizationFinalizationqueuequeue
Instead, a reference is added to Instead, a reference is added to a special root called the a special root called the freachable queue and the freachable queue and the finalization queue entry is finalization queue entry is removedremoved
FreachableFreachablequeuequeue
Object EObject E
On the On the nextnext collection the collection the object memory will be object memory will be releasedreleased
GC InternalsGC Internals
Finalizer Finalizer threadthread
After CollectionAfter Collection
Object CObject C
Object HObject HAfter a collection cycle, allocations After a collection cycle, allocations begin again from the beginning of begin again from the beginning of the heap. An allocation is made in the heap. An allocation is made in the first “hole” into which it will fitthe first “hole” into which it will fit
Object XObject X
Object AObject A
ptrptr
Allocation pointer moves to the Allocation pointer moves to the beginning of the first segmentbeginning of the first segment
Allocator will attempt to fill holes Allocator will attempt to fill holes with subsequent allocations. The with subsequent allocations. The pointer will continue to move up pointer will continue to move up until either a large enough hole is until either a large enough hole is found or we reach the top of the found or we reach the top of the GC heap and new memory must be GC heap and new memory must be allocatedallocated
ptrptr
ptrptr
Segments hold a pointer to first Segments hold a pointer to first available hole so locating it available hole so locating it is still fastis still fast
ptrptr
ptrptr
Same process happens with Same process happens with “active” segement before new “active” segement before new segments segments are createdare created
GC Heap CompactionGC Heap Compaction
Object CObject CObject CObject C
Object HObject HWhen the GC heap has 750k or When the GC heap has 750k or more of free space (holes) during more of free space (holes) during collection, a compaction occurscollection, a compaction occurs
Note that the GC heap does not Note that the GC heap does not necessary shrink when this necessary shrink when this occursoccurs
Object HObject H
Object AObject A
If the GC heap exceeds 1MB in If the GC heap exceeds 1MB in size, segments will be released size, segments will be released down to the 1MB line if they are down to the 1MB line if they are emptyempty
0x0010 00000x0010 0000
0x0000 00000x0000 0000
JIT CompilerJIT Compiler
Has its own heap separate from the GC heapDoes not highly optimize during compiling
Speed of compile chosen over code density or execution speed
Roughly 50% as “good” as the FFx compiler or a typical native C++ compiler
Optimized for short code pathsMethod calls are 2-5 times slower than in the FFx or native code
Avoid recursion and heavy use of abstraction layers
Has its own heap separate from the GC heapDoes not highly optimize during compiling
Speed of compile chosen over code density or execution speed
Roughly 50% as “good” as the FFx compiler or a typical native C++ compiler
Optimized for short code pathsMethod calls are 2-5 times slower than in the FFx or native code
Avoid recursion and heavy use of abstraction layers
JIT Heap Growth (JIT Compiling)JIT Heap Growth (JIT Compiling)
IL is compiled per method and on the flySmall increments
Assembly is 2-3 times the size of the source IL
JIT Heap growth is unbounded if .Net Compact Framework 2.0
IL is compiled per method and on the flySmall increments
Assembly is 2-3 times the size of the source IL
JIT Heap growth is unbounded if .Net Compact Framework 2.0
JIT Heap Shrinkage (Code Pitching)JIT Heap Shrinkage (Code Pitching)
Shrinks by almost all codeUnlike growth, shrinking is not incremental
Compiler keeps only JITted code for current call stack
TriggersMemory allocation failure
WM_HIBERNATE message received
Application is sent to the background
Shrinks by almost all codeUnlike growth, shrinking is not incremental
Compiler keeps only JITted code for current call stack
TriggersMemory allocation failure
WM_HIBERNATE message received
Application is sent to the background
Remote Performance Monitor (RPM)Remote Performance Monitor (RPM)
The Fixed Costs of Managed CodeThe Fixed Costs of Managed Code
.NET CF 2.0 takes ~2.2 MB in ROM (compressed)CLR native components require 650K virtual space, max of 650K physical
Slot 1 if in ROM, Slot 0 if installed in the field
.NET CF-managed assemblies take 3.8 MB of shared virtual space plus 1-2 MB of physical space from the 32 MB process slot
This is a one-time cost per device (shared across all managed applications)
.NET CF 2.0 takes ~2.2 MB in ROM (compressed)CLR native components require 650K virtual space, max of 650K physical
Slot 1 if in ROM, Slot 0 if installed in the field
.NET CF-managed assemblies take 3.8 MB of shared virtual space plus 1-2 MB of physical space from the 32 MB process slot
This is a one-time cost per device (shared across all managed applications)
The Fixed Costs of Managed CodeThe Fixed Costs of Managed Code
Applications allocate out of the 1 GB shared slot as memory-mapped files
Applications are compressed and will grow in virtual size ~50% when mapped
Required physical memory is demand paged in (probably ~50% of virtual size required to prevent thrashing)
Applications allocate out of the 1 GB shared slot as memory-mapped files
Applications are compressed and will grow in virtual size ~50% when mapped
Required physical memory is demand paged in (probably ~50% of virtual size required to prevent thrashing)
The Dynamic Costs of Managed CodeThe Dynamic Costs of Managed Code
CLR data structures require <250KJIT Heap typically 250K-500K from the 32 MB process space
Long code paths create requirements to hold more JITted code at one time. Even without the perf hit for method calls this can be a problem
Almost all JITted code is pitched when app is in the background
GC heap size is unboundedComes from the 32 MB process space
CLR data structures require <250KJIT Heap typically 250K-500K from the 32 MB process space
Long code paths create requirements to hold more JITted code at one time. Even without the perf hit for method calls this can be a problem
Almost all JITted code is pitched when app is in the background
GC heap size is unboundedComes from the 32 MB process space
The Dynamic Costs of Managed CodeThe Dynamic Costs of Managed Code
Each thread gets a 64K stackAllocated from 32 MB process space
Not freed until the thread is collected
Each thread gets a 64K stackAllocated from 32 MB process space
Not freed until the thread is collected
“Controlling” the GC“Controlling” the GC
Rule 1: The GC cannot be directly controlled except by calling GC.CollectWhen you call GC.Collect the GC must:1. Suspend all threads in the process
2. Traverse all roots
3. Mark all objects with a referring root
4. Traverse all objects in the GC heap to see if they’re marked
5. Traverse the finalizer queue
Rule 1: The GC cannot be directly controlled except by calling GC.CollectWhen you call GC.Collect the GC must:1. Suspend all threads in the process
2. Traverse all roots
3. Mark all objects with a referring root
4. Traverse all objects in the GC heap to see if they’re marked
5. Traverse the finalizer queue
“Controlling” the GC“Controlling” the GC
Any GC.Collect() call is inherently expensive and it’s extremely rare that an application should ever call itSince finalizers are run on a separate thread, it is not guaranteed, and is actually unlikely, that Finalize method execution will be done when the call to GC.Collect returns
Deterministic finalization is not possible
Never assume finalization if you manually call collect
Any GC.Collect() call is inherently expensive and it’s extremely rare that an application should ever call itSince finalizers are run on a separate thread, it is not guaranteed, and is actually unlikely, that Finalize method execution will be done when the call to GC.Collect returns
Deterministic finalization is not possible
Never assume finalization if you manually call collect
“Real Time” and DeterminismThe inherent problem with managed environments
“Real Time” and DeterminismThe inherent problem with managed environments
What exactly is “real time”?“Real time” is defined as a system with deterministic results. This means that you can guarantee that when an action occurs, the time for the reaction has an defined upper bound.
Example: An interrupt occurs, the handler must run within some bounded time frame or bad things happen (system failure, damaged product, injury, etc)
What exactly is “real time”?“Real time” is defined as a system with deterministic results. This means that you can guarantee that when an action occurs, the time for the reaction has an defined upper bound.
Example: An interrupt occurs, the handler must run within some bounded time frame or bad things happen (system failure, damaged product, injury, etc)
By their very nature, all garbage-collected systems (FFx, .NET CF, Java, etc.) are inherently nondeterministic
GC can run at any time
Time required to collect is indeterminate
GC suspends all other threads during collection
Reminder: Since finalizers are run on a separate thread it is not guaranteed, and is actually unlikely, that finalizers will be done when GC.Collect returns. This means deterministic finalization is not possible
By their very nature, all garbage-collected systems (FFx, .NET CF, Java, etc.) are inherently nondeterministic
GC can run at any time
Time required to collect is indeterminate
GC suspends all other threads during collection
Reminder: Since finalizers are run on a separate thread it is not guaranteed, and is actually unlikely, that finalizers will be done when GC.Collect returns. This means deterministic finalization is not possible
“Real Time” and DeterminismThe inherent problem with managed environments
“Real Time” and DeterminismThe inherent problem with managed environments
Determinism in the Managed Environment Achieving the impossible
Determinism in the Managed Environment Achieving the impossible
Requires exhaustive testing and a thorough understanding of the entire system, not just your application. Not recommended for critical or non-fault tolerant systems.
Determinism in the Managed EnvironmentAchieving the impossible
Determinism in the Managed EnvironmentAchieving the impossible
To get deterministic behavior we must:Identify what parts of our code need determinism
Determinism of entire app is not achievable
Protect those parts from the effects of GC Collection
Remember my Three Rules of Managed Determinism
To get deterministic behavior we must:Identify what parts of our code need determinism
Determinism of entire app is not achievable
Protect those parts from the effects of GC Collection
Remember my Three Rules of Managed Determinism
The Three Rules of Managed DeterminismThe Three Rules of Managed Determinism
1. Collection occurs when certain known trigger events occur (i.e. OOM, 1 MB allocation, etc.)
2. Collection runs on the current thread, not its own thread
3. Thread behavior is “adjustable” at the API level
1. Collection occurs when certain known trigger events occur (i.e. OOM, 1 MB allocation, etc.)
2. Collection runs on the current thread, not its own thread
3. Thread behavior is “adjustable” at the API level
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Allocation-based triggers (e.g. OOM or 1MB GC heap threshold)
Never, ever (ever) make an allocation in your real time code
Beware of boxing (allocations made on your behalf)
Allocation-based triggers (e.g. OOM or 1MB GC heap threshold)
Never, ever (ever) make an allocation in your real time code
Beware of boxing (allocations made on your behalf)
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
WM_HIBERNATE eventIt’s your system, do your best to know what else is running
Open systems are far more fragile than closed systems
Intercept WM_HIBERNATE with your own message pump (e.g. IMessageFilter implementation)
WM_HIBERNATE eventIt’s your system, do your best to know what else is running
Open systems are far more fragile than closed systems
Intercept WM_HIBERNATE with your own message pump (e.g. IMessageFilter implementation)
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Application moving to the backgroundDisallow it during execution of real time code sections
Use Mutexes, Critical sections, etc
Application moving to the backgroundDisallow it during execution of real time code sections
Use Mutexes, Critical sections, etc
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Determinism in the Managed EnvironmentStep 1: Handle GC triggers
Direct calls to GC.CollectEven if you’re tempted, just don’t do it
Direct calls to GC.CollectEven if you’re tempted, just don’t do it
Determinism in the Managed EnvironmentStep 2: Protect yourself
Determinism in the Managed EnvironmentStep 2: Protect yourself
Take advantage of Rules 1 and 2. Since GC runs on the current thread and thread behavior is API adjustable.
Put your real-time code in a separate threadAssuming we’ve handled step 1 and make no allocations, GC will always run in the non-real time thread
Raise your real-time thread’s priority (above normal application priorities) via CeSetThreadPriority
Set your real-time thread’s quantum to “run to completion” via CeSetThreadQuantum
Keep your real-time thread to a minimum to avoid system impact
Take advantage of Rules 1 and 2. Since GC runs on the current thread and thread behavior is API adjustable.
Put your real-time code in a separate threadAssuming we’ve handled step 1 and make no allocations, GC will always run in the non-real time thread
Raise your real-time thread’s priority (above normal application priorities) via CeSetThreadPriority
Set your real-time thread’s quantum to “run to completion” via CeSetThreadQuantum
Keep your real-time thread to a minimum to avoid system impact
Sample Real Time ISTSample Real Time ISTprivate void LaunchStartup()private void LaunchStartup(){{ Thread2 ist = new Thread2(ISTProc);Thread2 ist = new Thread2(ISTProc);
// set IST priority well above app priority// set IST priority well above app priority ist.RealTimePriority = 200;ist.RealTimePriority = 200;
// run to completion// run to completion ist.RealTimeQuantum = 0;ist.RealTimeQuantum = 0;
ist.Start();ist.Start();}}
Sample Real Time ISTSample Real Time ISTprivate void ISTProc()private void ISTProc(){{ EventWaitHandle interruptWaitHandle = EventWaitHandle interruptWaitHandle = new EventWaitHandle(new EventWaitHandle( false,false, EventResetMode.AutoReset,EventResetMode.AutoReset, "ISR_EVENT_NAME");"ISR_EVENT_NAME");
// allocate any variables here, outside the IST loop// allocate any variables here, outside the IST loop while(true)while(true) {{ interruptWaitHandle.WaitOne();interruptWaitHandle.WaitOne(); // handle the interrupt here// handle the interrupt here // do *NOT* allocate any variables// do *NOT* allocate any variables
}}}}
Questions?Questions?
Newsgroupsmicrosoft.public.dotnet.framework.compactframework
Blogshttp://blog.opennetcf.orghttp://blogs.msdn.com/netcfteam
Slide deck is on CommNetCode will be posted in my blog
Chris TackeOpenNETCF Consultingwww.OpenNETCF.comctacke@OpenNETCF.com
Newsgroupsmicrosoft.public.dotnet.framework.compactframework
Blogshttp://blog.opennetcf.orghttp://blogs.msdn.com/netcfteam
Slide deck is on CommNetCode will be posted in my blog
Chris TackeOpenNETCF Consultingwww.OpenNETCF.comctacke@OpenNETCF.com
Stop by the MED Content Publishing Team Station in the Microsoft Stop by the MED Content Publishing Team Station in the Microsoft Pavilion or visit the MED Content Publishing Team Wiki site:Pavilion or visit the MED Content Publishing Team Wiki site:http://msdn.microsoft.com/mobility/wiki
ResourcesResources
Need developer resources on this subject? Need developer resources on this subject?
© 2006 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it
should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Recommended