Upload
rosa-lorin-stewart
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Microsoft Corporation 1
CrashDumpCrashDump
2010-04-20
Microsoft Corporation 2
Agenda
Agenda
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
2010-04-20
Microsoft Corporation 3
Unexplainable Field Phenomena
Unexplainable Field Phenomena
• All of these devices worked normally after reboot. • No defects were found by file system scan.
• All of these devices worked normally after reboot. • No defects were found by file system scan.
2010-04-20
RE Break 7729 0 x86fre fbl_core1 100318-2018 ps-ps Timeout during installing break in webio dll.msg
RE I O Stress Failure # 60589.msg
RE Bugcheck KERNEL_DATA_INPAGE_ERROR (7a).msg
Microsoft Corporation 4
Unexplainable Field Phenomena
Unexplainable Field Phenomena
• Microsoft has been tracking these for years• Things aren’t getting better• Customers expect a solution from us• We have nothing to give them
• There are 2 theories• The device has a flaw• The device has been mishandled
• Microsoft has been tracking these for years• Things aren’t getting better• Customers expect a solution from us• We have nothing to give them
• There are 2 theories• The device has a flaw• The device has been mishandled
2010-04-20
Microsoft Corporation 5
Theory #1: The device has a flaw
Theory #1: The device has a flaw
• Goal: Address the flaw.• Assumption:
• ATA devices are sophisticated enough to perform their own internal ‘crashdump’.
• Microsoft is not able to address these issues along.
• Digression: Microsoft attempt to do this with the millions of crash reports it receives every day
• In general, user mode crashes are available to partners from http://winqual.microsoft.com through different portals.
• Partners with kernel mode drivers can download ~50 randomly selected CABs for a given bucket through the WER portal
• Partners only receive external mini dumps. Full dumps and internal crashes may only be given out by selected groups.
• Kernel mode crashes typically are driver issues that cause Blue Screens of Death or reset the machine. Analysis of data has found that device failure is a significant source of perceived “driver issues”.
• Goal: Address the flaw.• Assumption:
• ATA devices are sophisticated enough to perform their own internal ‘crashdump’.
• Microsoft is not able to address these issues along.
• Digression: Microsoft attempt to do this with the millions of crash reports it receives every day
• In general, user mode crashes are available to partners from http://winqual.microsoft.com through different portals.
• Partners with kernel mode drivers can download ~50 randomly selected CABs for a given bucket through the WER portal
• Partners only receive external mini dumps. Full dumps and internal crashes may only be given out by selected groups.
• Kernel mode crashes typically are driver issues that cause Blue Screens of Death or reset the machine. Analysis of data has found that device failure is a significant source of perceived “driver issues”.
2010-04-20
• Goal: Address the flaw.• Assumption:
• ATA devices are sophisticated enough to perform their own internal ‘crashdump’.
• Microsoft is not able to address these issues along.
• Goal: Address the flaw.• Assumption:
• ATA devices are sophisticated enough to perform their own internal ‘crashdump’.
• Microsoft is not able to address these issues along.
Microsoft Corporation 6
Agenda
Agenda
• Problem• Unexplainable field phenomena
• New Developments• in CrashDump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
• Problem• Unexplainable field phenomena
• New Developments• in CrashDump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
2010-04-20
7
Cloud Services (OCA,SQM, RAC)
IHVEnd user
Improved reliability of
Windows storage experience
Windows devices customer experience data flow
MS info
Vendor info
8
Response ExampleResponse Example
OCA process and workflowOCA process and workflow
• Crash occurs on the client• WER client collect crash data • Microsoft shares data with software
developers• Software developers troubleshoot• Software developers respond
to Microsoft and Customer
9
Microsoft Corporation 10
OCA’s Expanding FocusOCA’s Expanding Focus
2010-04-20
+Devices
+Drivers
+ISVs
MSFT
Microsoft Corporation 11
Theory #2: The device has been mishandled
Theory #2: The device has been mishandled
• Goal: Enable proper device handling.• Assumptions:
• Device has background scan information about internal issues, error handling, and results attempted corrections.
• This background scan information would be useful to manufacturers if there was a method for delivering it from active deployed systems.
• Background scanning can result in actionable requests from devices, improving robustness, and raising handling issues to the users attention.
• Goal: Enable proper device handling.• Assumptions:
• Device has background scan information about internal issues, error handling, and results attempted corrections.
• This background scan information would be useful to manufacturers if there was a method for delivering it from active deployed systems.
• Background scanning can result in actionable requests from devices, improving robustness, and raising handling issues to the users attention.
2010-04-20
Microsoft Corporation 12
Agenda
Agenda
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
2010-04-20
Microsoft Corporation 13
How to get to ‘The Why’How to get to ‘The Why’• How to transport?
• Time limited • Size negotiation• Security
• When to transport?• Host triggers• Device triggers• Dump persistence and recycling
• What to transport?• Bucketization• Device CrashDump (flavors?)• Background scan info• Does DSM affect collection content?
• How big are we willing to let this feature become?
• How to transport?• Time limited • Size negotiation• Security
• When to transport?• Host triggers• Device triggers• Dump persistence and recycling
• What to transport?• Bucketization• Device CrashDump (flavors?)• Background scan info• Does DSM affect collection content?
• How big are we willing to let this feature become?
2010-04-20
Microsoft Corporation 14
Background Scan Coordination Components
Background Scan Coordination Components
• Idle time notification• Power event notification• Background Scan vs. Power policy
precedence• Host/Device Event synchronization
(TimeStamped)
• Idle time notification• Power event notification• Background Scan vs. Power policy
precedence• Host/Device Event synchronization
(TimeStamped)
2010-04-20
Microsoft Corporation 15
Background Scan Coordination
Considerations
Background Scan Coordination
Considerations
2010-04-20
Microsoft Corporation 16
Background Scan Coordination
Considerations
Background Scan Coordination
Considerations
2010-04-20
Microsoft Corporation 17
Background Scan Coordination
Considerations
Background Scan Coordination
Considerations
2010-04-20
Microsoft Corporation 18
Background Scan Coordination
Considerations
Background Scan Coordination
Considerations
2010-04-20
Microsoft Corporation 19
Agenda
Agenda
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
• Problem• Unexplainable field phenomena
• New Developments• in Crashdump
• Solution• How to get to ‘The Why’?
• Opportunity• New work and timelines
2010-04-20
Microsoft Corporation 20
New work and TimelinesNew work and Timelines• Call for feedback, now. • Proposal for T13 in June• Approval in August
• Call for feedback, now. • Proposal for T13 in June• Approval in August
2010-04-20