To wrap up the recent news concerning live dumps, which I discussed here and here, I will cover the topic in a bit more detail and provide full source code to a tool you can use to explore the feature yourself.
Brief History of Live Dumps
Windows has always provided the ability to create non-invasive, live crash dumps through a kernel debugger using the “.dump” command (this does not crash the computer, see “.crash”). Additionally, tools like LiveKD allow you to create “mirror dumps” of a live system (only kernel memory). However, until these recent changes in Windows 8.1, these capabilities were not easily accessible to system administrators in most scenarios, due to the impractical necessity of attaching a kernel debugger to the target machine. Note that invasive crash dumps (those that cause a bug check such as keyboard-initiated dumps) are not the same thing as live dumps.
This undocumented function, previously the home of a serious security bug that allowed unrestricted access to all of physical memory, provides a host of debugging capabilities for kernel components (mostly meant for a kernel debugger). The following sections will discuss how this function works with corresponding kernel functions to generate dump files. Please take a look at kdtypes.h in the NDK for more information about data types discussed here.
Triage Dump – Parameter 29
Since at least Vista, NtSystemDebugControl has provided a means to create a triage dump via SYSDBG_COMMAND value 29 (SysDbgGetTriageDump). In fact, this is the only control code you could issue to this API without having a kernel debugger attached and windows booted in debug mode. Attempting to use this function without meeting those requirements would result in status code 0xc0000354 or STATUS_DEBUGGER_INACTIVE. Parameter 29 will create a triage dump through an internal function, DbgkCaptureLiveDump, which uses a set of internal kernel functions that manipulate debugger “snapshot data” (eg, DbgkpTriageDumpSnapData). The parameter that must accompany value 29 is SYSDBG_TRIAGE_DUMP, which specifies the thread handles to be dumped, along with various values that will be written in the triage dump header. The caller must also supply an output buffer of at least 132k and at most 1MB for the dump data. The structure of the input parameter from the NDK is shown below.
typedef struct _SYSDBG_TRIAGE_DUMP
} SYSDBG_TRIAGE_DUMP, *PSYSDBG_TRIAGE_DUMP;
The ProcessHandles field must be 0 and the ThreadHandles field must be non-zero. The Handles field must be an array of opened thread handles. The remaining fields are optional.
As you can guess from this structure definition, triage dump generation as it is implemented in the kernel has the annoying requirement that you must specify one or more thread handles in the SYSDBG_TRIAGE_DUMP structure. If you do not pass in the required thread handles, the function will return STATUS_INVALID_PARAMETER. DbgkCaptureLiveDump will iterate over the thread handles you supply, take a reference on the associated PsThreadType object, and get “snapshot data” from the thread by queueing an APC to the thread. This APC will execute a series of Dbgk* functions that collect thread stack information in cooperation with the thread. All of this data is appended to the triage dump data buffer.
If you want to learn more about triage dumps, skywing has a good blog post here.
Kernel Dump – Parameter 37 (Windows 8.1+)
In the latest NDK, the highest value for SYSDBG_COMMAND is 36. Windows 8.1 introduces a new value to SYSDBG_COMMAND, value 37, and NtSystemDebugControl has been updated to allow this command (along with 29, which operates the same as it has in the past) to execute with or without a debugger attached and regardless of debug mode. Under the hood, NtSystemDebugControl calls DbgkCaptureLiveKernelDump, a new internal function that interacts with the I/O manager and memory manager to dump memory while the machine is still active. Microsoft has introduced a set of 57 new internal functions (prepended with IopLiveDump) to support this mechanism, which I assume to be the “CrashAPI” I recently saw in Windbg output.
For parameter 37, NtSystemDebugControl takes as input the structure below (undocumented).
typedef struct _SYSDBG_LIVEDUMP_CONTROL
} SYSDBG_LIVEDUMP_CONTROL, *PSYSDBG_LIVEDUMP_CONTROL;
This structure is similar to the SYSDBG_TRIAGE_DUMP structure with a few notable differences:
- You must specify an output file handle for the dump to be written. The kernel takes care of filling it for you, instead of the old way of just giving you back a buffer.
- You have the option of canceling the dump process via an alertable object you can signal (more on that later)
- A placeholder flag field for fine-grained control over what memory pages are included in the dump
- A placeholder flag field for fine-grained control over various dump options
I call these placeholder flags, because they really don’t do much at the moment. The values for the control flags currently supported are shown below.
typedef union _SYSDBG_LIVEDUMP_CONTROL_FLAGS
ULONG UseDumpStorageStack: 1;
ULONG CompressMemoryPagesData: 1;
ULONG IncludeUserSpaceMemoryPages: 1;
ULONG Reserved: 29;
Of particular interest to me (since it appears to directly relate to my research on this website), is the UseDumpStorageStack bit. I don’t see this flag used anywhere in the LiveDump kernel functions (except for a single check in IopLiveDumpValidateParameters, which validates the control flag is not 3 – CompressMemoryPagesData + UseDumpStorageStack), so the feature appears to be unimplemented. Given that the function that actually writes buffered dump data to the output dump file (IopLiveDumpWriteBuffer) simply calls ZwWriteFile, it’s pretty obvious the live dump feature exclusively uses the normal I/O stack and not the crash dump stack. Just to be sure, I put a breakpoint on the dump port driver’s DiskDumpWrite function and issued a live dump request that set the UseDumpStorageStack bit to 1. The breakpoint was not hit.
The flag values for controlling the pages included in the dump are shown below.
typedef union _SYSDBG_LIVEDUMP_CONTROL_ADDPAGES
ULONG HypervisorPages: 1;
ULONG Reserved: 31;
Currently the only support flag here is to add hypervisor pages to the generated dump (note that the control flag includes the ability to add user space pages to the dump).
The crash API generates a kernel dump as follows:
- Acquire an exclusive lock, IopLiveDumpLock
- Validate the SYSDBG_LIVEDUMP_CONTROL structure
- Control flags cannot be 3 (CompressMemoryPagesData + UseDumpStorageStack)
- The dump file handle must have been created for synchronous I/O
- If specified, the cancel object must be an event, process, thread or timer object
- Allocate a global structure (IopLiveDumpContext) to hold the settings passed in via SYSDBG_LIVEDUMP_CONTROL and other internal data
- Allocate per-processor memory page mappings
- Initialize processor “corral” – I haven’t delved into this new concept yet, but it appears to be the “right” way to do the old rootkit trick of queueing a DPC to each processor to cause it to “spin” while a single dedicated processor does some hacky work that must have exclusivity. Surely this is how this new live dump feature is able to accomplish its goal while the system is still running (similar to the crash dump environment which is single thread/CPU, high IRQL, interrupts disabled, etc).
- Estimate dump buffer space requirements, discarding certain page ranges
- Capture calculated memory pages using page “mirroring” callbacks. When mirroring is complete (IopLiveDumpEndMirroringCallback), add the standard dump stuff (KdDebuggerDataBlock, KiProcessorBlock, loaded module list, processor data, kernel data, and so on), fill the dump header, and populate the dump bitmap.
- Trigger a processor corral state change (4), which presumably simulates the restrictions of a single-processor crash environment while the dump data is written to disk.
I’m sure I didn’t get that exactly right, but I think it’s pretty close. I’ll continue to update this blog post as I learn more.
I wrote a small utility over the weekend called LiveDump that lets you create both a triage dump using parameter 29 and a complete/kernel dump (user + kernel space or a bitmap kernel dump) using the new parameter 37.
LiveDump must be run as administrator. Its usage:
LiveDump.exe [type] [options] FileName
triage : create a triage dump (parameter 29)
kernel : create a kernel dump (parameter 37)
Options (triage dump only):
-p : PID to dump
Options (kernel dump only):
-c : compress memory pages in dump
-d : Use dump stack (currently not implemented in Windows 8.1, 9600.16404.x86fre.winblue_gdr.130913-2141)
-h : add hypervisor pages
-u : also dump user space memory
FileName is the full path to the dump file to create.
The kernel dump capability has options that mirror the new structures and flags discussed earlier.
Since triage dumps are per-process, you will need to specify what process you want to dump. The tool will then create a triage dump of the first 16 threads in that process. The example below dumps the System process.
C:\Users\tester\Desktop>LiveDump.exe triage -p 4 dump.dmp
Attempting to create a triage dump...
Triage dump is for PID 4 with 16 threads.
Dump file 'dump.dmp' written successfully!
Note: If you are seeing failure status 0xc0000022, you will need to launch the tool from an elevated cmd prompt by right-clicking and selecting “run as administrator”.
One final word: this tool and research was done in a few hours over the weekend, so I haven’t done much in the way of testing or validating the resulting dumps (other than loading them in Windbg). Please let me know if you run into any issues (@lilhoser).
Ok, one last final word. I might be just a touch flattered that Microsoft could have been watching my research about how to use the crash dump stack outside of its intended purpose.
My first inkling of big brother watching was with the first release of Windows 8, when Microsoft made extensive changes to the dump stack (incorporating it more tightly in hibernate/resume for the new hybrid boot) and introduced a new read capability in the dump port driver. The lack of a read capability in the pre-Windows 8 dump stack was something I had to hack around in my original research in order to use the dump stack to read and write to disk. Given that a read capability would be required for how they leveraged the dump stack in hybrid boot, it’s probably just coincidence. Furthermore, the fact that there was such a short timeline between my research (April of 2012) and the initial RTM release of Windows 8 (late summer 2012) makes it unlikely my research really influenced their product roadmap.
Of course, this new UseDumpStorageStack bit is the second such occurrence of what I shall call “narcissistic projection of presumed hero worship”, as this represents the first time Microsoft has tried to leverage the dump stack during runtime for non-crashing/non-hibernation reasons. They added a lot of code to support it too (“crash API”). Maybe it will start showing up in other places, like a secret storage stack rootkit that NSA uses.