The crash dump stack represents a pristine path to disk, because it is an entirely separate I/O path to the device. There are many benefits of using this operating system “backdoor” to disk, such as defeating bootkits, maintaining persistence on a system, or other stealthy operations. This page explores two driver bypass techniques to accomplish this goal: an original, incomplete bypass technique that targeted Windows XP – Windows 7, and a new updated technique specific to Windows 8.
This technique has “bypass” in its name, because the idea here is to skip the kernel and crashdmp.sys drivers and “talk” directly to the dump port driver to coerce it to communicate with the dump miniport to complete I/O requests, as if it were in dump mode.
However, as I found out early on in this research, there were several challenges to overcome in order to safely use the dump stack outside of its intended environment:
- There is no way to communicate with the stack: recall that the dump port driver has no DRIVER_OBJECT or DEVICE_OBJECT, so traditional means of communication (such as sending an IRP) do not work. During normal operating system operation, the dump stack is literally just binary code sitting unused and uninitialized in memory, with many of the data structures that code depends on not being completely initialized (only pre-initialization has occurred).
- Post-initialization has not been done: any post-initialization tasks that the kernel or crashdmp.sys normally carried out would need to be emulated, such as calling the dump port and miniport DriverEntry functions with appropriately-initialized arguments.
- We don’t have a priori knowledge: When the kernel performs post-initialization, it has the advantage of state that it saved during pre-initialization. We don’t have this information readily available, so we have to find ways to rediscover it independently.
- We have to simulate crash mode: Since all of the code we are going to use in the dump port and miniport driver requires crash mode, we will have to artificially create this environment. This includes temporarily disabling the normal I/O path, so that it doesn’t contend with the crash path.
- Dump port driver internal functions are inadequate: before Windows 8, the crash stack was only capable of writing to disk. For true read/write capability, we have to find a way to read as well as write.
After a few months of grueling research, the driver bypass technique was shown to work. The general steps in the technique are (for more details on any of these steps, please see the publications page for whitepapers):
- Identify the Crash Dump Port and Miniport Drivers – Walk the list of memory-resident kernel modules to locate the dump port and miniport drivers. Once located, call their entry points with special arguments to initialize them.
- Get Boot Device Information – Send special IO control codes to the normal I/O path drivers to get information about the boot disk, which we will use when sending our own I/O request.
- Find StartIo or DispatchCrb Routines – Scan the dump port driver’s text section for “magic bytes” that allow us to identify these routines that carry out I/O in crash dump mode.
- Find and Initialize the Dump Port Driver’s Device Extension – Locate a pointer to this dump port driver’s internal structure and initialize some fields in it to set up our I/O request.
- Instantiate a SCSI/IDE Request Block – Using the boot device information obtained in Step 2, fill in a SCSI_REQUEST_BLOCK (SRB) or IDE_REQUEST_BLOCK (IRB) structure that instructs the disk to read 512 bytes from the first sector on disk.
- Call StartIo or DispatchCrb – Pass the SRB/IRB from Step 5 to the I/O routine.
This technique worked sufficiently well for SCSI transport devices in Windows 2000 through Windows 7, but it could cause stability issues. A similar version of the technique tailored to IDE drives was able to transmit I/O requests, but the requests were not successfully completed. Since this was just a side project at the time, I was not able to devote the time and resources necessary to debug the issues with this technique.
Recently, with the arrival of Windows 8, I decided to take a peek at what has changed in the crash stack. As it turns out, Windows 8 has introduced drastic changes to both the crashdmp.sys driver and the dump port driver itself (a very rare alteration) that break the original technique described above and render many aspects of it unnecessary. The changes in Windows 8 integrate a read capability into the crash dump stack that can be used in a stable way. Before discussing a new technique to use these features, it is necessary to briefly explore some of the important changes to the crash dump stack in Windows 8.
To decrease system startup time, Windows 8 introduces a new feature called “hybrid boot.” Hybrid boot is a startup method for quickly resuming the operating system after the system goes into sleep/hibernation mode. It is a hybrid between traditional cold boot and resuming from hibernation.
As opposed to normal hibernation, where all user sessions are hibernated to disk, hybrid boot only hibernates the system session. This means that all system drivers, services, plug-and-play devices, and so on, don’t have to be shut down and restarted, providing a much faster resume operation. If you’d like to learn more about hybrid boot, check out this article on the Building Windows 8 blog1.
So what does this have to do with the crash dump stack? As previously mentioned, the crash dump stack is responsible not only for writing a crash dump file when a system error occurs, but it also has the task of managing the hibernation file (hiberfil.sys) in conjunction with the kernel and the power manager. Therefore, any changes to how the system hibernates (such as the new hybrid boot feature) will require modifications to the crash dump stack drivers.
Along with extensive changes to the kernel itself and the power manager, the dump port driver and the crashdmp.sys driver were modified to support this new hybrid boot feature. Most importantly, both drivers contain new functions that issue read requests to the boot device when the system is resuming from hibernation. The power manager uses these new read routines when resuming from hibernation to restore hiber context information:
Prior to Windows 8, the hibernation context information was not retrieved from disk in this manner. Rather, the power manager handled the entire process. Integrating this aspect of the new hybrid boot feature at the dump port driver level results in an extremely fast resume operation.
For the purposes of this research and proof of concept driver, this is an important departure from the crash dump stack in prior versions of Windows, which was only capable of writing to disk (either a crash dump file or hiberfil.sys). In Windows 8, it can now read from disk as well, consequently obsoleting the original bypass technique previously described.
So, how do we use the new read feature in the Windows 8 dump port driver for our own purposes? With the new read capabilities in the dump stack, Microsoft had to expose it to driver developers so that things like whole-disk encryption software would work properly. Whole-disk encryption software typically includes a crash dump filter driver so that when a dump file is written to disk, the software has an opportunity to encrypt its contents.
Now that data can be additionally read by the crash dump mechanism, Microsoft extended their existing crash dump filter callback API to include a read routine callback2. Unfortunately, such a callback has no control over what data is actually being read, so using this approach for our purposes would not work.
The other option is to use one of the new read routines directly, whose prototypes are shown below:
Crashdmp.sys!CrashdmpReadRoutine(Type, Offset, Mdl) Diskdump.sys!DumpRead(Type, Offset, Mdl)
As it turns out, the first function is simply a wrapper around the second one with additional code to update the dump state context, which we don’t care about. Therefore, the best option seems to be to attempt to use the second function. Further investigation of this function reveals it is also a wrapper around another internal dump port driver function, DiskDumpIoIssue:
DiskDumpIoIssue kindly handles all of the aspects of issuing I/O requests to the dump miniport, which the original technique had to hack prior to Windows 8:
- Initializing a SCSI_REQUEST_BLOCK (SRB) structure to describe the request
- Mapping an MDL to describe the SRB data buffer, if required
- Calling the dump port internal I/O function, StartIo, which calls the dump miniport HwStartIo routine to actually program the device for the operation
- Polling to wait on result
- Retrying failed/pending requests
As is the case with the original bypass technique, it is still necessary to do some initialization prior to calling this new function. In addition to initializing the dump stack drivers ourselves, the dump port driver’s device extension must be located and initialized. Since this structure changed significantly in Windows 8, it had to be manually reverse engineered and the initialization techniques changed accordingly.
At this point, there is enough information to piece together the steps required to use the DiskDumpIoIssue function in Windows 8 (some of the concepts mentioned here will be covered in detail shortly):
- Locate and initialize the crash dump miniport driver
- Locate, patch and initialize the crash dump port driver
- Locate the dump port driver’s new read routine, DiskDumpIoIssue
- Disable the normal I/O path
- Call DiskDumpIoIssue with a disk offset and an MDL that describes the buffer to store the result
- Enable the normal I/O path
- Unpatch the dump port driver
As previously mentioned, the system operates in a restricted environment (“crash dump mode”) when a crash is in progress. This environment, setup by the kernel after a bug check has occurred, severely restricts what operations can be performed on the system and what features are available. In effect, the system is reduced to a single processor running one uninterruptible thread at the highest possible interrupt request level (IRQL). Any I/O that might take place is done synchronously and interrupts are disabled. There are documented mechanisms available to replicate this environment, the most useful of which is Interprocessor Interrupt (IPI) broadcasts. Each of the steps mentioned above that interact with a driver in the crash dump stack must momentarily “enable” crash dump mode using an IPI broadcast. This includes calling the dump port and miniport DriverEntry routines when initializing the drivers and just before sending the I/O request to the dump port driver.
Futhermore, an issue not addressed in the original bypass technique, using two separate driver stacks that operate on the same underlying hardware is problematic at best, since doing so will almost certainly lead to race conditions. Furthermore, arbitrarily initiating I/O through the crash dump stack will trash any I/O already in progress on the device (initiated from the normal I/O path), which can have a range of possible outcomes, from working without issue to deadlocking the system. Fortunately, it is possible to send a special I/O control code to the normal I/O path port driver, instructing it to flush and lock its internal queue. This mechanism makes it possible to halt the normal I/O path, after which, the system can be placed into restricted crash dump mode and I/O can be issued through the crash dump I/O path.
Another challenge to overcome was the integration of new hibernation features into the dump port driver. Specifically, the dump port driver’s DriverEntry calls an internal function MarkHiberBootPhase, which marks certain memory pages to be included in the hibernation file via nt!PoSetHiberRange.
Unfortunately there is no way to sidestep this function call from within the dump port driver’s DriverEntry, and attempts to trick the operating system into thinking it is in a hibernation state before calling it failed (it can almost be accomplished via nt!ZwPowerInformation with the SystemReserveHiberFile information type).
The issue here is that a special hibernation context structure internal to the kernel must be allocated before any hiber functions (such as nt!PoSetHiberRange) are called, and the only way to trigger this allocation is to actually set the system into a hibernation state by calling nt!NtSetSystemPowerState. This is not an acceptable solution, as we want to make as few changes to the system as possible.
There are two ways to get around this unfortunate restriction. The first option is to not call DriverEntry at all. This means more upfront work before it is possible to send I/O to the dump port driver. It also means the solution is less portable to future operating systems, because of the need to use static structure offsets which are painstakingly cherry-picked from reverse engineering various drivers.
The other option is to simply patch the dump port driver’s DriverEntry function to disable the call to MarkHiberBootPhase. Despite my abhorrence of anything patching/hooking related (I can see all of my teammates cringing in disgust), this turned out to be a simple 0x15-byte patch that could be immediately restored after sending the I/O request. And since the system is manually forced into crash dump mode (single processor, single thread, uninterruptible, synchronous I/O, high IRQL), there are no synchronization issues.