Zach Burlingame
Programming, Computers, and Other Notes on Technology

Windows 8 Kernel Debugging with KDNET and the Realtek RTL8168

Existing Kernel Debugging Methods

Until Windows 8 there were only five ways to kernel debug a Windows machine:

  1. Local Debugging (e.g. SysInternals LiveKD)
  2. Null-modem cable (a.k.a Serial or COM)
  3. IEEE 1394 (a.k.a. Firewire)
  4. USB 2.0
  5. VirtualKD

Local debugging has serious limitations in that you can’t set breakpoints, view registers or stack traces, and are limited to debugging only those things that happen after the machine is fully booted and a user has logged in. Serial bugging has been a bread and butter method forever, supporting both hardware and virtual machines (via virtualized serial ports) but it’s really really slow. Firewire brought to the table a huge increase in speed over serial while offering fairly straight forward hardware requirements. Firewire virtualization isn’t possible with VMware (that I’m aware of anyways) and Firewire has steadily been dropped from a lot of OEM PCs in favor of USB 2.0 and more recently USB 3.0. Debugging over USB 2.0 is a bit of a unicorn. It’s technically possible to do it, but it requires a specific port on the USB controller that may or may not be available (e.g. it may be wired internally to a webcam or card reader) and a relatively expensive (~$100 USD) USB 2.0 debug cable such as the NET20DC. In practice, USB 2.0 kernel debugging is a gauntlet best avoided if at all possible. SysProg’s VirtualKD is an awesome framework. It has it’s quirks but it’s steadily improved over time. It’s by far the fastest kernel debugging transports in my experience and it’s a snap to install and configure. It’s biggest drawbacks are that it only supports virtual machines and you have to install a driver package (which may not amenable to the machine owner for various reasons).

Windows 8 Introduces Kernel Debugging over a Network Cable

Serial ports and Firewire were disappearing from a significant portion of PCs (laptops, workstations, and servers alike) and there are significant issues with using the USB 2.0 kernel debug transport. Microsoft needed a new efficient and effective means of supplying kernel debug transports to the IHVs and ISVs of the world. With Windows 8 and it’s kernel cousin, Server 2012, Microsoft introduced to the public two more kernel debug transports – Network and USB 3.0. Along with these, Microsoft also introduced new logo certification requirements (WHQL Debug Capability) that required system integrators to pass a series of tests to demonstrate that the system was in fact kernel debuggable. There is a great Channel 9 video released by the team here. It is my understanding that to be logo certified, you don’t have to support any one specific debugging transport, you just have to demonstrate one of them. The video states that one could even satisfy this requirement by using a custom hardware interface and debug cable as long as the hardware and any kernel transport drivers were available to end-users.

Microsoft implemented kernel transport drivers for the two most likely vehicles for providing this: Network and USB 3.0. Network interfaces already are and will continue to be prolific for the foreseeable future. USB 3.0 will continue to increase it’s market saturation and addresses the biggest issues with USB 2.0:

  • You can use any USB 3.0 port to debug
  • The only cable you need is a relatively inexpensive (~$15 USD) USB 3.0 A-A debug cable (i.e. no Vbus pins).
  • Chipset support is standardized via xHCI which is supported by most hardware shipping today

These are not the NICs you are looking for and other Jedi Knight tricks

I recently received a new (Jan 2013) Dell XPS 15 L521x laptop that I needed to develop driver support for. These machines don’t have a serial or Firewire ports and I’m not about to screw around with USB 2.0 debugging. They do however, have both a gigabit NIC and a USB 3.0 controller. Furthermore, they are logo certified, so they must have passed the WHQL Debug Capability tests, most likely with one (or both) of these methods. As I don’t yet have a USB 3.0 A-A debug cable available (or another machine with USB 3.0 ports for that matter), I set out to use the network transport. Checking the PCI hardware ID I could see that the NIC was reporting itself as a Realtek RTL8168. I checked MSDN’s list of supported NICs and saw that the first model listed was the Realtek RTL8168 – yay! I turned off secure boot (a requirement to modify boot options on UEFI machines), fired up BCDEDIT, configured the machine for network debugging, copied the key over to the host machine, fired up the debugger, opened the necessary ports on the firewall, and rebooted the target machine and …. nothing happened. I tried a few more times, got nothing. Checked my firewall and my BCDEDIT configs, still nothing. I verified that my host machine was setup correctly by firing up a Windows 8 VM with a bridged network connection using the same BCEDIT commands and that was working like a charm. I fired up Wireshark to see if the target machine was even sending out packets to try and connect and it was not. I updated the UEFI firmware and every driver on the system with the latest from Dell and still nothing. I tried setting static DHCP reservations for both machines and using the dhcp=no boot option flag, still nothing. Looking back at wireshark I could see that early in the boot process there were IPv6 solicitation packets going out, but nothing for IPv4 (which is the required protocol for kernel debugging) until much later in the boot sequence. Furthermore, these IPv4 packets were only trying to solicit a private 169.254. address.

I continued poking around for several hours, trying different things and searching the Internet for clues. It seemed like I had landed in a barren wasteland, with seemingly no chatter about anything that sounded like my issue. Then I stumbled across this post on the MSDN forums that had two responses – one from the PM on the team that owns the WHQL Debug Capability test (Matt Rolak) and one from a software design engineer on the Windows debugger team (Joe Ballantyne). Matt’s answer said that one possible cause of the OP’s question was:

In some cases machines with 4+GB of RAM don’t work for the NIC debugging (but will if you change the system to less than 4GB). It may be that your board passed their certification test with a different bios or with less RAM and worked around the issue.

The MSDN article on this is here: http://msdn.microsoft.com/en-gb/library/windows/hardware/hh830880 but the key part is:

  • System firmware should discover and configure the NIC device such that its resources do not conflict with any other devices that have been BIOS-configured.
  • System firmware should place the NIC’s resources under address windows that are not marked prefetchable

It seems for some MB’s this isn’t adhered to if they have more than 4GB of RAM configured and try to use NIC debugging. The ‘workaround’ was to reduce the RAM during debugging (less than ideal we know but unless the BIOS is updated by the manufacturer there isn’t another workaround we are aware of at this time).

The machine I was dealing with has 8GB of RAM and Matt’s reply was Jan of this year, so this could possibly be coming into play in my situation. With Dell’s sleek design of the XPS 15 L512x however, removing system memory was more than I felt like dealing with right away. It’s not terrible, but you do have to remove 6 T-5 screws and a couple of philips screws to get the back panel off to access the system memory and I wasn’t ready just yet to start taking the machine apart. Reading on, we see from Joe’s answer (emphasis mine):

I suspect that you have an unsupported NIC in the target machine, and that KDNET on the target machine is attempting to use USB3 debugging, and that you have a MB that has USB3 hardware support, and which also “supports” USB3 debugging. […]

What NIC is in the target machine? If it is a new model Realtek 8168 then it is likely NOT supported. (Unfortunately Realtek keeps their PCI device ID unchanged for many different hardware spins of their chips, and so some 8168 chips are supported and some are not.)

You have got. to. be. kidding me. I double check the PCI device ID for the NIC again and confirm that it is reporting itself as Realtek RTL8168. I go back to the Dell support page to check the driver update details again a notice a key detail – Dell is reporting the Windows 8 network driver update for it’s service tag as a Realtek RTL8111F! I headed over to the Realtek site to see if I could find out specifics about the differences between the RTL8168 and the RTL8111F. The only difference that jumped out to me was the packaging and neither page discussed chipset support with Microsoft’s kernel debug transport. The RTL8111F is definitely not listed as supported on the MSDN list of supported NICs. This would also explain why I wasn’t seeing any IPv4 packets being sent out by the target machine until late in the boot, likely after the Realtek driver had been loaded.

I found this whole ordeal pretty frustrating. It’s frustrating that Realtek is shipping hardware that is reporting the same PCI device ID that have different hardware interfaces such that the Microsoft driver that works for one, doesn’t work at all for the other. Realtek gets away with it for normal use of the NIC because they use the same driver bundle for all the 8168 and 8111 chipsets (along with several others). It’s also frustrating that, given the fact that Microsoft is obviously internally aware of this headache, that their MSDN supported Ethernet NICs page doesn’t have an asterisk on Realtek chip support stating these issues.

Conclusion

At this point, I’m waiting on a USB 3.0 debug cable and a PCI-E USB 3.0 adapter to arrive so I can approach it from that angle. Based on my experience, if you need to kernel debug hardware and you want to utilize the KDNET network transport protocol, I suggest you look towards the Intel and/or Broadcom chipsets if you get to choose. There are far more chipsets listed as supported from those vendors. If you end up with a machine with a Realtek controller, I still can’t tell you a universal way to determine which chipset you actually have and thus if it’s supported, so be prepared for the possibility of failure.

12 Responses to “Windows 8 Kernel Debugging with KDNET and the Realtek RTL8168”

  • frank perron says:

    First – nice blog….

    RE: I’m waiting on a USB 3.0 debug cable and a PCI-E USB 3.0 adapter

    Which did you order? ( model and vendor ) thanks.
    frank

    • ZachB says:

      The USB 3.0 debug cable I ordered was the SIIG USB 3.0 A-A Debug cable (Part # CB-US0112-S1). The PCI-E USB 3.0 adapter I ordered was the HighPoint RocketU 1022A PCI-Express 2.0 x1 Low Profile USB 3.0 HBA. Be aware that the card requires a 4-pin molex power adapter. That said, I haven’t even had a chance to use it. While I was waiting for everything to come in, I had to get some stuff done, so I ended up physical-to-virtual’ing a machine and debugging it via VirtualKD.

  • pcfist says:

    Hello. Nice to see some interesting information on new kernel debugging techniques!
    Have you received your USB 3.0 debug cable? Any luck getting it to work in Windows 8?
    I have the A-A crossover cable, but it does not seem to work correctly. Debugger does attach to the target, but then it hangs.
    So I’m waiting to see your feedback regarding USB 3.0 debugging experience!

    • ZachB says:

      Unfortunately, I haven’t had a chance to put mine to the test yet. I ended up doing what I needed via a physical-to-virtual created VM and VirtualKD. I haven’t had a need to go back and try it out over USB 3.0 although I’ve been meaning to find time to experiment.

      • pcfist says:

        Well, I agree — there’s usually little time to try new technologies 🙂
        I have good news here — successfully configured and used USB 3.0 debugging connection in Windows 8. Seems to me that the problem was on the machine that run WinDbg. It didn’t work on Windows 7 x64 machine, but with Windows 8 x86 the connection worked fine!
        I used the same machine, just ran different operating systems on it. So the difference is either in USB controller driver or in operating system itself. At first glance, the problem seems to be in the operating system, but I’ll try to experiment with different USB 3.0 controller drivers on Windows 7 and see if it will help.

        • ZachB says:

          Excellent, glad to hear someone out there has it working. In theory it’s supposed to be much simpler with Windows 8 with USB 3.0 than it was with USB 2.0 in the past. I’m still hoping to find some time to verify that for myself as well. Thanks again for sharing!

  • Soad[XM] says:

    Hi!
    I’m trying to figure out whether the HighPoint RocketU 1022A PCI-Express adapter supports debug port. And it’s not quite clear to me from your comments. Could you help me with this?
    Thanks!

    • pcfist says:

      Hi Soad[XM],
      According to what I could find on the net, this card is based on NEC µPD720202 chipset, which supports USB debugging function, so you’re good to go.
      BTW if you have the card installed you can check whether it supports USB 3.0 debugging using usbview utility (included in Windows Driver Kit 8). Just select a USB port in usbview and it’ll show if the port is debug capable or not.
      Note that you have to run usbview under Windows 8. You also need to have Windows 8 on *both* PCs for kernel debugging.
      Also not that USB kernel debugger connection is established on late stage of system boot process, so most drivers will already have been loaded by the time a debugger connects to target. This might be a problem if you need to debug a driver’s initialization process.

      Best Regards,
      Roman

  • Art Rothstein says:

    Why not use msconfig / Boot / Advanced Options / Maximum memory to temporarily trim your memory size to 4 GB? I suppose the OS could choose the upper 4 GB of the physical memory range instead of the lower 4 GB, but this is worth trying.

    Have you successfully used the SIIG or DataPro cable?

    • ZachB says:

      The problem isn’t with the OS having more than 4GB of physical memory but with the BIOS/UEFI address assignment for the device. Therefore, adjusting the OS’s memory limit won’t affect the issue I was having (and didn’t in my attempts). I probably could have tried removing memory to get it down to 4GB but I didn’t. Ultimately, I got the another machine with USB 3.0 support but never received the debug cable so I wrote the drivers using a a P2V VM + VirtualKD and then debugged on the physical host via crash dumps until I had it working sufficiently =/

  • Leave a Reply to ZachB Cancel reply

    Your email address will not be published. Required fields are marked *

    *