This time I will write about something I don't understand :-( Or at least not completely.
Some time ago I was debugging a new Intel hardware platform (Adlink ETX-BT, Celeron J1900) and I experienced unexpected hangups during boot of the DEBUG version of my WINCE800 image.
The RELEASE version never gave problems and started properly.
After narrowing down the problem, I came across this piece of code inside PCIBUS.DLL (pcicfg.c line ±800)
...
// Set the bus numbers & secondary latency
// Need to set the subordinate bus as max for now, then write
// actual number after found all downstream busses
(*pSubordinateBus)++;
Info.SecondaryLatency = (Info.SecondaryLatency > 0) ? Info.SecondaryLatency : pBusInfo->SecondaryLatency; // use global bus value if not defined for bridge
BusReg = PCIConfig_Read(Bus, Device, Function, PCIBRIDGE_BUS_NUMBER);
((PBRIDGE_BUS)(&BusReg))->PrimaryBusNumber = (BYTE)Bus;
((PBRIDGE_BUS)(&BusReg))->SecondaryBusNumber = (BYTE)(*pSubordinateBus);
((PBRIDGE_BUS)(&BusReg))->SubordinateBusNumber = 0xFF;
if (Info.SecondaryLatency) ((PBRIDGE_BUS)(&BusReg))->SecondaryLatencyTimer = (BYTE)Info.SecondaryLatency;
PCIConfig_Write(Bus, Device, Function, PCIBRIDGE_BUS_NUMBER, BusReg);
...
While enumerating the PCI bridges, stepping over the yellow-marked source code line on the 2nd PCI bridge instance (Bus 00, Dev 1C, Fun 01, 8086 0F4A), the device "hangs".
If you, however, skip this yellow-marked source code line (i.e. do NOT execute, Set Next Statement on the next source code line), there is no problem and you can continue debugging.
Bridge device | Before re-enumeration | After re-enumeration |
Bus | Dev | Fun | Order | Sub bus | Sec bus | Pri bus | Order | Sub bus | Sec bus | Pri bus |
00 | 1C | 00 | 1 | 1 | 1 | 0 | 2 | 1 | 1 | 0 |
00 | 1C | 01 | 3 | 2 | 2 | 0 | 4 | 2 | 2 | 0 |
00 | 1C | 02 | 5 | 4 | 3 | 0 | 8 | 4 | 3 | 0 |
03 | 00 | 00 | 6 | 4 | 4 | 3 | 7 | 4 | 4 | 3 |
00 | 1C | 03 | 9 | 5 | 5 | 0 | 10 | 5 | 5 | 0 |
What is this code doing?
The BIOS already executed the PCI bridge enumeration and had filled in the SecondaryBusNumber and SubordinateBusNumber.
PCIBUS.DLL in Windows CE actually re-executes the bus enumeration during startup (and finds the same enumeration order).
While recursively enumerating the PCI busses, it needs to pass-through all PCI Configuration accesses to the secondary bus "under investigation".
Therefore it needs to write - temporarily - SubordinateBusNumber = 0xFF, allowing all accesses to flow downwards the secondary bus.
So you might expect this is not a problem. We are just filling in the same numbers. But it isn't in this particular case.
Bus | Device | Function | VendorId | DeviceId | Description |
00 | 00 | 00 | 8086 | 0F00 | Host PCI Bridge |
00 | 02 | 00 | 8086 | 0F31 | Display Controller VGA/8514 |
00 | 1C | 00 | 8086 | 0F48 | PCI/PCI Bridge |
00 | 1C | 01 | 8086 | 0F4A | PCI/PCI Bridge |
00 | 1C | 02 | 8086 | 0F4C | PCI/PCI Bridge |
00 | 1C | 03 | 8086 | 0F4E | PCI/PCI Bridge |
00 | 1D | 00 | 8086 | 0F34 | USB |
00 | 1F | 00 | 8086 | 0F1C | PCI/ISA Bridge |
00 | 1F | 03 | 8086 | 0F12 | SMB |
01 | 00 | 00 | 11AB | 6101 | IDE controller |
02 | 00 | 00 | 11AB | 6101 | IDE Controller |
03 | 00 | 00 | 104C | 8240 | PCI/PCI Bridge |
04 | 05 | 00 | 10EC | 8139 | Ethernet Controller (Realtek) |
04 | 06 | 00 | xxxx | xxxx | Other Bridge type |
05 | 00 | 00 | 8086 | 1539 | Ethernet Controller (Intel) |
I admit that I do not fully understand the problem.
My closest guess is that when SubordinateBusNumber == 0xFF is filled in,
the PCI configuration accesses to the PCI based KITL NIC (Bus 04, Dev 05, Fun 00, 10EC 8139) get lost (redirected) through this bridge (Bus 00, Dev 1C, Fun 01, 8086 0F4A).
Failing KITL, hence the "hang" of the device during debugging. Actually the device is still alive, but the VisualStudio Platform Debugger lost its connection with the NIC.
It can also explain why there is no problem on a RELEASE build.
But why aren't the PCI configuration accesses lost when enumerating the first bridge (Bus 00, Dev 1C, Fun 00, 8086 0F48) instance then?
And why haven't I seen this problem on other hardware platforms before?
And does the KITL NIC driver need to access the PCI configuration registers in running condition?
Puzzling...
At least skipping this line of code (during boot, but annoying) brought back my debugging possibilities on this hardware.
If anyone has a better explanation for this problem, or a solution how to fix it, feel free to let me know...
Useful references: