posts - 20 , comments - 57 , trackbacks - 0

Wednesday, October 21, 2015

Debugging PCI bus with Windows CE

This time I will write about something I don't understand :-( Or at least not completely.

Some time ago I was debugging a new Intel hardware platform (Adlink ETX-BT, Celeron J1900) and I experienced unexpected hangups during boot of the DEBUG version of my WINCE800 image. The RELEASE version never gave problems and started properly. After narrowing down the problem, I came across this piece of code inside PCIBUS.DLL (pcicfg.c line ±800)

... // Set the bus numbers & secondary latency // Need to set the subordinate bus as max for now, then write // actual number after found all downstream busses (*pSubordinateBus)++; Info.SecondaryLatency = (Info.SecondaryLatency > 0) ? Info.SecondaryLatency : pBusInfo->SecondaryLatency; // use global bus value if not defined for bridge BusReg = PCIConfig_Read(Bus, Device, Function, PCIBRIDGE_BUS_NUMBER); ((PBRIDGE_BUS)(&BusReg))->PrimaryBusNumber = (BYTE)Bus; ((PBRIDGE_BUS)(&BusReg))->SecondaryBusNumber = (BYTE)(*pSubordinateBus); ((PBRIDGE_BUS)(&BusReg))->SubordinateBusNumber = 0xFF; if (Info.SecondaryLatency) ((PBRIDGE_BUS)(&BusReg))->SecondaryLatencyTimer = (BYTE)Info.SecondaryLatency; PCIConfig_Write(Bus, Device, Function, PCIBRIDGE_BUS_NUMBER, BusReg); ...

While enumerating the PCI bridges, stepping over the yellow-marked source code line on the 2nd PCI bridge instance (Bus 00, Dev 1C, Fun 01, 8086 0F4A), the device "hangs". If you, however, skip this yellow-marked source code line (i.e. do NOT execute, Set Next Statement on the next source code line), there is no problem and you can continue debugging.

Bridge device Before re-enumeration After re-enumeration
BusDevFun OrderSub busSec busPri bus OrderSub busSec busPri bus
00 1C 00 1 1 1 0 2 1 1 0
00 1C 01 3 2 2 0 4 2 2 0
00 1C 02 5 4 3 0 8 4 3 0
03 00 00 6 4 4 3 7 4 4 3
00 1C 03 9 5 5 0 10 5 5 0

What is this code doing?
The BIOS already executed the PCI bridge enumeration and had filled in the SecondaryBusNumber and SubordinateBusNumber. PCIBUS.DLL in Windows CE actually re-executes the bus enumeration during startup (and finds the same enumeration order). While recursively enumerating the PCI busses, it needs to pass-through all PCI Configuration accesses to the secondary bus "under investigation". Therefore it needs to write - temporarily - SubordinateBusNumber = 0xFF, allowing all accesses to flow downwards the secondary bus. So you might expect this is not a problem. We are just filling in the same numbers. But it isn't in this particular case.

Bus Device Function VendorId DeviceId Description
00 00 00 8086 0F00 Host PCI Bridge
00 02 00 8086 0F31 Display Controller VGA/8514
00 1C 00 8086 0F48 PCI/PCI Bridge
00 1C 01 8086 0F4A PCI/PCI Bridge
00 1C 02 8086 0F4C PCI/PCI Bridge
00 1C 03 8086 0F4E PCI/PCI Bridge
00 1D 00 8086 0F34 USB
00 1F 00 8086 0F1C PCI/ISA Bridge
00 1F 03 8086 0F12 SMB
01 00 00 11AB 6101 IDE controller
02 00 00 11AB 6101 IDE Controller
03 00 00 104C 8240 PCI/PCI Bridge
04 05 00 10EC 8139 Ethernet Controller (Realtek)
04 06 00 xxxx xxxx Other Bridge type
05 00 00 8086 1539 Ethernet Controller (Intel)

I admit that I do not fully understand the problem. My closest guess is that when SubordinateBusNumber == 0xFF is filled in, the PCI configuration accesses to the PCI based KITL NIC (Bus 04, Dev 05, Fun 00, 10EC 8139) get lost (redirected) through this bridge (Bus 00, Dev 1C, Fun 01, 8086 0F4A). Failing KITL, hence the "hang" of the device during debugging. Actually the device is still alive, but the VisualStudio Platform Debugger lost its connection with the NIC. It can also explain why there is no problem on a RELEASE build.
But why aren't the PCI configuration accesses lost when enumerating the first bridge (Bus 00, Dev 1C, Fun 00, 8086 0F48) instance then? And why haven't I seen this problem on other hardware platforms before? And does the KITL NIC driver need to access the PCI configuration registers in running condition? Puzzling...

At least skipping this line of code (during boot, but annoying) brought back my debugging possibilities on this hardware.

If anyone has a better explanation for this problem, or a solution how to fix it, feel free to let me know...

Useful references:

Posted On Wednesday, October 21, 2015 9:56 AM | Comments (0) | Filed Under [ Windows CE Windows Embedded Compact Microsoft Visual Studio 2012 Visual Studio 2013 Intel PCI Debugging ]

Monday, August 3, 2015

x86 bootloader for WCE8

This blog will reveal how to build the Windows CE bootloader for x86 that shipped with Windows CE 6. This bootloader also works for Windows CE 7 and 8 (CE2013)

Locate the x86 bootloader code in WINCE600\PLATFORM\CEPC\SRC\BOOTLOADER and copy it "as is" to WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER. Making any changes now to <YourBsp> prevents that the CEPC sources are "polluted" with your changes. You can also copy the BOOTLOADER sources to a Windows CE 7 or 8 platform directory. The bootloader will work fine too for these versions.

The normal build step is to open a Release Directory in a Build Window and "Build -c".
If you do so on the BOOTLOADER directory, you will get a number of build errors. How to fix this?

First we need to understand what we can expect. The bootloader consists of 2 parts: BSECT and BLDR:

  • BSECT (BootSECTor) is a 512 bytes binary image that is always located in sector 0 of a bootable disk. The BIOS locates it by checking that the last two bytes are FF AA and boot jumps to it. The code inside BSECT can do 2 things; either jump to the real OS bootloader or jump again to a new BSECT. The latter happens e.g. on a FAT file system. Each physical disk in FAT can contain up to 4 partitions or volumes. One of them can be marked as active partition, meaning it contains again a BSECT sector. This 2nd BSECT now will jump to the OS bootloader.
  • BLDR is the actual Windows CE bootloader, here referred to as biosloader as it uses BIOS functionality for disk, video, COM, keyboard, ... access. Other bootloaders could implement own driver code to access hardware to be independent of a BIOS.
We will build both BSECT and BLDR. Contrary to a normal Windows CE build, not a sources file is build, but instead (DOS) batch files are used.
  • The batch file itself uses an old DOS tool DEBUG.EXE to extract (cut and paste) bytes from an EXE or COM executable. Too old to explain every bit of it here, but you could use any other tool that can cut bytes from an offset + length of a file and paste them to another file. It mainly is used to strip off the PE header of an EXE file. You should know that this tool only runs on desktop Windows 32bit in compatibility mode. And DEBUG.EXE can only been run from a path that does not contain more than 64 characters... (old DOS tool I said). As I run my Windows CE builds from inside a Virtual Machine that runs Windows 7 32bit, this is not a problem.
  • The batch file also uses an oldish assembler compiler. The package that comes with this blog contains this old Windows 3.1/95 Windows assembler... You could also use the open source MASM32 assembler that is compatible with the Microsoft Assembler.

Now, let us examine what is located in the BOOTLOADER directory? We are only interested in the BIOSLOADER subdirectory. This directory contains on its turn the following directories:

  1. BOOTSECTOR: contains the sources to build the BSECT image.
  2. DISKIMAGES: contains ready to use images. We are not using these, we build our own.
  3. INIPARSER: builds a library used to parse commandline options and some utility functions.
  4. LOADER: contains the sources to build the BLDR bios loader
  5. UTILITIES: sources of some utility programs needed or useful to prepare your bootable disk to boot a Windows CE OS.

Building.
We build for FAT32 filesystem, the most commonly used filesystem supported by all Windows (and Linux) versions.

  1. BSECT.IMG:
    1. Go to WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\BOOTSECTOR\FAT32
    2. Run from Release build window 'build2.bat'. The batch file is an edited version of the original 'build.bat', mainly referring to other paths (assembler and linker).
    3. This builds a FAT32 compatible partition bootloader.
    4. The result is BSECT.IMG, a binary image of 512 bytes.
  2. BLDR:
    1. Go to WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\INIPARSER
    2. Run from Release build window 'build -c'. It is important to build the Release version to fit in 64k memory.
    3. Go to WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\LOADER\FIXED
    4. Run from Release build window 'build -c'. It is important to build the Release version to fit in 64k memory.
    5. Go to WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\LOADER\FIXED\FAT32
    6. Run from Release build window 'makebldr2.bat'. The batch file is an edited version of the original 'build.bat', it refers to boot2.bib, which on its turn is an edited version (increase of RAMSIZE to fit in memory).
      BLDR 00001000 00006000 RAMIMAGE ; Size should be evenly divisible by 512b.
    7. The result is BLDR, a binary image of 24k bytes that runs in 64k memory space. 'makebldr2.bat' uses the 'build' tool that can only work in 32bit Windows and cannot be run from a full path with more than 64 characters. If so, copy the contents of FIXED\FAT32 folder to a folder with less characters before executing 'makebldr2.bat'

Sources.
As usual, there are bugs in the shipped source code.

  1. BLDR cannot locate NK.BIN if its first cluster is located above cluster# 65535. When you create a clean formatted disk and copy NK.BIN to the FAT32 filesystem, NK.BIN's first cluster is typically below 65535. The source code only takes the lowest 16bit cluster value into account. However if you delete, move, copy the file often, it might be located above 65535 on larger sized disks with > 64k clusters. If so, BLDR cannot locate NK.BIN anymore, failing to boot.
  2. To fix it, change the following in WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\LOADER\FIXED\FAT32\fat32.h

    // // FAT directory entry structure. // #pragma pack(1) #if 0 // Original typedef struct { UCHAR FileName[8]; UCHAR FileExt[3]; UCHAR FATTr; UCHAR FOO[10]; USHORT LModTime; USHORT LModDate; USHORT FirstClust; ULONG FileSize; } DIRENTRY, *PDIRENTRY; #else // New, more correct typedef struct { UCHAR FileName[8]; UCHAR FileExt[3]; UCHAR FATTr; ULONG CreateTime; USHORT CreateDate; USHORT LAccessDate; USHORT FirstClustH; // high 16bit USHORT LModTime; USHORT LModDate; USHORT FirstClustL; // low 16bit ULONG FileSize; } DIRENTRY, *PDIRENTRY; #endif #pragma pack()

    And change WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\LOADER\FIXED\FAT32\bldr_fat32.c

    if (!bFound || pDirEntry == NULL) { WARNMSG(MSG_FILE_NOT_FOUND, ("File '%s' not found\n", pFileName)); return(0); } else { ULONG FirstClust = (pDirEntry->FirstClustH << 16) + pDirEntry->FirstClustL; // Create 32bit cluster to locate file in larger sized disks with > 64K clusters INFOMSG(MSG_FILE_FOUND, ("Found file '%s' (start=0x%x size=0x%x)\n", pFileName, FirstClust, pDirEntry->FileSize)); }

  3. While starting BLDR, it enumerates the BIOS to list the available video modes. These video modes are listed in a BIOS table, usually the first table entries are consequentially filled with valid video modes. However some BIOS do not list valid video modes in the first entries, instead they are listed further down the table. BLDR did not handle this situation correctly.

    To fix it, change the following in WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\LOADER\FIXED\MAIN\video.c

    // Iterate over available video modes for (i = 0; pVideoModeList[i] != 0xFFFF; i++) { if ((ulVbeStatus = BIOS_VBEModeInfo(&vesaModeInfo, pVideoModeList[i])) != 0x004F) { WARNMSG(MSG_CANNOT_GET_MODE_INFO, ("Cannot get mode info (status=0x%x)\n", ulVbeStatus)); //break; // Some BIOS return an error code on this 'pVideoModeList[i]' instance, but continue on the next one. Don't break, but continue search... } ... }

WCELDR

From WCE7 onwards, CEPC platform ships with WCELDR, an alternative bootloader for x86 platforms. It has many advantages over BLDR, but cannot work with single partition/volume disk images. This is usually not a problem, it depends mainly on how you format your CE boot disk image.

CeSys.exe

CeSys.exe is a tool to prepare your fresh formatted (FAT32) disk image to be bootable from the first partition/volume. It also copies BSECT to the first sector (MasterBootRecord, MBR). Although the sources suggest it is a DOS tool, you can easily compile it as a modern desktop Windows program. You can find it in WINCE600\PLATFORM\<YourBsp>\SRC\BOOTLOADER\BIOSLOADER\UTILITIES\CESYS\DOS.

Create a bootable BLDR disk

  1. Build BSECT.IMG and BLDR
  2. Format your disk (from desktop Windows) to FAT32 filesystem. Set volume label to empty (blank).
  3. Build CeSys.exe
  4. Run 'CeSys.exe z: bsect.img bldr -f -a' to make your disk CE bootable.
  5. Run 'BinCompress.exe /C YourSplashImage.bmp splash.bmx' from a WCE Release Directory in a Build Window
  6. Copy NK.BIN, boot.ini and splash.bmx to the root of your disk.
  7. Insert disk in your CE device and boot.

Useful references:

Posted On Monday, August 3, 2015 8:48 PM | Comments (0) | Filed Under [ Windows Embedded Compact Embedded Microsoft BootLoader ]

Monday, July 13, 2015

x86 ISR for WCE8

In Windows CE all hardware interrupts are handled by 1 Interrupt Service Routine (ISR) at the lowest level. Its main purpose is to identify the Interrupt Request (IRQ) source and handle some low level hardware clearing and resetting of the Interrupt Controller such that new interrupts can be triggered. Once the IRQ is recognized by the ISR, it is handed over to an Interrupt Service Thread (IST). This has the advantage that the hardware IRQ priority can be overruled by the IST thread priority scheduling. Moreover when IRQs of multiple devices are shared on the same IRQ pin, each device interrupt can be handled separately with its own priority in the IST. Typically the device driver handling the IST is responsible for clearing and resetting the device hardware itself such that a new interrupt can be triggered. (Note that on x86 hardware you typically need to clear and reset on 2 levels: the Interrupt Controller (PIC or APIC) and the Device Controller)

Some time ago I was porting a BSP from WCE6 to WCE8. This didn't involve much changes (most of the time just recompilation of the drivers). However when the new WCE8 NK.BIN image was running, randomly (but most of the time), the NIC and USB functionality didn't work. The NIC (RTL8139 compatible) was recognized, but simple pinging was not possible, nor any other NIC functionality was working properly. Also 2 of the 8 USB ports didn't work. This device used an Intel ICH7 chipset. Running the same image on an Intel NM10 chipset gave no problem. Running an older WCE6 image never gave any trouble on all chipsets.

What was going on?

From the start I had a feeling that the devices IRQs had something to do with it. Because the NIC, 1 USB UHCI chipset (out of 4) and the onboard APIC controller all shared the same IRQ (reported by the BIOS via ACPI). Here is what I found in the Intel chipset configuration registers and ACPI tables:

Chipset Configuration Register Interrupt Route Register LPC PCI Config register
INT A, B, C, D PIRQ A, B, C, D, E, F, G, H 60-63:PIRQ[A-D], 68-6B:PIRQ[E-H]
3100 SMBUS B 3140 INT D -> PIRQ A
3100 SATA SIP B 3140 INT C -> PIRQ A
3100 SATA PIP A 3140 INT B -> PIRQ D PIRQ D -> IRQ15
3100 PCI BRIDGE - 3140 INT A -> PIRQ C PIRQ C -> IRQ15
3104 AC97 B 3142 INT D -> PIRQ A
3104 AC97 A 3142 INT C -> PIRQ A
3104 - - 3142 INT B -> PIRQ E PIRQ E -> IRQ 5
3104 LPC BRIDGE - 3142 INT A -> PIRQ B PIRQ B -> IRQ10
3108 EHCI A 3144 INT D -> PIRQ A PIRQ H -> IRQ 9
3108 UHCI3 D 3144 INT C -> PIRQ C PITQ A -> IRQ11
3108 UHCI2 C 3144 INT B -> PIRQ D PIRQ C -> IRQ15
3108 UHCI1 B 3144 INT A -> PIRQ H PIRQ D -> IRQ15
3108 UHCI0 A PIRQ H -> IRQ 9
310C PCIEXPRESS #6 - 3146 INT D -> PIRQ D
310C PCIEXPRESS #5 - 3146 INT C -> PIRQ C
310C PCIEXPRESS #4 - 3146 INT B -> PIRQ B
310C PCIEXPRESS #3 - 3146 INT A -> PIRQ A
310C PCIEXPRESS #2 -
310C PCIEXPRESS #1 -

ACPI Control RegisterIRQ select
LPC I/F D31:F0 PCI Reg 44 IRQ 9

NIC RTL 8139 compatibleIRQ 9

From the tables we learn that EHCI, UHCI0, ACPI and NIC all shared IRQ9. That couldn't be a coincidence.

Let us take a look at 'WINCE800\platform\<yourbsp>\SRC\x86\COMMON\intr\fwpc.c'. It contains the function static ULONG PeRPISR() which is the ISR we talked about in the introduction. Comparing the source code between WCE6 and WCE8 revealed that the spurious interrupt handling code was changed (near the end of PeRPISR()). In WCE8 the reset of the 8259 PIC interrupt controller is not executed inside the ISR but instead delayed to a separate background thread. If we revert the code back to how it was on WCE6, the problems disappeared again.

Probably the delayed spurious interrupt handling resets the PIC too late (i.e. re-enable of PIC IRQ#), causing other interrupts to be lost. It is also worthwhile to mention that enabling the UHCI0 function in ICH7 caused an unwanted single occurrence spurious interrupt of the ACPI controller, causing this problem. Other Intel chipsets did not show this behaviour...

if (ulRet == SYSINTR_NOP) { // If SYSINTR_NOP, IRQ claimed by installed ISR, but no further action required PICEnableInterrupt(ucCurrentInterrupt, TRUE); } else if (ulRet == SYSINTR_UNDEFINED || !NKIsSysIntrValid(ulRet)) { // If SYSINTR_UNDEFINED, ignore // If SysIntr was never initialized, ignore OALMSGS(OAL_WARN&&OAL_INTR, (L"Spurious interrupt on IRQ %d\r\n", ucCurrentInterrupt)); #if 0 // This code is new since CE7, but gives sometimes trouble. When UHCI0 is enabled on ICH7 and its interrupt is shared with ACPI on IRQ9 // a spurious interrupt occurs (probably because of ACPI, although not enabled). // If the same IRQ is shared with other PCI devices (like ethernet), the device will also malfunction (losing interrupts) // When the next line is used, it will trigger OALUnknownIRQHandler() in map_refcount.c but it comes too late. // Hence many interrupts might already been masked out for a too long period... ulRet = OALMarkUnknownIRQ(ucCurrentInterrupt); #else // Instead it is better to use the pre CE7 code that re-enables interrupt on the PIC immediately. ulRet = SYSINTR_NOP; PICEnableInterrupt(ucCurrentInterrupt, TRUE); #endif }

Useful references:

Posted On Monday, July 13, 2015 10:23 PM | Comments (1) | Filed Under [ Windows CE Windows Embedded Compact USB UHCI EHCI APIC ACPI Interrupt IRQ PIC Intel ]

Monday, May 25, 2015

On Demand Paging in Windows CE

Recently I was working on a tool program that allowed me to read/rewrite the complete binary image of a drive (HDD) in Windows CE 8 (Compact Edition 2013). We use this program to restore a complete image to a (bootable) drive as part of the software upgrade of our device/application.

At some part of writing the image to the drive, the program itself crashed with an AV while reading a static const array of integers. Debugging with Visual Studio 2012 showed that at that particular place half of the static const array memory area was not readable (at a 4k page boundary). At first this was very odd, because when I took the code apart and debugged with a separate test program, nothing was wrong.

What was going on?

Remember what I said in the beginning. "At some part of writing the image...". Well, in order to write (or read) unconditionally, unaware of any filesystem, to the drive, the program first had to unmount the filesystem. Next the drive can be safely written with new binary data. However the program itself initially resided on the drive that was going to be overwritten with a complete new image (potentially even with a new filesystem).

On itself this is not a problem. I thought. I assumed that the OS (WCE8 in this case) would read the complete executable in memory before it started the main() code (it is a C++ program).

Enter "On Demand Paging" feature in Windows CE. To preserve memory the OS can page out unwanted blocks of memory that can be read into memory at a later stage again. Or instead of reading the complete executable image into memory before executing it, the OS only loads a part of the executable code and static data into memory. If the executable at some point jumps to a code address or accesses statics/constants (defined in .data segment) which are not yet loaded into memory, the OS underlying hits a page fault exception and handles it appropriately. I.e. it reads a new page (4k by default) from disk into memory. However if the drive is no longer accessible (as the program itself acted as Grim Reaper on the drive), the OS could not load the missing executable region with the static const array into memory. This happened completely invisible to the executable and VisualStudio debugger, probably because a page fault exception is gracefully implemented and handled by the OS as a "normal" situation. The end result was that some code and constants regions were no longer readable with valid data, causing the AV in my program.

What can you do to solve this?

  • Make sure all code has been "touched" or run before you actually unmount the drive. I.e. make sure all code and constants are paged into memory.
  • ROMFLAGS in config.bib allows you to disable "On Demand Paging" OS wide. The MSDN states that this gives better realtime performance. You should read this as: if a program is executing realtime code (needs to run deterministically), disabling On Demand Paging avoids unpredictable reading of code into memory prior to executing the code.
  • Play with Pageable flag of C++ linker option /SECTION
  • I disabled On Demand Paging system wide as our applications favor realtime performance. Set ROMFLAGS in config.bib.

    ; ; ROMFLAGS is a bitmask of options for the kernel ; ROMFLAGS 0x0001 Disallow Paging ; ROMFLAGS 0x0010 Trust Module only ; ; We disable Demand Paging system wide as this will improve realtime performance ; IF IMGTRUSTROMONLY ROMFLAGS=11 ; 10 ELSE ROMFLAGS=01 ; 00 ENDIF

    Useful references:

    Posted On Monday, May 25, 2015 2:05 PM | Comments (2) | Filed Under [ Windows CE Windows Embedded Compact RTOS Visual Studio 2012 Visual Studio 2013 ]

    Saturday, August 2, 2014

    HPET for x86 BSP (how to build it for WCE8)

    "I needed a timer". That is how we started a few blogs ago our series about APIC and ACPI.

    Well, here it is.
    HPET (High Precision Event Timer) was introduced by Intel in early 2000 to:

  • Replace old style Intel 8253 (1981!) and 8254 timers
  • Support more accurate timers that could be used for multimedia purposes. Hence Microsoft and Intel sometimes refers to HPET as Multimedia timers.

  • An HPET chip consists of a 64-bit up-counter (main counter) counting at a frequency of at least 10 MHz, and a set of (at least three, up to 256) comparators. These comparators are 32- or 64-bit wide. The HPET is discoverable via ACPI. The HPET circuit in recent Intel platforms is integrated into the SouthBridge chip (e.g. 82801) All HPET timers should support one-shot interrupt programming, while optionally they can support periodic interrupts. In most Intel SouthBridges I worked with, there are three HPET timers. TIMER0 supports both one-shot and periodic mode, while TIMER1 and TIMER2 are one-shot only.

    Each HPET timer can generate interrupts, both in old-style PIC mode and in APIC mode. However in PIC mode, interrupts cannot freely be chosen. Typically IRQ11 is available and cannot be shared with any other interrupt! Which makes the HPET in PIC mode virtually unusable. In APIC mode however more IRQs are available and can be shared with other interrupt generating devices. (Check the datasheet of your SouthBridge) Because of this higher level of freedom, I created the APIC BSP (see previous posts). The HPET driver code that I present you here uses this APIC mode.

    Hpet.reg
    [HKEY_LOCAL_MACHINE\Drivers\BuiltIn\Hpet]
    "Dll"="Hpet.dll"
    "Prefix"="HPT"
    "Order"=dword:10
    "IsrDll"="giisr.dll"
    "IsrHandler"="ISRHandler"
    "Priority256"=dword:50

    Because HPET does not reside on the PCI bus, but can be found through ACPI as a memory mapped device, you don't need to specify the "Class", "SubClass", "ProgIF" and other PCI related registry keys that you typically find for PCI devices.
    If a driver needs to run its internal thread(s) at a certain priority level, by convention in Windows CE you add the "Priority256" registry key. Through this key you can easily play with the driver's thread priority for better response and timer accuracy. See later.

    Hpet.cpp (Hpet.dll)

    This cpp file contains the complete HPET driver code. The file is part of a folder that you typically integrate in your BSP (\src\drivers\Hpet). It is written as sample (example) code, you most likely want to change this code to your specific needs. There are two sets of #define's that I use to control how the driver works.

  • _TRIGGER_EVENT or _TRIGGER_SEMAPHORE: _TRIGGER_EVENT will let your driver trigger a Windows CE Event when the timer expires, _TRIGGER_SEMAPHORE will trigger a Windows CE counting Semaphore. The latter guarantees that no events get lost in case your application cannot always process the triggers fast enough.
  • _TIMER0 or _TIMER2: both timers will trigger an event or semaphore periodically. _TIMER0 will use a periodic HPET timer interrupt, while _TIMER2 will reprogram a one-shot HPET timer after each interrupt. The one-shot approach is interesting if the frequency you wish to generate is not an even multiple of the HPET main counter frequency. The sample code uses an algorithm to generate a more correct frequency over a longer period (by reducing rounding errors).
    _TIMER1 is not used in the sample source code.

  • HPT_Init() will locate the HPET I/O memory space, setup the HPET counter (_TIMER0 or _TIMER2) and install the Interrupt Service Thread (IST). Upon timer expiration, the IST will run and on its turn will generate a Windows CE Event or Semaphore. In case of _TIMER2 a new one-shot comparator value is calculated and set for the timer.
    The IRQ of the HPET timers are programmed to IRQ22, but you can choose typically from 20-23. The TIMERn_INT_ROUT_CAP bits in the TIMn_CONF register will tell you what IRQs you can choose from.

    HPT_IOControl() can be used to set a new HPET counter frequency (actually you configure the counter timeout value in microseconds), start and stop the timer, and request the current HPET counter value. The latter is interesting because the Windows CE QueryPerformanceCounter() and QueryPerformanceFrequency() APIs implement the same functionality, albeit based on other counter implementations.

    HpetDrvIst() contains the IST code.

    DWORD WINAPI HpetDrvIst(LPVOID lpArg) { psHpetDeviceContext pHwContext = (psHpetDeviceContext)lpArg; DWORD mainCount = READDWORD(pHwContext->g_hpet_va, GenCapIDReg + 4); // Main Counter Tick period (fempto sec 10E-15) DWORD i = 0; while (1) { WaitForSingleObject(pHwContext->g_isrEvent, INFINITE); #if defined(_TRIGGER_SEMAPHORE) LONG p = 0; BOOL b = ReleaseSemaphore(pHwContext->g_triggerEvent, 1, &p); #elif defined(_TRIGGER_EVENT) BOOL b = SetEvent(pHwContext->g_triggerEvent); #else #pragma error("Unknown TRIGGER") #endif #if defined(_TIMER0) DWORD currentCount = READDWORD(pHwContext->g_hpet_va, MainCounterReg); DWORD comparator = READDWORD(pHwContext->g_hpet_va, Tim0_ComparatorReg + 0); SETBIT(pHwContext->g_hpet_va, GenIntStaReg, 0); // clear interrupt on HPET level InterruptDone(pHwContext->g_sysIntr); // clear interrupt on OS level _LOGMSG(ZONE_INTERRUPT, (L"%s: HpetDrvIst 0 %06d %08X %08X", pHwContext->g_id, i++, currentCount, comparator)); #elif defined(_TIMER2) DWORD currentCount = READDWORD(pHwContext->g_hpet_va, MainCounterReg); DWORD previousComparator = READDWORD(pHwContext->g_hpet_va, Tim2_ComparatorReg + 0); pHwContext->g_counter2.QuadPart += pHwContext->g_comparator.QuadPart; // increment virtual counter (higher accuracy) DWORD comparator = (DWORD)(pHwContext->g_counter2.QuadPart >> 8); // "round" to real value WRITEDWORD(pHwContext->g_hpet_va, Tim2_ComparatorReg + 0, comparator); SETBIT(pHwContext->g_hpet_va, GenIntStaReg, 2); // clear interrupt on HPET level InterruptDone(pHwContext->g_sysIntr); // clear interrupt on OS level _LOGMSG(ZONE_INTERRUPT, (L"%s: HpetDrvIst 2 %06d %08X %08X (%08X)", pHwContext->g_id, i++, currentCount, comparator, comparator - previousComparator)); #else #pragma error("Unknown TIMER") #endif } return 1; }

    The following figure shows how the HPET hardware interrupt via ISR -> IST is translated in a Windows CE Event or Semaphore by the HPET driver. The Event or Semaphore can be used to trigger a Windows CE application.


    HpetTest.cpp (HpetTest.exe)

    This cpp file contains sample source how to use the HPET driver from an application. The file is part of a separate (smart device) VS2013 solution. It contains code to measure the generated Event/Semaphore times by means of GetSystemTime() and QueryPerformanceCounter() and QueryPerformanceFrequency() APIs.

    HPET evaluation

    If you scan the internet about HPET, you'll find many remarks about buggy HPET implementations and bad performance.
    Unfortunately that is true. I tested the HPET driver on an Intel ICH7M SBC (release date 2008). When a HPET timer expires on the ICH7M, an interrupt indeed is generated, but right after you clear the interrupt, a few more unwanted interrupts (too soon!) occur as well. I tested and debugged it for a loooong time, but I couldn't get it to work. I concluded ICH7M's HPET is buggy Intel hardware.
    I tested the HPET driver successfully on a more recent NM10 SBC (release date 2013).

    With the NM10 chipset however, I am not fully convinced about the timer's frequency accuracy. In the long run - on average - all is fine, but occasionally I experienced upto 20 microseconds delays (which were immediately compensated on the next interrupt). Of course, this was all measured by software, but I still experienced the occasional delay when both the HPET driver IST thread as the application thread ran at CeSetThreadPriority(1). If it is not the hardware, only the kernel can cause this delay. But Windows CE is an RTOS and I have never experienced such long delays with previous versions of Windows CE. I tested and developed this on WCE8, I am not heavily experienced with it yet. Internet forum threads however mention inaccurate HPET timer implementations as well. At this moment I haven't figured out what is going on here.

    Useful references:

    Posted On Saturday, August 2, 2014 10:02 PM | Comments (0) | Filed Under [ Windows CE Windows Embedded Compact RTOS APIC ACPI Visual Studio 2013 HPET ]

    Tuesday, July 1, 2014

    PCI Latency Timer

    In my previous blog I mentioned I was involved in searching for Windows socket data that got corrupted upon reception in a Windows CE 6 executable.

    The short explanation is that the Realtek 100MBit (RTL8139, RTL8101) network interface card simply couldn’t swallow high load bursts of TCP packets (The long explanation took weeks...). The Realtek card couldn’t transfer its data fast enough to the cpu’s main memory for further handling by NDIS and the Windows CE 6 TCP/IP stack. The loss of data showed itself on the wire (WireShark) as TCP RETRANSMISSIONS and when applicable as TCP FAST RETRANSMISSIONS.

    What was happening? Well the NIC acts as a PCI Bus Master and therefore the PCI Latency Timer in the PCI configuration register comes into place. The PCI Latency Timer determines how long a PCI Bus Master can stay on the PCI bus before relinquishing access to the bus (in favor of other bus masters requesting access). This value is by convention 32 [0 .. 255] bus clock cycles and set by the BIOS for each PCI bus master.

    The BIOS of the SBC that we were using however set the value to 0 (zero!). 0 means relinquishing access immediately when another bus master requests access. With the result that sometimes on high loads TCP packets got lost. By setting the value to 32 the problem disappeared.

    The BIOS value of 32 is a good default value and given to each PCI bus master present on the bus. On rare very high load conditions however, still packets got lost. Therefore I wrote a small piece of code that was executed by the Windows CE image at bootup (in oeminit.c) that sets the value to 32 for most PCI bus masters, but to 64 for Ethernet network cards. This fine-tuning completely solved the problem.

    void FixupPciLatencyTimer(void) { USHORT deviceId; USHORT command; USHORT devId; BYTE latencyTimer; BYTE classCode; BYTE subClassCode; BYTE b, d, f; BOOL isBusMaster; PCIReadBusData(0x0, 0x1F, 0x0, &deviceId, 0x02, 2); RETAILMSG(1, (TEXT("fixupPciLatencyTimer LPC deviceId = 0x%x\r\n"), deviceId)); for (b = 0; b < 255; ++b) // bus { for (d = 0; d < 32; ++d) // device { for (f = 0; f < 8; ++f) // function { PCIReadBusData(b, d, f, &devId, 0x02, 2); if (devId != 0xFFFF) { PCIReadBusData(b, d, f, &classCode, 0x0B, 1); PCIReadBusData(b, d, f, &subClassCode, 0x0A, 1); PCIReadBusData(b, d, f, &command, 0x04, 2); isBusMaster = (command & 0x0004) == 0x0004; RETAILMSG(1, (TEXT("%02x %02x %01x %02x %02x : cmd = %04x %s\r\n"), b, d, f, classCode, subClassCode, command, isBusMaster ? TEXT("BusMaster") : TEXT("-"))); PCIReadBusData(b, d, f, &latencyTimer, 0x0D, 1); RETAILMSG(1, (TEXT("%02x %02x %01x %02x %02x : latTim = %02x\r\n"), b, d, f, classCode, subClassCode, latencyTimer)); if (isBusMaster /*&& latencyTimer == 0x00*/) // It is possible that Latency Time is ReadOnly { if ( classCode == 0x02 && // class = 0x02 = network controller subClassCode == 0x00) // subclass = 0x00 = ethernet controller { latencyTimer = 64; // good default value. Required for Realtek 100MBit nic cards. } else { latencyTimer = 32; // good default value. } PCIWriteBusData(b, d, f, &latencyTimer, 0x0D, 1); // write PCIReadBusData (b, d, f, &latencyTimer, 0x0D, 1); // read back RETAILMSG(1, (TEXT("%02x %02x %01x %02x %02x : set latTim = %02x\r\n"), b, d, f, classCode, subClassCode, latencyTimer)); } } } } } }

    The attentive reader however might argue (correctly) that TCP packets should never get lost. TCP checksum and retransmission mechanisms should guarantee errorless transmission of data.

    Well, this was not the case. Therefore I believe there is still a bug in the TCP FAST RETRANSMISSION mechanism in the Windows CE 6 TCP/IP stack. When TCP FAST RETRANSMISSIONS were logged on the WireShark traces, we sometimes saw in our application the wrong data been delivered over the Windows socket.

    As I could circumvent the problem by setting the PCI Latency Timer to a reasonable value (64), I stopped debugging it. However this code will be converted in the future to Windows CE 8 and I will certainly pick it up again then. The Windows CE 8 TCP/IP stack is based on the desktop Windows 7 TCP/IP stack while the Windows CE 6 TCP/IP stack is based on desktop Windows XP TCP/IP stack. Different enough to see if there is also a difference in behavior when the PCI Latency Timer is set back to 0.

    We also tested the same PCI Latency Timer = 0 on an Intel PRO 100 network card, but that card always behaved correctly. Even with a value of 0, no data packets were lost, nor were TCP (FAST) RETRANSMISSIONS visible on the wire.

    Posted On Tuesday, July 1, 2014 9:29 PM | Comments (0) | Filed Under [ Windows CE Windows Embedded Compact Microsoft NDIS TCP/IP ]

    NDIS Packet Capturing DLL

    Recently I was involved in searching for Windows socket data that got corrupted upon reception in a Windows CE 6 executable. The data was transmitted from a Windows 7 desktop PC.

    At first it was not clear where the problem was located.

  • Was it the Windows 7 (C#) application?
  • Was it the Windows 7 TCP/IP stack?
  • Was it the Windows 7 NDIS?
  • Was it the Windows 7 network interface driver?
  • Was it the Windows 7 PCI interface between the network card and the cpu (main memory)?
  • Was it the Windows 7 network interface card?
  • Was it the Windows 7 PC?

  • Was it the cable or hardware? Was it noise or EMI?

  • Was it the Windows CE 6 ePC (embedded PC)?
  • Was it the Windows CE 6 network interface card?
  • Was it the Windows CE 6 PCI interface between the network card and the cpu (main memory)?
  • Was it the Windows CE 6 network interface driver?
  • Was it Windows CE 6 NDIS?
  • Was it Windows CE 6 TCP/IP stack?
  • Was it Windows CE 6 (C++) application?
  • On the Windows 7 PC you can use WireShark to trace all in and outgoing traffic. But what can you do on the Windows CE side?

    Well, I found out – after all these years – that Windows CE (since 4.x) has the possibility to capture and trace (Ethernet) traffic that passes through the CE NDIS layer. How do you enable this feature? From the Platform Builder Catalog select the “NDIS Packet Capture DLL” feature. This will add the necessary DLL and registry keys to your image.

    To use it at runtime, simply enter the following commands at the Windows CE command prompt.

    > netlogctl.exe cap_size 20000000
    > netlogctl.exe start
    ...
    > netlogctl.exe stop

    By default 2 files netlog0.cap and netlog1.cap are alternatively written to the “\” root folder. The nice thing is that you can open them afterwards with WireShark for analysis.

    For more information:

    http://msdn.microsoft.com/en-us/library/ee493097.aspx

    Posted On Tuesday, July 1, 2014 9:25 PM | Comments (0) | Filed Under [ Windows CE Windows Embedded Compact Microsoft NDIS TCP/IP ]

    Monday, June 23, 2014

    APIC for x86 BSP (how to build it for WCE8)

    As promised I will talk about how to replace the “old” PIC (Peripheral Interrupt Controller) with the “new” APIC (Advanced Peripheral Interrupt Controller) in a CEPC (x86) BSP. I will refer to the “MyBSP” BSP in my explanation, your clone of the CEPC BSP. As APIC is mostly only available for (Intel) x86 platforms, this talk will only be valid for x86, not ARM.

    The Windows CE Boot to Kernel startup phase

    There are a few good MSDN links that explain quite a bit about the Windows CE startup phase. This link, for instance, explains the Kernel Startup Sequence in Windows CE 5, but most of it (as many older Windows CE articles) is still valid for Windows CE 8. I summarize briefly what we need to know:

    Location BSP (common) Kernel BSP (common) BSP (MyBSP) Remarks
    WINCE800\platform\common\src\x86\common\startup\startup.asmStartup()
    WINCE800\private\winceos\COREOS\nk\ldr\x86\x86start.asm KernelStart()
    WINCE800\private\winceos\COREOS\nk\ldr\x86\x86init.c X86Init()
    WINCE800\private\winceos\COREOS\nk\kernel\x86\sysinit.c NKStartup()
    DoNKStartup()
    WINCE800\platform\common\src\x86\common\other\debug.c OEMInitDebugSerial()
    WINCE800\platform\common\src\x86\common\startup\oeminit.c OEMInit()[1]
    WINCE800\platform\MyBSP\src\x86\common\startup\oeminit.c OEMInit()[1]
    WINCE800\private\winceos\COREOS\nk\kernel\nkinit.c KernelInit()[2]
    ... ...

    KernelInit() [2] is an interesting function to examine. It calls many other functions that completely initialize the kernel. At the end of this function you find

    #ifdef DEBUG
        g_pNKGlobal->pfnWriteDebugString (TEXT("Scheduling the first thread.\r\n"));
    #endif

    When you see the “Scheduling the first thread” debug message in PlatformBuilder’s Output window, you know you have reached the full kernel initialization stage. This is usually a good sign; messing up things in your OemInit() [1] MyBSP modifications, tend to lead to a system reboot or hangup (do I hear a sigh?). Setting a PlatformBuilder debug breakpoint also doesn’t work always before this point (reboot or hangup), after this point it will work ( except for some OEM low level interrupt code in your MyBSP) and you can start debugging, even the kernel itself. OemInit() [1] and functions called from within, is the place where we will do most of our APIC modifications. So let’s have a better look at it.

    First of all you need to move this code from the platform\common folder to the platform\MyBSP folder as we will make changes to it. I suppose you know how to do this, or have a look at the source code that accompanies this blog.
    Now, OEMInit() itself does not contain much code, it calls other init functions. The 2 functions we are interested in are OALIntrInit() and x86InitPICs()

    void OEMInit()
    {
        ...
        // initialize interrupts
        OALIntrInit ();
        
        // initialize PIC
        x86InitPICs();
        ...
    }

    Integrating APIC in MyBSP

    You can find an example MyBSP here. It is based on a CEPC BSP that ships with Windows Embedded Compact Edition 2013 Update 5 (2014 release for Visual Studio 2013 compatibility)

    1. Clone CEPC BSP to MyBsp. Create SampleOSDesign based on MyBSP. (You need to restart VS2013 after you have cloned CEPC in order to make VS2013 aware of the new MyBSP). From the Catalog select the features you wish to add to your OSDesign. If you add LAN Networking drivers and ATAPI PCI drivers, you can test immediately if your APIC changes are OK.
    2. Add the ACPICA library to your MyBSP (as we discussed in a previous post)
    3. [MyBSP]\src\acpica
    4. Copy from WINCE800\platform\common the following files to WINCE800\platform\MyBSP
    5. [common -> MyBSP]\src\inc

      [common -> MyBSP]\src\x86
      [common -> MyBSP]\src\x86\COMMON
      [common -> MyBSP]\src\x86\COMMON\intr
      [common -> MyBSP]\src\x86\COMMON\io
      [common -> MyBSP]\src\x86\COMMON\kitl
      [common -> MyBSP]\src\x86\COMMON\startup
      [common -> MyBSP]\src\x86\COMMON\startup\newtable
      [common -> MyBSP]\src\x86\COMMON\timer

      [common -> MyBSP]\src\common
      [common -> MyBSP]\src\common\intr
      [common -> MyBSP]\src\common\intr\base
      [common -> MyBSP]\src\common\intr\base_refcount
      [common -> MyBSP]\src\common\intr\common

      We do this because a Windows CE BSP for x86 exists of a MyBSP folder + a common folder. However the APIC changes we will make are mainly located in the common folder. As we don’t want to overwrite/modify the common folder, we copy the files we will modify to our own MyBSP folder where we have all freedom to make changes without touching the original source code. I tried to make the modifications such that you can choose between PIC or APIC compilation by defining BSP_APIC (or not) in the project’s Environment Property page. BSP_APIC will #define APIC in the sources file.
      Most of the changes in these folders use the APIC define to switch between the old PIC and new APIC.
    6. Add APIC folder which contains most of the new APIC code
    7. [common -> MyBSP]\src\common\apic
      This new code will call code located in
      [MyBSP]\src\acpica
      that we added in step 2.
    8. Edit "WINCE800\platform\common\src\soc\x86_ms_v1\inc\pc.h" such that the INTR_xxx defines reflect your (first) APIC IRQ0-IRQ15 situation. Depending on your hardware motherboard, certain “old PIC” IRQs might be mapped differently. This change is unfortunate (as we are modifying a common header file); you can implement a different mechanism to obtain the INTR_xxx defines (I know a few places where these defines are used, but maybe not all), but this was by far the easiest and safest implementation.
    9. Change Platform.reg to tell the Device Resource Manager we have more IRQ’s available
    10. [HKEY_LOCAL_MACHINE\Drivers\Resources\IRQ]
      "Identifier"=dword:1
      "Minimum"=dword:1
      "Space"=dword:F
      "Ranges"="1,3-7,9-0xF"
      "Shared"="1,3-7,9-0xF"
      IF BSP_APIC
      "Space"=dword:17
      "Ranges"="1,3-7,9-0x17"
      "Shared"="1,3-7,9-0x17"
      ENDIF BSP_APIC
    11. Set KITL to polling mode. Why? As we are changing PIC to APIC and in case we make a mistake, KITL in interrupt mode most likely will not work. As long as we are debugging, it is best to switch KITL to polling mode.
      Edit “WINCE800\platform\MyBSP\src\x86\common\kitl\kitl_x86.c” function OALKitlStart() as follows:
    12. switch (g_pX86Info->KitlTransport & ~KTS_PASSIVE_MODE)
      {
      ...
      case KTS_ETHER:
      case KTS_DEFAULT:
      fRet = InitKitlEtherArgs (&kitlArgs);

      // MyBSP start
      // Override to set poll mode
      kitlArgs.flags |= OAL_KITL_FLAGS_POLL;
      g_pX86Info->ucKitlIrq = OAL_INTR_IRQ_UNDEFINED;
      // MyBSP end
      break;
      default:
      break;
      }
    13. Modify OemInit() in “WINCE800\platform\MyBSP\src\x86\common\startup\oeminit.c”
    14. void OEMInit()
      {
      // Set up the debug zones according to the fix-up variable initialOALLogZones
      OALLogSetZones(initialOALLogZones);

      OALMSG(OAL_FUNC, (L"+OEMInit\r\n"));

      // initialize interrupts
      OALIntrInit ();

      #ifndef APIC
      // initialize PIC
      x86InitPICs();
      #endif

      // Initialize PCI bus information
      PCIInitBusInfo ();

      // starts KITL (will be a no-op if KITLDLL doesn't exist)
      KITLIoctl (IOCTL_KITL_STARTUP, NULL, 0, NULL, 0, NULL);

      #ifdef DEBUG
      // Instead of calling OEMWriteDebugString directly, call through exported
      // function pointer. This will allow these messages to be seen if debug
      // message output is redirected to Ethernet or the parallel port. Otherwise,
      // lpWriteDebugStringFunc == OEMWriteDebugString.
      NKOutputDebugString (TEXT("CEPC Firmware Init\r\n"));

      #endif

      // sets the global platform manufacturer name and platform name
      g_oalIoCtlPlatformManufacturer = g_pPlatformManufacturer;
      g_oalIoCtlPlatformName = g_pPlatformName;

      OEMPowerManagerInit();

      #ifndef APIC
      // initialize clock
      InitClock();
      #endif

      // initialize memory (detect extra ram, MTRR/PAT etc.)
      x86InitMemory ();

      // Reserve 128kB memory for Watson Dumps
      dwNKDrWatsonSize = 0;
      if (dwOEMDrWatsonSize != DR_WATSON_SIZE_NOT_FIXEDUP)
      {
      dwNKDrWatsonSize = dwOEMDrWatsonSize;
      }

      x86RebootInit();
      x86InitRomChain();

      OALMpInit ();

      g_pOemGlobal->pfnIsProcessorFeaturePresent = x86ProcessorFeaturePresent;

      #ifdef APIC
      // this assembly code disables any old PIC interrupt
      __asm
      {
      mov al, 0xff
      out 0xa1, al
      out 0x21, al
      }
      // start APIC
      x86InitAPICs();

      // initialize clock
      InitClock();
      #endif

      #ifdef DEBUG
      NKOutputDebugString (TEXT("Firmware Init Done.\r\n"));
      #endif

      OALMSG(OAL_FUNC, (L"-OEMInit\r\n"));
      }

    That’ s about it. Compile and debug.

    New APIC code

    In step 4 we briefly mentioned that we need to add the new APIC code

    [common -> MyBSP]\src\common\apic

    Let’s see what that means. This folder contains source file apic.cpp with function x86InitAPICs() with the real meat. This function goes through all steps required to setup the APIC chipset and hook the interrupts to the correct vectors.

    1. x86InitACPICA(). Remember this function from my previous blog? It initializes the ACPICA library that we need in a moment to query ACPI for APIC information.
    2. ApicGlobalSystemInterruptInfo() will init a global data structure that will contain all APIC chipset information and what IRQ source is connected to what APIC chip INT line
    3. SetApicEnable() will search for APIC chipsets (via the LowPinCount (LPC) Intel PCI device) and enable it. Over the years Intel devised a few mechanisms how to do this, I suggest you read the source code to learn how to. Newer chipsets might require adaptation to this code.
    4. ApicIrqRoutingMapInfo() uses the ACPICA library to learn from the _PRT (PCI Routing tables) tables in ACPI how the PCI interrupts are routed to your APIC chipset(s). I only encountered Intel chipsets with only 1 APIC chip, but motherboards can have multiple APICs. Although the code is prepared for it, I could not test boards with multiple APICs.
    5. pci_IRQ_mapping() ties the PCI interrupt info from the _PRT tables to the real PCI devices in your motherboard. _PRT in ACPI tells you that *if* you encounter a PCI device (behind a PCI bridge) to what APIC interrupt it is routed, but you need to scan your PCI bus yourself to find the currently present devices.
    6. Last but not least, you need to HookInterrupt() the APIC IRQs to the interrupt vectors. E.g. a 24x IRQ APIC will let you hook 24 vectors.
    7. x86UninitACPICA(). Clean up (memory) resources used by the ACPICA library.

    8. OALIntrRequestIrqs(). This method will be called by PciBus.dll when setting up drivers. In the old PIC mode this method interrogates the BIOS (and queries the PCI configuration registers) how the PCI interrupts are mapped to the 8259’s. In the new APIC mode this information comes from ACPICA. This method was originally implemented in WINCE800\platform\common\src\x86\common\io\route.c where I surrounded it with the #ifndef APIC define.

    Each individual step might require more in-depth explanation. You can find many useful comments in the source code or I refer to the Intel datasheets. An interesting source for APIC programming can be found here http://www.intel.com/content/dam/doc/specification-update/64-architecture-x2apic-specification.pdf

    Tips and Tricks.
    • How do you debug an interrupt service routine (ISR) without KITL? As KITL itself might use interrupts or is not yet available in an early bootup sequence.
      • Set KITL to polling mode or
      • Use OALMSGS macro. This will send your trace message to the (first) serial port without using interrupts, instead of the normal OALMSG that will send your traces to KITL once it is up and running. Be sure you disable the OALMSGS macros in your release build as they are pretty slow and will make your ISR unnecessary slow.
    • To control the OALMSG debug zones, set the default value in config.bib
        nk.exe:initialOALLogZones 00000000 0x0000000B FIXUPVAR
      • See also “WINCE800\platform\common\src\inc\oal_log.h”
    • Some notes on the ATAPI driver. This driver is responsible for controlling the IDE chipsets that drive your hard disks. The ATAPI driver can work in
      • Legacy mode = PIC mode => fixed IRQ 14 and 15
      • Native mode = APIC mode => variable IRQ according to board manufacturer.
    • By default the ATAPI driver is set to Legacy mode and works with fixed IRQ 14 and 15. However you can switch the driver and hardware easily to Native mode via the Registry. Add the following registry keys to the end of Platform.reg
    • IF BSP_APIC

      ; @CESYSGEN IF CE_MODULES_ATAPI
      ; @XIPREGION IF PACKAGE_OEMXIPKERNEL
      ; HIVE BOOT SECTION
      IF BSP_NOIDE !

      ; @CESYSGEN IF ATAPI_ATAPI_PCIO
      [$(PCI_BUS_ROOT)\Template\I82371]
      “ProgIF”=dword:8F
      “LegacyIRQ”=- ; The primary legacy IRQ, remove for native mode
      “ConfigEntry”=”NativeConfig” ; PCI configuration entry point
      ; @CESYSGEN ENDIF ATAPI_ATAPI_PCIO

      ; @CESYSGEN IF ATAPI_ATAPI_PCIO
      [$(PCI_BUS_ROOT)\Template\GenericIDE]
      “ProgIF”=dword:8F
      “LegacyIRQ”=- ; The primary legacy IRQ, remove for native mode
      “ConfigEntry”=”NativeConfig” ; PCI configuration entry point
      ; @CESYSGEN ENDIF ATAPI_ATAPI_PCIO

      ENDIF BSP_NOIDE !
      ; END HIVE BOOT SECTION
      ; @XIPREGION ENDIF PACKAGE_OEMXIPKERNEL
      ; @CESYSGEN ENDIF CE_MODULES_ATAPI

      ENDIF BSP_APIC

      The ATAPI driver has code to switch automatically from legacy mode to native mode when loaded. To see how it works, check out (set a breakpoint) in NativeConfig() method in WINCE800\public\common\oak\drivers\block\atapi\pcicfg.cpp. This piece of code detects legacy mode and if the right conditions are met, it disables this PCI device so that in a later stage of the PciBus enumeration, this device is “placed” again in native mode. Hint: place also a breakpoint in OALIntrRequestIrqs() method in WINCE800\platform\MyBSP\src\x86\common\apic\apic.cpp to see how the APIC irq is requested.

    Here you can find the MyBsp WCE8 Board Support Package that includes all the source code required to build a working image based on Intel chipsets with APIC. I have tested it successfully with IDE, USB, NIC. Copy the files in WINCE800\platform\MyBsp and create an OSDesign with it. Remember:

    • Mind the OALMSGS() in APIC_ISR() in WINCE800\platform\MyBSP\src\x86\common\apic\apic.cpp.
    • Mind the nk.exe:initialOALLogZones 00000000 0x0000410F FIXUPVAR in WINCE800\platform\MyBSP\FILES\config.bib
    • Mind WINCE800\platform\common\src\soc\x86_ms_v1\inc\pc.h

    I tested most of the code on Intel ICH4, ICH7 and NM10 chipsets with 1 APIC on board. The code might not work for other chipsets/motherboards combinations, if so feel free to let me know.

    Good Luck!

    Useful references:

    Posted On Monday, June 23, 2014 8:39 PM | Comments (7) | Filed Under [ Windows CE Windows Embedded Compact Embedded RTOS Microsoft APIC ACPI ACPICA BSP Visual Studio 2013 ]

    Sunday, April 27, 2014

    Installing the Windows Embedded Compact Edition 2013 Update for Visual Studio 2013

    I wanted to install the Windows Embedded Compact Edition 2013 Update that is compatible with Visual Studio 2013. But I discovered that the install process is poorly described. What follows is what I found out (most of it) myself.

      The Windows Embedded Compact Edition 2013 download link itself has been updated with a more recent image named “Windows Compact Edition 2013 Update 5”. So you just need to re-download through the original link where you could find the original image (year 2012).

      I installed Visual Studio 2013.

      I tried to install Windows Embedded Compact Edition 2013 Update 5, but it refused. Because an older version of Windows Embedded Compact Edition was already installed. Oh no, not again... Nop, no way around it, so I un-installed the original Windows Compact Edition 2013.

      Next I tried to install Windows Embedded Compact Edition 2013 Update 5 again. Seems to work.

      After some time (was busy for half a day now downloading, installing, un-installing, re-installing, ...) everything seemed to be installed. Both Visual Studio 2013 and Windows Embedded Compact Edition 2013 Update 5. I launched Visual Studio 2013 with great hopes to discover ... that I could not create a new Platform Builder project. It was simply not there. I tried to load an existing, but it failed. Grrrr, I launched Visual Studio 2012. And ..., no Platform Builder present anymore (it was there before I started the upgrade). WTF!

    Out of options, I decided to start from scratch (the next day) and again after half a day of "clean work" it finally succeeded.

    Here are the scenarios that worked for me. Luckily I always work with Virtual Machines (Oracle VirtualBox), so grabbing a fresh installation of Windows 7 is never a problem and saves time.

    Virtual machine for Windows 7 + Visual Studio 2012 + Windows Embedded Compact Edition 2013 Original.

    • Clean install Windows 7
    • Clean install Visual Studio 2012
    • Install Visual Studio 2012 Update 4
    • Install Application Builder for Visual Studio 2012
    • Install Windows Embedded Compact Edition 2013 Original

    Virtual machine for Windows 7 + Visual Studio 2013 + Windows Embedded Compact Edition 2013 Update 5.

    • Clean install Windows 7
    • Clean install Visual Studio 2013
    • Install Visual Studio 2013 Update 1
    • Install Application Builder for Visual Studio 2013
    • Install Windows Embedded Compact Edition 2013 Update 5

    And I keep both Virtual Machine images separately.

    Useful references:
  • Download the latest Windows Embedded Compact Edition 2013
  • Download Application Builder for Windows Embedded Compact Edition 2013, both for Visual Studio 2012 and 2013
  • Posted On Sunday, April 27, 2014 7:45 PM | Comments (2) | Filed Under [ Windows CE Windows Embedded Compact Microsoft Visual Studio 2012 Smart Device application Visual Studio 2013 ]

    Sunday, March 23, 2014

    The ACPICA library (how to build it for WCE8)

    ACPI (Advanced Configuration and Power Interface) is an open industry specification co-developed by Hewlett-Packard, Intel, Microsoft, Phoenix, and Toshiba.

    Download acpica-win-<date>.zip, acpitests-win-<date>.zip as well as acpica_reference_<x>.pdf and aslcompiler_<y>.pdf from http://acpica.org. Also download ACPISpec50.pdf from http://www.acpi.info to complete your documentation.

    Things to know before building the ACPICA library

    The ACPICA v5.0 package consists of the following (relevant) directory structure:

    Generate
    Msvc9 -> AcpiComponents.sln
    Unix
    Libraries
    Source
    Common
    Compiler
    Components
    Debugger
    Disassemble
    Dispatcher
    Events
    Executer
    Hardware
    Namespace
    Parser
    Resources
    Tables
    Utilities
    Include
    Os_specific
    Service_layers
    Tools
    Tests

    You can use the library in a few ways:

    • Build the Components folder into a C library. This contains the real meat. You typically integrate this library in your kernel.
      • Only use the library to query some info from ACPI tables. That is what I did for the IOAPIC programming
      • The library can also be set up to work directly with ACPI hardware, i.e. power management functionality inside Intel chipset. I didn’t do this.
    • Build the Compiler folder if you are interested in the low level ASL language and AML byte code.
    • Build Tools and Common to create a few executables
      • Test functionality. Mainly intended for ACPICA developers themselves.
      • Realtime querying of ACPI info (integrate in kernel space) from an user space executable
      • Query and debug ACPI tables. Not all board manufacturer “write bug-free” tables. Linux has a way to “overwrite” (replace, redirect) the onboard tables with new “bug-fixed” tables.

    I only use the library to query specific ACPI tables (like RSDT, MADT) for APIC programming.

    Open AcpiComponents.sln (VS2008) and start a full compile. All projects should build successfully for desktop Windows. See later for details for creating a successfull build.

    The solution builds the library and a few executables. We don’t need the executables, but the source code itself is interesting. Depending on the project (library or executable) you build, you need to set a few compiler DEFINEs. This is a bit annoying, because depending on the purpose (target) you want to build, the library source code itself is (re)compiled with different DEFINE sets when used from the executable projects, hence omits or embeds specific pieces of code. The following DEFINEs (excerpt from sources file) worked best for me when compiling the library as a standalone C library.

    CDEFINES=$(CDEFINES) -DACPI_LIBRARY -DACPI_USE_SYSTEM_CLIBRARY -DACPI_SINGLE_THREADED -DACPI_DEBUG_OUTPUT -DACPI_DISASSEMBLER -DACPI_DEBUGGER

    For more information what compiler DEFINEs to use, have a look at Source\Include\acenv.h

    You can choose to build only the library in Source\Components (that is what you need for the WINCE BSP integration), but you can also try to compile the code that is present in the other directories. Although you most likely will copy the source code from it and embed it in your own source code.

    As the acpica_reference_7.pdf document explains you need to “fill in” some OS specific glue logic to let the library integrate with your OS environment. Luckily most of this work is already done for desktop Windows, with a few changes this works for Windows CE as well.

    Keep in mind that we will use this library in the Windows CE BSP OAL layer in the early boot phase, we don’t really have a full OS at our disposal. No threads, mutexes whats so ever. There is a FullLibC with most of the CRT functionality (e.g. strstr(), memcpy(), memset(), …), but no memory allocation APIs like malloc() and free().

    The ACPICA library uses ANSI strings, Windows CE uses UNICODE strings. The library has logging functionality like AcpiOSPrintf(const char* format, ...), but you cannot just redirect it to OALMSG() or NKDebugPrintf() as they would not compile (char <-> wchar_t)

    • To overcome these problems, I added myself my own os_malloc() and os_free() APIs that work on top of NKCreateStaticMapping() and NKDeleteStaticMapping()
    • My own simple ‘ANSI char’ os_printf(), that I redirect myself to OALMSG()

    How did I find the missing API’s? Simply by starting to compile the library and solve the linker problems.

    I succeeded in NOT making any code changes in the Common, Compiler, Components, Tools folders to build the full package. This was my goal as it should be possible to copy in a newer version of the library at all times (provided the organization structure of the library doesn’t change) At regular intervals an update of the package is released. I did all my work on the January 2014 release (ACPICA v5.0).

    Steps to build the ACPICA library for Windows CE 8:

    1. Download the ACPI 5.0 package from https://acpica.org/downloads/windows-source. Download both the “Windows Format Source Code and Build Environment” and “Windows Format Test Suite” package. Unzip both packages in the same folder and only copy the “source” folder in “platform\<Your BSP>\src\acpica” folder. This will give you the following directory structure

    C:\WINCE800\platform\<Your BSP>\src\acpica\
    Common
    Compiler
    Components
    Debugger
    Disassemble
    Dispatcher
    Events
    Executer
    Hardware
    Namespace
    Parser
    Resources
    Tables
    Utilities
    Include
    Os_specific
    Service_layers

    Tools

    1. Download my Windows CE BSP source file package for ACPICA and copy them in the same directory structure. It will add the dirs and sources files and some extra source files in the os_specific\service_layers folder
    2. Download the Flex and Bison tools and install them on your PC. The instructions are listed on the https://acpica.org/downloads/windows-source download page. You need them in the next step.
    3. Open Msvc9\AcpiComponents.sln and build the complete solution. You will get many “warning C4001: nonstandard extension 'single line comment' was used” compiler errors, but you can ignore them.
    4. If all projects from the solution were build correctly, copy the generated Yacc and Lex files from “generate\msvc9\AslCompiler<Debug>|<Release>” to “platform\<Your BSP>\src\acpica\compiler”

    AslCompiler.y.h
    AslCompilerDebug.l.c
    AslCompilerDebug.y.c
    AslCompilerDebug.y.h
    DtParser.y.h
    DtParserDebug.l.c
    DtParserDebug.y.c
    DtParserDebug.y.h
    PrParser.y.h
    PrParserDebug.l.c
    PrParserDebug.y.c
    PrParserDebug.y.h

    This step is only required if you plan to use the ACPI compiler and want to compile its sources. I didn’t use it, although I do compile it for completeness. Alternatively you can always regenerate them in your build process. I didn’t do that as these files don’t change for a particular version of the ACPICA library and it would complicate the build process. I do regenerate them (once) from the solution when I download a new version of the ACPICA library and copy them over manually.
    1. Rename file "include\acpi.h" to "include\acpica.h". This is to avoid interference with the existing "acpi.h" file that exists already in your CEPC based BSP include folders
    2. Find and replace #include “acpi.h” into #include “acpica.h” (nearly 200 replacements)

    The following steps are part of my Windows CE BSP source file package for ACPICA mentioned in step 2. If unpatient, go to step 14.

    1. Create sources files to compile the sub-libraries per folder.
    2. Setup os_specific layer
      1. Copy (rename) file “oswinxf.c” to “oswincexf.c”. This is the ACPICA os adaptation layer
      2. Add file “oswincextr.c”. This will contain missing API’s like malloc(), free(), printf(), …
      3. Add file “tools.c”, “common.c”, “compiler.c”. In these files I copy source code from non-library source code in the package (Tools, Compiler, Common folder) that are handy to use as well.
    3. Add missing os_specific API’s
      1. ACPICA Library (mandatory, Components folder)
      2. Executables (optional, Common, Compiler, Tools folder)
    4. Add os_malloc() and os_free()
    5. Add os_printf()
    6. Add acpica_itf.c. This contains my personal code to extract APIC related stuff from the ACPICA library. More on this in the next blog.
    7. Build

    Things to know when you examine the os_specific\service_layers folder

    Many API functions that are needed for the compilation and linking were added, but I left them mostly unimplemented with a log message. Only when during runtime I encountered a problem, I decided to implemented them further.

    C malloc()  and free() are redirected to os_malloc() and os_free(). These methods work on top of a statically 256K memory buffer where they allocate and free memory from.

    AcpiOsPrintf() is the log function where all ACPICA logging is redirected to. This method redirects to vfprintf() (see oswincexf.c). I provided an implementation for vfprintf() and redirect it to my AcpiOs_vsprintf() function (see osprintf.c). This printf() alike version implements the minimum format specifiers and arguments that are used by the ACPICA library. It is a minimalistic implementation of the standard C printf().

    The ACPICA library defines global variables via DEFINE_ACPI_GLOBALS and ACPI_INIT_GLOBALS() macro. If you are not careful, you end up defining global variables more than once. Instead I define the global variables myself to avoid linker problems.

    I had to set WARNISERROR=0 in the os_specific\service_layers sources file to solve a linker problem I could not fix otherwise. (e.g. warning C4273: 'xxx' : inconsistent dll linkage) .This is unfortunate and annoying, because some compiler errors are not flagged anymore as error. So keep an eye on the output window when you (re)compile this folder. There will always 4 warnings (not treated as error) and there should ONLY be those 4. Any other warning not flagged as error is an error!.

    • warning C4273: 'GetTickCount' : inconsistent dll linkage 
    • warning C4273: 'Sleep' : inconsistent dll linkage
    • warning C4273: 'IsBadReadPtr' : inconsistent dll linkage 
    • warning C4273: 'IsBadWritePtr' : inconsistent dll linkage

    The reason is that the prototype sneaks in via #include <windows.h>, but I had to implement them myself (empty). These API’s are decorated with __declspec(dllimport) which cannot be used in early OAL code (of course).

    The next step

    There is much more to tell/learn about this library, the best way is to read some of its source code and use it. A great help in understanding ACPI and related topics can be found at OsDev.

    So far for ACPICA integration in a Windows CE CEPC based BSP. Next blog will deal with APIC programming and how to use ACPI for that purpose.

    Useful references:

    Posted On Sunday, March 23, 2014 8:26 PM | Comments (3) | Filed Under [ Windows CE Windows Embedded Compact Embedded APIC ACPI ACPICA BSP ]

    Powered by: