Recently I was working on a tool program that allowed me to read/rewrite the complete binary image of a drive (HDD) in Windows CE 8 (Compact Edition 2013).
We use this program to restore a complete image to a (bootable) drive as part of the software upgrade of our device/application.
At some part of writing the image to the drive, the program itself crashed with an AV while reading a static const array of integers.
Debugging with Visual Studio 2012 showed that at that particular place half of the static const array memory area was not readable (at a 4k page boundary).
At first this was very odd, because when I took the code apart and debugged with a separate test program, nothing was wrong.
What was going on?
Remember what I said in the beginning. "At some part of writing the image...".
Well, in order to write (or read) unconditionally, unaware of any filesystem, to the drive, the program first had to unmount the filesystem.
Next the drive can be safely written with new binary data.
However the program itself initially resided on the drive that was going to be overwritten with a complete new image (potentially even with a new filesystem).
On itself this is not a problem. I thought. I assumed that the OS (WCE8 in this case) would read the complete executable in memory before it started the main() code (it is a C++ program).
Enter "On Demand Paging" feature in Windows CE. To preserve memory the OS can page out unwanted blocks of memory that can be read into memory at a later stage again.
Or instead of reading the complete executable image into memory before executing it, the OS only loads a part of the executable code and static data into memory.
If the executable at some point jumps to a code address or accesses statics/constants (defined in .data segment) which are not yet loaded into memory,
the OS underlying hits a page fault exception and handles it appropriately. I.e. it reads a new page (4k by default) from disk into memory.
However if the drive is no longer accessible (as the program itself acted as Grim Reaper on the drive), the OS could not load the missing executable region with the static const array
into memory. This happened completely invisible to the executable and VisualStudio debugger, probably because a page fault exception is gracefully implemented and handled by the OS as a "normal" situation.
The end result was that some code and constants regions were no longer readable with valid data, causing the AV in my program.
What can you do to solve this?
Make sure all code has been "touched" or run before you actually unmount the drive. I.e. make sure all code and constants are paged into memory.
ROMFLAGS in config.bib allows you to disable "On Demand Paging" OS wide. The MSDN states that this gives better realtime performance. You should read this as:
if a program is executing realtime code (needs to run deterministically), disabling On Demand Paging avoids unpredictable reading of code into memory prior to executing the code.
Play with Pageable flag of C++ linker option /SECTION
I disabled On Demand Paging system wide as our applications favor realtime performance. Set ROMFLAGS in config.bib.
; ROMFLAGS is a bitmask of options for the kernel
; ROMFLAGS 0x0001 Disallow Paging
; ROMFLAGS 0x0010 Trust Module only
; We disable Demand Paging system wide as this will improve realtime performance
ROMFLAGS=11 ; 10
ROMFLAGS=01 ; 00