Previously I posted instructions for finding the source of a data abort, see  Windows CE: Finding the cause of a Data Abort.  This will walk through those steps to find the source in a real application for.  This is specific to Windows CE and later.

I have this data abort:

AKY=00000005 PC=02c138ac(lan91c111.dll+0x000038ac) RA=02c138a8(lan91c111.dll+0x000038a8) BVA=06000000 FSR=00000007

From this, I can see that it is in lan91c111.dll.  Lan91c111.dll is my Ethernet driver.  I was just making changes to it, so I could go back and review my changes for hints.  But let's find it from the Data Abort output.

  1. We can also see that the Return Address (RA) is at 0x000038a8 
  2. Subtract 0x1000 to find the Module Offset(MO) of 0x000028a8
  3. We can now look up the Module Offset in lan91c111.map.  Here is a small section of lan91c111.map:

     Address         Publics by Value              Rva+Base     Lib:Object
     0001:00002780       READ_ETH_USHORT_32BIT_MODE 10003780 f   LAN91C111_Init.obj
     0001:00002798       WRITE_ETH_USHORT_32BIT_MODE 10003798 f   LAN91C111_Init.obj
     0001:000027cc       WRITE_ETH_CUSTOMIZE_USHORT 100037cc f   LAN91C111_Init.obj
     0001:00002820       LAN91C_Write16             10003820 f   LAN91C111_Init.obj
     0001:00002884       LAN91C111_MiniportInitialize 10003884 f   LAN91C111_Init.obj
     0001:00002fc4       LAN91C111_MiniportISR      10003fc4 f   LAN91C111_Intr.obj

    Looking at the addresses we find that the MO is between 00002884 LAN91C111_MiniportInitialize and 00002fc4 LAN91C111_MiniportISR.  That tells us that the Data Abort occurred in LAN91C111_MiniportInitialize.  To calculate the Instruction Offset(IO) subtract the Function Offset(FO) from the Module Offset:  0x00002858 - 0x00002884 = 0x28.
  4. The file that contains LAN91C111_MiniportInitialize is Lan91C111.c, which we know because of the name of the object file that the function is in.  But, what we need is the COD file, which contains the C code as comments mixed with the assembly code that was created when the file was compiled.  The COD files are in the same folder that the OBJ files are in.  If you don't have the COD files, set WINCECOD=1 and rebuild.

    Looking in the COD file, find the function; in this case LAN91C111_MiniportInitialize.  This is what mine looks like:

    00000    AREA  |.rdata| { |??_C@_1BC@ECFINIDN@?$AA?$CK?$AAp?$AAt?$AAr?$AA?5?$AA?$CF?$AAX?$AA?6?$AA?$AA@| }, DATA, READONLY, SELECTION=2 ; comdat any
    |??_C@_1BC@ECFINIDN@?$AA?$CK?$AAp?$AAt?$AAr?$AA?5?$AA?$CF?$AAX?$AA?6?$AA?$AA@| DCB "*"
     DCB 0x0, "p", 0x0, "t", 0x0, "r", 0x0, " ", 0x0, "%", 0x0, "X"
     DCB 0x0, 0xa, 0x0, 0x0, 0x0   ; `string'
    ; Function compile flags: /Ogsy
      00000    AREA  |.text| { |LAN91C111_MiniportInitialize| }, CODE, ARM, SELECTION=1 ; comdat noduplicate
      00000   |LAN91C111_MiniportInitialize| PROC
    ; 144  : {
      00000   |$L48878|
      00000 e92d47f0  stmdb       sp!, {r4 - r10, lr}
      00004 e24dd054  sub         sp, sp, #0x54
      00008   |$M48876|
      00008 e1a06003  mov         r6, r3
      0000c e1a04002  mov         r4, r2
      00010 e1a07001  mov         r7, r1
    ; 145  :     NDIS_STATUS         Status = NDIS_STATUS_SUCCESS;
    ; 146  :     UINT                ArrayIndex;
    ; 147  :     PMINIPORT_ADAPTER   Adapter;
    ; 148  :     USHORT              temp;
    ; 149  :
    ; 150  :     LPVOID              lpIOBase;
    ; 151  :     BOOL                RetVal;
    ; 152  :  DWORD    *ptr = NULL;
    ; 153  :     WCHAR szFunctionName[]  = L"LAN91C111_MiniportInitialize()";
      00014 e59f1720  ldr         r1, [pc, #0x720]
      00018 e28d0014  add         r0, sp, #0x14
      0001c e3a0203e  mov         r2, #0x3E
      00020 eb000000  bl          memcpy
    ; 154  :
    ; 155  :
    ; 156  :  RETAILMSG( 1, (TEXT("*ptr %X\n"), *ptr ));
      00024 e3a03000  mov         r3, #0
      00028 e5931000  ldr         r1, [r3]
      0002c e59f0704  ldr         r0, [pc, #0x704]
      00030 eb000000  bl          NKDbgPrintfW

    The numbers on the left of the assembly code are the Function Offsets, and we can see that at offset 0x28 we have:

      00028 e5931000  ldr         r1, [r3]
     
    Which is dereferencing an indirect address which we can see is the *ptr in the C code above it:

    ; 156  :  RETAILMSG( 1, (TEXT("*ptr %X\n"), *ptr ));

Now the hard part, why is dereferencing the pointer a problem?  In this case, it is because ptr is NULL, but you may need to get out a debugger to find the cause.  But at least we now know where the problem is.

 In some cases, you may need to start with the Program Counter (PC) insteaad of the Return Address (RA) to find the source of the problem.

 Update 10 June 2008

Sometimes the assembly line found is not really the source of the problem.  This can be becuase of the CPU instruction pipeline.  In the following real problem that I just had, the Module Offset of the data abort was at 0x1C:

; 1047 :  BOOL bRet = TRUE; // This will be set to FALSE by an unsuccesful IOCTL call
; 1048 :  *lpBytesReturned= 0; // Make sure this is initially zero.

  00010 e59d7020  ldr         r7, [sp, #0x20]

; 1049 :
; 1050 :  RETAILMSG( 1, (TEXT("XXX_IOControl code\n")));

  00014 e59f0b00  ldr         r0, [pc, #0xB00]
  00018 e3a03000  mov         r3, #0
  0001c e5873000  str         r3, [r7]
  00020 e3a04001  mov         r4, #1
  00024 eb000000  bl          NKDbgPrintfW

But in this case the actual problem was up a few lines at offset 0x10, the dereference of the pointer lpBytesReturned.  The application developer had passed in the value of, rather than the pointer to, the data.

 Tags: Data Abort
Copyright © 2008 – Bruce Eitman
All Rights Reserved