Just recently I bumped into a very nasty bug that I had been unfortunate enough to conjure. Alignment of memory has never been my primary concern when working on the PC. As a typical C++ programmer you often don’t have to think about such things. On the PC this is usually “almost never” (when not optimizing, that is) and in a managed environment this truly should become “never”. On ARM, however, “never” becomes “almost never” again.
Having your memory aligned means storing values of different sizes at addresses that are multiples of a certain number. The typical CPU gives you bonus when you have your memory properly aligned but does not kick you when it’s not. ARM, on the other hand, is not that nice.
Issues arise when you try to write a memory block at a random location by interpreting that location has a value of a given type, as is done in the following example.
void *pMyBuffer = malloc( sizeof( int ) * 2 );
*(int *)( (char *)pMyBuffer + 2 ) = 10;
Malloc returns memory aligned at the worst case boundary so writing at the beginning of the memory block would be ok. Writing an int at a two byte offset means you’re writing at a memory address that’s not a multiple of four. Optimized code on ARM does not like this.
The easiest way around this is writing byte by byte when absolutely necessary. When compiling with GCC, it appears that disabling compiler optimizations might also get you around this issue. I think we can all agree that that’s not the best solution to this problem, however.
The problem I recently ran into was related to this issue but it occurred in a managed environment! I was running Mono on Android and was using a Vector3 structure for which I had explicitly specified the memory alignment by telling the run-time to pack the structure at byte boundaries. The issue arose when I was accessing and instance of the structure embedded in another class.
I was initially puzzled, since I had tested the structure with another class and all had worked fine. After rigorous testing, and fixing the issue by specifying the structure at a different location within the class, I finally figured that it must be a memory alignment issue.
What I find somewhat interesting is that I hadn’t specified the packing of the containing class and thus the run-time should still have been able to align my memory correctly in the containing class while still respecting the layout of my Vector3 instance. This is actually something that led me into believing that there is a bug in Mono. Feel free to comment on this post if anyone has any insight on this.
The lesson that we all should learn from this? Sometimes it helps to know what happens under the hood, even in a managed environment.