Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  133 Posts | 8 Stories | 368 Comments | 162 Trackbacks

News



Archives

Post Categories

Image Galleries

Programming

What is wrong with this code?

List<Bitmap> bitmaps = new List<Bitmap>();
private void button1_Click(object sender, EventArgs e)
{
   for (int i = 0; i < 10 * 1000; i++)
   {
       var bmp = Resources.SmallIcon;
       bitmaps.Add(bmp);
   }
}

Update: This issue is fixed with KB 2929755 from Windows 7 up to Windows 8.0 and

               with KB 2919355 for Windows 8.1 and Server 2012 R2.

This is my first Windows bug that actually got fixed. The later update was rolled out via Windows update already in April so you should have it by know already.</Update End>

From a correctness point of view nothing. You allocate 10 000 bitmaps which are e.g. a 32x32 icon which has about 1KB size. That would make another 10 MB of memory which is nothing out of the ordinary. You let your application run for a few hours and Task Manager shows you 1 GB of memory consumption and boom you run out of memory. The interesting thing is that you have not used up all available memory but your 32 bit application has run out of address space.

When I do execute this on my home machine with Windows 8.1 x64. I get the following numbers

image

We allocate 163 MB in private bytes. That is expected. But what about the 1.7 GB of reserved memory? Ok Windows might have a bad day. Lets try it with 5000 bitmaps.

 

# of Bitmaps Reserved KB KB Reserved per Bitmap KB # 64 KB Blocks/Bitmap

10.000

1.709.096

171

2.7

5.000

986.024

197

3.1

 

When I do this for zero and one bitmap I get 3 64 KB file mappings for each bitmap which has one KB size. When all address space is used up Windows GDI seems to free some memory to leave place for future allocations (that explains the factor of 2,7).

Windows is reserving memory 192 times the size of our bitmaps!

This is true for 64 bit processes on Windows 8/8.1 as well but there it does not matter. The behavior is the same for all editions of Windows 7 x86 and Windows 8.x x86/x64. The only Windows edition where this does not happen is Windows 7 x64 where the GDI implementation is different. When I look at some applications which make use of bitmaps like Windows Live Writer I do see 80 MB of shareable memory where lots of 64 KB blocks are allocated. This is a real problem in all applications that read many bitmaps out of resources! Interestingly the most prominent example VS2013 on my machine has no such problems. It seems that MS is not using their own infrastructure…

Next time your managed application does run out of memory check your Shareable Reserved bytes count for many 64KB blocks. If you have many of them then you should check out your bitmap usage.

What is this Shareable Memory anyway? Good question. Shareable Memory are file mapping objects which are page file backed. Any file mapping that does point to a real file on disc is shown as Mapped File with the full file name. That is the reason why there is not much to see except the allocation address inside VMMap. That was an interesting bug which involved some time with Windbg and a memory dump of a crashed process.

I can recommend the !address extension of Windbg to dump of a process all pages and their allocation mode and export the data into excel from where it is easy to create pivot charts to find out that the Sharable Memory is all allocated as 64KB blocks. Once that was known I created some Windbg scripts to break into the CreateFileMapping and MapViewOfFile methods to get the allocation stacks. Unfortunately due to the many file mapping objects created by the process it became so slow that the application was unusable. A faster approach than Windbg was needed.

After a little thinking I thought why not intercept the calls with a hooking library like EasyHook? It is a great library which allows you to execute managed code in your hook while you can intercept nearly any Windows API call. Now I can get my hands on the mapping address (MapViewOfFile return the mapped address, UnMapViewOfFile receives the mapping address to unmap) and allocation size besides other things. But one thing is missing: Call Stacks.

Here it is good to know that .NET support ETW events by now. What is easier than to create an ETW provider in managed code which logs the all calls to MapViewOfFile and UnmapViewOfFile as structured events? The kernel knows how to walk the stack very fast. That's what I did and it did work perfectly. When you know how WPA displays custom ETW events you can even use it to analyze unbalanced map/unmap calls. The key insight here is that you need to put the mapping address of Un/MapViewOfFile into the same column. Now you can group by this field and get as grouping the mapping address. Now sort by count and look for all events which have the only one mapping call but no unmap. Select all unbalanced maping calls and filter for them.

Now you can earn the fruits of your grouping by adding the StackTrace column and look in which methods most unbalanced allocations are happening. Perhaps I can get permission to make the code public so you can use it for other purposes as well. The concept is general. You need to log the allocation and deallocation events into ETW and put the handle (e.g. the address or HANDLE) as first property into your ETW event. Then you can record all allocation and deallocation events with full call stacks. WPA will display the handle in the first column for your allocation and deallocation calls which allows further grouping and filtering.

You can skip that part and use wpaexporter to export your events to a CSV file and write your own tool to do more sophisticated analysis. Or you could use the TraceEvent Library and process the ETW events directly. It is up to you.

If you are having a hard time to find leaks related to GDI objects like Bitmap, Brush, Device Context, Font, Metafile, Pen, Region, Palette you can hook the calls and have an easy day. This approach is not new and has been described already 2003 in this MSDN magazine article. There are also the method names listed you might be interested in to hook. The new thing is that it was never so easy to get full managed and unmanaged call stacks with nearly no performance impact. The MSDN article relies on SetWindowsHookEx to inject code into other processes and dbghelp.dll of walk the stack with the debugging Apis. The combined ETW and EasyHook approach is much more performant and general (full mixed mode stack walking).

Without that it would have been much harder to find out the root cause. With ETW tracing you have access to first class information which is not based on guessing but exact measurements. That not only allows you to find bugs but you can now also make quantitative predictions if you fix the bitmap usage in this or that method how much address space and private bytes  memory you will get back. It is kinda cool to go to the devs and tell them: If you fix that method you will free up 300MB of virtual address space. If we fix the top 5 then we can reduce the address space usage of Shareable Memory by 70%. Most of the time they say: "I do not believe that this innocent method is responsible for such a big amount of allocations. Here is the fix and check again." After measuring the gain of the fix I can compare the predicted results and tell them: "Looks like my prediction was correct and we have an exact match."

This is a real game changer. Now you can work focused on the most important methods which cause a leak based on accurate measurements. Thanks to the low impact ETW tracing it is possible to run all your tests with hooks enabled giving you in the case of leaks accurate data to find the leak without the need to repeat the tests! At least in an ideal world. But since setting the hooks up may cause random crashes it is safer to employ this technique only when a leak was found.

There is one drawback to hooking and ETW stack walking: If you are on Windows 7 stack walking will only work in 32 bit processes which (lucky me) was the case for this problem. The ETW stackwalker in Windows 7 stops at the first dynamically generated stack frame (the assembly code for the hooks) which will give you only a stack which ends in your trace call. Not terribly useful. This has been fixed in Windows 8+ where you can hook with full call stacks also in 64 bit processes.

Since Windows 7 will stay for some time I really would like to get a fix for the Windows 7 ETW infrastructure to get full call stacks in 64 bit processes. For managed code you can NGEN your assemblies which fixes the issue if you do not have much JIT generated code running in your process. The stackwalker of Windows 7 will stop when it encounters the first frame which is not inside any loaded module.

I can only speculate how many 32 bit applications are affected by this issue and run out of memory much earlier than they would need to. At least for regular .NET applications this is a real issue nobody did notice so far. The 64KB value is no accident. It is the OS allocation granularity which is the reason why always 64 KB chunks of memory are reserved although you can map smaller regions (e.g. 4KB as well). The OS allocation granularity seems to be used by GDI and put into the Maping calls explicitly. The called GDI method is GdipCreateBitmapFromStream which assumes an IStream interface which is converted to a file mapping region under the hood. Other GDI methods seem not to be affected by this issue but one can never know. If you have a bitmap intensive managed x32 application (many icons) you should better check with VMMap the amount of Shareable Memory when your application has run out of memory.

posted on Friday, January 3, 2014 1:19 PM

Feedback

# re: GDI+ Bug in Windows 7 x86 and Windows 8.0/8.1 x86/x64 1/6/2014 7:59 PM ronin4net
nice artice :)

however, if u r not able to reproduce ur bitmap issue but have a dump available here is another approach:
- load dump in WinDbg
- foreach Bitmap from managed heap follow the steps described here:
http://social.msdn.microsoft.com/forums/en-US/winforms/thread/cd86ddf7-da4b-4b85-a737-b516dbb13d03
- use ".writemem" to write the bitmap to the disc (start pointer + size can be finally read from GpMemoryBitmap structure)
- use ImageMagick to transform the raw bitmap to e.g. pngs
- (of course some windbg scripting might be necessary to semi-automate these steps assuming u have tons of bitmaps)
- browse the icons in explorer to get a feeling which icons/resources create most of the trouble
(assuming that from the image itself u know the to analyse)
- alternatively u could create a histogram from all icons using a hash on the pixel data to produce a top 10 listing



# re: GDI+ Bug in Windows 7 x86 and Windows 8.0/8.1 x86/x64 1/7/2014 11:50 AM Alois Kraus
Hi ronin yes this is a very good idea if you cannot get the call stacks because one dump is all you have.

Post A Comment
Title:
Name:
Email:
Comment:
Verification: