Search
Close this search box.

Avoid common globalization errors in .NET

Last week I’ve been looking for a better RSS Reader (err… Aggregator). You see after certain number of feeds MS Outlook no longer works for me and I needed a better alternative. In my search I came a cross a product called RikReader and decided to give it a try. Why? I’ve read some good opinions about it, it’s free, it uses Windows RSS Platform, and is close to my heart because it is done entirely in WPF.

I promptly downloaded and installed the package. But instead of the expected welcome screen here is what I get every time I try to run it:

(The error message is in Polish but it says: The input string has incorrect format)

As you can see the application includes a very nice error reporting window (every modern application should have one!) so Douglas Stockwell, its author is probably already aware of the problem and hopefully working on the solution. But I think he wouldn’t mind to give him a hand.

So what exactly is this problem here? From the stack trace above we can see that it all start in a custom binding converter called MultiplyConverter. My guess is that when this converter expects a decimal number, but encounters a String argument it first tries to convert it to a Double value by calling System.Convert.ToDouble. In the end this method rises exception to indicate that the given text doesn’t seem to represent a valid decimal number.

Looks like a pretty obvious bug, so you may wonder why the author didn’t fixed it already? You say he can’t reproduce it? So what is so special about my system that I always get the same error? The answer is this: it uses different language settings! Should it matter?

Not all of my English-speaking fellow developers might realize this (and those who do sometimes “forget” about it) but not all cultures use the same format to write numbers or dates. You see in certain languages, like French (or in my case in Polish), we write decimal number where the decimal separator is not a dot as in English but a coma. So instead of 123.456 we would write 123,456. A subtle difference but with huge implications (at least for the dumb computer).

Here is a simple example to demonstrate this. The .NET library provides several ways to convert a number to a string:

double number = 123.456;
string convertToString = Convert.ToString(number);
string numberToString = number.ToString();
string stringFormat = String.Format("{0}", number);

What do you think would be the value of each of these strings after running this code? The correct answer is it depends. For some systems this might be  “123.456”, but for others this could as well be “123,456”. This is because by default, each of these methods uses the current culture settings to produce the text representation of the numerical value.

Similar there are several ways in .NET to parse the text to produce a decimal value:

string text = "123.456"; 
double convertToDouble = Convert.ToDouble(text); 
double doubleParse = Double.Parse(text); 
double doubleTryParse; 
if (Double.TryParse(text, out doubleTryParse) == false) 
    Console.WriteLine("Can't parse string.");

Is this code correct? It also depends in which culture you run it. In some the code executes fine and each double variable gets the correct number. In others the first two methods would throw exceptions and the last one will return false to indicate the string is invalid.

What this all means for you? If your application depends on storing decimal values in certain formats — for example attempts to read some settings from a text file — it can never depend on these methods alone. But not all is lost, and of course .NET provides means to handle such problem.

Each of these methods has overrides that can take IFormatProvider as an argument. How do we find one?. It could be hard to guess at first but in this case you should use appropriate instance of CultureInfo class. So if you expect that the input value you need to parse will be formated in Polish use CultureInfo(“pl-pl”). The same applies to output. Of course sometimes it might be hard to determine what culture you should use. But if you are in a lucky position that your application controls both input and output format then you can just pick any culture and use for both. Actually .NET already provides a nice shortcut for this via CultureInfo.InvariantCulture.

I hope that by now you already know how to fix the above examples. Here is the corrected version that uses InvariantCulture:

string convertToString = Convert.ToString(number, CultureInfo.InvariantCulture); 
string numberToString = number.ToString(CultureInfo.InvariantCulture); 
string stringFormat = String.Format(CultureInfo.InvariantCulture, "{0}", number);

double convertToDouble = Convert.ToDouble(text, CultureInfo.InvariantCulture); 
double doubleParse = Double.Parse(text, CultureInfo.InvariantCulture); 
double doubleTryParse; 
if (Double.TryParse(text, NumberStyles.Any, CultureInfo.InvariantCulture, out doubleTryParse) == false) 
    Console.WriteLine("Can't parse string.");

As you see this change is fairly simple so I strongly encourage you to review your code and always consider what would be the proper format of the input or output value. The same rules apply to other decimal data types (Float and Decimal), but also to formating dates (DateTime).

If you have a large, existing application reviewing it quickly can be bit cumbersome, but thankfully FxCop contains set of rules that can help find such spots. Here is the code analysis report obtained by running FxCop on the first version of the above examples:

As you can see, it properly lists all lines with calls to methods that have overrides that take IFormatProvider (apart from Double.TryParse). With the corrected version of code all these warnings disappear.

Also worth noting is that all methods used in the first example are actually shortcuts for passing the CultureInfo.CurrentCulture value that (if you haven’t changed it in your code) holds the culture of your operating system. For most purposes, like displaying the value to the user, this works fine and you could leave it as it is. But when reading files or any external data-source you should always use exactly the same culture that was used to write it.

Hopefully this information would help you avoid very common globalization errors that are quite annoying to many people living in this part of the world. If you need help ensuring that your application runs with no problems on non-English systems just e-mail me and we can test it together.

You can download the sample project here

This article is part of the GWB Archives. Original Author: Szymon Kobalczyk

Related Posts