James Michael Hare

...hare-brained ideas from the realm of software development...
posts - 166 , comments - 1431 , trackbacks - 0

My Links

News

Welcome to my blog! I'm a Sr. Software Development Engineer in the Seattle area, who has been performing C++/C#/Java development for over 20 years, but have definitely learned that there is always more to learn!

All thoughts and opinions expressed in my blog and my comments are my own and do not represent the thoughts of my employer.

Blogs I Read

Follow BlkRabbitCoder on Twitter

Tag Cloud

Archives

Post Categories

.NET

CSharp

Little Wonders

Little Wonders

vNext

C#/.NET Fundamentals: Returning Data Immutably in a Mutable World

One of the things I sort-of miss from C++ (it has its good and bad) is the const modifier.  Yes, while it’s true that we have a const modifier in C# (as well as readonly), but it’s not quite as robust. 

Many times you’ll want to return an internal member of a class but not want it to be directly modifiable by the user of that class.  This article discusses how to present simple types as read-only. 

Note: I’m deliberately avoiding creating read-only views of collections in this particular article, but I will cover that in a follow-up article as there are multiple ways to achieve that as well.  For the purposes of this article we will assume the types are all fairly flat with no collections.

How C++ Lets You Mark Identifiers const:

In C++ you can mark an identifier (even parameters) of any type as const and you cannot call any mutable operations on that identifier.  Now, to be fair, C++ makes you jump through hoops to gain that distinction.  You have to mark every method that does not mutate the type as const as well to indicate that the method will not mutate the type:

   1: // C++ code, sorry for all you C#-ers
   2: class Queue
   3: {
   4:     private:
   5:         // ...
   6:  
   7:     public:
   8:         // const modifier here says Count() will not modify Queue.
   9:         int Count() const;
  10:  
  11:         // no const modifier, so Pop can modify Queue.
  12:         void Pop();
  13:  
  14:         // ...
  15: };

So, if you have two different identifiers (even parameters) of type Queue:

   1: // can call any methods on nonConstQueue, but only explicitly marked
   2: // const methods on constQueue.
   3: void DoSomethingWithTwoQueues(Queue& nonConstQueue, const Queue& constQueue)
   4: {
   5:     // ...
   6: }

In this case, nonConstQueue is not declared const, so you can call any Queue method.  However, constQueue, is declared const, so you can only call methods that have explicitly been marked const.

Now, in C++ you can cheat this by casting away const-ness or marking members as mutable.  But on the whole this gives you an easy way to present a read-only view of mutable data.

How const/readonly Works in C#:

So, how does this relate to C#?  Well, in C# you can mark identifiers as const but only if they are compile time constants.  This means that the constant must be a numeric literal, boolean literal, string literal, char literal, or null for reference types.

So what about readonly?  In C#, readonly is akin to Java’s final modifier in that it works well for value types (numeric, char, struct) and immutable types (string), but for any mutable reference type is somewhat ineffective (for a larger discussion of const versus readonly, see here).

That is, the reference is readonly, but what it refers to is not.  So if you had:

   1: public Sender
   2: {
   3:     private readonly List<string> _hosts;
   4:   
   5:     // ...
   6: }

This prevents you from saying:

   1: // cannot change what we refer to, _hosts is readonly.
   2: _hosts = new List<string>();

But you can mutate the existing list:

   1: // can do this, we aren't changing the reference (which is readonly) 
   2: _hosts.Clear();

This can lead to confusion for the novice developer.  To be fair, though, I can totally see why C# (and Java) took this route, because C++ const-correctness can be extremely maddening.  Essentially, as mentioned before, you have to mark every method as to whether or not it will mutate the class.

So what are we to do if we want to present truly read-only data?  Well, you have several options, none of these are perfect, in any sense, but any of them will give you a reasonable level of read-only protection. 

Present a Read-Only Interface

One of the things you can do to make a type harder to modify is have it present a read-only interface.  Note that I’m not talking about the readonly keyword, but an interface that only exposes non-mutable operations.  For example, let’s say you had a POCO like Product:

   1: public class Product 
   2: {
   3:     public string Name { get; set; }
   4:     public int Id { get; set; }
   5:     public string Category { get; set; }
   6: }

Any time an instance of this class is exposed, the members can be altered.  Thus even if you had:

   1: public class CatalogEntry
   2: {
   3:     private readonly Product _product;
   4:     private readonly double _price;
   5:  
   6:     // incidentally we could have done this with a private setter 
   7:     // instead of backing field, but wanted to illustrate using readonly
   8:     public Product Product { get { return _product; } }
   9:     public double Price { get { return _price; } }
  10:  
  11:     public CatalogEntry(Product product, double price)
  12:     {
  13:         _product = product;
  14:         _price = price;
  15:     }
  16: }

There’s nothing to stop you from saying:

   1: var entry = new CatalogEntry(new Product { Id = 3, Name = "Widget", Category = "Theoretical" }, 3.14);
   2:  
   3: // allowed, _product is readonly, but _product.Name is not!
   4: entry.Product.Name = "Ooops";

So how can we mitigate this?  Well, we could provide a read-only interface for our product such as:

   1: // create an interface without mutators.
   2: public interface IReadOnlyProduct 
   3: {
   4:     public string Name { get; }
   5:     public int Id { get;  }
   6:     public string Category { get; }
   7: }
   8:  
   9: // POCO class implements the read-only interface and adds mutators
  10: public class Product : IReadOnlyProduct
  11: {
  12:     public string Name { get; set; }
  13:     public int Id { get; set; }
  14:     public string Category { get; set; }
  15: }

Now, with this read-only interface, we can make classes that want to use product but keep it from being altered expose only the IReadOnlyProduct interface.

   1: public class CatalogEntry
   2: {
   3:     private readonly Product _product;
   4:     private readonly double _price;
   5:  
   6:     // incidentally we could have done this with a private setter 
   7:     // instead of backing field, but wanted to illustrate using readonly
   8:     public IReadOnlyProduct Product { get { return _product; } }
   9:     public double Price { get { return _price; } }
  10:  
  11:     public CatalogEntry(Product product, double price)
  12:     {
  13:         _product = product;
  14:         _price = price;
  15:     }
  16: }

Now, if you attempt to directly modify Product, you will get a syntax error because there are no setters exposed.

   1: var entry = new CatalogEntry(new Product { Id = 3, Name = "Widget", Category = "Theoretical" }, 3.14);
   2:  
   3: // Now, this is a compiler error because IReadOnlyProduct does not expose a Name setter.
   4: entry.Product.Name = "Ooops";

So is this perfect?  Not really.  The main problems this approach has is that you have to create read-only interfaces for everything you want to protect, and there’s nothing that prevents a user of your class from directly casting it back to Product and then modifying the values.  As such, it’s probably not the best approach.

Present a Struct

One of the other options we have is to make the type a struct.  Remember that struct types are value types and thus any time you pass them around (which includes returning them from property gets) you pass them by value, which makes a full copy.  Since you are passing a copy, if the user chooses to modify the copy that is their business, but it will not affect your copy.

   1: // now makes a struct, type is now pass-by-value.
   2: public struct Product 
   3: {
   4:     public string Name { get; set; }
   5:     public int Id { get; set; }
   6:     public string Category { get; set; }
   7: }

Now with our CatalogEntry, we have:

   1: public class CatalogEntry
   2: {
   3:     // making private setters to illustrate can do either way
   4:     public Product Product { get; private set; }
   5:     public double Price { get; private set; }
   6:  
   7:     public CatalogEntry(Product product, double price)
   8:     {
   9:         Product = product;
  10:         Price = price;
  11:     }
  12: }

Now, if we attempt to alter the Product returned:

   1: var entry = new CatalogEntry(new Product {Id = 3, Name = "Widget", Category = "Theoretical"}, 3.14);
   2:  
   3: // compiler gives us an error saying cannot set Product Name because "not a variable".
   4: entry.Product.Name = "Ooops";

We get a nice compiler error saying we can’t modify the Product Name field because the Product is returned by value and is a local copy which cannot be modified.

Now, you can do this:

   1: var entry = new CatalogEntry(new Product {Id = 3, Name = "Widget", Category = "Theoretical"}, 3.14);
   2:  
   3: // you can do this...
   4: var product = entry.Product;
   5:  
   6: // but you're modifying a copy and it doesn't affect original.
   7: product.Name = "Ooops";

But this is rather benign because you at that point product is a copy of entry’s Product and thus the original is unaltered.

So, a struct gives you a bit better of protection, but it’s not a panacea.  The thing to remember is that struct is not class and there are major differences between the two (I have an entry with a table of differences here).  One of the major points being it is always passed by value, which can be heavy if your type has more than a few properties.

Also, this method does not protect you if you are exposing a mutable reference type as a property inside the struct.  This is because the copy of the struct is shallow, only the reference is copied and not what it refers to.  This is fine for immutable types (like string), but if you expose a mutable reference type, that can still be changed.

Present a Immutable Type

One of the ways you can also create a read-only type is to make the type immutable.  That is, once the type has received its initial value, it cannot be changed.  In C#, strings (among others) are immutable.  Any operation you perform that would change the string actually returns a new string instead.  This method can be used either in plus of using struct or in addition to it. 

So, how do you create an immutable type?  Basically you would take in all the information necessary in the constructor of the type, and then only expose non-mutating methods and property gets:

   1: public class Product 
   2: {
   3:     // all properties have private sets so only the class itself can change them.
   4:     public string Name { get; private set; }
   5:     public int Id { get; private set; }
   6:     public string Category { get; private set; }
   7:  
   8:     // constructor takes (or calculates) all it needs and sets
   9:     public Product(int id, string name, string category)
  10:     {
  11:         Id = id;
  12:         Name = name;
  13:         Category = category;
  14:     }
  15: }

So now, using the same CatalogEntry as before, we would have:

   1: var entry = new CatalogEntry(new Product(3, "Widget", "Theoretical"), 3.14);
   2:  
   3: // can't do this, Name doesn't have get exposed.
   4: entry.Product.Name = "Ooops";

Notice, this seems very similar to exposing a read-only interface.  The difference being that this cannot be cast to anything to get at the private members.  Since the type itself protects the sets, they cannot be accessed and you will get a compiler error.

So what’s the down-side to this?  Well, you have to specify pretty much everything in the constructor, so if your class contains a lot of properties, you can have a very messy constructor.  Plus, because we have to pass all values in the constructor, we can’t use object initializers for the properties.  To me this makes the code less readable. 

For example, in the code snippet above is “Widget” the name, or the category?  Same with “Theoretical”?  You can get around this by using named parameters in C#.  C# named parameters allow you to pass a parameter by its name instead of by position.  Now, that’s not to say you can’t put a parameter in the right position and name it either, but it does allow that flexibility.

The nice part of this is you can use it to remove ambiguity:

   1: var entry = new CatalogEntry(new Product(3, name: "Widget", category: "Theoretical"), 3.14);

Is it perfect?  No, but often times an immutable type can be your best protection.

Summary

So, we’ve seen three of several ways to create read-only data in C#.  They all have their pros and cons and some may be more applicable in some situations than others.  In summary:

  • Read-Only Interface
    • Pros: can return read-only view of data while allowing underlying type to be mutable.
    • Cons: can get around read-only view by casting to the actual type, also requires two artifacts (class and interface) per “type”.
  • Struct
    • Pros: great for small data types if a struct is applicable, original can’t be modified if returned by value (property get, return from method).
    • Cons: a struct can have some gotchas that you should understand, may be less performant if size of struct is large.
  • Immutable
    • Pros: cannot modify underlying values by design.
    • Cons: must pass all values (or calculate them) in constructor, can’t use object intializers.

Until C# gives us a way to return a constant form of a mutable reference type, these are a few of the tools at our disposal.

 

 Technorati Tags: ,,,,,

 

Print | posted on Thursday, October 28, 2010 6:21 PM | Filed Under [ My Blog C# Software .NET Fundamentals ]

Powered by: