Functional Programming with C#/F#

After all this hype around C# 3.0 where we will get LINQ, lambda expressions and many other thing I thought that it would be useful to have a deeper look at functional programming languages like F#. At Microsoft quite many people are fond of the Functional Programming style which did influence the design of C# 3.0. Many language architects there have Haskell background which could explain the renaissance of these “old” concepts. So what’s the deal with the functional approach? It does basically force you to think in a twisted way where the problem is straight solved by calling a function which does call another function and so on. Lets assume that you want to get the list of all members of all types which are contained in the assemblies that are currently loaded into your AppDomain (to look at the problem this way is the twisted part) you could write for example in F# simply:

// Assign allMembers the result of our "calculation"
let allMembers = System.AppDomain.CurrentDomain.GetAssemblies()
|> Array.to_list |> List.map (fun a -> a.GetTypes()) |> Array.concat
|> Array.to_list |> List.map (fun ty -> ty.GetMembers()) |> Array.concat
// Print them out
do Array.foreach allMembers ( fun memberInfo -> printf "\n%s" memberInfo.Name)

If you want to understand what this little code fragment really does and how it can be rewritten I recommend the post from Robert. A more comprehensible version in C# code would look like this:

public List<MemberInfo> GetAllMembers()

{
  Assembly[] allAssemblies = AppDomain.CurrentDomain.GetAssemblies();

  List<Type> allTypes = new List<Type>();

  foreach (Assembly assembly in allAssemblies)

  {
    allTypes.AddRange(assembly.GetTypes());
  }

  List<MemberInfo> allMembers = new List<MemberInfo>();

  foreach (Type t in allTypes)

  {
    allMembers.AddRange(t.GetMembers());
  }

  return allMembers;
}

You see much more lines to write but in a way that (most) people can understand in shorter time. Let’s rewrite it in a more functional way with C# 2.0, it does already provide some functional features (anonymous delegates, iterators, coroutines = yield) today:

public List<MemberInfo> GetFunctionMembers()

{
  List<MemberInfo> allMembers = new List<MemberInfo>();

  Array.ForEach(AppDomain.CurrentDomain.GetAssemblies(),
                delegate(Assembly assembly)

                {
                  Array.ForEach(assembly.GetTypes(), delegate(Type type)

                                {
                                  allMembers.AddRange(type.GetMembers());
                                });
                });

  return allMembers;
}

Anonymous delegates and the new static Array functions are very nifty features in .NET 2.0 that allow a compact functional style programming already today. The same ForEach, .. constructs are also available for generic Lists as instance functions.

New Array (generic list) Helper Functions Of .NET 2.0

The following table does contain the new static array helper functions which are also available as instance functions of generic lists.

Array Function/Delegate	Explanation	F# Counterpart
delegate void Action<T>( T obj )	Action delegate used by some Array helper functions. Does not return anything.	–
delegate bool Predicate<T>( T obj )	Predicate delegate which is used by some array helper functions. Does return true if the input element does match something.	–
delegate int Comparison<T>(T x, T y)	Comparison delegate which is used by Array.Sort. Return negative if x < y 0 if x = y positive if x > y	–
ReadOnlyCollection<T> AsReadOnly<T> (T[] array)	Returns a read only collection wrapper of the input collection.	–
ConstrainedCopy ( sourceArray, sourceIndex, destinationArray, destinationIndex, length )	Generic copy where you can copy only parts of an array to another location inside another array.	–
TOutput[] ConvertAll<TInput,TOutput> ( TInput[] array, <TInput,TOutput> converter )	Apply a conversion function to all members of the input array and return the converted array values as new array.	Array.map
bool Exists<T> ( T[] array, Predicate<T> match )	Return true if the called Predicate delegate did return true for an array element.	Array.exists
T Find<T> ( T[] array, Predicate<T> match )	Return the first matching element when the Predicate delegate does return true for an array element.	Array.find
T[] FindAll<T> ( T[] array, Predicate<T> match )	Return an array with all matching elements for which the Predicate delegate did return true.	Array.filter
T FindLast<T> ( T[] array, Predicate<T> match )	Return the last matching element for which the Predicate delegate did return true.	–
void ForEach<T> ( T[] array, Action<T> action )	Execute the Action delegate for every element of the array.	Array.map
void Resize<T> ( ref T[] array, int newSize )	Finally you can resize an array! But keep in mind that a new array of the specified size is created and the elements are copied into the new one.	–
void Sort<T> ( T[] array, Comparison<T> comparison )	Sort the array according to the comparison results of the supplied Comparison delegate.	Array.sort
bool TrueForAll<T> ( T[] array, Predicate<T> match )	Return true if for all array elements the Predicate delegate does return true.	Array.for_all

As you can see you do not have to wait until C# 3.0 to program (nearly) functional style with .NET 2.0. Of course F# has many more functions to offer but the list of C# is growing.

C# Function/Delegate Call Performance
This functional thing C# with delegates and anonymous methods is nice. But how does it perform?
I did some measurements which did compare how much one integer add operation (+one integer array access) is slowed down when we take into account the function calling overhead. I did take the numbers from my P4 with 3,0 GHz.

The blue bars of this diagram is the time needed for our work (integer add) and the red part is the associated overhead. As we can see here the delegate call is only 20% slower than a normal (not inlined) function call which is quite impressive. This is what I would expect after reading Eric Gunnersons very good article about this topic. What is surprising is that there is no difference of a virtual function call compared to a not inlined function call. The penalty we pay here is that the JITer is not able to inline a virtual function call whereas my “normal” function would have been inlined if I did not prevent it by adding the

[MethodImpl(MethodImplOptions.NoInlining)]

attribute above the function. By the way the generated IL code for a “normal” function call and a virtual function call is the same: “callvirt” instruction. Why is for every function call a virtual function call IL opcode emitted by the C# compiler? Answer: Runtime safety. The C# team did decide to use callvirt for every function call because it does add an additional null check if the object we are calling into is null. If it is null a NullReferenceException is thrown. We could use a normal “call” then no exception would be thrown but we will break somewhere inside the function when the (still null) this pointer is needed to dereference e.g. a member variable.

Caching The C# Way

A very practical example of caching is the following scenario. Suppose you have a function that is costly and does generate from some input value/s an an output value which takes a long time to calculate/create (e.g. a database connection when the connection string is the input). You want to speed up things for some colleagues which do open the database connection at every database server access just because it is so easy to do. To make caching happen a wrapper around the costly function is needed which does return the already created database connection when it was opened already. Since I am dealing in my blog very often with the Enterprise Library I use here the Caching Application Block of course ;-).

To cache a function of the form SqlConnection CreateConnection(string connString) you need to write only one additional line of code:

Function<string, SqlConnection> CachedConnection =
    FunctionCacher.EntlibFuncCacher<string, SqlConnection>(CreateConnection);

SqlConnection cachedConn = CachedConnection("connection String");

using System;

using System.Collections.Generic;

using System.Text;

using Microsoft.Practices.EnterpriseLibrary.Caching;

using Microsoft.Practices.EnterpriseLibrary.Common.Configuration;

public static Function<TParam, TReturn> EntlibFuncCacher<TParam, TReturn>(
    Function<TParam, TReturn> func)

{
  CacheManagerFactory factory =
      new CacheManagerFactory(ConfigurationSourceFactory.Create());

  CacheManager cache = factory.CreateDefault();

  return delegate(TParam arg)

  {
    TReturn result = default(TReturn);

    string key = arg.GetHashCode().ToString();

    if (cache.Contains(key))

    {
      result = (TReturn)cache.GetData(key);

    }

    else

    {
      result = func(arg);

      cache.Add(key, result);
    }

    return result;
  };
}

The code above does wrap the original delegate into a new one which does first check if the result of the function has been computed before. If yes the cached value is returned otherwise the costly operation is performed, the result put into the cache and returned to the caller. This caching function was inspired by Sriram Krishan very good article: Lisp is sin. Sriram does use in his function a simple hash table which is very fast but will sooner or later overflow your memory. The Enterprise Library Caching Application Block allows us to upgrade to a real caching solution.

Caching the F# Way

Let’s see if the F# version can do it better:

#r @"<YourPathToit>\Microsoft.Practices.EnterpriseLibrary.Common.dll";;
#r @"<YourPathToit>\Microsoft.Practices.EnterpriseLibrary.Caching.dll";;
open System
open Microsoft.Practices.EnterpriseLibrary.Common.Configuration
open Microsoft.Practices.EnterpriseLibrary.Caching

let CacheFunction f =
   let source = ConfigurationSourceFactory.Create()   in // Get configuration
   let CacheFactory = new CacheManagerFactory(source) in // Create Cache Factory
   let Cache = CacheFactory.CreateDefault()           in // Get configured default cache from Entlib

   // define a new function that takes an arbitrary type as input argument
   fun (x) ->
   (  
       let strKey = sprintf "%d" (hash x) in // get (hopefully) unique hash value
       if( Cache.Contains( strKey )) then    // check if the cache does already contain it
       (
         unbox(Cache.GetData(strKey))        // unbox (static cast to return type of f(x)) and return the cached object
       )
       else                                  // otherwise calculate result of original function and store it in cache
       (
         let result = f x in                 // do lengthy calculation
         Cache.Add(strKey,result);           // store result in cache
         result                              // return calculated result
       )
   )

////////////////////////////
// Usage of CacheFunction
////////////////////////////
let TestFunc (x:string) = String.Format("New Value: {0}",x) // This is our function we want to cache
let CachedFunc = CacheFunction TestFunc                     // create cached function

The F# version is a bit shorter and contains more comments than the F# version. Apart from this it does look similar except for the nicer anonymous function definition which does not result in the creation of an anonymous delegate. It is educational to see what Reflector thinks about our F# version when he does translate it back to C# code.

The CacheFunction does create the local variables as expected and passes it to the F# version of an anonymous delegate class which does not derive from MultiCastDelegate.

public static FastFunc<A, B> CacheFunction<A, B>(FastFunc<A, B> f)
{
 IConfigurationSource source1 = ConfigurationSourceFactory.Create();
 CacheManager manager1 = new CacheManagerFactory(source1).CreateDefault();
 return new File1.CacheFunction@14<A, B>(f, manager1);
}

Each F# delegate receives the local variables via reference so he can access the variables of the outer function.

public CacheFunction@14(FastFunc<A, B> f0, CacheManager Cache0)
{
 this.f0 = f0;
 this.Cache0 = Cache0;
}

The Invoke method does contain our actual caching logic which we did write inside our anonymous function.

public override B Invoke(A x)
{
 // calculate hash code of x and stringify it
 string text1 = Pervasives.sprintf<FastFunc<int, string>>((PrintfPrimitives.Format4<FastFunc<int, string>, Unit, string, string>) 
 new PrintfPrimitives.Format4<FastFunc<int, string>, Unit, string, string>("%d", (FastFunc<object, FastFunc<int, string>>) 
 new File1.strKey@16())).Invoke(Pervasives.hash<A>(x));
 if (this.Cache0.Contains(text1))
 {
 return Pervasives.unbox<B>(this.Cache0.GetData(text1)); // return cached value
 }
 B local1 = this.f0.Invoke(x); // call original function
 this.Cache0.Add(text1, (object) local1); // add it to the cache
 return local1; // return calculated result
}

What I found interesting is that the F# unbox command is nothing else than our good old static cast. Things are falling into places at last.

public static A Microsoft.FSharp.MLLib.Pervasives.unbox<A>(object x)
{
 return (A) x;
}

Curried Functions F# / C#

In functional languages it is common to create for a function f(x,y,z,…) some helper functions g(x=0,y,z,…) which do call f with some default argument (x=0 in the case of g). This feature does remind me quite strong to default function arguments in C++. Of course there is more to it since in a functional language the parameter itself can be a function which is exchanged at runtime to support e.g. different sorting algorithms quite easy.

In F# it is easy to define a function calc which does add the input arguments (thanks for the comments DeeJ, I am still learning). The trick is that the argument can be anything even another function which does return the correct type. You can try this code sample with fsi.exe the F# interpreter.

>let calc x y = x + y;;
Now we can curry the List.map function with our calc function which does add to every element 2.
>let List.map (calc 2) [1;2;3];;
If we execute the curried function we will get the desired result:
val it : int list = [3; 4; 5]

Another example of function currying the the Array.map function which is the same as the Array.ConvertAll function. Let’s compare the two:

Expected Output:
SOME
LOWERCASE
WORDS

F# version:

let someArray = [|"some"; "lowercase"; "words"|]
let ToUpperArray(inArr) = inArr |> Array.map (fun y -> y.ToUpper())
let bigArray = ToUpperArray(someArray)
do bigArray |> Array.foreach (fun x -> printf "\n%s" x)

Lists are immutable in F# and defined by [“aaa”;bb”]. This notation is rather confusing for beginners since I expect this definition to be the definition of an array. Mutable arrays are defined by using the [| |] notation which allows us to use them in the old fashioned way. Thanks for Don pointing this out. To declare an anonymous(unnamed) function we can use the fun keyword to supply a conversion function for the array. In the next line we call our ToUpperFunction which does execute on all element’s of the array the anonymous functions. After that we are ready to print the converted array to the console.

C# version:

static public string[] ToUpperArray(string[] arr)

{
  return Array.ConvertAll<string, string>(arr, delegate(string input)

                                          {
                                            return input.ToUpper();
                                          });
}

static public void DoArrayStuff()

{
  string[] someArray = { "some", "lowercase", "words" };

  string[] upperCaseArray = ToUpperArray(someArray);

  // print them out

  Array.ForEach<string>(upperCaseArray, delegate(string input)

                        {
                          Console.WriteLine("{0}", input);
                        });
}

Even with the anonymous delegates syntax we have much more overhead when we need to define a new function in C#. This time the F# version does win the price for the shorter and more concise (aka) readable code. Speed is another thing I did not yet measure because it is not fair to compare a released product against a research language. Please note that anonymous delegates do create a new delegate instance inside your method where you define them. If e.g. the ToUpperArray method is called very often with small arrays the cost for the creation of the delegate instance must be considered. Normally it is a better idea to create the delegate the old fashioned way as a member variable of your class instance.

Coroutines in C#

This extension of the traditional function is in its most general form perhaps a bit too much for most programmers since it does violate nearly every programming principle like encapsulation and information hiding. C# 2.0 does provide this feature by adding the yield keyword, although in a very limited form as enumerator. Note: You must use yield return xxx; in C# since this way no pre C# 2.0 code was broken if somebody did use the reserved yield keyword despite better knowledge. In a foreach loop the following things happen:

Get the Enumerator via the IEnumerable<T> interface of the the input object.
Call MoveNext on the enumerator and store the current value of T in a local variable.
The code inside the foreach loop can use the populated local variable now.
Call Dispose on the enumerator.

The following sample does illustrate this behavior:
Expected Output:
Int: 1
Int: 2
Int: 3

public IEnumerable<int> EnumInt()

{
  yield return 1;

  yield return 2;

  yield return 3;
}

public void IterateWithForEach()

{
  foreach (int i in EnumInt())

  {
    Console.WriteLine("Int: {0}", i);
  }
}

public void IterateByHand()

{
  IEnumerator<int> iterator = EnumInt().GetEnumerator();

  int i;

  while (iterator.MoveNext())

  {
    i = iterator.Current;

    Console.WriteLine("Int: {0}", i);
  }

  iterator.Dispose();
}

The yield return statement is executed n-times where the function is entered in the same state as it was left of the previous yield return statement. This powerful feature does allow you to write goto’s in a much nicer way ;-). You can do anything in your function you would like to. The most useful things I can imagine are

Smart collections where you supply helper enumerators that enumerate e.g. in alphabetical order plus a filter over you string array.
Start asynchronous operations on each element of a collection and return the delegate that will finish its work later.
Usage in algorithms
…

Coroutines C# Example

Google has created a small but powerful web service called Google API where you can query for search results for check phrases for spelling errors. This is an excellent opportunity to put our new knowledge about coroutines at work. The scenario is for example a word editor where you underline the wrong words with red color. Since spell checking is expensive we want to do it asynchronously so the user can continue with his work without any interruption.

Put an array of phrases into a spell checker
Get the corrected phrases back not necessarily in the order they were given.

To achieve this “not necessarily in the order they were given” part we use asynchronous I/O with delegates to call the Google web service for each phrase and collect the responses asynchronously. Each corrected phrase is then put along with the original phrase into an array that can be queried by an external call in a synchronous way. We do only block as long as one response from Google is collected and return it immediately.

The consumer of our SpellChecker can call us synchronously:

public void Spellchecker()

{
  string[] phrases = { "seperate pece", "no hype", "micosoft" };

  foreach (KeyValuePair<string, string> result in GetCorrectedPhrases(phrases))

  {
    Console.WriteLine("Phrase: {0} becomes {1}", result.Key, result.Value);
  }
}

Expected Output:
Phrase: no hype becomes no hype
Phrase: seperate pece becomes separate peace
Phrase: micosoft becomes microsoft

The infrastructure behind this real world example is a little longer

GoogleSearchService myGoogle =
    new GoogleSearchService();  // .NET powered by Google ;-)

public KeyValuePair<string, string> GetCorrectSpelling(string phrase)

{
  // do the Google spell check synchronously

  string newphrase = myGoogle.doSpellingSuggestion(licenseKey, phrase);

  if (String.IsNullOrEmpty(newphrase))

    newphrase = phrase;

  // Key is the original phrase, Value is the googled phrase

  KeyValuePair<string, string> ret =
      new KeyValuePair<string, string>(phrase, newphrase);

  return ret;
}

delegate KeyValuePair<string, string> SpellingCheck(string phrase);

IEnumerable<KeyValuePair<string, string>> GetCorrectedPhrases(string[] phrases)

{
  SpellingCheck checker = GetCorrectSpelling;  // Create delegate instance

  int PendingSpells = 0;  // +1 for each request, -1 for each completed request

  // Results are collected as pairs where key is the old phrase and value the
  // checked one

  List<KeyValuePair<string, string>> results =
      new List<KeyValuePair<string, string>>();

  foreach (string phrase in phrases)  // Start spell checking for all phrases at
                                      // once -> very scalable

  {
    System.Threading.Interlocked.Increment(ref PendingSpells);

    // Invoke Spell checker delegate asynchronously

    checker.BeginInvoke(
        phrase,
        delegate(IAsyncResult ar)  // put callback into our function too

        {
          // Retrieve the delegate.

          SpellingCheck caller = (SpellingCheck)ar.AsyncState;

          // Call EndInvoke to retrieve the results.

          lock (checker)

          {
            results.Add(caller.EndInvoke(ar));
          }

          System.Threading.Interlocked.Decrement(ref PendingSpells);
        },
        checker);
  }

  // collect data from asynchronously called delegates in the order they finish

  while (PendingSpells != 0 || results.Count != 0)

  {
    if (results.Count > 0)

    {
      // the delegates delegates modify the collection from another thread

      // we must synchronize here

      lock (checker)

      {
        foreach (KeyValuePair<string, string> result in results)

        {
          yield return result;  // return available result
        }

        results.Clear();  // remove returned results
      }
    }

    System.Threading.Thread.Sleep(
        50);  // wait until the next phrase comes from google
  }
}

This little example shows the true power of anonymous delegates in combination with asynchronous I/O and the resulting scalable solution which can be found in true Enterprise Applications which use the .NET platform. I hope Microsoft will forgive me when I used the Google API web service for this demo ;-).

Conclusions

Many concepts originated from functional programming languages are already alive in C# 2.0 which enables a feature rich way of programming. F# has some interesting ideas (type inference, increased thread safety, …) which will soon be available to all sharper’s in the future, I hope. Not all concepts of functional languages should be explored by Mort and Elvis since most of them are fairly complex to understand because functional languages are very picky (just like unix) who will become be their friend and who not. If Einstein does write a F# program then only a true Einstein can maintain it. When the outsourced Mort and Elvis do change his code I am sure they will screw it up. My first steps with F# (I am still a beginner) have been a little frustrating since the F# compiler does not give me useful compiler error messages. But once I was more into the syntax it did become easier. I hope you did enjoy this C# biased introduction into the world of functional programming languages.

This article is part of the GWB Archives. Original Author: Alois Kraus

Replatforming Guide: Pros, Cons, and Impact

Deciding to replatform is no small feat; it’s like setting sails for new horizons with your digital presence. Weighing the

Cypress vs Selenium: Why Cypress is Better!

Navigating the competitive landscape of web testing tools, Cypress emerges as a noteworthy contender, outshining Selenium with its cutting-edge advantages.

Functional Programming with C#/F#

Related Posts

Replatforming Guide: Pros, Cons, and Impact

Cypress vs Selenium: Why Cypress is Better!