Alois Kraus

blog

  Home  |   Contact  |   Syndication    |   Login
  106 Posts | 8 Stories | 294 Comments | 162 Trackbacks

News



Article Categories

Archives

Post Categories

Image Galleries

Programming

One of the action points for Enterprise Library 4 is to improve the performance of the TextFormatter. I like performance problems and found some interesting results I would like to share here. TextFormatter does take a template where the placeholders are expanded for every log message. The template does look like this

Timestamp: {timestamp}
Message: {message}
Category: {category}
Priority: {priority}
EventId: {eventid}
Severity: {severity}
Title:{title}
Machine: {machine}
Application Domain: {appDomain}
Process Id: {processId}
Process Name: {processName}
Win32 Thread Id: {win32ThreadId}
Thread Name: {threadName}
Extended Properties: {dictionary({key} - {value}
)}

TextFormatter does use StringBuilder to create from the template a mutable string and does then call StringBuilder.Replace for each token it could contain. It does basically a String.IndexOf to find if something has to be replaced and then allocates a new string which contains enough space for the token to be inserted. This process is repeated until all gaps are filled. This results in searches for tokens which are not present in the current template and some memory allocation/copy overhead because the string to replace will never have the same size.

But the most expensive things are the TokenFunctions which do expand e.g. the Extended Properties collection at the end of the example template. A quick look into the source shows where the problem is:

public virtual void Format(StringBuilder messageBuilder, LogEntry log)

    {

       ...

            string messageString = messageBuilder.ToString();

       ...

 

The performance optimization to use StringBuilder in the Formatter has cornered the TokenFunction formatter because it cannot to a simple replace of a fixed string and needs to parse the template. But StringBuilder gives you no access to the not yet constructed string and you need to use therefore this solution.

Armed with this knowledge I came up with an alternate approach to format a text template which does not suffer from the deficiencies of StringBuilder.Replace. The trick is to simply use StringBuilder.Append. To be able to do this I split the template apart in its constant (marked in orange) parts and use for each dynamic content a retriever function. The tuple (string, callback) is put into an object (MicroFormatter) from which we can create a list that allows us to build up the expanded template step by step.  Registering template fill callbacks does eliminate all string searches and (nearly) all buffer reallocations. The cool thing is that this is also possible for the TokenFunctions which can now directly format the parsed template.

public class MicroFormatter

    {

        /// <summary>

        /// Constant prefix which is part of the message template

        /// </summary>

        public string Prefix { get; set; }

 

        /// <summary>

        /// Gets the dynamic content of the LogEntry

        /// </summary>

        public Func<LogEntry, string> Retriever { get; set; }

 

        /// <summary>

        /// TokenFunction formatter wrapper function which does need a string template as parameter which is then formatted.

        /// The formatted string is then returned

        /// </summary>

        public Func<LogEntry, string, string> TokenFuncRetriever { get; set; }

 

        /// <summary>

        /// Template which is expanded by the token e.g. "dictionary({key} - {value}" for the DictionaryFormatter

        /// </summary>

        public string TokenFuncTemplate { get; set; }

 

        public MicroFormatter()

        {

        }

 

        public MicroFormatter(string prefix, string tokenFuncTemplate, Func<LogEntry, string, string> tokenFuncRetriver)

        {

            Prefix = prefix;

            TokenFuncTemplate = tokenFuncTemplate;

            TokenFuncRetriever = tokenFuncRetriver;

        }

 

        public MicroFormatter(string prefix, Func<LogEntry, string> retriever)

        {

            Prefix = prefix;

            Retriever = retriever;

        }

 

        /// <summary>

        /// Appends to a StringBuilder instance the static and dynamic contents of the parsed template

        /// </summary>

 

           public void Format(LogEntry entry, StringBuilder template)

 

        {

            if (Prefix != null) // Append constant template part

            {

                template.Append(Prefix);

            }

 

            if (Retriever != null) // Add dynamic content of LogEntry

            {

                template.Append(Retriever(entry));

            }

            else if (TokenFuncRetriever != null) // Or add dynamic content of TokenFunction

            {

                template.Append(TokenFuncRetriever(entry, TokenFuncTemplate));

            }

        }

    }

 

This allows us to generate a easy Format Method:

    // Combined list of microformatters which allow an easy StringBuilder.Append instead of

    // a much slower StringBuilder.Replace which does first search (many times pointless) and

    // then replaces the string which needs to reallocate and copy the buffer

    List<MicroFormatter> myFormatters = new List<MicroFormatter>();

 

    public string Format(LogEntry entry)

    {

        // allocate string builder can be dynamically sized

        StringBuilder sb = new StringBuilder(500);

 

        // Call formatters in sequence they were entered to build the spitted

        // template piece by piece together again.

        foreach (MicroFormatter fmt in myFormatters)

        {

           fmt.Format(entry, sb);

        }

 

        return sb.ToString();

    }

 

The red lines are the important changes which give us much better performance. This simple solution does improve the formatting speed over 14 times! That should be enough for everyone ;-) Another costly part is the creation of the LogEntry itself which does call a private method in its default ctor (CollectIntrinsicProperties) which can cost quite some time. It would be nice if the P&P team would make this function virtual so one has a chance to optimize things a bit. This would be beneficial if somebody does want to filter LogEntry derived objects which can be filtered out without needing any of the other properties. That would allow to defer the costly data collection when it is really ensured that the entry will be logged.

 

Performance

All Performance Tests were performed with an Intel Core Duo 6600 @ 2.40 GHz with 3 GB RAM, EntLib 3.1, VS2008 RTM and the message template which is shown at the start of the article.

 

Test Case (executed 100K times) Execution Frequency in 1/s Speed Gain (x factor)
Create LogEntry    22 000 -
Format LogEntry TextFormatter     5 300 1
Format LogEntry SmartFormatter   75 000 14

 

More Code

The complete code of my SmartFormatter is shown below:

namespace Logging

{

    /// <summary>

    /// Really fast formatter with equivalent features as TextFormatter of the Enterprise Library

    /// </summary>

    public class SmartFormatter : ILogFormatter

    {

        protected const char TokenStartChar = '{';

 

        protected const string timeStampToken = "{timestamp}";

        protected const string messageToken = "{message}";

        protected const string categoryToken = "{category}";

        protected const string priorityToken = "{priority}";

        protected const string eventIdToken = "{eventid}";

        protected const string severityToken = "{severity}";

        protected const string titleToken = "{title}";

        protected const string errorMessagesToke = "{errorMessages}";

 

        protected const string machineToken = "{machine}";

        protected const string appDomainNameToken = "{appDomain}";

        protected const string processIdToken = "{processId}";

        protected const string processNameToken = "{processName}";

        protected const string threadNameToken = "{threadName}";

        protected const string win32ThreadIdToken = "{win32ThreadId}";

        protected const string activityidToken = "{activity}";

 

        protected const string NewLineToken = "{newline}";

        protected const string TabToken = "{tab}";

 

        protected const string DictionaryStartToken = "{dictionary(";

        protected const string KeyValueStartToken = "{keyvalue(";

        protected const string TimeStampStartToken = "{timestamp(";

        protected const string ReflectedPropertyStartToken = "{property(";

        protected const string EndToken = ")}";

 

        /// <summary>

        /// Key is the Token template e.g. "{timestamp}" and value is a callback function that retrieves the requested string from

        /// a LogEntry object.

        /// </summary>

        protected internal Dictionary<string, Func<LogEntry, string>> myTokenToFunc = new Dictionary<string, Func<LogEntry, string>>();

 

        /// <summary>

        /// Key is the Token Function start e.g. "{dictionary(" and value is a callback function that formats from a

        /// configured template the contents of a LogEntry object.

        /// </summary>

        protected internal Dictionary<string, Func<LogEntry, string, string>> myTokenFormatterToFunc = new Dictionary<string, Func<LogEntry, string, string>>();

 

        // Combined list of MicroFormatters which allow an easy StringBuilder.Append instead of

        // a much slower StringBuilder.Replace which does first search (many times pointless) and

        // then replaces the string which needs to reallocate and copy the buffer

        List<MicroFormatter> myFormatters = new List<MicroFormatter>();

 

        /// <summary>

        /// For brevity we use the C# 3.0 lambda syntax

        /// </summary>

        protected internal virtual void AddWellknownTokens()

        {

            myTokenToFunc.Add(timeStampToken, (LogEntry ent) => { return ent.TimeStampString; });

            myTokenToFunc.Add(messageToken, (LogEntry ent) => { return ent.Message; });

            myTokenToFunc.Add(categoryToken, FormatCategoriesCollection);

            myTokenToFunc.Add(priorityToken, (LogEntry ent) => { return ent.Priority.ToString(); });

            myTokenToFunc.Add(eventIdToken, (LogEntry ent) => { return ent.EventId.ToString(); });

            myTokenToFunc.Add(severityToken, (LogEntry ent) => { return ent.Severity.ToString(); });

            myTokenToFunc.Add(titleToken, (LogEntry ent) => { return ent.Title; });

            myTokenToFunc.Add(errorMessagesToke, (LogEntry ent) => { return ent.ErrorMessages; });

            myTokenToFunc.Add(machineToken, (LogEntry ent) => { return ent.MachineName; });

            myTokenToFunc.Add(appDomainNameToken, (LogEntry ent) => { return ent.AppDomainName; });

            myTokenToFunc.Add(processIdToken, (LogEntry ent) => { return ent.ProcessId; });

            myTokenToFunc.Add(processNameToken, (LogEntry ent) => { return ent.ProcessName; });

            myTokenToFunc.Add(threadNameToken, (LogEntry ent) => { return ent.ManagedThreadName; });

            myTokenToFunc.Add(win32ThreadIdToken, (LogEntry ent) => { return ent.Win32ThreadId; });

            myTokenToFunc.Add(activityidToken, (LogEntry ent) => { return ent.ActivityId.ToString(); });

            myTokenToFunc.Add(NewLineToken, (LogEntry ent) => { return Environment.NewLine; });

            myTokenToFunc.Add(TabToken, (LogEntry ent) => {return "\t"; });

 

            // register token functions

            myTokenFormatterToFunc.Add(DictionaryStartToken, (LogEntry ent, string template) =>

            {

                DictionaryToken token = new DictionaryToken();

                return token.FormatToken(template, ent);

            });

 

            myTokenFormatterToFunc.Add(KeyValueStartToken, (LogEntry ent, string template) =>

                {

                    KeyValueToken token = new KeyValueToken();

                    return token.FormatToken(template, ent);

                });

 

            myTokenFormatterToFunc.Add(TimeStampStartToken, (LogEntry ent, string template) =>

                {

                    TimeStampToken token = new TimeStampToken();

                    return token.FormatToken(template, ent);

                });

 

            myTokenFormatterToFunc.Add(ReflectedPropertyStartToken, (LogEntry ent, string template) =>

                {

                    ReflectedPropertyToken token = new ReflectedPropertyToken();

                    return token.FormatToken(template, ent);

                });

        }

 

        public SmartFormatter()

        {}

 

        public SmartFormatter(string template)

        {

            AddWellknownTokens();

            BuildFormatterList(template);

        }

 

        public string Format(LogEntry entry)

        {

            // allocate string builder can be dynamically sized

            StringBuilder sb = new StringBuilder(500);

 

            // Call formatters in sequence they were entered to build the splitted

            // template piece by piece together again.

            foreach (MicroFormatter fmt in myFormatters)

            {

                fmt.Format(entry, sb);

            }

 

            return sb.ToString();

        }

 

        /// <summary>

        /// Returns the template in between the parentheses for a token function.

        /// Expecting tokens in this format: {keyvalue(myKey1)}.

        /// </summary>

        /// <param name="startPos">Start index to search for the next token function.</param>

        /// <param name="message">Message template containing tokens.</param>

        /// <returns>Inner template of the function.</returns>

        protected virtual string GetInnerTemplate(string tokenFuncStart, int startPos, string message)

        {

            int tokenStartPos = message.IndexOf(tokenFuncStart, startPos) + tokenFuncStart.Length;

            int endPos = message.IndexOf(EndToken, tokenStartPos);

            return message.Substring(tokenStartPos, endPos - tokenStartPos);

        }

 

        /// <summary>

        /// Parse the template and create the list of microformatters. This way we get an ordered list

        /// of things we really have to do instead of trying to blindly format tokens in a template which

        /// may not be present

        /// </summary>

        /// <param name="template"></param>

        internal virtual protected void BuildFormatterList(string template)

        {

            MicroFormatter fmt = new MicroFormatter();

            List<char> curNonTokenString = new List<char>();

            bool isTokenChar = false;

 

            // Go through template and build list of Microformatters

            for(int i = 0; i<template.Length; i++)

            {

                isTokenChar = false;

                if (template[i] == TokenStartChar)

                {

                    // Transform all template variables

                    foreach (string token in myTokenToFunc.Keys)

                    {

                        if (template.IndexOf(token, i) == i)

                        {

                            i += token.Length-1;

                            MicroFormatter formatter = new MicroFormatter();

                            formatter.Prefix = new string(curNonTokenString.ToArray());

                            // Get for this token the callback function which retrieves

                            // from a LogEntry object the resulting dynamic string

                            formatter.Retriever = myTokenToFunc[token];

                            myFormatters.Add(formatter);

                            isTokenChar = true;

                            curNonTokenString.Clear();

                            break;

                        }

                    }

 

                    // Transform all Token functions in the message template

                    foreach (string tokenStart in myTokenFormatterToFunc.Keys)

                    {

                        if (template.IndexOf(tokenStart, i) == i)

                        {

                            string tokenTemplate = GetInnerTemplate(tokenStart,i, template);

                            i += tokenStart.Length + tokenTemplate.Length + EndToken.Length - 1;

 

                            // Create formatter for this token function and store the token function

                            // template inside the formatter

                            MicroFormatter formatter = new MicroFormatter(new string(curNonTokenString.ToArray()),

                                                                          tokenTemplate,

                                                                          myTokenFormatterToFunc[tokenStart]);

                            myFormatters.Add(formatter);

                            isTokenChar = true;

                            curNonTokenString.Clear();

                            break;

                        }

                    }

                }

 

                // if the current template character does not belong to a token variable or function

                // store it later as constant template part in the next MicroFormatter instance

                if (!isTokenChar)

                {

                    curNonTokenString.Add(template[i]);

                }

            }

 

            // the rest of the template is constant

            if (curNonTokenString.Count != 0)

            {

                myFormatters.Add( new MicroFormatter(new string(curNonTokenString.ToArray()), null) );

            }

        }

 

        public static string FormatCategoriesCollection(LogEntry ent)

        {

            StringBuilder categoriesListBuilder = new StringBuilder();

            int i = 0;

            foreach (String category in ent.Categories)

            {

                categoriesListBuilder.Append(category);

                if (++i < ent.Categories.Count)

                {

                    categoriesListBuilder.Append(", ");

                }

            }

            return categoriesListBuilder.ToString();

        }

   }

}

posted on Monday, January 14, 2008 12:18 PM

Feedback

# re: Really Fast Formatting with Enterprise Library 2/29/2008 3:28 AM Bill Menees
Great work. I hope these optimizations make it into EL4. I'd also really like to see LogEntry.CollectIntrinsicProperties optimized and made virtual.

# re: Really Fast Formatting with Enterprise Library 5/5/2008 7:42 PM Laurent
Hello Alois,
great post,
I am trying to implement my ouwn formattter as well, I am suprised that your formatter class does not use any attribute like:
[ConfigurationElementType(typeof(CustomFormatterData))]
How do you then reference the formatter in the config file?

Many thanks for your answer.

Laurent

# re: Really Fast Formatting with Enterprise Library 5/6/2008 12:50 AM Alois Kraus
Hi Laurent,

the answer is easy. I did not include the Configuration classes because the pasted code was already quite big. I will post the source code at the CodePlex EntlibContrib project for real world usage.


Yours,
Alois Kraus


# re: Really Fast Formatting with Enterprise Library 8/27/2008 9:25 AM Roman
Hey Alois,
I was wondering if u had a chance to submit this code to EntlibContrib. I am about to use EntLib 3.1 for 2.0 project and would love to utilize your brilliant changes. Let me know what's your policy for reuse of this code.

Thanks.

# re: Really Fast Formatting with Enterprise Library 10/4/2011 2:54 AM Tar
errorMessagesToke

that is all :)

Post A Comment
Title:
Name:
Email:
Comment:
Verification: