September 12, 2009

"File not found" CryptographicException

The Data Protection API (in Windows 2000 or later) provides methods to securely store secret information (e.g., a cached password) on a local computer. Only the logged-in user can decrypt the protected data (if they also have the key that was used to protect it). In .NET, these APIs are exposed through the ProtectedData class.

On one of our XP test systems, a call to ProtectedData.Protect unexpectedly failed with a CryptographicException that had a very puzzling exception message:

System.Security.Cryptography.CryptographicException: The system cannot
    find the file specified. 
  at System.Security.Cryptography.ProtectedData.Protect(Byte[] userData,
    Byte[] optionalEntropy, DataProtectionScope scope)

There is no information provided on which file is required or why it's missing, or even why protecting data (in memory) requires file system access in the first place. Searching the internet for the error message turned up just a few other programmers who were also experiencing, but no solutions.

Since the ProtectedData class is just a thin wrapper around the Win32 CryptProtectData function, I searched for the underlying Win32 error code, and found the answer: the crypto methods read the HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Explorer\User Shell Folders registry key; if there are missing values, the methods will fail. (This makes me think that the methods are incorrectly reading the "User Shell Folders" key, instead of calling SHGetFolderPath, as Raymond recommends, but I haven't been able to test that yet.)

Posted by Bradley Grainger at 1:56 PM | Comments (4) | TrackBack

August 19, 2009

Entity Framework Performance Tip - Be a Minimalist

Only get the data you need! This not only applies to Entity Framework query performance, but just about every situation were code interacts with data. It is a simple rule to follow, but yet I see it broken all the time. Breaking this rule is one of the top three reasons I've seen for slow Entity Framework query performance. I may discuss the other two in later posts ...

Bad

  116         private static IEnumerable<string> GetNames()

  117         {

  118             using (var dc = new MyDataContext())

  119             {

  120                 // This is bad news!

  121                 // This will return alot of columns

  122                 //  that won't even be used.

  123                 var userNames = from u in dc.UserSet

  124                                 select u;

  125 

  126                 // SQL Server Profiler will show you that all the

  127                 //  columns are fetched when you call "ToList()".

  128                 return userNames.ToList().Select(u => u.Name);

  129             }

  130         }


Good

  116         private static IEnumerable<string> GetNames()

  117         {

  118             using (var dc = new MyDataContext ())

  119             {

  120                 // Just get the Name.

  121                 // Don't waste database and network resources!

  122                 var userNames = from u in dc.UserSet

  123                                 select u.Name;

  124 

  125                 return userNames.ToList();

  126             }

  127         }

Posted by Bill Simpkins at 8:11 PM | Comments (1) | TrackBack

June 5, 2009

Using If-Modified-Since in HTTP Requests

Conditionally requesting the download of a web page only if it has been modified after a given time seems like it should be as simple as setting the IfModifiedSince property and making the request:

HttpWebRequest request = (HttpWebRequest) WebRequest.Create(@"http://code.logos.com/blog/");

request.IfModifiedSince = new DateTime(2009, 6, 3);

using (HttpWebResponse response = (HttpWebResponse) request.GetResponse())

{

    if (response.StatusCode == HttpStatusCode.NotModified)

    {

        // page wasn't modified; use cached version

    }

}

But of course it’s not that simple (as some others have noticed).

The designers of HttpWebRequest decided that some particular HTTP status codes would cause a WebException to be thrown. (As far as I can tell, this list is undocumented, but 304 “Not Modified” is one of them.) This is a vexing exception, because the situation is hardly exceptional. In fact, because it can only happen if IfModifiedSince is explicitly set (or if request.Headers were modified), one could argue that it’s quite expected and intentional. To avoid duplicating logic in the try block (for handling 200 “OK”) and in the catch block (for handling “304” Not Modified), I wrote a utility method that swallows any WebException thrown due to a ProtocolError (e.g., an “invalid” HTTP status code):

public static class HttpWebRequestUtility

{

    /// <summary>

    /// Gets the <see cref="HttpWebResponse"/> from an Internet resource.

    /// </summary>

    /// <param name="request">The request.</param>

    /// <returns>A <see cref="HttpWebResponse"/> that contains the response from the Internet resource.</returns>

    /// <remarks>This method does not throw a <see cref="WebException"/> for "error" HTTP status codes; the caller should

    /// check the <see cref="HttpWebResponse.StatusCode"/> property to determine how to handle the response.</remarks>

    public static HttpWebResponse GetHttpResponse(this HttpWebRequest request)

    {

        try

        {

            return (HttpWebResponse) request.GetResponse();

        }

        catch (WebException ex)

        {

            // only handle protocol errors that have valid responses

            if (ex.Response == null || ex.Status != WebExceptionStatus.ProtocolError)

                throw;

 

            return (HttpWebResponse) ex.Response;

        }

    }

}

The code to consume this reads very similarly to the first snippet in this post; you just have to remember that normal errors (e.g., 404 “Not Found”) are reported through a valid HttpWebResponse, so its StatusCode property must be checked before acting on the response.

Posted by Bradley Grainger at 11:36 AM | Comments (3) | TrackBack

June 1, 2009

Enumerable.Sum never returns null

I wrote the following code recently, and was surprised when ReSharper warned me that the condition is always true:

IEnumerable<int?> values = // get some values

int? sum = values.Sum();

if (sum.HasValue) { /* this code is always executed */ }

It’s even more surprising when you consider the following difference:

int?[] values = new int?[] { 1, null };

int? sum1 = values.Sum(); // returns 1

int? sum2 = values[0] + values[1]; // returns null

Here, sum1 is 1, but sum2 is null.

Since Sum<int?> never returns null (not even for any empty sequence, or a sequence containing all nulls), it’s odd that its return type is int?, implying that null is a possible return value. Anders explains that this return type is to keep the pattern of T Sum<T>(IEnumerable<T>) for nullable types.

But what if you want Sum to return null if the sequence contains a null? This is easy to simulate using Aggregate, as C# already propagates nulls properly when using the addition operator:

public static class EnumerableUtility

{

    public static int? NullableSum(this IEnumerable<int?> values)

    {

        return values.Aggregate((int?) 0, (sum, value) => sum + value);

    }

}

The initial value of 0 is specified to force the sum of an empty list to be zero; you could change it to default(int?) to make an empty list sum to null. A possible optimisation would be to rewrite it with a foreach loop that returns null as soon as the first null in the sequence is found.

Update: My very smart coworker points out that changing the initial aggregate value to default(int?) makes the function return null for any input. (This is probably a good reason to include a full unit test suite with every blog post…) A custom enumerator (or test of values.Any() first) could be used if returning null as the sum of an empty sequence is desired.

Posted by Bradley Grainger at 4:55 PM | Comments (0) | TrackBack

May 6, 2009

WrappingStream Implementation

In a previous post, I mentioned that a certain problem could be solved by creating “an implementation of Stream that wraps another stream”. “Anonymous” asked recently, “Can you send me the code of your straightforward solution that is wrapper the class MemoryStream?”. Yes, but first I want to give another example of where such a class is useful.

The first time I used BinaryReader, I wrote code similar to the following:

public void DoSomething(Stream stream)

{

    // read a simple byte

    int value = stream.ReadByte();

 

    // simplify more complex reading by using a BinaryReader

    using (BinaryReader reader = new BinaryReader(stream))

    {

        value = reader.ReadInt32();

        value = reader.ReadInt32();

    }

 

    // back to simple reading

    value = stream.ReadByte();

}

Experienced users of BinaryReader will see the problem here: the BinaryReader class takes ownership of the Stream with which it’s constructed, and disposes it in Dispose. (This somewhat-important detail is only briefly mentioned in the documentation for the Close method.) Thus the last call to ReadByte fails; we have also closed the Stream even though our caller probably expects it to still be open.

While the easy answer is to simply not Close/Dispose the BinaryReader (it holds no unmanaged resources, and so nothing “bad” happens if it’s not disposed), I feel guilty every time I don’t dispose an IDisposable object. Moreover, tools like the .NET Memory Profiler have a profiling mode that lists all non-disposed IDisposable objects (to help find resource leaks); leaving this BinaryReader undisposed generates a false positive and makes it harder to find the real leaks.

The WrappingStream class can be of use in this scenario by providing an implementation of Stream that the BinaryReader can own and Dispose without affecting the real stream:

using (WrappingStream wrapper = new WrappingStream(stream))

using (BinaryReader reader = new BinaryReader(wrapper))

{

    value = reader.ReadInt32();

    value = reader.ReadInt32();

}

 

// 'stream' is still valid here

Here, at long last, is the code for WrappingStream:

/// <summary>

    /// A <see cref="Stream"/> that wraps another stream. The major feature of <see cref="WrappingStream"/> is that it does not dispose the

    /// underlying stream when it is disposed; this is useful when using classes such as <see cref="BinaryReader"/> and

    /// <see cref="System.Security.Cryptography.CryptoStream"/> that take ownership of the stream passed to their constructors.

    /// </summary>

    public class WrappingStream : Stream

    {

        /// <summary>

        /// Initializes a new instance of the <see cref="WrappingStream"/> class.

        /// </summary>

        /// <param name="streamBase">The wrapped stream.</param>

        public WrappingStream(Stream streamBase)

        {

            // check parameters

            if (streamBase == null)

                throw new ArgumentNullException("streamBase");

 

            m_streamBase = streamBase;

        }

 

        /// <summary>

        /// Gets a value indicating whether the current stream supports reading.

        /// </summary>

        /// <returns><c>true</c> if the stream supports reading; otherwise, <c>false</c>.</returns>

        public override bool CanRead

        {

            get { return m_streamBase == null ? false : m_streamBase.CanRead; }

        }

 

        /// <summary>

        /// Gets a value indicating whether the current stream supports seeking.

        /// </summary>

        /// <returns><c>true</c> if the stream supports seeking; otherwise, <c>false</c>.</returns>

        public override bool CanSeek

        {

            get { return m_streamBase == null ? false : m_streamBase.CanSeek; }

        }

 

        /// <summary>

        /// Gets a value indicating whether the current stream supports writing.

        /// </summary>

        /// <returns><c>true</c> if the stream supports writing; otherwise, <c>false</c>.</returns>

        public override bool CanWrite

        {

            get { return m_streamBase == null ? false : m_streamBase.CanWrite; }

        }

 

        /// <summary>

        /// Gets the length in bytes of the stream.

        /// </summary>

        public override long Length

        {

            get { ThrowIfDisposed();  return m_streamBase.Length; }

        }

 

        /// <summary>

        /// Gets or sets the position within the current stream.

        /// </summary>

        public override long Position

        {

            get { ThrowIfDisposed(); return m_streamBase.Position; }

            set { ThrowIfDisposed(); m_streamBase.Position = value; }

        }

 

        /// <summary>

        /// Begins an asynchronous read operation.

        /// </summary>

        public override IAsyncResult BeginRead(byte[] buffer, int offset, int count, AsyncCallback callback, object state)

        {

            ThrowIfDisposed();

            return m_streamBase.BeginRead(buffer, offset, count, callback, state);

        }

 

        /// <summary>

        /// Begins an asynchronous write operation.

        /// </summary>

        public override IAsyncResult BeginWrite(byte[] buffer, int offset, int count, AsyncCallback callback, object state)

        {

            ThrowIfDisposed();

            return m_streamBase.BeginWrite(buffer, offset, count, callback, state);

        }

 

        /// <summary>

        /// Waits for the pending asynchronous read to complete.

        /// </summary>

        public override int EndRead(IAsyncResult asyncResult)

        {

            ThrowIfDisposed();

            return m_streamBase.EndRead(asyncResult);

        }

 

        /// <summary>

        /// Ends an asynchronous write operation.

        /// </summary>

        public override void EndWrite(IAsyncResult asyncResult)

        {

            ThrowIfDisposed();

            m_streamBase.EndWrite(asyncResult);

        }

 

        /// <summary>

        /// Clears all buffers for this stream and causes any buffered data to be written to the underlying device.

        /// </summary>

        public override void Flush()

        {

            ThrowIfDisposed();

            m_streamBase.Flush();

        }

 

        /// <summary>

        /// Reads a sequence of bytes from the current stream and advances the position

        /// within the stream by the number of bytes read.

        /// </summary>

        public override int Read(byte[] buffer, int offset, int count)

        {

            ThrowIfDisposed();

            return m_streamBase.Read(buffer, offset, count);

        }

 

        /// <summary>

        /// Reads a byte from the stream and advances the position within the stream by one byte, or returns -1 if at the end of the stream.

        /// </summary>

        public override int ReadByte()

        {

            ThrowIfDisposed();

            return m_streamBase.ReadByte();

        }

 

        /// <summary>

        /// Sets the position within the current stream.

        /// </summary>

        /// <param name="offset">A byte offset relative to the <paramref name="origin"/> parameter.</param>

        /// <param name="origin">A value of type <see cref="T:System.IO.SeekOrigin"/> indicating the reference point used to obtain the new position.</param>

        /// <returns>The new position within the current stream.</returns>

        public override long Seek(long offset, SeekOrigin origin)

        {

            ThrowIfDisposed();

            return m_streamBase.Seek(offset, origin);

        }

 

        /// <summary>

        /// Sets the length of the current stream.

        /// </summary>

        /// <param name="value">The desired length of the current stream in bytes.</param>

        public override void SetLength(long value)

        {

            ThrowIfDisposed();

            m_streamBase.SetLength(value);

        }

 

        /// <summary>

        /// Writes a sequence of bytes to the current stream and advances the current position

        /// within this stream by the number of bytes written.

        /// </summary>

        public override void Write(byte[] buffer, int offset, int count)

        {

            ThrowIfDisposed();

            m_streamBase.Write(buffer, offset, count);

        }

 

        /// <summary>

        /// Writes a byte to the current position in the stream and advances the position within the stream by one byte.

        /// </summary>

        public override void WriteByte(byte value)

        {

            ThrowIfDisposed();

            m_streamBase.WriteByte(value);

        }

 

        /// <summary>

        /// Gets the wrapped stream.

        /// </summary>

        /// <value>The wrapped stream.</value>

        protected Stream WrappedStream

        {

            get { return m_streamBase; }

        }

 

        /// <summary>

        /// Releases the unmanaged resources used by the <see cref="WrappingStream"/> and optionally releases the managed resources.

        /// </summary>

        /// <param name="disposing">true to release both managed and unmanaged resources; false to release only unmanaged resources.</param>

        protected override void Dispose(bool disposing)

        {

            // doesn't close the base stream, but just prevents access to it through this WrappingStream

            if (disposing)

                m_streamBase = null;

 

            base.Dispose(disposing);

        }

 

        private void ThrowIfDisposed()

        {

            // throws an ObjectDisposedException if this object has been disposed

            if (m_streamBase == null)

                throw new ObjectDisposedException(GetType().Name);

        }

 

        Stream m_streamBase;

    }

You’ll note that this class isn’t sealed; that’s because it can be a useful base class for more specialised wrappers, some of which I hope to cover in future posts.

Posted by Bradley Grainger at 10:12 AM | Comments (3) | TrackBack

April 14, 2009

DateTime and ISO8601

The ISO 8601 standard for representing dates and times defines a (large) number of string formats for serializing dates. One of the more common formats in use (certainly here at Logos) uses the “extended format”, specifies the full date and time (but only in whole, not fractional, seconds) and always uses UTC. A string in this format looks like “2009-04-14T16:19:58Z”.

The standard date and time format strings for .NET don’t include a pattern that uses this precise format. The round-trip date/time pattern (specified by “o”) includes fractional seconds (e.g., “2009-04-14T16:19:58.0785018Z”), whereas the universal sortable date/time pattern (specified by “u”) has a space instead of a ‘T’ between the date and the time (e.g., “2009-04-14 16:19:58Z”).

We wrote the following utility methods to parse and render DateTimes in our preferred format:

/// <summary>

/// Provides methods for manipulating dates.

/// </summary>

public static class DateTimeUtility

{

    /// <summary>

    /// Converts the specified ISO 8601 representation of a date and time

    /// to its DateTime equivalent.

    /// </summary>

    /// <param name="value">The ISO 8601 string representation to parse.</param>

    /// <returns>The DateTime equivalent.</returns>

    public static DateTime ParseIso8601(string value)

    {

        return DateTime.ParseExact(value,

            Iso8601Format, CultureInfo.InvariantCulture,

            DateTimeStyles.AssumeUniversal | DateTimeStyles.AdjustToUniversal);

    }

 

    /// <summary>

    /// Formats the date in the standard ISO 8601 format.

    /// </summary>

    /// <param name="value">The date to format.</param>

    /// <returns>The formatted date.</returns>

    public static string ToIso8601(this DateTime value)

    {

        return value.ToUniversalTime().ToString(Iso8601Format, CultureInfo.InvariantCulture);

    }

 

    /// <summary>

    /// The ISO 8601 format string.

    /// </summary>

    public const string Iso8601Format = "yyyy'-'MM'-'dd'T'HH':'mm':'ss'Z'";

}

Posted by Bradley Grainger at 9:24 AM | Comments (2) | TrackBack

April 8, 2009

Creating Mixins with T4 in Visual Studio

One of the (few) features I really miss from C++ is the ability to use multiple inheritance to create mixins, particularly with the Curiously Recurring Template Pattern.

Scott Hanselman blogged about T4 (Text Template Transformation Toolkit) a few months back, but only recently it hit me that T4 could be combined with partial classes to create a poor man's mixin system in C# 3.0.

For a simple example, let's consider a class that supports IEquatable<T>. The logic that’s specific to computing equality for this class will be in the Equals and GetHashCode methods:

public class TestObject : IEquatable<TestObject>

{

    public TestObject(int value)

    {

        Value = value;

    }

 

    public bool Equals(TestObject other)

    {

        return other != null && Value == other.Value;

    }

 

    public override int GetHashCode()

    {

        return Value;

    }

 

    public int Value { get; set; }

}

To round out this class, we really should implement Equals(object) and overload the equality operators. This is exactly where a mixin would be useful, because that code is exactly the same in every implementation of an equatable object.

Firstly, prepare TestObject to support mixins by making it a partial class:

public partial class TestObject : IEquatable<TestObject>

Secondly, create the mixin source: the templated methods of the equatable implementations. Add a new item to the project, and set the name to “EquatableClass.tt”. (This implementation will be for classes only; a separate mixin would need to be created for structs since they can’t ever be null.)

Open the Properties window for this new file, and clear the Custom Tool property. This prevents an output file being generated for this template.

Fill in the EquatableClass.tt file with this code, a templated version of standard equality methods.

namespace <#= NamespaceName #>

{

    partial class <#= ClassName #>

    {

        public override bool Equals(object other)

        {

            return Equals(other as <#= ClassName #>);

        }

 

        public static bool operator==(<#= ClassName #> left, <#= ClassName #> right)

        {

            if (ReferenceEquals(left, right))

                return true;

            else if (ReferenceEquals(left, null) || ReferenceEquals(right, null))

                return false;

            else

                return left.Equals(right);

        }

 

        public static bool operator!=(<#= ClassName #> left, <#= ClassName #> right)

        {

            if (ReferenceEquals(left, right))

                return false;

            else if (ReferenceEquals(left, null) || ReferenceEquals(right, null))

                return true;

            else

                return !left.Equals(right);

        }

    }

}

<#+

    public void SetClassName(string namespaceName, string className)

    {

        NamespaceName = namespaceName;

        ClassName = className;

    }

 

    string NamespaceName;

    string ClassName;

#>

This template (when included in another template) outputs the standard equality methods and also provides a way for the including template to customize the output.

Lastly, write the including template. Add another file to the project, this time named “TestObjectEquatable.tt”. Paste the following code into that file:

<#@template language="C#"#>

<#@output extension="g.cs" #>

<# SetClassName("Mixins", "TestObject"); #>

<#@include file="EquatableClass.tt"#>

These four lines have the following effects:

  1. Instructs the T4 engine that code in this template is written in C#.
  2. Sets the extension of the file to “.g.cs”; this clearly identifies the output as generated code.
  3. Calls the SetClassName method defined in the first template so that the right namespace and class name are used in the generated code.
  4. Includes the first template, causing all the code it defines to be generated.

When the project is built, the TestObjectEquatable.tt template will be processed to generate a partial class containing the supporting equality methods. The C# compiler will compile this with the primary partial class, giving a complete implementation of the standard equality methods.

Now that the template is defined, only the four lines in the second template need to be copied to add equatable methods to a new class, instead of the 25 lines of boilerplate code that the template generates.

One significant drawback with this approach is that a template is only reprocessed when it changes; the template engine isn't aware that TestObjectEquatable.tt depends on EquatableClass.tt, and that it should be reprocessed if the latter changes. For this reason, it’s probably best to keep the included template rather simple; it would be better to have the generated methods delegate to composed objects or static methods on utility classes rather than including a lot of complicated logic in the template itself.

To learn more about T4, I highly recommend Oleg Sych’s blog, which contains many tutorials, examples, and information about his T4 Toolbox project.

Posted by Bradley Grainger at 8:51 AM | Comments (3) | TrackBack

March 17, 2009

ReadOnlyObservableCollection anti-pattern

I recently fixed a bug that was a result of a subtle misuse of the ReadOnlyObservableCollection<T> class. The code in question was structured as follows: The Source class has a collection of Items. Clients should be able to observe but not modify the collection, so it’s exposed as a ReadOnlyObservableCollection. The Client class adds an event handler to this collection, and removes it at the appropriate time (to prevent a possible memory leak).

class Source

{

    Source()

    {

        m_collection = new ObservableCollection<int>();

    }

 

    public ReadOnlyObservableCollection<int> Items

    {

        get

        {

            // this is the bug

            return new ReadOnlyObservableCollection<int>(m_collection);

        }

    }

 

    readonly ObservableCollection<int> m_collection;

}

 

class Client

{

    // obtain Source instance from somewhere

    Source m_source;

 

    void Subscribe()

    {

        ((INotifyCollectionChanged) m_source.Items).CollectionChanged += SourceItems_CollectionChanged;

    }

 

    void Unsubscribe()

    {

        ((INotifyCollectionChanged) m_source.Items).CollectionChanged += SourceItems_CollectionChanged;

    }

 

    void SourceItems_CollectionChanged(object sender, NotifyCollectionChangedEventArgs e) { }

}

The bug is that the Source.Items property creates a new ReadOnlyObservableCollection instance each time it’s accessed, which means that the calls to add and remove the event handler happen on two different objects; the event handler is never actually removed.

One solution is for the Client class to cache the exact INotifyCollectionChanged object it subscribed to, so it can be sure to unsubscribe from the same object. The other (and better) solution is for the Source class to create and hold a ReadOnlyObservableCollection, and return the same object each time the Items property is accessed:

public class Source

{

    Source()

    {

        m_collection = new ObservableCollection<int>();

        m_collectionReadOnly = new ReadOnlyObservableCollection<int>(m_collection);

    }

 

    public ReadOnlyObservableCollection<int> Items

    {

        get { return m_collectionReadOnly; }

    }

 

    readonly ObservableCollection<int> m_collection;

    readonly ReadOnlyObservableCollection<int> m_collectionReadOnly;

}

This solves the initial bug and also has the benefit of allowing all clients to share the same ReadOnlyObservableCollection instance (instead of creating one per client).

Posted by Bradley Grainger at 9:02 AM | Comments (2) | TrackBack

December 4, 2008

Profiling Lock Contention

The Improving .NET Application Performance and Scalability talk at PDC previewed of the new lock contention mode in the Visual Studio 2010 profiler. I haven't seen a detailed list of capabilities, but starting at 57:15 in the video you can see that it appears to identify synchronization objects that cause threads to block, reports which locks have the highest contention, and records the total amount of time spent waiting. This sounds like a great way to identify why a concurrent app may be experiencing sub-linear speed-up.

There doesn't appear to be any tools available right now that collect this information, so I wrote a TimedMonitor class to help identify the locks in my code that could be causing problems. To use it, you have to change the objects you're locking on to instances of the TimedMonitor class. The Lock() method acquires a lock (using Monitor.Enter) and returns an IDisposable struct that releases the lock. (Returning a struct incurs no heap allocation, and the Dispose method gets inlined by the JIT, making this approach efficient.)

// Instead of:

object myLock = new object();

lock (myLock) { /* do work */ }

// Use:

TimedMonitor timedLock = new TimedMonitor();

using (timedLock.Lock()) { /* do work */ }

Console.WriteLine("Lock was entered {0} times; total wait time: {1}.",

    timedLock.EnterCount, timedLock.WaitTime);

The full code for TimedMonitor is at the end of this post. It uses a conditional directive named "ENABLE_TIMING". If this is not set, the timing code isn't compiled into the assembly, the JITter completely inlines the Lock method, and the performance of TimedMonitor is on par with the C# lock keyword. Otherwise, the timing code is executed; TimedMonitor.Lock() is about 30x slower than lock on my Core 2 Duo (which is fine for most cases, when this overhead is negligible compared to the work done inside the lock).

One final thing to note is that this code does not account for the case where thread A calls Monitor.Pulse, and thread B (which had called Monitor.Wait) wakes up and immediately has to block waiting for thread A to release the lock (so that thread B can re-acquire it before it returns from Monitor.Wait). The time B spends blocked will not be reported by TimedMonitor.

Update: The original code P/Invoked to QueryPerformanceCounter; I've since updated it to just use a Stopwatch (which simplifies the code and adds little additional overhead).

    /// <summary>

    /// Initializes a new instance of the <see cref="TimedMonitor"/> class.

    /// </summary>

    public TimedMonitor()

    {

        // use a private lock so that clients must go through Lock()

        m_lock = new object();

#if ENABLE_TIMEDMONITOR

        m_sw = new Stopwatch();

#else

        m_sw = null;

        m_enterCount = 0;

#endif

    }

 

    /// <summary>

    /// Acquires an exclusive lock and records how long the acquisition took.

    /// </summary>

    /// <returns>An object that can be disposed to release the lock.</returns>

    public LockHolder Lock()

    {

#if ENABLE_TIMEDMONITOR

        m_sw.Start();

#endif

        Monitor.Enter(m_lock);

#if ENABLE_TIMEDMONITOR

        m_sw.Stop();

        m_enterCount++;

#endif

        return new LockHolder(m_lock);

    }

 

    // TODO: Implement Enter() and Exit() as separate methods for more

    //   complex locks that can't be wrapped in a 'using' block.

 

    // TODO: Implement Pulse, PulseAll, and Wait.

 

    /// <summary>

    /// Gets the number of times the lock was acquired.

    /// </summary>

    /// <value>The number of times the lock was acquired.</value>

    public long EnterCount

    {

        get { return m_enterCount; }

    }

 

    /// <summary>

    /// Gets the total time spent waiting to acquire the lock.

    /// </summary>

    /// <value>The total time spent waiting to acquire the lock.</value>

    public TimeSpan WaitTime

    {

        get { return m_sw == null ? TimeSpan.Zero : m_sw.Elapsed;  }

    }

 

    /// <summary>

    /// Releases the lock acquired by <see cref="TimedMonitor.Lock"/>.

    /// </summary>

    public struct LockHolder : IDisposable

    {

        /// <summary>

        /// Releases the lock.

        /// </summary>

        public void Dispose()

        {

            Monitor.Exit(m_lock);

        }

 

        internal LockHolder(object objLock)

        {

            m_lock = objLock;

        }

 

        readonly object m_lock;

    }

 

    readonly object m_lock;

    readonly Stopwatch m_sw;

    long m_enterCount;

}

Posted by Bradley Grainger at 7:34 PM | Comments (2) | TrackBack

November 20, 2008

Events and Threads (Part 4)

A common pattern for raising an event in C# is the following code:

EventHandler handler = MyEvent;

if (handler != null)

    handler(this, EventArgs.Empty);

A warning associated with this example (that we have helped spread) is that the code is subject to potentially unsafe JIT optimizations. The ultimate source for this seems to be Juval Lowy's book, Programming .NET Components, 2nd Edition (p250):

By copying the delegate to a temporary variable, you keep a copy of the original state of the delegate, irrespective of any thread context switches. Unfortunately, however, the JIT compiler may optimize the code, eliminate the temporary variable, and use the original delegate directly. That puts you back where you started, susceptible to the race condition.

The subject may have been first raised by a post on GrantRi's weblog about AMD64 JIT optimisations. He wrote about Hashtable, but his logic is applicable to this example. He stated that the AMD64 JIT could legally ignore a local (handler) in the code above, and simply perform two reads of a field (MyEvent) instead. This could allow another thread to set the field to null, causing handler to appear to be null inside the body of the if statement—a logical contradiction (and a serious bug).

However, this blog post was written in 2004 (and the book in early 2005), before CLR 2.0 introduced a much stronger memory model with guarantees that eliminate this bug. As per a MSDN article on memory models, CLR 2.0's memory model rules include:

  1. All the rules that are contained in the ECMA model, in particular the three fundamental memory model rules as well as the ECMA rules for volatile.
  2. Reads and writes cannot be introduced.

Joe Duffy also describes the problem (for fields, not for events specifically) and states that the memory model prevents it, in Concurrent Programming on Windows, pp517-8:

As an example of when a load might be introduced, consider this code:

MyObject mo = ...;

int f = mo.field;

if (f == 0)

{

    // do something

    Console.WriteLine(f);

}

If the period of time between the initial read of mo.field into variable f and the subsequent use of f in the Console.WriteLine was long enough, a compiler may decide it would be more efficient to reread mo.field twice. ... Doing this would be a problem if mo is a heap object and threads are writing concurrently to mo.field. The if-block may contain code that assumes the value read into f remained 0, and the introduction of reads could break this assumption. In addition to prohibiting this for volatile variables, the .NET memory model prohibits it for ordinary variables referring to GC heap memory too.

Since reads can't be introduced for fields in CLR 2.0, there's no need to use a non-inlineable helper method to raise the event; the JIT won't perform aggressive optimisations that violate the memory model. The typical code that's given for raising events is correct. (But if you do happen to be programming for a non-Microsoft CLR that only follows the ECMA rules, and your code is running on a processor with a notoriously weak hardware memory model, such as the Intel Itanium, you would need to protect this code appropriately.)

Posted by Bradley Grainger at 4:42 PM | Comments (0) | TrackBack

November 3, 2008

Tuples in .NET

It was announced at PDC that .NET 4.0 will include a Tuple implementation. Several examples of Tuple have been posted on the web; we have also written our own, which is reproduced below.

It would be nice if the BCL team would post an official Tuple design document, so that .NET 3.5 developers can ensure that any implementations they create are easy to migrate to .NET 4.0. Since that's not currently available, and the .NET 4.0 CTP doesn't have a public Tuple type, I examined the CTP bits with Reflector. I was gratified to see that our overall design matches the current CTP: Tuple<> is an immutable struct implementing IEquatable<Tuple<>>, the values are named .First and .Second, and there's a static Tuple.Create method to create new instances.

The .NET 4.0 implementation is more complex than ours (for example, due to the need to interop with IronPython, IronRuby, and F#); they also implement IComparable<Tuple<>>, whereas we have (for now) chosen to implement CompareTo as an extension method.

Update: The beta documentation for Tuple shows a different design: Tuple is an immutable class with properties named .Item1 and .Item2; IEquatable<> has been removed and replaced with IStructuralEquatable.

The Tuple<T1, T2> class is as follows. Tuples of different arity are constructed very similarly.

/// <summary>

/// A tuple comprising two items.

/// </summary>

/// <typeparam name="T1">The type of the first item in the tuple.</typeparam>

/// <typeparam name="T2">The type of the second item in the tuple.</typeparam>

[DebuggerDisplay("First={m_t1}, Second={m_t2}")]

public struct Tuple<T1, T2> : IEquatable<Tuple<T1, T2>>

{

    /// <summary>

    /// Initializes a new instance of the <see cref="Tuple{T1,T2}"/> class.

    /// </summary>

    /// <param name="first">The first item in the tuple.</param>

    /// <param name="second">The second item in the tuple.</param>

    public Tuple(T1 first, T2 second)

    {

        m_t1 = first;

        m_t2 = second;

    }

 

    /// <summary>

    /// Gets the first item in the tuple.

    /// </summary>

    /// <value>The first item in the tuple.</value>

    public T1 First

    {

        get { return m_t1; }

    }

 

    /// <summary>

    /// Gets the second item in the tuple.

    /// </summary>

    /// <value>The second item in the tuple.</value>

    public T2 Second

    {

        get { return m_t2; }

    }

 

    /// <summary>

    /// Indicates whether the current tuple is equal to another tuple.

    /// </summary>

    /// <param name="other">A tuple to compare with this tuple.</param>

    /// <returns>true if the current tuple is equal to the <paramref name="other"/> parameter; otherwise, false.</returns>

    public bool Equals(Tuple<T1, T2> other)

    {

        return EqualityComparer<T1>.Default.Equals(m_t1, other.m_t1) &&

            EqualityComparer<T2>.Default.Equals(m_t2, other.m_t2);

    }

 

    /// <summary>

    /// Determines whether the specified <see cref="Object"/> is equal to the current <see cref="Object"/>.

    /// </summary>

    /// <param name="obj">The <see cref="Object"/> to compare with the current <see cref="Object"/>.</param>

    /// <returns>

    /// true if the specified <see cref="Object"/> is equal to the current <see cref="Object"/>; otherwise, false.

    /// </returns>

    public override bool Equals(object obj)

    {

        return obj is Tuple<T1, T2> && Equals((Tuple<T1, T2>) obj);

    }

 

    /// <summary>

    /// Returns a hash code for this tuple.

    /// </summary>

    /// <returns>A hash code for the current <see cref="Object"/>.</returns>

    public override int GetHashCode()

    {

        return EqualityComparer<T1>.Default.GetHashCode(m_t1) ^

            EqualityComparer<T2>.Default.GetHashCode(m_t2);

    }

 

    /// <summary>

    /// Compares two tuples for equality.

    /// </summary>

    /// <param name="left">The first tuple.</param>

    /// <param name="right">The second tuple.</param>

    /// <returns><c>true</c> if the tuples are equal; false otherwise.</returns>

    public static bool operator ==(Tuple<T1, T2> left, Tuple<T1, T2> right)

    {

        return left.Equals(right);

    }

 

    /// <summary>

    /// Compares two tuples for inequality.

    /// </summary>

    /// <param name="left">The first tuple.</param>

    /// <param name="right">The second tuple.</param>

    /// <returns><c>true</c> if the tuples are not equal; false otherwise.</returns>

    public static bool operator !=(Tuple<T1, T2> left, Tuple<T1, T2> right)

    {

        return !left.Equals(right);

    }

 

    /// <summary>

    /// Returns a <see cref="String"/> that represents the current <see cref="Object"/>.

    /// </summary>

    /// <returns>A <see cref="String"/> that represents the current <see cref="Object"/>.</returns>

    public override string ToString()

    {

        return string.Format(CultureInfo.InvariantCulture, "({0},{1})", m_t1, m_t2);

    }

 

    readonly T1 m_t1;

    readonly T2 m_t2;

}

We also have a static Tuple class (with no generic parameters itself) that provides some helper methods.

/// <summary>

/// Methods for creating and manipulating Tuples.

/// </summary>

public static class Tuple

{

    /// <summary>

    /// Creates a new <see cref="Tuple{T1,T2}"/>.

    /// </summary>

    /// <param name="first">The first item in the tuple.</param>

    /// <param name="second">The second item in the tuple.</param>

    /// <returns>A new tuple consisting of the specified two items.</returns>

    public static Tuple<T1, T2> Create<T1, T2>(T1 first, T2 second)

    {

        return new Tuple<T1, T2>(first, second);

    }

 

    /// <summary>

    /// Compares the specified tuples by comparing their first component, and then (if that is equal)

    /// the second component.

    /// </summary>

    /// <param name="left">The left tuple.</param>

    /// <param name="right">The right tuple.</param>

    /// <returns>A 32-bit signed integer that indicates the relative order of the tuples being compared.</returns>

    public static int CompareTo<T1, T2>(this Tuple<T1, T2> left, Tuple<T1, T2> right)

    {

        int result = Comparer<T1>.Default.Compare(left.First, right.First);

        return result == 0 ? Comparer<T2>.Default.Compare(left.Second, right.Second) : result;

    }

}

Posted by Bradley Grainger at 8:41 AM | Comments (0) | TrackBack

October 22, 2008

How to Reverse a Unicode String in C#

Perhaps due to the lack of a built-in String.Reverse method in the .NET Framework, it's very common (1, 2, 3, 4, 5, 6, 7) for implementations of such a method to be posted.

Unfortunately, most of these implementations do not handle characters outside Unicode's Basic Multilingual Plane correctly. These supplementary characters have code points between U+10000 and U+10FFFF and so cannot be represented with one 16-bit char. In UTF-16 (which is how .NET strings are encoded), these Unicode characters are represented as two C# chars, a high surrogate followed by a low surrogate. When the string is reversed, the order of these two chars has to be preserved.

Here's our method that reverses a string while handling surrogate code units correctly:

/// <summary>

/// Reverses the specified string.

/// </summary>

/// <param name="input">The string to reverse.</param>

/// <returns>The input string, reversed.</returns>

/// <remarks>This method correctly reverses strings containing supplementary characters

/// (which are encoded with two surrogate code units).</remarks>

public static string Reverse(this string input)

{

    if (input == null)

        throw new ArgumentNullException("input");

 

    // allocate a buffer to hold the output

    char[] output = new char[input.Length];

    for (int outputIndex = 0, inputIndex = input.Length - 1; outputIndex < input.Length; outputIndex++, inputIndex--)

    {

        // check for surrogate pair

        if (input[inputIndex] >= 0xDC00 && input[inputIndex] <= 0xDFFF &&

            inputIndex > 0 && input[inputIndex - 1] >= 0xD800 && input[inputIndex - 1] <= 0xDBFF)

        {

            // preserve the order of the surrogate pair code units

            output[outputIndex + 1] = input[inputIndex];

            output[outputIndex] = input[inputIndex - 1];

            outputIndex++;

            inputIndex--;

        }

        else

        {

            output[outputIndex] = input[inputIndex];

        }

    }

 

    return new string(output);

}

Posted by Bradley Grainger at 9:07 PM | Comments (4) | TrackBack

Detecting Bindings that should be OneTime

In WPF, a Binding's source can be any .NET object; the target of the Binding will be updated when the specified property on that source changes. This works best when the source property is a DependencyProperty, or when the source object implements INotifyPropertyChanged; these objects have built-in support for property value changed notifications. In other cases, the ComponentModel infrastructure (as exposed by the PropertyDescriptor class) stores the source object in a global table in order to track clients who wish to be notified when a property value changes.

Binding to a regular property of a regular .NET object (that doesn't implement INotifyPropertyChanged) has two drawbacks:

  1. It may be needlessly inefficient. If, for example, the source object is not implementing INotifyPropertyChanged because it's immutable, creating and attaching value changed handlers is unnecessary overhead.
  2. It can cause a memory leak.

Both these problems can be eliminated by setting the Mode of the Binding to OneTime, but in a large application, determining all the bindings that could be OneTime is not an easy task. Some spelunking (with .NET Memory Profiler and .NET Reflector) showed that the (internal) ReflectTypeDescriptionProvider class has a static Hashtable containing all objects that have had value changed handlers added. A common reason for objects to end up in that Hashtable is their participation in a WPF binding, so enumerating this Hashtable at runtime can help track down bindings that may need to be changed. (And if an object is never removed from this hashtable, that may be a sign of a memory leak.)

This method uses reflection to dump the contents of the ReflectTypeDescriptionProvider._propertyCache hashtable for diagnostic purposes (the definition of the ReflectPropertyDescriptorInfo class is given later):

private static ReadOnlyCollection<ReflectPropertyDescriptorInfo> GetReflectPropertyDescriptorInfo()

{

    List<ReflectPropertyDescriptorInfo> listInfo = new List<ReflectPropertyDescriptorInfo>();

 

    // get the ReflectTypeDescriptionProvider._propertyCache field

    Type typeRtdp = typeof(PropertyDescriptor).Module.

        GetType("System.ComponentModel.ReflectTypeDescriptionProvider");

    FieldInfo propertyCacheFieldInfo = typeRtdp.GetField("_propertyCache",

        BindingFlags.Static | BindingFlags.NonPublic);

    Hashtable propertyCache = (Hashtable) propertyCacheFieldInfo.GetValue(null);

 

    if (propertyCache != null)

    {

        // try to make a copy of the hashtable as quickly as possible (this object can be accessed by other threads)

        DictionaryEntry[] entries = new DictionaryEntry[propertyCache.Count];

        propertyCache.CopyTo(entries, 0);

 

        FieldInfo valueChangedHandlersFieldInfo = typeof(PropertyDescriptor).GetField("valueChangedHandlers",

            BindingFlags.Instance | BindingFlags.NonPublic);

 

        // count the "value changed" handlers for each type

        foreach (DictionaryEntry entry in entries)

        {

            PropertyDescriptor[] pds = (PropertyDescriptor[]) entry.Value;

            if (pds != null)

            {

                foreach (PropertyDescriptor pd in pds)

                {

                    Hashtable valueChangedHandlers = (Hashtable) valueChangedHandlersFieldInfo.GetValue(pd);

                    if (valueChangedHandlers != null && valueChangedHandlers.Count != 0)

                        listInfo.Add(new ReflectPropertyDescriptorInfo(entry.Key.ToString(), pd.Name,

                            valueChangedHandlers.Count));

                }

            }

        }

    }

 

    listInfo.Sort();

    return listInfo.AsReadOnly();

}

The following code implements a window that displays all the properties that were found. It can be used by adding it to a WPF application and creating a special diagnostic button or keystroke that opens the window. You can open two windows and compare the lists side-by-side, or use the Refresh button to regenerate the list (after interacting with your application's UI) to see if any properties have been added or removed.

ReflectPropertyDescriptorWindow.xaml:

<Window x:Class="OneTimeBinding.ReflectPropertyDescriptorWindow"

    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"

    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"

    xmlns:src="clr-namespace:OneTimeBinding"

    Title=".NET Properties used in Binding Paths" Height="450" Width="450" WindowStartupLocation="CenterScreen"

    DataContext="{Binding RelativeSource={RelativeSource Self}}">

 

    <Window.Resources>

        <ResourceDictionary>

            <DataTemplate DataType="{x:Type src:ReflectPropertyDescriptorInfo}">

                <StackPanel Orientation="Horizontal">

                    <TextBlock Text="{Binding TypeName, Mode=OneTime}"/>

                    <TextBlock>.</TextBlock>

                    <TextBlock FontWeight="Bold" Text="{Binding PropertyName, Mode=OneTime}"/>

                    <TextBlock Text="{Binding DisplayHandlerCount, Mode=OneTime}"/>

                </StackPanel>

            </DataTemplate>

        </ResourceDictionary>

    </Window.Resources>

 

    <DockPanel>

        <Button DockPanel.Dock="Top" Margin="4"

            Click="RefreshButton_Click">_Refresh</Button>

        <ScrollViewer Margin="4">

            <ItemsControl ItemsSource="{Binding ReflectProperties}"/>

        </ScrollViewer>

    </DockPanel>

</Window>

 

ReflectPropertyDescriptorWindow.xaml.cs:

public partial class ReflectPropertyDescriptorWindow : Window

{

    public ReflectPropertyDescriptorWindow()

    {

        InitializeComponent();

        ReflectProperties = GetReflectPropertyDescriptorInfo();

    }

 

    public static readonly DependencyProperty ReflectPropertiesProperty =

        DependencyProperty.Register("ReflectProperties", typeof(ReadOnlyCollection<ReflectPropertyDescriptorInfo>),

        typeof(ReflectPropertyDescriptorWindow), new PropertyMetadata());

 

    public ReadOnlyCollection<ReflectPropertyDescriptorInfo> ReflectProperties

    {

        get { return (ReadOnlyCollection<ReflectPropertyDescriptorInfo>) GetValue(ReflectPropertiesProperty); }

        set { SetValue(ReflectPropertiesProperty, value); }

    }

 

    private void RefreshButton_Click(object sender, RoutedEventArgs e)

    {

        ReflectProperties = GetReflectPropertyDescriptorInfo();

    }

 

    private static ReadOnlyCollection<ReflectPropertyDescriptorInfo> GetReflectPropertyDescriptorInfo()

    {

        // as shown above

    }

}

And finally, the definition of the immutable ReflectPropertyDescriptorInfo object, which is used as the source of a OneTime binding in the UI:

public sealed class ReflectPropertyDescriptorInfo : IEquatable<ReflectPropertyDescriptorInfo>,

    IComparable<ReflectPropertyDescriptorInfo>

{

    public ReflectPropertyDescriptorInfo(string typeName, string propertyName, int handlerCount)

    {

        m_typeName = typeName;

        m_propertyName = propertyName;

        m_handlerCount = handlerCount;

    }

 

    public string TypeName

    {

        get { return m_typeName; }

    }

 

    public string PropertyName

    {

        get { return m_propertyName; }

    }

 

    public int HandlerCount

    {

        get { return m_handlerCount; }

    }

 

    public string DisplayHandlerCount

    {

        get { return m_handlerCount == 1 ? "" : string.Format(CultureInfo.InvariantCulture,

            " ({0:n0} handlers)", m_handlerCount); }

    }

 

    public int CompareTo(ReflectPropertyDescriptorInfo other)

    {

        if (object.ReferenceEquals(other, null))

            return 1;

 

        int compareResult = m_typeName.CompareTo(other.m_typeName);

        if (compareResult == 0)

            compareResult = m_propertyName.CompareTo(other.m_propertyName);

        if (compareResult == 0)

            compareResult = m_handlerCount.CompareTo(other.m_handlerCount);

        return compareResult;

    }

 

    // Implementations of Equals, GetHashCode, operators, etc. elided for brevity

 

    readonly string m_typeName;

    readonly string m_propertyName;

    readonly int m_handlerCount;

}

Posted by Bradley Grainger at 6:57 PM | Comments (1) | TrackBack

October 6, 2008

Using "Background Processing Mode" from C#

The Windows Vista kernel added support for I/O and memory priorities. These allow background work (such as search indexing or virus scanning) to reduce its impact on foreground applications beyond what is possible simply by using a low thread CPU priority. According to the SetThreadPriority documentation, "For threads that perform background work such as file I/O, network I/O, or data processing, it is not sufficient to adjust the CPU scheduling priority; even an idle CPU priority thread can easily interfere with system responsiveness when it uses the disk and memory."

Applications can opt in to low I/O and memory priority by passing new flags to SetThreadPriority(THREAD_MODE_BACKGROUND_BEGIN and THREAD_MODE_BACKGROUND_END) or SetPriorityClass (PROCESS_MODE_BACKGROUND_BEGIN and PROCESS_MODE_BACKGROUND_END).

These new priority levels aren't exposed through the .NET Framework, but can be accessed by using P/Invoke. First, declare the constants and functions from the Windows API:

internal static class Win32

{

    public const int THREAD_MODE_BACKGROUND_BEGIN = 0x00010000;

    public const int THREAD_MODE_BACKGROUND_END = 0x00020000;

}

 

internal static class NativeMethods

{

    [DllImport("Kernel32.dll", ExactSpelling = true)]

    public static extern IntPtr GetCurrentThread();

 

    [DllImport("Kernel32.dll", ExactSpelling = true)]

    [return: MarshalAs(UnmanagedType.Bool)]

    public static extern bool SetThreadPriority(IntPtr hThread, int nPriority);

}

Second, write a C# wrapper for those functions. I use Thread.BeginThreadAffinity to notify the runtime (strictly speaking, the CLR host) that the code that's being executed depends on the identity of the underlying OS thread. The return type, Scope, has been covered already on this blog.

public static class ThreadUtility

{

    /// <summary>

    /// Puts the current thread into background processing mode.

    /// </summary>

    /// <returns>A Scope that must be disposed to leave background processing mode.</returns>

    [SecurityPermission(SecurityAction.Demand, Flags=SecurityPermissionFlag.ControlThread)]

    public static Scope EnterBackgroundProcessingMode()

    {

        Thread.BeginThreadAffinity();

        IntPtr hThread = SafeNativeMethods.GetCurrentThread();

        if (IsWindowsVista() && NativeMethods.SetThreadPriority(hThread,

            Win32.THREAD_MODE_BACKGROUND_BEGIN))

        {

            // OS supports background processing; return Scope that exits this mode

            return Scope.Create(() =>

            {

                NativeMethods.SetThreadPriority(hThread, Win32.THREAD_MODE_BACKGROUND_END);

                Thread.EndThreadAffinity();

            });

        }

 

        // OS doesn't support background processing mode (or setting it failed)

        Thread.EndThreadAffinity();

        return Scope.Empty;

    }

 

    // Returns true if the current OS is Windows Vista (or Server 2008) or higher.

    private static bool IsWindowsVista()

    {

        OperatingSystem os = Environment.OSVersion;

        return os.Platform == PlatformID.Win32NT && os.Version >= new Version(6, 0);

    }

}

Third, use the wrapper like so:

using (ThreadUtility.EnterBackgroundProcessingMode())

{

    PerformSomeBackgroundWork();

}

Note that this only has effect on Windows Vista, Windows Server 2008, and later; you'd also want to lower Thread.Priority during the background work if your application runs on earlier operating systems.

And while this does appear to work quite nicely in testing, it could actually be dangerous in production code. It's possible that this could have unpredictable and hazardous interactions with the garbage collector, the finalizer thread, or other components of the .NET Runtime. Furthermore, if a CLR host ever multiplexes many managed threads to one OS thread, changing the priority of the OS thread would be too heavy-handed. Perhaps a future version of the framework will expose background processing mode to managed threads in a safe way; until then it's probably best to consider advanced native threading features (background mode, fibers, CPU affinity, etc.) to be off limits to managed code.

Posted by Bradley Grainger at 8:17 AM | Comments (0) | TrackBack

September 19, 2008

Getting the file path of an assembly

I can never remember how to get the file path of a .NET Framework assembly, so I thought I'd post it here once and for all.

string strProgramPath = typeof(Program).Assembly.Location;

That is all.

Posted by Ed Ball at 2:15 PM | Comments (0) | TrackBack

August 25, 2008

Event subscription using weak references

In my previous post, I never really explained why it can be important to unsubscribe from events.

It is often the case that the "subject" (the object with the event) has a longer lifetime than the "observer" (the object that subscribes to the event). When we are no longer using the observer, we would like it to be garbage collected; however, if the observer is still subscribed to an event on the subject, the associated event handler holds a strong reference to the observer, so the observer will not be garbage collected until the subject also becomes garbage, or until the observer unsubscribes.

However, it is sometimes impossible (or very inconvenient) to determine the lifetime of an object, and so it isn't possible to know when it would be safe to unsubscribe from the event. In that case, it would be really nice if we could subscribe to the event with a “weak delegate” – an event handler that would allow the target to be garbage collected.

Greg Schechter has the best article that I could find on this subject, and I encourage you to read it for more information. (His “containee” is our “subject” and his “container” is our “observer”.)

By using the EventInfo class from the previous post, we have created a method that allows a “weak subscription” to any event. Here's an example, similar to the examples from that post.

public sealed class Observer

{

    public Observer(Subject subject)

    {

        Subject.NotifyEvent.WeakSubscribe(subject, this,

            (t, s, e) => t.UpdateSubject(subject));

    }

 

    private void UpdateSubject(Subject subject)

    {

        // ...

    }

}

Note that the Observer class is not disposable. Also note that the delegate passed to WeakSubscribe does not hold a strong reference to the Observer. The WeakSubscribe method is passed a strong reference to the Observer via “this”, but it only ends up holding a weak reference to that reference.

For reasons that will (hopefully) become clear, we define WeakSubscribe as an extension method; in this form, it only works with events that use the simple EventHandler delegate:

public static Scope WeakSubscribe<TSource, TTarget>(

    this EventInfo<TSource, EventHandler> info,

    TSource source, TTarget target, Action<TTarget, object, EventArgs> action)

        where TTarget : class

{

    WeakReference weakTarget = new WeakReference(target, false);

 

    EventHandler handler = null;

    handler =

        (s, e) =>

        {

            TTarget t = (TTarget) weakTarget.Target;

            if (t != null)

                action(t, s, e);

            else

                info.RemoveHandler(source, handler);

        };

    return info.Subscribe(source, handler);

}

Note how the actual event handler only calls the supplied delegate if the weak reference to the target is still valid. If it is not, the handler is removed, since it is no longer necessary.

This method also returns a Scope that unsubscribes from the event, for cases where the client wants to unsubscribe before the target is garbage collected.

We can support WeakSubscribe on events that use EventHandler<T> with just a few minor changes:

public static Scope WeakSubscribe<TSource, TTarget, TEventArgs>(

    this EventInfo<TSource, EventHandler<TEventArgs>> info,

    TSource source, TTarget target,

    Action<TTarget, object, TEventArgs> action)

        where TTarget : class

        where TEventArgs : EventArgs

{

    WeakReference weakTarget = new WeakReference(target, false);

 

    EventHandler<TEventArgs> handler = null;

    handler =

        (s, e) =>

        {

            TTarget t = (TTarget) weakTarget.Target;

            if (t != null)

                action(t, s, e);

            else

                info.RemoveHandler(source, handler);

        };

    return info.Subscribe(source, handler);

}

Finally, we can support WeakSubscribe on events that use any standard event handler (e.g. CancelEventHandler) by using DelegateUtility.Cast:

public static Scope WeakSubscribe<TSource, TTarget, TEventArgs, TEventHandler>(

    this EventInfo<TSource, TEventHandler> info,

    TSource source, TTarget target,

    Action<TTarget, object, TEventArgs> action)

        where TTarget : class

        where TEventHandler : class

        where TEventArgs : EventArgs

{

    WeakReference weakTarget = new WeakReference(target, false);

 

    TEventHandler handler = null;

    Action<object, TEventArgs> fn =

        (arg1, arg2) =>

        {

            TTarget t = (TTarget) weakTarget.Target;

            if (t != null)

                action(t, arg1, arg2);

            else

                info.RemoveHandler(source, handler);

        };

    handler = DelegateUtility.Cast<TEventHandler>(fn);

    return info.Subscribe(source, handler);

}

All told, these methods make it very easy to “weakly” subscribe to events from any source. We welcome any comments, questions, or criticisms, as always!

Posted by Ed Ball at 9:32 AM | Comments (1) | TrackBack

August 21, 2008

Unsubscribing from C# events

Unsubscribing from C# events can be a pain. It isn't so bad when your event handler is a simple static or instance method, because those methods are still available when you're ready to unsubscribe.

public sealed class Subject

{

    public event EventHandler Notify;

 

    // ...

}

 

public sealed class Observer : IDisposable

{

    public Observer(Subject subject)

    {

        m_subject = subject;

        m_subject.Notify += Subject_Notify;

    }

 

    public void Dispose()

    {

        m_subject.Notify -= Subject_Notify;

    }

 

    private void Subject_Notify(object sender, EventArgs e)

    {

        // ...

    }

 

    readonly Subject m_subject;

}

When you subscribe to an event using an anonymous delegate, however, things get trickier. You've got to keep that delegate around so that you can remove it from the event.

public sealed class Observer : IDisposable

{

    public Observer(Subject subject)

    {

        m_subject = subject;

        m_handler = delegate { UpdateSubject(); };

        m_subject.Notify += m_handler;

    }

 

    public void Dispose()

    {

        m_subject.Notify -= m_handler;

    }

 

    private void UpdateSubject()

    {

        // ...

    }

 

    readonly Subject m_subject;

    readonly EventHandler m_handler;

}

The Scope class can be a convenient way to encapsulate the lifetime of an event, though it is a bit awkward, because you have to declare a local variable for the event handler.

public sealed class Observer : IDisposable

{

    public Observer(Subject subject)

    {

        EventHandler handler = delegate { UpdateSubject(subject); };

        subject.Notify += handler;

        m_scope = Scope.Create(() => subject.Notify -= handler);

    }

 

    public void Dispose()

    {

        m_scope.Dispose();

    }

 

    private void UpdateSubject(Subject subject)

    {

        // ...

    }

 

    readonly Scope m_scope;

}

We've written an EventInfo class that encapsulates the add/remove behavior of any event and makes it easy to create a Scope that unsubscribes to an event.

public sealed class EventInfo<TSource, TEventHandler>

{

    public EventInfo(Action<TSource, TEventHandler> fnAddHandler,

        Action<TSource, TEventHandler> fnRemoveHandler)

    {

        m_fnAddHandler = fnAddHandler;

        m_fnRemoveHandler = fnRemoveHandler;

    }

 

    public void AddHandler(TSource source, TEventHandler handler)

    {

        m_fnAddHandler(source, handler);

    }

 

    public void RemoveHandler(TSource source, TEventHandler handler)

    {

        m_fnRemoveHandler(source, handler);

    }

 

    public Scope Subscribe(TSource source, TEventHandler handler)

    {

        AddHandler(source, handler);

        return Scope.Create(() => RemoveHandler(source, handler));

    }

 

    readonly Action<TSource, TEventHandler> m_fnAddHandler;

    readonly Action<TSource, TEventHandler> m_fnRemoveHandler;

}

Here's how EventInfo would be used:

public sealed class Subject

{

    public event EventHandler Notify;

 

    public static readonly EventInfo<Subject, EventHandler> NotifyEvent =

        new EventInfo<Subject, EventHandler>(

            (s, eh) => s.Notify += eh, (s, eh) => s.Notify -= eh);

 

    // ...

}

 

public sealed class Observer : IDisposable

{

    public Observer(Subject subject)

    {

        m_scope = Subject.NotifyEvent.Subscribe(subject,

            delegate { UpdateSubject(subject); });

    }

 

    public void Dispose()

    {

        m_scope.Dispose();

    }

 

    private void UpdateSubject(Subject subject)

    {

        // ...

    }

 

    readonly Scope m_scope;

}

Any class with events can expose static read-only EventInfo fields to help clients with this pattern, but a client is certainly capable of creating EventInfo instances on its own, since the EventInfo doesn't manage the process of raising the event at all.

If you're using the Subscribe pattern a lot with a single event, the EventInfo class is probably worth using. However, the real reason we created EventInfo was to make it easier to subscribe to events with weak references, which will be the subject of my next post.

Posted by Ed Ball at 10:18 AM | Comments (0) | TrackBack

August 20, 2008

Leverage using blocks with Scope

Making sure that cleanup code is called even in the face of an exception is usually the job of try-finally blocks.

public class Command

{

    // ...

 

    public void Execute()

    {

        try

        {

            IsWorking = true;

            ExecuteCore();

        }

        finally

        {

            IsWorking = false;

        }

    }

}

The standard way to provide cleanup code for a class is to implement IDisposable, which provides a Dispose method that does the cleanup. Since writing try-finally blocks properly is a pain, the using statement in C# makes this much easier. But what about cleanup code that isn't in a Dispose method? Can the using statement help us with that? Of course; all you need is a specialized type that implements IDisposable. For efficiency, you can even use a struct.

public void Execute()

{

    using (new IsWorkingScope(this))

    {

        IsWorking = true;

        ExecuteCore();

    }

}

 

private struct IsWorkingScope : IDisposable

{

    public IsWorkingScope(Command command)

    {

        m_command = command;

    }

 

    public void Dispose()

    {

        m_command.IsWorking = false;

    }

 

    readonly Command m_command;

}

Of course, defining a specialized type isn't very convenient. Sometimes you want to define your cleanup code as a lambda or an anonymous delegate. For this, we use the Scope class.

public void Execute()

{

    using (Scope.Create(() => IsWorking = false))

    {

        IsWorking = true;

        ExecuteCore();

    }

}

The implementation of Scope is very straightforward; the following is the minimal implementation that we started with. We decided to use a static Create method because we thought it looked better than using a constructor.

public sealed class Scope : IDisposable

{

    public static Scope Create(Action fnDispose)

    {

        return new Scope(fnDispose);

    }

 

    public void Dispose()

    {

        if (m_fnDispose != null)

        {

            m_fnDispose();

            m_fnDispose = null;

        }

    }

 

    private Scope(Action fnDispose)

    {

        m_fnDispose = fnDispose;

    }

 

    Action m_fnDispose;

}

We added a Cancel method when we realized that it is sometimes useful to conditionally not execute the cleanup code.

public void Cancel()

{

    m_fnDispose = null;

}

The Transfer method is a useful way of returning a Scope that would otherwise be disposed by an enclosed using block.

public Scope Transfer()

{

    Scope scope = new Scope(m_fnDispose);

    m_fnDispose = null;

    return scope;

}

Finally, the Empty static field makes it easy to return a Scope that does nothing when disposed.

public static readonly Scope Empty = new Scope(null);

There are those that would consider Scope to be a misuse of the dispose pattern, but we have found it extremely useful.

Posted by Ed Ball at 9:52 AM | Comments (4) | TrackBack

August 19, 2008

Image Format Error when Loading from a Stream

The Microsoft Windows Imaging Component (WIC) is “an extensible framework for encoding, decoding, and manipulating images”. It's also the core of WPF’s System.Windows.Media.Imaging classes; this meant that a curious exception I got when using BitmapSource eventually led me to discover a possible bug in IWICImagingFactory::CreateDecoderFromStream.

My code was loading a large number of images from disk. The files contained a header with some image metadata, followed immediately by a regular Windows bitmap (in the ubiquitous BMP file format). The code would read the header from the stream, then load the bitmap from the rest of the stream, as follows:

using (Stream stream = new FileStream(filename, FileMode.Open))

{

    // read header

    stream.Read(header, 0, header.Length);

    // etc.

 

    BitmapImage bitmap = new BitmapImage();

    bitmap.BeginInit();

    bitmap.CacheOption = BitmapCacheOption.OnLoad;

    bitmap.StreamSource = stream;

    bitmap.EndInit();

    return bitmap;

}

Some images would fail to load, with EndInit throwing a mysterious System.IO.FileFormatException: “The image format is unrecognized”. The InnerException was System.Runtime.InteropServices.COMException (0x88982F07), with an HRESULT of a WIC error code: WINCODEC_ERR_UNKNOWNIMAGEFORMAT.

My first thought was that the images were somehow corrupted, but further investigation showed that the files loaded without errors if the header preceding the bitmap was removed from the file, or if the bitmap data following the header was first copied to a new MemoryStream before being loaded. I observed the same behaviour with IWICImagingFactory::CreateDecoderFromStream when I rewrote the test harness as a C++ COM application: if the IStream containing the image contained any data preceding the bitmap data, an error HRESULT would sometimes be returned.

It appears that, in certain circumstances, CreateDecoderFromStream assumes that the bitmap data begins at the stream's origin, and absolute offsets within the stream are used when seeking; thus, the image data must begin at offset 0 within the stream. As a workaround, you can copy the image data to a new MemoryStream (but note that this may increase memory usage). The solution I chose was to write a thin Stream wrapper class that handles calls to Position, Seek, Length, etc. and adjusts the offsets so that the image now appears to start at offset 0; all other calls are passed straight through to the underlying FileStream. This allows WIC and WPF to load all the images without having to make an unnecessary copy of the bitmap, or having to change the legacy file format.

Posted by Bradley Grainger at 6:33 PM | Comments (1) | TrackBack

July 28, 2008

Casting delegates

One of the annoying things about delegates in .NET is that delegates with exactly the same parameters and return type are not compatible. Specifically, you cannot cast a delegate to a delegate of another type even if they have the same parameters and return type.

Predicate<int> isPositive = n => n > 0;

Func<int, bool> isPositive2 = (Predicate<int>) isPositive; // COMPILER ERROR

This problem is mitigated somewhat in C# 3.5, which defines generic delegates that take arbitrary parameters and return types and encourages their use: Action, Action<T>, Action<T1, T2>, ..., Func<TR>, Func<T, TR>, Func<T1, T2, TR>, ...

However, all of the "old" delegates still exist and are in use: AsyncCallback, Comparison<T>, and Predicate<T>, to name a few.

The biggest source of delegate types is event handlers. There's plain old EventHandler and the newer EventHandler<T>, but there are still lots of non-generic event handlers like CancelEventHandler. Neither WPF nor Windows Forms use EventHandler<T>, so they are chock full of unique delegate types that take an object and some EventArgs-derived class.

Usually this doesn't present a problem, but occasionally you'd like to convert between compatible delegates. If both types are known at compile-time, you can just use a lambda:

Predicate<int> isPositive = n => n > 0;

Func<int, bool> isPositive2 = n => isPositive(n);

But sometimes, the types aren't known at compile-time. We primarily find this to be the case when trying to write generic utility code that can work with arbitrary event handlers. Fortunately, it is possible to cast between arbitrary delegate types, though it isn't as efficient as you might like – DelegateUtility.Cast:

public static class DelegateUtility

{

    public static T Cast<T>(Delegate source) where T : class

    {

        return Cast(source, typeof(T)) as T;

    }

 

    public static Delegate Cast(Delegate source, Type type)

    {

        if (source == null)

            return null;

 

        Delegate[] delegates = source.GetInvocationList();

        if (delegates.Length == 1)

            return Delegate.CreateDelegate(type,

                delegates[0].Target, delegates[0].Method);

 

        Delegate[] delegatesDest = new Delegate[delegates.Length];

        for (int nDelegate = 0; nDelegate < delegates.Length; nDelegate++)

            delegatesDest[nDelegate] = Delegate.CreateDelegate(type,

                delegates[nDelegate].Target, delegates[nDelegate].Method);

        return Delegate.Combine(delegatesDest);

    }

}

There is a generic version and a non-generic version. Note that the null case is handled first, followed by the single-invocation case, followed by the rare multiple-invocation case.

It is quite straightforward to use. (We'd have made it an extension method, but converting delegates isn't really a common enough need to justify it.)

CancelEventHandler handler = (source, e) => e.Cancel = OnCancel();

EventHandler<CancelEventArgs> handler2 =

    DelegateUtility.Cast<EventHandler<CancelEventArgs>>(handler);

The types used by the two delegate types must be exactly the same for DelegateUtility.Cast to work. Supporting compatible types is left as an exercise for the reader; we certainly haven't needed it.

Posted by Ed Ball at 4:00 PM | Comments (0) | TrackBack

July 24, 2008

.NET Regular Expressions and Unicode

A fundamental limitation of .NET regular expressions when it comes to processing Unicode text is that the regex engine apparently operates on UTF-16 code units (i.e., the 16-bit value(s) that are used to encode a single Unicode character) not code points (the values between 0 and 0x1FFFFF that are assigned to characters encoded in the Unicode standard).

This limitation can be inferred from the list of named blocks for character classes, which claims to be based on Unicode 4.0 but only goes up to FFF0–FFFF, IsHalfwidthandFullwidthForms. (Unicode 4.0 defines many blocks of supplementary characters, starting with Linear B Syllabary, U+010000..U+01007F.) It can be confirmed by verifying the return values of the following code snippet:

// search a string containing two Linear B letters for a letter

isMatch = Regex.IsMatch("\U00010000\U00010001", @"\p{L}");

// isMatch should be true, but is actually false

 

// search a string containing two Linear B letters for a surrogate code point

isMatch = Regex.IsMatch("\U00010000\U00010001", @"\p{Cs}");

// isMatch is true

The fundamental problem here is that \p{L} doesn't match sequences of chars that encode a character that’s defined as a letter by Unicode. Ideally, \p{L} would match supplementary characters, and \p{Cs} would match nothing because regular expressions would operate on characters, not code units (and there’s no such thing as a surrogate character).

Because of this problem, none of the 46,982 supplementary characters encoded in Unicode 5.1 can be matched by specifying Unicode properties. Furthermore, many other regular expression language elements (such as the period and quantifiers) do not correctly handle these characters, which are encoded with two UTF-16 chars.

It's unfortunate that the implementation details of UTF-16 encoding leak out into what is otherwise an excellent regular expression engine. I don't know of any .NET-based workaround for this issue; with native code, this problem can be solved by using the regular expression engine of ICU.

Update (25 July): I've filed a suggestion on Microsoft Connect, asking that the regex engine be extended to process Unicode characters, not UTF-16 code units.

Update 2 (25 July): Michael Kaplan also blogged about regular expressions and Unicode today, and included a link to Unicode Technical Standard #18: Unicode Regular Expressions. The essence of my Microsoft Connect suggestion is that .NET regular expressions be improved to have “Basic Unicode Support” as defined by that document.

Posted by Bradley Grainger at 11:34 AM | Comments (0) | TrackBack

June 10, 2008

Salsa20 Implementation in C#

Salsa20 is a stream cipher submitted to eSTREAM, the ECRYPT Stream Cipher Project, by Daniel Bernstein. (Salsa20/12, a version of the algorithm that uses fewer rounds, was one of four software implementations to be included in the final eSTREAM portfolio.) The algorithm can use either 128-bit or 256-bit keys, and is designed to be secure and efficient. For more information, see the Wikipedia article and the algorithm homepage.

There is a .NET port of this algorithm in the Bouncy Castle Crypto Library. Being a port from a Java library, however, that version doesn't interoperate with the System.Security.Cryptography APIs.

The code attached to this post implements Salsa20 using a subclass of SymmetricAlgorithm (with the actual encryption class implementing ICryptoTransform), so it can be used with CryptoStream and other .NET cryptography classes.

The focus is not on efficiency (for that, one should probably use a hand-coded SSE2 implementation), but on being a straightforward port to C# from the reference implementation in C. There is also a suite of tests (that use the eSTREAM test vectors) to verify the correctness of the implementation.

Like the reference C implementation, this code is in the public domain. Download it here: Salsa20.cs, Salsa20Tests.cs.

Posted by Bradley Grainger at 7:00 AM | Comments (0) | TrackBack

June 9, 2008

Implementing Clone

Now, before you get too excited, I'm not suggesting that you implement the all-but-deprecated ICloneable interface. Rather, this post is about how best to implement a method that duplicates an object – such a method is commonly named "Clone".

If you like, you can skip the long-winded commentary below and start reading the recommendations at the bottom.

Some would argue that Copy is a better name than Clone, since it avoids the “smell” of ICloneable, but I think Clone is more discoverable. Obviously the patterns of this post can be used regardless of whether you like Clone, Copy, Duplicate, or Replicate.

Any Clone method should document its semantics if they aren't obvious; in particular, it should be clear whether a “shallow” or a “deep” clone will be used. (A “deep” clone clones its “children”; a “shallow” clone simply copies the references of its children.)

In some circumstances, it may be useful to add a parameter to Clone that can change the behavior. For example, proposals to improve the ICloneable interface included a parameter that would indicate whether a shallow or deep clone is desired. Adding a parameter to the Clone method should be a simple extension to the patterns described below.

The simplest kind of clonable class is a sealed class, because we don't need to support derived classes. Even in this simple case, the cleanest approach is to delegate the cloning to a private “copy constructor”:

public sealed class Vector

{

    public Vector(int length)

    {

        m_array = new int[length];

    }

 

    public int Length

    {

        get { return m_array.Length; }

    }

 

    public int this[int index]

    {

        get { return m_array[index]; }

        set { m_array[index] = value; }

    }

 

    public Vector Clone()

    {

        return new Vector(this);

    }

 

    private Vector(Vector v)

    {

        m_array = (int[]) v.m_array.Clone();

    }

 

    int[] m_array;

}

Now, suppose we wanted Vector to be an abstract class so that derived classes could decide how the items are stored. Furthermore, we determine that the length needs to be cached in the base class for performance reasons:

public abstract class Vector

{

    protected Vector(int length)

    {

        m_length = length;

    }

 

    protected Vector(Vector v)

    {

        m_length = v.m_length;

    }

 

    public int Length

    {

        get { return m_length; }

    }

 

    public abstract int this[int index] { get; set; }

 

    public abstract Vector Clone();

 

    int m_length;

}

The protected “copy constructor” is provided to simplify the Clone override:

public sealed class ArrayVector : Vector

{

    public ArrayVector(int length)

        : base(length)

    {

        m_array = new int[length];

    }

 

    private ArrayVector(ArrayVector v)

        : base(v)

    {

        m_array = (int[]) v.m_array.Clone();

    }

 

    public override int this[int index]

    {

        get { return m_array[index]; }

        set { m_array[index] = value; }

    }

 

    public override Vector Clone()

    {

        return new ArrayVector(this);

    }

 

    int[] m_array;

}

In the ArrayVector implementation above, I'd like the overridden Clone method to be more type-safe – that is, I'd like it to return an ArrayVector instead of a Vector. Unfortunately, while some languages (e.g. C++) allow overrides to return a more-derived class than the method they override, C# does not. Not to be deterred in my quest for type safety, my preferred pattern is to use CloneCore as the overridable and define Clone as a separate non-overridable method in each concrete class.

public abstract class Vector

{

    // ...

    protected abstract Vector CloneCore();

    // ...

}

 

public sealed class ArrayVector : Vector

{

    // ...

    public ArrayVector Clone()

    {

        return (ArrayVector) CloneCore();

    }

 

    protected override Vector CloneCore()

    {

        return new ArrayVector(this);

    }

    // ...

}

Incidentally, if you really want to implement ICloneable:

public class Vector : ICloneable

{

    // ...

    object ICloneable.Clone()

    {

        return Clone();

    }

    // ...

}

Okay, enough commentary. Here are the recommendations:

Recommendations

There are three parts to a clonable class: (1) the copy constructor, (2) the CloneCore method, and (3) the Clone method.

The copy constructor is always defined, and does all of the copying, being sure to call the base class copy constructor if available. It is always protected (unless the class is sealed, in which case it is private).

The CloneCore and Clone methods are only implemented on non-abstract classes, and always have the same definitions. (Exception: if the root class is abstract, CloneCore is an abstract method on that class.) The CloneCore method uses the copy constructor to clone the instance. The Clone method calls CloneCore and casts the result to the correct type. If the Clone method overloads a base class method, use the “new” keyword.

Clear? No? Let's try sample code. Consider the clonable classes Base and Derived, where Derived derives from Base. The Base class should always have a protected copy constructor:

protected Base(Base x)

{

    // copy members from x

}

If Base is abstract, it only has an abstract CloneCore:

protected abstract Base CloneCore();

If Base is not abstract, it defines both Clone and CloneCore:

public Base Clone()

{

    return (Base) CloneCore();

}

 

protected virtual Base CloneCore()

{

    return new Base(this);

}

Similarly, the Derived class should always have a copy constructor:

protected Derived(Derived x)

    : base(x)

{

    // copy members from x

}

But if Derived is sealed, it will need to be private:

private Derived(Derived x)

    : base(x)

{

    // copy members from x

}

If Derived is abstract, it is done. If Derived is not abstract, it defines both Clone and CloneCore:

public new Derived Clone()

{

    return (Derived) CloneCore();

}

 

protected override Base CloneCore()

{

    return new Derived(this);

}

If no ancestors have defined a Clone method (e.g. Base is abstract), you'll have to omit the “new” keyword from the Derived Clone method.

Whew. It feels like I wrote too much, but I'll just publish and move on. Hope this is useful!

Posted by Ed Ball at 9:15 AM | Comments (0) | TrackBack

May 29, 2008

Events and Threads (Part 3)

We've discussed reasonable mechanisms for subscribing to events and for raising events, but we skirted the issue of "thread-safe" events until now.

What is a thread-safe event? A good definition would be "an event that may be subscribed, unsubscribed, and/or raised simultaneously on arbitrary threads." In that case, what must we do to create a thread-safe event?

Certainly it must be true that if you add an event handler, it is added, and if you remove an event handler, it is removed. As discussed earlier, the default implementation of the add and remove methods accomplishes this by locking the object, but I'd recommend using your own lock:

public event EventHandler Click

{

    add

    {

        lock (m_lockClick)

            m_click += value;

    }

    remove

    {

        lock (m_lockClick)

            m_click -= value;

    }

}

 

EventHandler m_click;

object m_lockClick = new object();

It is also certain that a thread-safe event must not throw a null reference exception when raising the event. The problem is that another thread could remove the last event handler at any moment, which sets the event delegate to null. In the following naïve implementation, Click could become null after the check but before the call:

private void RaiseClick()

{

    if (m_click != null)

        m_click(this, EventArgs.Empty);

}

The most common solution is to make a copy of the event delegate before calling it:

private void RaiseClick()

{

    EventHandler handler = m_click;

    if (handler != null)

        handler(this, EventArgs.Empty);

}

However, I learned from Juval Lowy's book that aggressive compiler inlining could theoretically eliminate the copy, which would bring us back to the same problem. His solution is to write a non-inlined method that raises the event, something like this:

private void RaiseClick()

{

    RaiseEvent(m_click);

}

 

[MethodImpl(MethodImplOptions.NoInlining)]

private void RaiseEvent(EventHandler handler)

{

    if (handler != null)

        handler(this, EventArgs.Empty);

}

Another good solution is to add a do-nothing event handler; follow the link for an explanation of that approach.

Of course, the most "correct" solution is probably to use the lock that's already there:

private void RaiseClick()

{

    EventHandler handler;

    lock (m_lockClick)

        handler = m_click;

    if (handler != null)

        handler(this, EventArgs.Empty);

}

Perhaps the last solution helped you think of another aspect of thread-safe events that isn't discussed very often. A problem common to all of these solutions is that a subscriber's event handler may be called even after it has been unsubscribed!

I found this behavior very surprising when I was writing thread-safe objects with events. For example, the Dispose method of one object might unsubscribe from an event of another object, assuming that the event handler won't be called again; but, in fact, that event handler might actually be called after the object has been disposed, which can obviously cause problems.

If you want to guarantee that an event handler won't be called after it is unsubscribed, as well as guarantee that an event handler can't be unsubscribed until the event is done being raised, the most direct solution is to call the event handler from within the lock:

private void RaiseClick()

{

    lock (m_lockClick)

    {

        if (m_click != null)

            m_click(this, EventArgs.Empty);

    }

}

This is a bit hair-raising, of course, because you're calling arbitrary code from within a lock, which is a good recipe for deadlock. I don't have enough experience with this pattern to know how common a problem that might be.

One final note about thread-safe events – make sure that your clients understand that their event handler will be invoked on an arbitrary thread, so that they know to dispatch to their UI thread if necessary.

I wish I had more solid conclusions as regards thread-safe events, but I'm still working through these issues. Hopefully I've at least given you some things to think about when you're considering adding events to a thread-safe class – it might be easier to just avoid them altogether.

Posted by Ed Ball at 2:46 PM | Comments (7) | TrackBack

May 23, 2008

Events and Threads (Part 2)

It's time to continue our discussion of events and threads. You'll note in the last post that I didn't say much about "thread safe" events, because it's not clear what that would mean, particularly as regards the raising of an event. You won't see much in this post about "thread safe" events, either, though I do hope to get to that eventually.

We've already talked about adding and removing an event handler, so it's only natural that we would now talk about raising the event. The most commonly discussed problem that we face when raising an event in C# is that the event delegate is null if there are no subscribers.

public event EventHandler Click;

 

private void RaiseClick()

{

    // throws NullReferenceException if no subscribers

    Click(this, EventArgs.Empty);

}

In fact, I touched on this subject back in March, where I noted that assigning a do-nothing event handler to the event delegate avoids that problem entirely, though it does add a bit of inefficiency.

public event EventHandler Click = delegate { };

 

private void RaiseClick()

{

    // never throws NullReferenceException

    Click(this, EventArgs.Empty);

}

If your class has thread affinity, you must only raise the event from the UI thread, so you can safely do a null check without worrying about another thread removing the last event handler between the check and the call.

public event EventHandler Click;

 

private void RaiseClick()

{

    VerifyAccess();

 

    if (Click != null)

        Click(this, EventArgs.Empty);

}

If your class is thread-compatible, it must be assumed that you only raise an event from the thread that is currently accessing your instance, so, again, you can safely do a null check without worrying about other threads.

public event EventHandler Click;

 

private void RaiseClick()

{

    if (Click != null)

        Click(this, EventArgs.Empty);

}

But what if you want to raise an event in response to background work on a worker thread? In the case of a thread-affined class, there is usually a way to submit work to the UI thread, allowing you to raise the event from the UI thread. In WPF, you can use the Dispatcher for the UI thread.

public event EventHandler Click;

 

private void RaiseClick()

{

    Dispatcher.Invoke(DispatcherPriority.Send, new SendOrPostCallback(

        delegate

        {

            if (Click != null)

                Click(this, EventArgs.Empty);

        }), null);

}

In Windows Forms or WPF, you can use the SynchronizationContext of the UI thread.

public event EventHandler Click;

 

private void RaiseClick()

{

    m_context.Send(

        delegate

        {

            if (Click != null)

                Click(this, EventArgs.Empty);

        }, null);

}

 

SynchronizationContext m_context = SynchronizationContext.Current;

Raising an event in response to background work on a worker thread for a thread-compatible class is more interesting, because subscribers to the event will be called on an arbitrary thread. Therefore, for all intents and purposes, the event must be thread-safe, because it could be subscribed or unsubscribed on one thread and raised on another thread at the same time.

Which means that it's time to talk about thread-safe events, but I think I'll save that discussion for a future post.

Posted by Ed Ball at 1:15 PM | Comments (0) | TrackBack

May 9, 2008

Events and Threads (Part 1)

Once upon a time, I mentioned that I'd like to blog about thread-safety as it relates to events, so I figured I'd better get moving on that.

There are so many issues with .NET events and threads that it's hard to know where to begin, but let's start with the adding and removing of event handlers.

Unless documentation specifies otherwise, one must assume that adding and removing an event handler falls under the same thread safety requirements as any other method of the class. So, if the class has thread affinity (Windows Forms controls, WPF elements, etc.), assume that events can only be added and removed from the UI thread. If the class is thread-compatible (most non-UI classes in .NET), assume that events can be added and removed from any thread, but no two threads can add or remove events (or call any other method, for that matter) at the same time.

When authoring an event, if you allow C# to implement the add and remove methods (by not including your own), the default implementation attempts to be thread-safe by locking "this" before adding or removing the handler from the event delegate. In other words, these two events are implemented the same way:

public event EventHandler Event1;

 

public event EventHandler Event2

{

    add { lock (this) m_event2 += value; }

    remove { lock (this) m_event2 -= value; }

}

private EventHandler m_event2;

If your event has thread affinity or is thread-compatible, the lock is unnecessary overhead, so you're better off with a lock-free implementation:

public event EventHandler Event3

{

    add { m_event3 += value; }

    remove { m_event3 -= value; }

}

private EventHandler m_event3;

Better yet, if your event has thread affinity, make sure that the caller is on the UI thread.

public event EventHandler Event4

{

    add { VerifyAccess(); m_event4 += value; }

    remove { VerifyAccess(); m_event4 -= value; }

}

private EventHandler m_event4;

Furthermore, locking "this" is not recommended (see the MSDN documentation on the lock statement and on MethodImplOptions.Synchronized), so you might consider always implementing your own add and remove methods anyway.

While we're on the subject of adding and removing event handlers, if your class has more than a few events, consider using the EventHandlerList class to manage all of the event handlers, or manage the event handlers in a similar way with your own collection. This will save memory when many of the events have no subscribers. The EventHandlerList class is not thread-safe, which makes it most suitable for thread-affined and thread-compatible events.

There's obviously much more to discuss, not the least of which is a discussion of what it would mean for an event to be entirely thread-safe; hopefully part 2 won't be so long in coming!

Posted by Ed Ball at 10:25 AM | Comments (4) | TrackBack

April 8, 2008

Finalizers called from partially constructed objects

Did you know that finalizers are called from partially constructed objects? I certainly didn't. If an exception is thrown from a class constructor, that object is considered “partially constructed” – and its finalizer is still run when the object is garbage collected. Chris Brumme mentioned this four years ago when he helped us understand that it’s hard to implement Finalize properly: “Your Finalize method must tolerate partially constructed instances.”

A coworker discovered this fact when he was unit testing a class that called Debug.Fail in its finalizer to make sure that its instances were being disposed properly. He passed an invalid argument to the constructor to verify that an exception would be thrown – but then found that the call to Debug.Fail in the finalizer was causing tests to fail.

We couldn't figure out a good way to determine whether an object is partially constructed, so we just had to hack around the problem. Any better ideas for detecting undisposed objects?

Posted by Ed Ball at 3:49 PM | Comments (1) | TrackBack

April 5, 2008

“Memory leak” with BitmapImage and MemoryStream

The code snippet below has a small “memory leak”:

BitmapImage bitmap = new BitmapImage();

 

byte[] buffer = GetHugeByteArray(); // from some external source

using (MemoryStream stream = new MemoryStream(buffer, false))

{

    bitmap.BeginInit();

    bitmap.CacheOption = BitmapCacheOption.OnLoad;

    bitmap.StreamSource = stream;

    bitmap.EndInit();

    bitmap.Freeze();

}

 

// use bitmap...

The BitmapImage keeps a reference to the source stream (presumably so that you can read the StreamSource property at any time), so it keeps the MemoryStream object alive. Unfortunately, even though MemoryStream.Dispose has been invoked, it doesn't release the byte array that the memory stream wraps. So, in this case, bitmap is referencing stream, which is referencing buffer, which may be taking up a lot of space on the large object heap. Note that there isn't a true memory leak; when there are no more references to bitmap, all these objects will (eventually) be garbage collected. But since bitmap has already made its own private copy of the image (for rendering), it seems rather wasteful to have the now-unnecessary original copy of the bitmap still in memory.

The solution here is fairly straightforward: create an implementation of Stream that wraps another stream (in this example, the MemoryStream). The Dispose method of this wrapper class needs to release the wrapped stream, so that it can be garbage collected. Once the BitmapImage is initialised with this wrapper stream, the wrapper stream can be disposed, releasing the underlying stream, and allowing the large byte array itself to be freed.

Posted by Bradley Grainger at 12:40 PM | Comments (5) | TrackBack

April 1, 2008

Handling the PropertyChanged event

When handling the PropertyChanged event of an object that implements INotifyPropertyChanged, one is tempted to write a handler like this:

void Source_PropertyChanged(object sender, PropertyChangedEventArgs e)

{

    if (e.PropertyName == "Text")

        UpdateText();

    else if (e.PropertyName == "Icon")

        UpdateIcon();

}

Strictly speaking, this isn't entirely correct. According to the documentation for the PropertyChanged event, “the PropertyChanged event can indicate all properties on the object have changed by using either null or string.Empty as the property name in the PropertyChangedEventArgs.” So, you're forced to write something like this:

void Source_PropertyChanged(object sender, PropertyChangedEventArgs e)

{

    bool bAllChanged = string.IsNullOrEmpty(e.PropertyName);

    if (bAllChanged || e.PropertyName == "Text")

        UpdateText();

    if (bAllChanged || e.PropertyName == "Icon")

        UpdateIcon();

}

To make checking for this situation a little bit easier, and to increase the likelihood that we remember to do it, we put together a nice little extension method on PropertyChangedEventArgs called HasChanged:

public static bool HasChanged(this PropertyChangedEventArgs e,

    string strPropertyName)

{

    string strEventPropertyName = e.PropertyName;

    return string.IsNullOrEmpty(strEventPropertyName) ||

        strPropertyName == strEventPropertyName;

}

Now, proper handling of the PropertyChanged event is as easy as this:

void Source_PropertyChanged(object sender, PropertyChangedEventArgs e)

{

    if (e.HasChanged("Text"))

        UpdateText();

    if (e.HasChanged("Icon"))

        UpdateIcon();

}

Just remember not to use "else"...

Posted by Ed Ball at 4:27 PM | Comments (2) | TrackBack

March 28, 2008

Degrees of Thread Safety

I'd like to talk about .NET events and thread safety, but I first need to clarify what we mean by “thread safety.” After all, the most important consideration as regards thread safety is how thread-safe your code has to be. If you can get away with being thread-hostile, you should do so. There's no sense in worrying about other threads in a single-threaded console application, for example.

If you are writing components, it is critically important that you document your level of thread-safety. Having a consistent vocabulary for describing thread safety thus becomes important.

Here is our preferred taxonomy for describing degrees of thread safety, adapted from this article, and from Joshua Bloch's Effective Java Programming Language Guide.

Immutable

Instances of this class appear constant to their clients; no external synchronization is necessary. Examples include String and Regex.

Most immutable objects need no internal synchronization; however, “lazy loading” will generally require some form of synchronization in order to maintain thread safety.

Thread-safe

Instances of this class are mutable, but all methods contain sufficient internal synchronization that instances may be used concurrently without the need for external synchronization. Examples include many of the System.Threading types, including ManualResetEvent and Timer.

Since most concurrent applications require more coarsely-grained synchronization than the per-method-call synchronization of a thread-safe class, and since thread-safety causes unnecessary overhead for single-threaded operations, truly thread-safe classes are often not needed.

Conditionally Thread-safe

Like thread-safe, except that the class (or an associated class) contains methods that must be invoked in sequence without interference from other threads. To eliminate the possibility of interference, the client must obtain an appropriate lock for the duration of the sequence.

The object returned by Hashtable.Synchronized, for example, is documented as such: “Synchronized supports multiple writing threads, provided that no threads are reading the Hashtable. The synchronized wrapper does not provide thread-safe access in the case of one or more readers and one or more writers.”

Thread-safe for multiple readers

Many classes are conditionally thread-safe if they are not being mutated. Examples include List<T> and Dictionary<TKey, TValue>.

Readers can only presume thread-safety on an instance of this type if they can be certain that there will be no writers on any thread. If there is any possibility of a writer, both readers and writers must use the same external synchronization.

It is easy to assume that a class is thread-safe for multiple readers, but unsynchronized “lazy loading” can easily cause a class to violate this degree of thread safety. Worse, a future “upgrade” of the class could introduce such a thing as an optimization. If a class is thread-safe for multiple readers, its documentation must indicate as such.

Thread-safe for a subset of operations

Some classes are mostly thread-compatible (see below), except for a few methods that are thread-safe. For example, the BeginInvoke method of Control can safely be called from any thread, even though Control otherwise has thread affinity.

Thread-safe except for Dispose

It can also be convenient to declare a class thread-safe except for the Dispose method. See Thread-safe disposable objects for details.

Freezable

A “freezable” object has a method (often named Freeze) that causes the object to become immutable and thus thread-safe. Until that point, the object is mutable and generally thread-compatible or even thread-hostile. Examples include classes that derive from the Freezable class of WPF, including Brush, Pen, and Transform.

Thread-compatible

Instances of this class can safely be used concurrently by surrounding each method invocation (and in some cases, each sequence of method invocations) by external synchronization. Most classes in the .NET Framework fall into this category and are documented as such: “Any public static members of this type are thread safe. Any instance members are not guaranteed to be thread safe.”

Thread-affined

A thread-affined class has thread affinity; that is, instances of that class may only be used from one thread, generally the thread with which the instance was created. Most user interface components (Windows Forms, WPF, etc.) have affinity to the “user interface thread” and can thus only be used from that thread. Methods of thread-affined classes should assert that they are being called on the proper thread.

Thread-hostile

This class is not safe for concurrent use by multiple threads, even if all method invocations are surrounded by external synchronizaton. By this definition, thread-affined classes are also thread-hostile. In general, the only classes in the .NET Framework that are “thread hostile” are those with thread affinity. Thread-hostile behavior also includes:

Modifying static data

Modifying static data that affect other threads is thread-hostile behavior, for example, setting the value of System.Console.OutputEncoding.

Posted by Ed Ball at 2:06 PM | Comments (0) | TrackBack

March 14, 2008

EmptyIfNull

Another great use for the fact that C# extension methods work on null references is a method we call EmptyIfNull:

public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> seq)

{

    return seq ?? Enumerable.Empty<T>();

}

This extension method simply returns the specified enumerable, unless it is null, in which case it returns an empty enumerable. It requires slightly less typing than using the null-coalescing operator.

For example, suppose the GetNames method normally returns a collection of strings, but could return null if there are none. Either of these statements will write each of the strings to the console:

(GetNames() ?? Enumerable.Empty<string>()).

    ToList().ForEach(Console.WriteLine);

 

GetNames().EmptyIfNull().ToList().ForEach(Console.WriteLine);

I like the EmptyIfNull syntax better.

Posted by Ed Ball at 11:07 AM | Comments (0) | TrackBack

March 7, 2008

Assigning to C# events

When raising an event in C#, you first have to make sure it isn't null, because it will be null if there are no subscribers.

public event EventHandler Click;

 

public void RaiseClick()

{

    // throws NullReferenceException if no subscribers

//    Click(this, EventArgs.Empty);

 

    // could throw NullReferenceException if unsubscribed on another thread

//    if (Click != null)

//        Click(this, EventArgs.Empty);

 

    // never throws NullReferenceException

    EventHandler handler = Click;

    if (handler != null)

        handler(this, EventArgs.Empty);

}

A few weeks ago, Andy Clymer demonstrated a fun way to avoid that boilerplate null-checking code by simply adding a do-nothing event handler.

public event EventHandler Click = delegate { };

 

public void RaiseClick()

{

    // never throws NullReferenceException

    Click(this, EventArgs.Empty);

}

Call me crazy, but somehow I didn't realize that I could assign to events! In fact, you can assign to events anywhere within the declaring class; you'll note that I do so twice in my previous post.

In the constructor, I use a similar pattern to always call a virtual method as the first event handler:

Disposing = delegate { OnDisposing(); };

And in the Dispose method, I assign the event to null to unsubscribe all of my subscribers.

Disposing = null;

I should mention that I'm not claiming that any of these patterns make events "thread safe" by any stretch; they only help to avoid throwing a NullReferenceException. In fact, I hope to talk more about the thread safety of events in a future post.

(You'll note that the name of my event-raising method starts with "Raise" rather than "On" -- I agree with Sebastien Lambla that raising events from OnXxx methods is an anti-pattern.)

Posted by Ed Ball at 8:49 AM | Comments (4) | TrackBack

March 5, 2008

Thread-safe disposable objects

Most disposable objects are not thread-safe. After all, calling Dispose on one thread while other threads are accessing the object is bound to cause problems.

It is possible to make a disposable object entirely thread-safe, of course, but you'd need to take a lock inside your Dispose method and inside every other property or method call that uses disposable state to prevent that state from being disposed while you're using it (or about to use it). Within that lock, you could safely check to see if your object is disposed and throw an ObjectDisposedException if it is.

To avoid the overhead of those locks, even our otherwise thread-safe objects have a disclaimer about the Dispose method: “This class is thread-safe except for Dispose, which is thread-compatible. Using an instance during or after its disposal results in unpredictable behavior.”

Whether a disposable object is entirely thread-safe or not, there are still obstacles when using disposable objects in multiple threads. Some disposable objects start background work as part of their job description; if the object is disposed while the background work is still running, that thread is likely to access disposed state and cause havoc. Similarly, clients of disposable objects may be doing work in background threads that use the object; if the client disposes the object while that background work is running, that thread is likely to access the object while it is being disposed, or after it is entirely disposed.

One solution to this problem is to catch ObjectDisposedException in your background work. If you get such an exception, you simply abandon the work on the assumption that it is no longer needed. Keep in mind that you'd have to make the disposable object entirely thread-safe for this to work, as described above; otherwise the object might be in the middle of a method call when it is disposed.

Our preferred solution is to cancel and wait for background work before disposing the object. (The mechanisms for canceling and waiting for background work are outside the scope of this post, though we may discuss it in a future post; needless to say, Thread.Abort is not a viable solution.)

In the case of the disposable object starting background work, it simply has to cancel and wait for background work as the first thing that it does in the Dispose method, before marking itself as being disposed or disposing any state.

In the case of the client with background work that is using the disposable object, the client should cancel and wait for that work before calling the Dispose method on that object.

There's another scenario that's somewhat common in our code base -- what if the client that is running the background work isn't the owner of the disposable object? How can the client cancel its background work before the object is disposed when it isn't responsible for calling Dispose on that object?

We solve this problem by implementing a Disposing event on the disposable object. The Disposing event is raised by the Dispose method of the disposable object before any actual disposing takes place. The client can cancel and wait for background work inside its event handler for the Disposing event.

public abstract class DisposableService : IDisposable

{

    public event EventHandler Disposing;

 

    public void Dispose()

    {

        if (Interlocked.Exchange(ref m_nDisposing, 1) != 0)

            return;

 

        Disposing(this, EventArgs.Empty);

        Disposing = null;

 

        m_bDisposed = true;

 

        Dispose(true);

        GC.SuppressFinalize(this);

    }

 

    protected DisposableService()

    {

        Disposing = delegate { OnDisposing(); };

    }

 

    protected void VerifyNotDisposed()

    {

        if (m_bDisposed)

            throw new ObjectDisposedException(GetType().Name);

    }

 

    protected virtual void Dispose(bool bDisposing)

    {

    }

 

    protected virtual void OnDisposing()

    {

    }

 

#if DEBUG

    ~DisposableService()

    {

        Debug.Fail("Not disposed: " + GetType().Name);

    }

#endif

 

    int m_nDisposing;

    volatile bool m_bDisposed;

}

(In some cases where we have clients that depend on a disposable service, the client actually calls Dispose on itself in its handler for the Disposing event of that disposable service!)

All of this canceling and waiting inside Dispose methods and event handlers seems like a recipe for deadlock, but we haven't found a better solution to the problem of using disposable objects from multiple threads, and it has been working pretty well for us so far.

(For more on this subject, see Joe Duffy's recent post.)

Update: Inspired by Neil's comment, I've added a bit more thread safety to Dispose. Now, even if multiple threads call Dispose at the “same time,” the Disposing event won't be raised twice.

Posted by Ed Ball at 8:33 AM | Comments (6) | TrackBack

February 27, 2008

Disposable value types

Way back when, Ian Griffiths wrote a disposable class called TimedLock that made it easy to use a C# using statement to release the lock. Eric Gunnerson advised him to use a struct instead of a class because a value type avoids a heap allocation. Some time later, Phil Haack verified that the using statement doesn't box the value type in order to call Dispose (as long as IDisposable is implemented explicitly, not implicitly).

Disposable value types do make for a nice optimization in C#, but like many optimizations, they should only be used when performance demands it, because they have a few disadvantages.

Most importantly, there's nothing to keep the Dispose logic from being called twice, perhaps even accidentally. Chris Lyon touches on this when he explains why GCHandle doesn't implement IDisposable. If you make a copy of a disposable value type before you call Dispose, there is no way to avoid executing the same logic if Dispose is called on each copy. This could cause an unmanaged resource to be freed twice, or cleanup code to be executed twice, etc. A class doesn't have this problem because it is only copied by reference; once Dispose is called, it can make sure that subsequent calls to Dispose do nothing.

Another disadvantage of disposable value types is that C++/CLI doesn't support them very well. In C++/CLI, you can't define disposable value types at all; compiler errors prohibit you from defining a destructor or a Dispose method on a value type. You can use a disposable value type defined in C#, but you can't dispose one without boxing it, which causes a heap allocation, thus subverting the purpose of the optimization.

Also, consider this: If the creation or disposal logic of the value type happens to require a heap allocation (creating a delegate, for example), it is unlikely that the optimization will come to much, since the whole point of a disposable struct is to avoid heap allocation.

In summary, I'm not using disposable value types with the reckless abandon that I once did. But if performance measurements ever show that I'm spending too much time allocating a disposable class that's used in this way, I know that I can try making it a struct.

Posted by Ed Ball at 9:50 AM | Comments (1) | TrackBack

February 19, 2008

Disposed objects

There are a few rules about disposed objects. First, calling Dispose on an already disposed object should do nothing. Second, calling any other property or method on an already disposed object should throw an ObjectDisposedException.

Either way, you'll need a way to determine whether an instance of your disposable class has been disposed. The easiest and most obvious way is to have a simple Boolean member that is set to true when the object is disposed.

More often than not, we find that setting a field to null that wouldn't otherwise be null works just as well. Commonly, that field needs to be disposed, so we use a utility method that can be used to dispose a field and set it to null – unless it is already null. Using this method ensures that calling Dispose a second time won't throw a NullReferenceException.

public static class DisposableUtility

{

    public static void Dispose<T>(ref T obj) where T : class, IDisposable

    {

        if (obj != null)

        {

            obj.Dispose();

            obj = null;

        }

    }

}

An easy way to throw ObjectDisposedException when your object has already been disposed is to define a private (or protected) method that throws the exception if necessary; just call that method at the beginning of every property or method definition of your class (besides Dispose, of course).

sealed class FileReader : IDisposable

{

    public FileReader(string strFilePath)

    {

        m_stream = new FileStream(strFilePath, FileMode.Open);

    }

 

    public object ReadNext()

    {

        VerifyNotDisposed();

 

        object obj = null;

        // ...

        return obj;

    }

 

    public void Dispose()

    {

        DisposableUtility.Dispose(ref m_stream);

    }

 

    private void VerifyNotDisposed()

    {

        if (m_stream == null)

            throw new ObjectDisposedException(GetType().Name);

    }

 

    Stream m_stream;

}

Nothing too complicated, but it's easy to forget the two rules of disposed objects. Get in the habit of verifying both rules when you write your unit tests.

Posted by Ed Ball at 8:57 AM | Comments (0) | TrackBack

February 5, 2008

GetOrAddValue

Jon Skeet recently posted his DictionaryUtility.GetOrCreate method.

Here's our version (sans-parameter validation):

public static TValue GetOrAddValue<TKey, TValue>(

    this IDictionary<TKey, TValue> dict, TKey key)

    where TValue : new()

{

    TValue value;

    if (dict.TryGetValue(key, out value))

        return value;

    value = new TValue();

    dict.Add(key, value);

    return value;

}

 

public static TValue GetOrAddValue<TKey, TValue>(

    this IDictionary<TKey, TValue> dict, TKey key,

    Func<TValue> generator)

{

    TValue value;

    if (dict.TryGetValue(key, out value))

        return value;

    value = generator();

    dict.Add(key, value);

    return value;

}

First note that we have an overload that takes a Func<TValue>. This enables us to call GetOrAddValue even for types that don't support a parameterless constructor. We pass a Func<TValue> rather than a TValue to avoid needlessly creating empty instances when they already exist in the dictionary.

Also, since we're using the looked-up value if it exists, we call TryGetValue instead of checking ContainsKey.

Posted by Jacob Carpenter at 5:23 PM | Comments (0) | TrackBack

February 4, 2008

Another extension method: DisposeAfter

Ed previously blogged about our null-propagating extension method. This extension method is one example of a pattern that I find very interesting:

public static ? InvokeOnSelf<T>(this T self, Func<T, ?> toInvoke)

Another interesting example of this pattern is DisposeAfter:

public static T DisposeAfter<TDisposable, T>(this TDisposable d,

    Func<TDisposable, T> fn) where TDisposable : IDisposable

DisposeAfter is an extension method constrained to types which implement IDisposable. It lets you pass in work to be done before calling Dispose, and returns the result of that work.

Why is this useful?

Let's look at a simple case of reading a row of data from an IDataReader:

int id = 0;

string name = null;

 

using (IDataReader reader = command.ExecuteReader())

{

    if (reader.Read())

    {

        id = reader.GetInt32(0);

        name = reader.GetString(1);

    }

}

 

// do some work with id and name

This is one case where using an anonymous type instance to encapsulate the read data could be useful. Doing so would enable us to know if data was read; we could simply return null when it wasn't. It would also somewhat simplify the method by reducing the number of local variables.

There's a big problem though: if we declare the anonymous type instance inside the using block, its scope will be constrained. But we cannot declare the anonymous type separately from its initialization.

We want to say something like:

var result; // error CS0818: Implicitly-typed local variables must be initialized

 

using (IDataReader reader = command.ExecuteReader())

{

    if (reader.Read())

        result = new { Id = reader.GetInt32(0), Name = reader.GetString(1) };

}

 

// do some work with result

But that won't compile.

Enter DisposeAfter:

var result = command.ExecuteReader().DisposeAfter(r =>

    r.Read() ? new { Id = r.GetInt32(0), Name = r.GetString(1) } : null);

 

// do some work with result

There, that's nice! Since DisposeAfter is a generic method, the return value is strongly typed. And it doesn't matter to the compiler that we're returning an anonymous type instance; it can still infer the type of result.

Looking at the implementation of DisposeAfter we see that all of the passed in work is still done within a using statement, so objects will properly be disposed (even in the face of troublesome exceptions):

public static T DisposeAfter<TDisposable, T>(this TDisposable d,

    Func<TDisposable, T> fn) where TDisposable : IDisposable

{

    using (d)

        return fn(d);

}

Let me know in the comments if you find this method useful. Hopefully you'll also start to think about other circumstances where passing a delegate (or perhaps an expression tree) to an object via an extension method could be useful.

Posted by Jacob Carpenter at 9:12 AM | Comments (4) | TrackBack

February 1, 2008

The Dispose Pattern

The "Dispose pattern" describes the proper way for a type to implement Dispose and/or Finalize. It applies in different ways to different .NET languages. In C++/CLI, for example, the Dispose pattern is automatically implemented by managed types with destructors and/or finalizers. This post is primarily aimed at C# developers.

The Dispose method of the IDisposable interface is used to free the resources used by an object, which may include unmanaged resources that need to be released, managed resources that need to be disposed, and other miscellaneous cleanup.

The Finalize method of Object (that is, the "finalizer") is similar, but is called automatically after the object is garbage collected, and should only be overridden to release unmanaged resources, since managed resources may already have been finalized and shouldn't be accessed. (In C#, you use the "destructor syntax" to create a finalizer.)

First things first: you don't need a finalizer. Finalizers can only be used to release unmanaged resources. Your clients should be calling Dispose, which will release those resources. If you need to release those resources even if the client forgets to call Dispose, you should wrap each resource in a separate finalizable class derived from SafeHandle or CriticalHandle or CriticalFinalizerObject. Feel free to implement a simple finalizer for debugging purposes, but consider removing it from shipping code.

#if DEBUG

        ~Program()

        {

            Console.WriteLine("Program finalized");

        }

#endif

If you're really, really sure that you need to ship with a finalizer, be sure to understand and obey all of the rules. Otherwise, with finalizers out of the way, we can discuss a simpler set of rules for implementing the Dispose pattern.

The simplest case is a sealed class whose base class doesn't already implement IDisposable. Just implement IDisposable and do your cleanup in the Dispose method.

public sealed class SimpleSealed : IDisposable

{

    public void Dispose()

    {

        // clean up

    }

}

If your class isn't sealed (and its base class doesn't already implement IDisposable), you need to provide the proper mechanism for derived classes to add their own cleanup. Specifically, you need to add a virtual method, also called Dispose, that takes a Boolean argument that indicates whether it is being called from Dispose or from the finalizer. (If a derived class needed a finalizer, it would add one that called Dispose(false). The call to GC.SuppressFinalize would prevent it from being called if the object was already disposed.)

public class Simple : IDisposable

{

    public void Dispose()

    {

        Dispose(true);

        GC.SuppressFinalize(this);

    }

 

    protected virtual void Dispose(bool disposing)

    {

        if (!disposing)

        {

            // clean up

        }

    }

}

(If your class is abstract, you may be tempted to make Dispose(bool) abstract as well. Unfortunately, that breaks the pattern enough to cause derived classes implemented in C++/CLI to enter an infinite loop when they are disposed.)

If your base class already implements IDisposable, it ideally conforms to the Dispose pattern, which means it has a virtual Dispose(bool) that should be overridden to do cleanup.

public class DerivedFromSimple : Simple

{

    protected override void Dispose(bool disposing)

    {

        try

        {

            if (disposing)

            {

                // clean up

            }

        }

        finally

        {

            base.Dispose(disposing);

        }

    }

}

If your base class implements IDisposable but doesn't conform to the Dispose pattern, you should introduce the Dispose pattern for your derived classes.

public class VirtualDispose : IDisposable

{

    public virtual void Dispose()

    {

        // clean up

    }

}

 

public class DerivedFromVirtualDispose : VirtualDispose

{

    public sealed override void Dispose()

    {

        Dispose(true);

        GC.SuppressFinalize(this);

    }

 

    protected virtual void Dispose(bool disposing)

    {

        if (disposing)

        {

            try

            {

                // clean up

            }

            finally

            {

                base.Dispose();

            }

        }

    }

}

(If the Dispose method of the base class isn't virtual, you won't need the "sealed override", but you'll need to reimplement IDisposable. If the Dispose method of the base class is abstract, you won't be able to call "base.Dispose()".)

Finally, if your base class doesn't conform to the Dispose pattern, but your class is sealed, you can keep it simple.

public sealed class SealedFromVirtualDispose : VirtualDispose

{

    public override void Dispose()

    {

        try

        {

            // clean up

        }

        finally

        {

            base.Dispose();

        }

    }

}

(If the Dispose method of the base class isn't virtual, you won't be able to "override", so you'll have to reimplement IDisposable.)

I'd still like to talk about ensuring that Dispose can be called multiple times, when to throw ObjectDisposedException, the thread safety of this pattern, and disposable value types (structs), but those topics will have to wait for future posts.

Posted by Ed Ball at 9:24 AM | Comments (1) | TrackBack

January 22, 2008

Using Process.Start to link to the Internet

The easiest way to navigate the user's default Internet browser to a specified URL is to call System.Diagnostics.Process.Start(string).

Process.Start("http://www.logos.com/");

The MSDN documentation seems inaccurate on a few points. It suggests that this method should only be called from an STA thread, which isn't true – it will create a new STA thread and run from there if necessary. Worse, a comment in the example suggests that this method doesn't work with a URL to launch an Internet browser – obviously, it works just fine.

We have found that, in some circumstances, with some browsers, Process.Start will raise an exception, even though it might actually succeed, so we wrap the call in a try/catch block and hope for the best.

try

{

    Process.Start("http://www.logos.com/");

}

catch (Exception)

{

}

Interestingly, when used to launch the default Internet browser in this way, Process.Start "waits" until the Internet browser is displayed before returning. I must use quotation marks here, because when Process.Start is called from a UI thread (e.g., the main STA thread of a WPF application), it pumps messages while it waits, which means that the user can still interact with the application, and events like mouse clicks and keystrokes will be processed by the application while it is "waiting". So, make sure that you don't do any work after calling Process.Start that could fail due to user activity in the meantime. Process.Start is probably being called from an event handler, so the best thing to do after calling Process.Start is nothing else.

One thing that bothers me about most applications that link to the Internet is that the Internet browser can take a while to appear. You click on a link, but nothing happens, and you wonder if you actually clicked the link, so you click again, and ultimately end up opening the Web site twice. In a WPF application, a nice way to deal with this problem is to set Mouse.OverrideCursor to Cursors.AppStarting while Process.Start is running. (Fortunately, since Process.Start pumps messages, it doesn't "hang" the application while you wait.)

try

{

    Mouse.OverrideCursor = Cursors.AppStarting;

    Process.Start("http://www.logos.com/");

}

catch (Exception)

{

}

finally

{

    Mouse.OverrideCursor = null;

}

Wrap that up in a utility method and you're good to go.

Posted by Ed Ball at 9:09 AM | Comments (2) | TrackBack

January 14, 2008

Null-propagating extension method

Brad Wilson recently discovered that C# extension methods work on null references. We also discovered this interesting feature while developing a null-propagating extension method that we call IfNotNull:

public static class IfNotNullExtensionMethod

{

    public static U IfNotNull<T, U>(this T t, Func<T, U> fn)

    {

        return t != null ? fn(t) : default(U);

    }

}

Since this is an extension method on T, where T has no constraints, it is always available to any class or struct instance, much like Equals and GetHashCode. Therefore, we put it in its own namespace so that it only appears when requested.

You'll notice that the IfNotNull method is only an obscure way to call the specified delegate on the instance, except that we check to see whether the instance is null, and don't call the delegate in that case, returning null (or the default value in the case of a struct).

How do we use this method? Well, suppose you have an item with a Parent property and you want to get the Name of its parent.

string parentName = item.Parent.Name;

But suppose that item might be null, or the item's parent might be null, and you'd like the name variable to be null in either case.

string parentName = null;

if (item != null)

{

    Item parent = item.Parent;

    if (parent != null)

        parentName = parent.Name;

}

Or, more concisely:

Item parent = item == null ? null : item.Parent;

string parentName = parent == null ? null : parent.Name;

It would be nice to eliminate the temporary 'parent' variable, but we'd have to call the Parent property twice:

string parentName = item == null ? null : item.Parent == null ? null : item.Parent.Name;

The IfNotNull extension method allows us to eliminate the temporary 'parent' variable without calling the Parent property twice:

string parentName = item.IfNotNull(x => x.Parent).IfNotNull(x => x.Name);

The use of lambda expressions makes using IfNotNull the least efficient of all of the possibilities, so only use it when you believe the increased code clarity outweighs the performance loss.

Some programmers may legitimately feel that it doesn't improve code clarity at all. This was the best we could come up with in lieu of the ?. operator proposed by Miral, which would allow this:

string parentName = item?.Parent?.Name;

Update: Renamed the extension method from "To" to "IfNotNull"; the new name makes the semantics of the method clearer. We still have a To extension method, but it's even simpler:

public static U To<T, U>(this T t, Func<T, U> fn)

{

    return fn(t);

}

What possible use could this method have? Eliminating temporary variables. Here's a simple example:

IList<int> list = BuildListOfNumbers();

int nFirstOrDefault = list.Count == 0 ? 0 : list[0];

becomes

int nFirstOrDefault =

    BuildListOfNumbers().To(list => list.Count == 0 ? 0 : list[0]);

For what it's worth.

Posted by Ed Ball at 8:30 AM | Comments (1) | TrackBack