Empty for a reason

So far I have found Option<T> for reference types to be the best way of avoiding null reference exceptions while keeping my code very readable. I use it when parsing strings, retrieving values from dictionaries, getting properties from object via reflection and more. Option<T> allows one to write very clean code without any risk of null reference exceptions. To illustrate this point, consider the following code:

string r = null;
if (someObject != null) {
    var value = someObject.GetPropertyValue<string>("P");
    int i;
    if (value != null && int.TryParse(value, out i)) {
        dict.TryGetValue(i, out r);
    }
}

which, using Option<T> and associated LINQ support can be rewritten as

var r = from o in someObject.AsOption()
        from s in o.GetPropertyValue<string>("P")
        from i in s.ParseAsInt()
        from v in i.Lookup(dict)
        select v;

on the first line we have an empty Option if someObject is null. The first from extracts the actual object if the Option is not empty, the second one retrieves a property value if the property exists and if it is of type string and if the value is not null, then we try to parse the string and get an integer value if the parsing is successful, finally we look up a dictionary and again get a value only if the dictionary holds the key. So all in all we have six different reasons to end up with an empty result. The main goal of Option is to avoid null reference exceptions, but quite often, in particular while debugging, one is actually interested in why the result is empty, is it because the dictionary didn’t contain the key? because the string didn’t represent an integer? etc. As the code stands there is no way of knowing the actual reason without redoing the whole calculation one step at a time, typically by stepping through the code one step at a time.

An empty result is empty for one  reason and only one reason. So the idea is that whenever a step returns an empty option the reason (a string) for returning empty is stored in the option. Here is the implementation of the methods used in the above example, you will see that each time an empty Option is returned, the reason for doing so is specified:

public static Option<T> AsOption<T>(this T value) {
    if (value is ValueType) return new Option<T>(value);
    if ((object)value == null) return Option.Empty<T>("value is null");
    return new Option<T>(value);
}

public static Option<T> AsOption<T>(this T value, string reason) {
    if (value is ValueType) return new Option<T>(value);
    if ((object)value == null) return Option.Empty<T>(reason);
    return new Option<T>(value);
}

public static Option<R> GetPropertyValue<R>(this object value, string propertyName) {
    return from v in value.AsOption()
           from pi in v.GetPropertyInfo(propertyName)
           from cast in pi.GetValue(v).AsOption().As<R>()
           select cast;
}

public static Option<PropertyInfo> GetPropertyInfo<T>(this T value, string propertyName) {
    return from v in value.AsOption()
           from p in v.GetType()
                      .GetProperty(propertyName)
                      .AsOption("Property " + propertyName + " not found in " + v.GetType().Name)
           select p;
}

public Option<R> As<R>() {
    return this.HasValue && value is R
        ? ((R)(object)value).AsOption()
        : Option.Empty<R>("Cannot cast " + typeof(T).Name + " to " + typeof(R).Name);
}

public static Option<int> ParseAsInt(this string source) {
    int value;
    return int.TryParse(source, out value)
        ? value.AsOption()
        : Option.Empty<int>("Input string cannot be parsed:" + source);
}

public static IOption<V> Lookup<K, V>(this K key, Dictionary<K,V> dict) {
    V value;
    return dict.TryGetValue(key, out value)
        ? new Option<V>(value)
        : Option.Empty<V>("Key not found:" + key.ToString());
}

So now, if the result of our query is empty, we can retrieve the reason for it being empty from the Option instance in the form of a string. Note that this was achieved without making a single modification to our code.

Let’s now have a look at how the other version of the code would have to be modified to achieve the same thing:

string reason = "null someObject";
string r = null;
if (someObject != null){
    reason = "Unable to retrieve property P in " + someObject.GetType().Name;
    var value = someObject.GetPropertyValue<string>("P").Value;
    int i;
    if (value != null) {
        reason = "Unable to parse " + value + " to an int";
        if (int.TryParse(value, out i)){
            reason = "Key not found:" + i.ToString();
            dict.TryGetValue(i, out r);
        }
    }
}

where I have avoided using else branches to keep the code compact and slightly more readable, but clearly there is no contest between the two versions.

So far I have found this useful mostly for logging and debugging purposes. But I wonder whether a strongly typed version of this could be used for other purposes…

Combining IEnumerable with Option/Maybe in LINQ queries

I have found that Option<T>/Maybe<T> (same as Nullable, but can be used with reference types and LINQ) to be incredibly useful, especially with LINQ support. Usefulness of Option/Maybe is well documented, so is also the LINQ support for Maybe<T> or Option<T>. What I want to briefly address here is the interplay between Option<T> and IEnumerable<T>. Essentially about flattening IEnumerable<Option<T>> to just IEnumerable<T> and Option<IEnumerable<T>> to IEnumerable<T>:

    IEnumerable<Option<T>> => IEnumerable<T>
    Option<IEnumerable<T>> => IEnumerable<T>

In the first case I want to avoid having to deal with an enumeration of holes, which in general is meaningless, while in the second an empty Option<IEnumerable<T>> can just be expressed as an empty IEnumerable<T>.

Taking concrete examples, this is what I would like to be able to write. Let’s first define a few helper methods to make our examples a little bit more realistic:

public static Option<V> GetValue<K, V>(this Dictionary<K, V> that, K key) {
    V value;
    return that.TryGetValue(key, out value)
        ? new Option<V>(value)
        : Option.Empty<V>();
}

public static Option<double> div(int n, int d) {
    return d == 0
        ? Option.Empty<double>()
        : new Option<double>(1.0n/d);
}

public static Option<double> inv(int n) {
    return n == 0
        ? Option.Empty<double>()
        : new Option<double>(1.0/n);
}

 
The simplest case consists of only selecting Options that have a value in a standard IEnumerable query:

IEnumerable<double> invs = from i in new[] { 0, 1, 2 }
                           select inv(i);

 
which contains just two elements. The same can be done with an Option sub query:

IEnumerable<double> invs = from i in new[] { 0, 1, 2 }
                           select from v in inv(i)
                                  select v;

 
A more complex example:

IEnumerable<double> res = from i in new[] {1, 2}
                          from j in new[] {0, 2}
                          select div(i, j);

 
For the other operation we use a dictionary of IEnumerable to illustrate the implementation:

var d = new Dictionary<int, IEnumerable<int>>() {
    {1, new[] {1, 2, 3}}
};

IEnumerable<int> res = from v in d.GetValue(1)
                       from w in d.GetValue(1)
                       select v.Concat(w);

 
The following example returns an empty enumeration:

IEnumerable<int> res = from v in d.GetValue(1)
                       from w in d.GetValue(0)
                       select v.Concat(w);

 
Queries can also be nested:

IEnumerable<int> res = from e in from v in d.GetValue(1)
                                 select v
                       select e*2;

 
Here are the overloads I defined to support these examples:

public static IEnumerable<R> Select<T,R>(this IEnumerable<T> source, Func<T, Option<R>> selector) {
    foreach (var s in source) {
        var r= selector(s);
        if (r.HasValue) yield return r.Value;
    }
}

public static IEnumerable<R> SelectMany<T,C,R>(this IEnumerable<T> source, Func<T, IEnumerable<C>> collectionSelector, Func<T, C, Option<R>> resultSelector) {
    foreach (var s in source) {
        foreach (var r in collectionSelector(s)) {
            var result= resultSelector(s, r);
            if (result.HasValue) yield return result.Value;
        }
    }
}

public static IEnumerable<R> SelectMany<T,O,R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, IEnumerable<R>> selector) {
    if (that.HasValue) {
        var result = optionSelector(that.Value);
        if (result.HasValue) return selector(that.Value, result.Value);
    }
    return Enumerable.Empty<R>();
}

public static IEnumerable<R> Select<T,R>(this Option<T> that, Func<T, IEnumerable<R>> selector) {
    return that.HasValue
        ? selector(that.Value)
        : Enumerable.Empty<R>();
}

 
For the record here are the LINQ extension methods for Option:

public static Option<R> SelectMany<T, O, R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, R> selector) {
    if (that.HasValue) {
        var option = optionSelector(that.Value);
        if (option.HasValue) {
            return new Option<R>(selector(that.Value, option.Value));
        }
    }
    return Option.Empty<R>();
}

public static Option<R> SelectMany<T, O, R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, Option<R>> selector) {
    if (that.HasValue) {
        var option = optionSelector(that.Value);
        if (option.HasValue) return selector(that.Value, option.Value);
    }
    return Option.Empty<R>();
}

public static Option<R> SelectMany<T, R>(this Option<T> that, Func<T, Option<R>> selector) {
    if (that.HasValue) {
        var selected = selector(that.Value);
        if (selected.HasValue) return selected;
    }
    return Option.Empty<R>();
}

public static Option<R> Select<T, R>(this Option<T> that, Func<T, R> selector) {
    return that.HasValue
        ? new Option<R>(selector(that.Value))
        : Option.Empty<R>();
}

There are be other ways of combining Option and IEnumerable, but quite often I ended up confusing LINQ as it was unable to make its choice between two or more overloads. So far, I have only used the first two methods in production code, and they have nicely demonstrated their worth in terms of readability and conciseness.

Awelon Blue

Thoughts on Programming Experience Design

Joe Duffy's Blog

Adventures in the High-tech Underbelly

Design Matters

Furniture design blog from Design Matters author George Walker

Shavings

A Blog for Woodworkers by Gary Rogowski

Paul Sellers' Blog

– A Lifestyle Woodworker –

randallnatomagan

Woodworking, life and all things between