Exiting the monad with eval in LINQ

LINQ to object queries are lazy, but this is actually rarely what I need, so in most cases I add a ToList() or a ToArray() in my code before using the result of the query. Since this requires the addition of brackets around the query, the readability of the code suffers and often I skip LINQ and use chained extension methods instead:

var x = (from i in numbers
         where i < 0
         select i*2).ToArray();
var x = numbers.Where(i => i < 0).Select(i => i*2).ToArray();

The same happens with all methods that exit the monad, e.g. First/Last/Single, Any/All/Contains, Aggregate, Average/Min/Max/Sum, Count, etc. So wouldn’t it be nice to have an extra clause in LINQ that would let one apply any of these method to the result of the query?

var x = from i in numbers
        where i < 0
        select i*2
        eval ToArray();

I am using the eval keyword here as it does actually result in the evaluation of the query. In some cases we might want to include a lambda expression:

var x = from i in numbers
        where i < 0 select i*2 eval First(j => j > 8);

which isn’t very pretty but makes it clear that the last bit is not actually part of the query. The alternative would be to add new keywords to LINQ, e.g. first, last, any, etc. But this approach would result in a proliferation of keywords. The advantage of the above approach is that any existing method can be used.

Intermediate approaches might be possible, for example:

var x = from i in numbers
        where i < 0 select i*2 eval First j where j > 8;

where the where keyword could be used whenever the lambda is a predicate? Similarly for Sum/Max/Min/Average

var x = from i in numbers
        where i < 0
        select i*2
        eval Sum j for j*j;

though this one doesn’t look too nice.

Never mind, I have created a fork of Roslyn on Codeplex to try out this idea, … to be continued once the changes has been committed.

[24 hours later] I committed the code in a fork named AddEvalClauseToLinq. It was nice and easy to modify the syntax, but things got trickier when translating the query. I managed to get the example code (see below) to build and execute, but I would have to spend much more time on the compiler to do a less hacky job.

Here is the code I managed to compile:

var ints = new[] { 1, 2, 3 };
Func<int, bool> Trace = i => { Console.Write("<" + i + ">"); return true; };

Console.WriteLine("x = from i in ints where Trace(i) select i*2;");
var x = from i in ints
        where Trace(i)
        select i*2;
Console.WriteLine("type: " + x.GetType().Name);
foreach (var i in x) Console.Write(i + " ");
foreach (var i in x) Console.Write(i + " ");

Console.WriteLine("y = from i in ints where Trace(i) select i*2 eval ToList;");
var y = from i in ints
        where Trace(i)
        select i * 2
        eval ToList();
Console.WriteLine("type: " + y.GetType().Name);
foreach (var i in y) Console.Write(i + " ");
foreach (var i in y) Console.Write(i + " ");

Console.WriteLine("z = from i in ints where Trace(i) select i*2 eval First;");
var z = from i in ints
        where Trace(i)
        select i * 2
        eval First();
Console.WriteLine("type: " + z.GetType().Name);
Console.WriteLine("Value: " + z);

Console.WriteLine("s = from i in ints where Trace(i) select i*2 eval Sum(j => j*j);");
var s = from i in ints
        where Trace(i)
        select i * 2
        eval Sum(j => j*j);  
Console.WriteLine("type: " + s.GetType().Name);
Console.WriteLine("Value: " + s);

and here is the console output:

x = from i in ints where Trace(i) select i*2;
type: WhereSelectArrayIterator`2
<1>2 <2>4 <3>6
<1>2 <2>4 <3>6

y = from i in ints where Trace(i) select i*2 eval ToList;
<1><2><3>type: List`1
2 4 6
2 4 6

z = from i in ints where Trace(i) select i*2 eval First;
<1>type: Int32
Value: 2

m = from i in ints where Trace(i) select i*2 eval Sum(t => t*t);
<1><2><3>type: Int32
Value: 56

I think that the console output says it all, types are as expected and only the first query without eval is executed multiple times.

My next Roslyn hack I would love to try would be to get better syntaxic support for tuples, just as in

var x = from (i,j) in listOfTuples
        where i < 0
        select j*2;


var (i,j) = Tuple.Create(1,2);

depending on how hard it turns out to be.

[EDIT] Something very similar has been proposed #3571

Join-Calculus library hosted on Codeplex

Just a quick note to mention that the Join-Calculus library is now hosted on Codeplex.

Combining IEnumerable with Option/Maybe in LINQ queries

I have found that Option<T>/Maybe<T> (same as Nullable, but can be used with reference types and LINQ) to be incredibly useful, especially with LINQ support. Usefulness of Option/Maybe is well documented, so is also the LINQ support for Maybe<T> or Option<T>. What I want to briefly address here is the interplay between Option<T> and IEnumerable<T>. Essentially about flattening IEnumerable<Option<T>> to just IEnumerable<T> and Option<IEnumerable<T>> to IEnumerable<T>:

    IEnumerable<Option<T>> => IEnumerable<T>
    Option<IEnumerable<T>> => IEnumerable<T>

In the first case I want to avoid having to deal with an enumeration of holes, which in general is meaningless, while in the second an empty Option<IEnumerable<T>> can just be expressed as an empty IEnumerable<T>.

Taking concrete examples, this is what I would like to be able to write. Let’s first define a few helper methods to make our examples a little bit more realistic:

public static Option<V> GetValue<K, V>(this Dictionary<K, V> that, K key) {
    V value;
    return that.TryGetValue(key, out value)
        ? new Option<V>(value)
        : Option.Empty<V>();

public static Option<double> div(int n, int d) {
    return d == 0
        ? Option.Empty<double>()
        : new Option<double>(1.0n/d);

public static Option<double> inv(int n) {
    return n == 0
        ? Option.Empty<double>()
        : new Option<double>(1.0/n);

The simplest case consists of only selecting Options that have a value in a standard IEnumerable query:

IEnumerable<double> invs = from i in new[] { 0, 1, 2 }
                           select inv(i);

which contains just two elements. The same can be done with an Option sub query:

IEnumerable<double> invs = from i in new[] { 0, 1, 2 }
                           select from v in inv(i)
                                  select v;

A more complex example:

IEnumerable<double> res = from i in new[] {1, 2}
                          from j in new[] {0, 2}
                          select div(i, j);

For the other operation we use a dictionary of IEnumerable to illustrate the implementation:

var d = new Dictionary<int, IEnumerable<int>>() {
    {1, new[] {1, 2, 3}}

IEnumerable<int> res = from v in d.GetValue(1)
                       from w in d.GetValue(1)
                       select v.Concat(w);

The following example returns an empty enumeration:

IEnumerable<int> res = from v in d.GetValue(1)
                       from w in d.GetValue(0)
                       select v.Concat(w);

Queries can also be nested:

IEnumerable<int> res = from e in from v in d.GetValue(1)
                                 select v
                       select e*2;

Here are the overloads I defined to support these examples:

public static IEnumerable<R> Select<T,R>(this IEnumerable<T> source, Func<T, Option<R>> selector) {
    foreach (var s in source) {
        var r= selector(s);
        if (r.HasValue) yield return r.Value;

public static IEnumerable<R> SelectMany<T,C,R>(this IEnumerable<T> source, Func<T, IEnumerable<C>> collectionSelector, Func<T, C, Option<R>> resultSelector) {
    foreach (var s in source) {
        foreach (var r in collectionSelector(s)) {
            var result= resultSelector(s, r);
            if (result.HasValue) yield return result.Value;

public static IEnumerable<R> SelectMany<T,O,R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, IEnumerable<R>> selector) {
    if (that.HasValue) {
        var result = optionSelector(that.Value);
        if (result.HasValue) return selector(that.Value, result.Value);
    return Enumerable.Empty<R>();

public static IEnumerable<R> Select<T,R>(this Option<T> that, Func<T, IEnumerable<R>> selector) {
    return that.HasValue
        ? selector(that.Value)
        : Enumerable.Empty<R>();

For the record here are the LINQ extension methods for Option:

public static Option<R> SelectMany<T, O, R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, R> selector) {
    if (that.HasValue) {
        var option = optionSelector(that.Value);
        if (option.HasValue) {
            return new Option<R>(selector(that.Value, option.Value));
    return Option.Empty<R>();

public static Option<R> SelectMany<T, O, R>(this Option<T> that, Func<T, Option<O>> optionSelector, Func<T, O, Option<R>> selector) {
    if (that.HasValue) {
        var option = optionSelector(that.Value);
        if (option.HasValue) return selector(that.Value, option.Value);
    return Option.Empty<R>();

public static Option<R> SelectMany<T, R>(this Option<T> that, Func<T, Option<R>> selector) {
    if (that.HasValue) {
        var selected = selector(that.Value);
        if (selected.HasValue) return selected;
    return Option.Empty<R>();

public static Option<R> Select<T, R>(this Option<T> that, Func<T, R> selector) {
    return that.HasValue
        ? new Option<R>(selector(that.Value))
        : Option.Empty<R>();

There are be other ways of combining Option and IEnumerable, but quite often I ended up confusing LINQ as it was unable to make its choice between two or more overloads. So far, I have only used the first two methods in production code, and they have nicely demonstrated their worth in terms of readability and conciseness.

Runtime scopes


The project I am working on at the moment contains approximately 1000 interfaces and 4000 classes. We use Autofac for dependency injection, but not all the project uses dependency injection. Within the solution these types are grouped by folder/namespaces and this helps find related types. The only real organization is the use of one Autofac scope to manage sessions. So the types are grouped in two scopes, the global scope and the session scope.

I was expecting the scope separation between classes to be clearly visible on a diagram that represents the dependencies between types. In fact, producing such a diagram turns out to be much harder than I thought. This is a topic I will discuss further in another post. But I am hoping that a well organized diagram would clearly show a natural grouping of classes based on their dependencies.

My goal here is to somewhat formalize this grouping of classes. For this I am proposing to make use of the concept of scoping which plays an important role in many areas of computer science.


Scopes are so pervasive that we use them pretty much unconsciously. In C# we have for example the following scopes:

  • statement block, delineated by curly brackets
  • using statement scope
  • method scope
  • class scope
  • namespace scope

Some characteristics and properties of scopes:

  • Always
  • Nested  or side-by-side
  • Non crossing
  • Isolation
    • Outer scopes have no access to the content of inner scopes
    • Inner scopes might or might not have access to the content of outer scopes
    • It is also good practice for inner scopes not to modify the state of outer scopes
    • Sharing
      • Content of outer scopes might be available to inner scopes
      • Entry/exit
        • Data passed in on entry
        • Data returned on exit
        • Creation on entry
        • Disposal on exit
        • Restoring state on exit
        • Lifetime of object inside scope

Generally speaking, scope is considered to be a source code level concept. Scopes are enforced and resolved at compile time. Some aspects are also important at runtime, e.g. call stack management for method calls, disposal of objects when leaving a using scope. But I should emphasize that scopes are always bound to the source code. So the source code of an inner scope is defined within the code of the outer scope. This characteristic represents a strong constraint and limits the usefulness of lexical scopes.

Runtime scope

By contrast, a runtime scope defines the scoping of types declaratively rather than lexically. As such types defined in different namespaces or assemblies can live in the same scope. Note that runtime scopes are limited to classes and interfaces. Moreover I will only discuss runtime scopes in the context of (constructor) dependency injection. I will stick to Autofac as I am most familiar with it and as it already provides some support for scopes.

Declarative scoping

Every interface must define a scope (for the sake of simplicity lets limit ourselves, for now, to a single scope per interface) using a generic marking interface: interface IScoped<TScope> where TScope : IScope.

Scopes themselves are defined as interfaces that derive from the IScope interface.

Finally, scopes can be either singletons (only one can exist within a given scope at any time) or concurrent. I will consider defining other semantics as required, e.g. one-shot singleton or contiguous singletons. The marking interfaces are ISingleScope and IConcurrentScope.

Naturally scopes themselves are scoped.

With Autofac, one registers an implementation of an interface with a class non-declaratively, e.g. by calling the RegisterType method of the builder instance. This has two disadvantages, firstly as the sharing of the class is defined at the registration site rather than declaratively with the interface or class one has to repeatedly navigate between to the registration site and the interface\class implementations. It also makes the analysis of type dependencies by a tool much harder to perform unless one can use Roslyn. Going for a declarative approach solves these two problems.

Sharing, that is whether an instance is shared or not, or in other words whether an instance is a singleton or not can be defined declaratively using either marking interfaces or attributes. In my first version I use marking interfaces: ISingleton (the easy one) and the more dubious IManifold (please, please do suggest a better name).


Let’s have a look at the semantics of runtime scopes.

  1. A class or scope in TScope can only be resolved in a TScope or a sub-scope of TScope.
  2. All injected arguments of all constructors of a class (i.e. all its dependencies) in TScope must be in either TScope or a super-scope of TScope.

Condition (1) means that the content of inner scopes is not available to outer scopes, but the content of outer scopes is visible to inner scopes. This last condition could easily be restricted so as to fully isolate the inner scopes. Condition (2) states that all dependencies must come from the same scope or from an outer scope. Here again we could restrict this to only allow dependencies from the same scope.

New scopes are created by existing scopes. A scope can only create a direct inner scope.

Disposal of scopes

Ever scope can be closed, i.e. disposed of. Thanks to Autofac, every class in the scope will also be disposed of if it implements the IDisposable interface.

Returned results

Just as functions return one or more values on exit, a scope can return one or more values when it is closed. The current mechanism is to instantiate a class in its outer scope using Autofac, so that this shared instance can then be retrieved by the outer scope. I am not really convinced by this mechanism and it doesn’t work well with concurrent scopes. I am investigating other solutions, e.g. continuation scopes or have the closed scope expose the result.

A simple expression parser

In my previous post I gave a brief introduction to my parser generator and its grammar definition language REBNF. Here I will give a complete example of a grammar for parsing simple arithmetic expressions:

namespace -> "Example.Expressions"; 

goal = Expressions; 
Expressions = Expression/";"...; 
Expression : Term | Terms; 
Terms = Term\AddOp...; 
AddOp = "+" | "-"; 
Term : Factor | Factors; Factors = Factor\MulOp...; 
MulOp = "*" | "/"; 
Factor : constant | SubExpression; 
SubExpression = "(" Expression ")"; 
decimalDigit :: ['0'-'9']; 
constant == decimalDigit+; 
newLine == U+000D | U+000A | U+000D U+000A | U+2028 | U+2029; 
whitespaceChar :: [' ', U+0009, U+000B, U+000C]; 
whitespace == whitespaceChar+;

ignore -> whitespace; 
ignore -> newLine; 

This grammar contains also lexical definitions for three terminal tokens: constant, newLine and whitespace. The last two lines of the grammar are options which indicate that the whitespace and newLine tokens should be ignored by the parser. The option on the first line defines the namespace to use for the generated classes.

A class is generated for every terminal, here is the one for the constant token:

public partial class constant : ITerminal, IFactor {
    public constant(IToken token) {
        Token = token;
    public IToken Token { get; private set; } }

All terminals implement the following ITerminal interface:

public interface ITerminal {
    IToken Token { get; }

So every terminal class exposes the token returned by the lexer. An empty interface for every alternative rule is also generated. These are IExpression, ITerm and IFactor. IExpression is inherited by the ITerm interface and implemented by the Terms class. Similarly for ITerm is inherited by the IFactor interface and implemented by the Factors class. Finally IFactor is implemented by the constant and SubExpression classes. So both constant and SubExpression classes implement all three interfaces.

The big advantage of this grammar is that an expression which consists of a constant only will contain just one instance of the constant class, rather than, as in more usual approaches, of a hierarchy of expression, term, factor and constant tree nodes.

I will now demonstrate how this can be used to implement an Evaluate function for an expression. Firstly I extend the IExpression interface (all generated classes and interfaces are marked as partial) with the following method:

public partial interface IExpression {
    double Evaluate();

Now all classes which implement this interface will have to implement this method, so we have the following:

public partial class Terms : ZsK.G5.Syntax.IExpression {
    public double Evaluate() {
        var result = Terms.First().Evaluate();
        foreach (var pair in Terms.Skip(1).Zip(AddOps, (factor, addOp) => new { op = addOp, val = factor.Evaluate() })) {
            if (pair.op.Minus == null) {
                result += pair.val;
            } else {
                result -= pair.val;
        return result;
public partial class Factors {
    public double Evaluate() {
        var result = Factors.First().Evaluate();
        foreach (var pair in Factors.Skip(1).Zip(MulOps, (factor, mulOp) => new { op = mulOp, val = factor.Evaluate() })) {
            if (pair.op.Times == null) {
                result /= pair.val;
            } else {
                result *= pair.val;
        return result;
public partial class constant : ITerminal, IFactor {
    public double Evaluate() {
        return int.Parse(this.Token.Value);
public partial class SubExpression : IFactor {
    public double Evaluate() {
        return this.Expression.Evaluate();

The processing on the lists of terms and factors could be improved by using better list processing primitives. Otherwise the code consists mostly in mapping lexical terms to actual values and operations. I think that this example illustrates nicely the advantages of mapping alternative rules to interfaces.

Walking the syntax tree

There are two ways one can walk the syntax tree, in the first case one’s code gets called at every step of the walk, in the other one, one can override specific method of a generated class. Let me start with the first one:

class Walker : WalkContext {
    public override void OnEnter(object node) {
    public override void OnExit(object node) {
    public override void OnLeaf(object leaf) {

The following implementation of the OnLeaf method lets you reconstrcut the original source:

public override void OnLeaf(object leaf) {
    if (leaf != null) {
        IToken t = (leaf as dynamic).Token;
        if (t.Next != null) {
            if (t.Next.Symbol.Name == "whitespace" || t.Next.Symbol.Name == "newLine") {

Remember that we directed the grammar to ignore all whitespaces and new lines, so these tokens are not referenced anywhere in the syntax tree, however there are still stored in singly linked list of tokens as illustrated in the code. The following Walker class is also generated where I have only expanded just one of the generated method for illustration purposes:

public class Walker {
    public virtual void OnEnter(object node) {...}
    public virtual void OnExit(object node) {...}
    public virtual void OnLeaf(global::ZsK.G5.Interfaces.ITerminal terminal) {...}
    public virtual void Walk(AddOp rule) {...}
    public virtual void Walk(constant item) {...}
    public virtual void Walk(Expressions rule) {
        int Expression_i = 0;
        if ((rule.Expression != null)) {
            for (Expression_i = 0; (Expression_i < (rule.Expression.Count - 0)); Expression_i = (Expression_i + 1)) {
    public virtual void Walk(Factors rule) {...}
    public virtual void Walk(goal rule) {...}
    public virtual void Walk(IExpression item) {...}
    public virtual void Walk(IFactor item) {...}
    public virtual void Walk(ITerm item) {...}
    public virtual void Walk(Minus item) {...}
    public virtual void Walk(MulOp rule) {...}
    public virtual void Walk(newLine item) {...}
    public virtual void Walk(Plus item) {...}
    public virtual void Walk(RoundBra item) {...}
    public virtual void Walk(RoundKet item) {...}
    public virtual void Walk(SemiColon item) {...}
    public virtual void Walk(SubExpression rule) {...}
    public virtual void Walk(Terms rule) {...}
    public virtual void Walk(Times item) {...}
    public virtual void Walk(UpSlash item) {...}
    public virtual void Walk(whitespace item) {...}

This approach offers more flexibility than the first option.

I am in the process of writing the C# 4.0 grammar in REBNF. This is a really useful exercise (hopefully it will be more than that) as it is showing up already plenty of problems with the current form of REBNF. So changes are likely to happen in the next version of REBNF. Rev-ing the grammar itself is usually a rather painful process, and I have usually been waiting until I have collected enough annoying issues in the generator itself to move forward. I should specify that the current REBNF parser was generated using the previous version of REBNF, which in turn was generated using its preceding one, etc. The current REBNF stands at version 5.

Not quite sure what the next post on this topic will cover, I might discuss the C# grammar, or give more details about REBNF, or even add some functionality to this post’s expression parser example.

Introduction to REBNF

My previous post was about a year old idea, this post is about a 20 years old one! It is basically a parser and scanner generator. Originally written in Turbo Pascal, then in Delphi Pascal, I eventually ported it to C# 2.0 and 3.0. This project has spent most of its life in a dormant state, and was painfully revived every couple of years.

This scanner and parser generator has the following features:

  • Combines scanner and parser definitions in a single grammar definition
  • Maps grammar rules to C# classes
  • Maps alternatives to C# interfaces
  • EBNF style grammar is restricted to ease mapping to classes
  • Supports repetition with separators
  • Syntax tree is strongly typed
  • Syntax tree contains all source tokens
  • Normally ignored tokens such as whitespaces and comments are also available in the syntax tree
  • Two types of syntax tree walk mechanisms are generated
  • Generates LR1 parsers

Read more of this post

Traced execution

I will present in this post a proof of concept of an idea I got nearly a year ago while trying to remote the execution of some code and got tired of the limitations of C# in general and of Linq.Expressions in particular. Run-time reflection in C# 4.0 is limited to type and Linq.Expressions, here I demonstrate a form of reflection that can be performed on a small subset of C# statements.

Read more of this post


FruitPlate Difficile de recommander les demenagements Guigard… non seulement il ont cause beaucoup de degats, mais en plus je n’ai pas encore ete indemnise… une annee apres le demenagement, une habitude ? De plus toutes nos affaires ont une terrible odeur de moisi qui persiste jusqu’a ce jours. Voir egalement ici.

Piecewise immutable programming

Copyright © 2011 Gabriel Zs. K. Horvath

Although not a programming language expert at all, I will try to muster all my limited knowledge to briefly discuss software transactions in terms of programming model. Be aware that this is all very tentative…

On the one hand the level of isolation and immutability is very much reminiscent of functional programming, on the other hand the mutability of the versioned variables is essentially imperative programming. To better understand this relationship let’s see, very tentatively, how my model can be used to code in a “pure” imperative way or in a more functional one.

Read more of this post

Model specification

Copyright © 2011 Gabriel Zs. K. Horvath

This is my first attempt at giving a more formal definition of my software transactions model. Undoubtedly, a post which will be frequently updated.

Consider a system composed of the following:

  • A single transaction manager holding a unique global version number
  • Transactions
  • Threads
  • Semi-mutable versioned variables managed by the transaction manager
  • Non-versioned transaction-local immutable variables
  • External resources, these can be either read from or written to
  • A garbage collector

The program consists in the execution of transactions in one or more concurrent threads.

Read more of this post

Awelon Blue

Thoughts on Programming Experience Design

Joe Duffy's Blog

Adventures in the High-tech Underbelly

Design Matters

Furniture design blog from Design Matters author George Walker


A Blog for Woodworkers by Gary Rogowski

Paul Sellers' Blog

– A Lifestyle Woodworker –


Woodworking, life and all things between