Exiting the monad with eval in LINQ
November 5, 2014 Leave a comment
LINQ to object queries are lazy, but this is actually rarely what I need, so in most cases I add a ToList() or a ToArray() in my code before using the result of the query. Since this requires the addition of brackets around the query, the readability of the code suffers and often I skip LINQ and use chained extension methods instead:
var x = (from i in numbers where i < 0 select i*2).ToArray();
var x = numbers.Where(i => i < 0).Select(i => i*2).ToArray();
The same happens with all methods that exit the monad, e.g. First/Last/Single, Any/All/Contains, Aggregate, Average/Min/Max/Sum, Count, etc. So wouldn’t it be nice to have an extra clause in LINQ that would let one apply any of these method to the result of the query?
var x = from i in numbers where i < 0 select i*2 eval ToArray();
I am using the eval keyword here as it does actually result in the evaluation of the query. In some cases we might want to include a lambda expression:
var x = from i in numbers where i < 0 select i*2 eval First(j => j > 8);
which isn’t very pretty but makes it clear that the last bit is not actually part of the query. The alternative would be to add new keywords to LINQ, e.g. first, last, any, etc. But this approach would result in a proliferation of keywords. The advantage of the above approach is that any existing method can be used.
Intermediate approaches might be possible, for example:
var x = from i in numbers where i < 0 select i*2 eval First j where j > 8;
where the where keyword could be used whenever the lambda is a predicate? Similarly for Sum/Max/Min/Average
var x = from i in numbers where i < 0 select i*2 eval Sum j for j*j;
though this one doesn’t look too nice.
Never mind, I have created a fork of Roslyn on Codeplex to try out this idea, … to be continued once the changes has been committed.
[24 hours later] I committed the code in a fork named AddEvalClauseToLinq. It was nice and easy to modify the syntax, but things got trickier when translating the query. I managed to get the example code (see below) to build and execute, but I would have to spend much more time on the compiler to do a less hacky job.
Here is the code I managed to compile:
var ints = new[] { 1, 2, 3 }; Func<int, bool> Trace = i => { Console.Write("<" + i + ">"); return true; }; Console.WriteLine("x = from i in ints where Trace(i) select i*2;"); var x = from i in ints where Trace(i) select i*2; Console.WriteLine("type: " + x.GetType().Name); foreach (var i in x) Console.Write(i + " "); Console.WriteLine(); foreach (var i in x) Console.Write(i + " "); Console.WriteLine(); Console.WriteLine(); Console.WriteLine("y = from i in ints where Trace(i) select i*2 eval ToList;"); var y = from i in ints where Trace(i) select i * 2 eval ToList(); Console.WriteLine("type: " + y.GetType().Name); foreach (var i in y) Console.Write(i + " "); Console.WriteLine(); foreach (var i in y) Console.Write(i + " "); Console.WriteLine(); Console.WriteLine(); Console.WriteLine("z = from i in ints where Trace(i) select i*2 eval First;"); var z = from i in ints where Trace(i) select i * 2 eval First(); Console.WriteLine("type: " + z.GetType().Name); Console.WriteLine("Value: " + z); Console.WriteLine(); Console.WriteLine(); Console.WriteLine("s = from i in ints where Trace(i) select i*2 eval Sum(j => j*j);"); var s = from i in ints where Trace(i) select i * 2 eval Sum(j => j*j); Console.WriteLine("type: " + s.GetType().Name); Console.WriteLine("Value: " + s);
and here is the console output:
x = from i in ints where Trace(i) select i*2; type: WhereSelectArrayIterator`2 <1>2 <2>4 <3>6 <1>2 <2>4 <3>6 y = from i in ints where Trace(i) select i*2 eval ToList; <1><2><3>type: List`1 2 4 6 2 4 6 z = from i in ints where Trace(i) select i*2 eval First; <1>type: Int32 Value: 2 m = from i in ints where Trace(i) select i*2 eval Sum(t => t*t); <1><2><3>type: Int32 Value: 56
I think that the console output says it all, types are as expected and only the first query without eval is executed multiple times.
My next Roslyn hack I would love to try would be to get better syntaxic support for tuples, just as in
var x = from (i,j) in listOfTuples where i < 0 select j*2;
or,
var (i,j) = Tuple.Create(1,2);
depending on how hard it turns out to be.
[EDIT] Something very similar has been proposed #3571