Tuesday, September 30, 2014

Should you avoid closures and LINQ?

You probably know that C# offers anonymous methods and lambda expressions (which are almost but not quite identical to each other). You can create them with the delegate keyword and the => operator, respectively. They are often a handy tool, and they are hard to avoid if you want to use certain library functions (such as List<T>.Sort()) or LINQ.
Do anonymous methods and lambdas cause memory leaks? The answer is: it depends. The C# compiler actually has two very different ways of handling them. To understand the difference, consider the following small chunk of code:
int result = 0;
void Update()
    for (int i = 0; i < 100; i++)
        System.Func<int, int> myFunc = (p) => p * p;
        result += myFunc(i);
As you can see, the snippet seems to create a delegate myFunc 100 times each frame, using it each time to perform a calculation. But Mono only allocates heap memory the first time the Update() method is called (52 Bytes on my system), and doesn't do any further heap allocations in subsequent frames. What's going on? Using a code reflector (as I'll explain in the next blog post), one can see that the C# compiler simply replaces myFunc by a static field of type System.Func<intint> in the class that contains Update(). This field gets a name that is weird but also revealing: f__am$cache1 (it may differ somewhat on you system). In other words, the delegator is allocated only once and then cached.
Now let's make a minor change to the definition of the delegate:
        System.Func<int, int> myFunc = (p) => p * i++;
By substituting 'i++' for 'p', we've turned something that could be called a 'locally defined function' into a true closure. Closures are a pillar of functional programming. They tie functions to data - more precisely, to non-local variables that were defined outside of the function. In the case of myFunc, 'p' is a local variable but 'i' is non-local, as it belongs to the scope of the Update() method. The C# compiler now has to convert myFunc into something that can access, and even modify, non-local variables. It achieves this by declaring (behind the scenes) an entirely new class that represents the reference environment in whichmyFunc was created. An object of this class is allocated each time we pass through the for-loop, and we suddenly have a huge memory leak (2.6 Kb per frame on my computer).
Of course, the chief reason why closures and other language features where introduced in C# 3.0 is LINQ. If closures can lead to memory leaks, is it safe to use LINQ in your game? I may be the wrong person to ask, as I have always avoided LINQ like the plague. Parts of LINQ apparently will not work on operating systems that don't support just-in-time compilation, such as iOS. But from a memory aspect, LINQ is bad news anyway. An incredibly basic expression like the following:
int[] array = { 1, 2, 3, 6, 7, 8 };

void Update()
    IEnumerable<int> elements = from element in array
                    orderby element descending
                    where element > 2
                    select element;
... allocates 68 Bytes on my system in every frame (28 via Enumerable.OrderByDescending() and 40 viaEnumerable.Where())! The culprit here isn't even closures but extension methods to IEnumerable: LINQ has to create intermediary arrays to arrive at the final result, and doesn't have a system in place for recycling them afterwards. That said, I am not an expert on LINQ and I do not know if there are components of it that can be used safely within a real-time environment.

