String Analysis in F # Generated IL

I wrote some small string parsing functions in F # to better understand F # and see how to deal with them. I am trying to go through a string and look for specific characters through recursion.

The logic really works, but the generated release build IL code (optimization enabled) really looks weird in my opinion. Therefore, I think there is a better way to write this material in the performance of F #.

Here is what part of the parsing functions looks like this:

let eatTag (input : string) index =
    let len = input.Length
    let nothing = 0, null, TagType.Open

    // more functions used in the same way
    // ...

    let rec findName i =
        if i >= len then nothing
        else
            let chr = input.[i]
            if isWhitespace chr then
                findName (i+1)
            elif chr = '/' then
                getName (i+1) (i+1) true
            else getName (i+1) i false

    let rec findStart i =
        if i >= len then nothing
        elif input.[i] = '<' then findName (i+1)
        else findStart (i+1)

    findStart index

Here's what the generated IL code looks like for the findStart function:

 // loop start
    IL_0000: nop
    IL_0001: ldarg.2
    IL_0002: ldarg.1
    IL_0003: blt.s IL_000e

    IL_0005: ldc.i4.0
    IL_0006: ldnull
    IL_0007: ldc.i4.0
    IL_0008: newobj instance void class [mscorlib]System.Tuple`3<int32, string, valuetype TagType>::.ctor(!0, !1, !2)
    IL_000d: ret

    IL_000e: ldarg.0
    IL_000f: ldarg.2
    IL_0010: call instance char [mscorlib]System.String::get_Chars(int32)
    IL_0015: ldc.i4.s 60
    IL_0017: bne.un.s IL_0024

    IL_0019: ldarg.0
    IL_001a: ldarg.1
    IL_001b: ldarg.2
    IL_001c: ldc.i4.1
    IL_001d: add
    IL_001e: call class [mscorlib]System.Tuple`3<int32, string, valuetype TagType> findName@70(string, int32, int32)
    IL_0023: ret

    IL_0024: ldarg.0
    IL_0025: ldarg.1
    IL_0026: ldarg.2
    IL_0027: ldc.i4.1
    IL_0028: add
    IL_0029: starg.s i
    IL_002b: starg.s len
    IL_002d: starg.s input
    IL_002f: br.s IL_0000
// end loop

The following code is displayed in the C # view (ILSpy) for this function - and this is especially important why I think I'm doing something wrong here. Obviously, the arguments to the function are somehow assigned to themselves ...?!

internal static Tuple<int, string, TagType> findStart@80(string input, int len, int i)
{
        while (i < len)
        {
                if (input[i] == '<')
                {
                        return findName@70(input, len, i + 1);
                }
                string arg_2D_0 = input;
                int arg_2B_0 = len;
                i++;
                len = arg_2B_0;
                input = arg_2D_0;
        }
        return new Tuple<int, string, TagType>(0, null, TagType.Open);
}

, . , , : -)

+3
3

.

"" . ( while(true) { }).

, "" , , , . , , , , .

+7

, . , "" , . , :

  • - , "" . , .

  • .tail call IL - "" , (, , let rec foo () = ... and bar () = ...). ( ), , .tail call .NET .

  • - , ( , )

, ( ), : :

let rec foo x = 
  if condition then 
    let x' = calculateNewArgument x // Run some computation
    foo x'                          // (Tail-)recursively calls itself
  else 
    calculateResult x               // Final calculation in the branch that returns

, :

let foo x = 
  let mutable x = x
  while condition do                // Check condition using current argument value
    x <- calculateNewArgument x     // Instead of recursion, run next iteration
  calculateResult x                 // Final calculation in the branch that returns
+5

Basically, instead of creating a chain like

findstart(findstart(findstart(findstart.....

the compiler is converted to a loop that eliminates function calls.

This is Tail Elimination , a fairly standard optimization of functional programming. This works because the overhead of re-evaluating function arguments is lower than calling a new function that generates a new frame stack.

+2
source

All Articles