Warm tip: This article is reproduced from stackoverflow.com, please click
.net c# linq performance

Counting elements in an array -Performance-

发布于 2020-03-29 21:00:21

I have this array : var arr = new int[] { 1, 1, 0, -1, -1 }; where I need to count the number of positives, negatives and zeros numbers. I did it with foreach loop and with Linq and I tried to compare the performance between the two methods by using the Stopwatch here is my code:

        int pos = 0, neg = 0, zeros = 0;
        int p = 0, n = 0, z = 0;


        Stopwatch sw = new Stopwatch();

        sw.Start();

        pos = arr.Sum(e => e > 0 ? 1 : 0);
        neg = arr.Sum(e => e < 0 ? 1 : 0);
        zeros = arr.Sum(e => e == 0 ? 1 : 0);

        sw.Stop();


        Stopwatch sw2 = new Stopwatch();

        sw2.Start();

        foreach (var item in arr)
        {
            if (item > 0) p++;
            else if (item < 0) n++;
            else z++;
        }

        sw2.Stop();

        Console.WriteLine("Elapsed={0}", sw.Elapsed); //Elapsed=00:00:00.0008311
        Console.WriteLine("Elapsed2={0}", sw2.Elapsed); //Elapsed2=00:00:00.0000028

The results showed me that the foreach loop where a lot better (28ms) than the Linq method (8311ms), So my question is why there is all this difference in performance?

I even tried to make three foreach loop, one to count the negatives, one to count the positives and the third to count the zeros, but the performance was still good than the Linq method!

Thanks in advance for your help!

Questioner
ilyes
Viewed
72
Dmitry Bychenko 2020-01-31 18:46

Let's carry out horse races in a bit different manner:

private static string UnderTest(int size) {
  int pos = 0, neg = 0, zeros = 0;
  int p = 0, n = 0, z = 0;

  Random random = new Random(0);
  int[] arr = Enumerable
    .Range(0, size)
    .Select(x => random.Next(-1, 2))
    .ToArray();
  GC.Collect(GC.MaxGeneration);

  Stopwatch sw = new Stopwatch();
  sw.Start();

  // Three Linq loops (expected to be 3 three times slower)
  pos = arr.Sum(e => e > 0 ? 1 : 0);    
  neg = arr.Sum(e => e < 0 ? 1 : 0);
  zeros = arr.Sum(e => e == 0 ? 1 : 0);

  sw.Stop();

  Stopwatch sw2 = new Stopwatch();
  sw2.Start();

  // Just 1 loop 
  foreach (var item in arr) {
    if (item > 0) p++;
    else if (item < 0) n++;
    else z++;
  }

  sw2.Stop();

  return $"{sw.Elapsed} vs. {sw2.Elapsed} ratio: {((double)sw.Elapsed.Ticks) / sw2.Elapsed.Ticks:f3}";
} 

Races for different array sizes:

   int[] loops = new int[] {
    1000,
    10_000,
    100_000,
    1_000_000,
    10_000_000,
    100_000_000,
    1000,         // <- 1000 again
  };

  string report = string.Join(Environment.NewLine, loops
    .Select(loop => $"loops: {loop,10} : {UnderTest(loop)}"));

  Console.Write(report);

Outcome:

loops:       1000 : 00:00:00.0006471 vs. 00:00:00.0000114 ratio: 56.763 // <- Warming up
loops:      10000 : 00:00:00.0003195 vs. 00:00:00.0001074 ratio: 2.975
loops:     100000 : 00:00:00.0037131 vs. 00:00:00.0010910 ratio: 3.403
loops:    1000000 : 00:00:00.0351574 vs. 00:00:00.0118858 ratio: 2.958
loops:   10000000 : 00:00:00.3729617 vs. 00:00:00.1198276 ratio: 3.112
loops:  100000000 : 00:00:03.7002508 vs. 00:00:01.1808595 ratio: 3.134
loops:       1000 : 00:00:00.0000322 vs. 00:00:00.0000099 ratio: 3.253 // <- Expected

What's going on: we have 3 Linq based loops (3 calls for Sum), so Linq 3 times slower ( no surprise). However we have a huge outlier at the very first run when system loads assembly, compiles IL code etc. So you have a so called warming up effect.