The Innocent Dropdown That Was Eating Our Garbage Collector

There’s a particular kind of bug I love. The kind that’s been sitting in production for months, where two or three good engineers have already had a swing at it, and the verdict is always the same shrug: “the database must be slow during work hours.”

This is the story of one of those bugs.

The symptom

Legacy MVC5 application. A dossier search form, used all day every day by the backoffice team. Open the page, get a coffee, come back. Painful, but only between roughly 9 a.m. and 6 p.m. — outside work hours, the page was fine.

The team had spent weeks on it. Every query plan had been inspected. Every index reviewed. Every Include() audited for over-fetching. The DBA had been consulted. Nothing.

When it landed on my desk, I refused to play the guessing game. Guessing is how you spend two weeks adding indexes to columns that were never the problem. I opened my profiler and hit record.

What the profiler actually said

The hot path wasn’t a query. It wasn’t EF. It wasn’t the network.

It was a dropdown.

One of those innocent little <select> controls at the top of the form, populated with about 1,500 dossiers. The previous developer had built it the most direct way imaginable: take a string template, do string.Replace to inject the id, do another string.Replace to inject a custom color based on the dossier’s kind, then += the result onto the growing HTML string. Repeat 1,500 times. Per request.

It looked like this, give or take:

string html = "<select>";
foreach (var d in dossiers)
{
    string option = template
        .Replace("__ID__", d.Id.ToString())
        .Replace("__COLOR__", ColorFor(d.Kind))
        .Replace("__LABEL__", d.Label);
    html += option;
}
html += "</select>";

Every Replace returns a brand new string. Every += allocates a new string and copies the old one. For 1,500 options that’s something like 6,000 throwaway strings per render, many of them large enough and long-lived enough to survive Gen 0 and get promoted.

Off-hours, with a couple of users, the GC absorbed it. During work hours, with thirty users hammering the search form, the heap couldn’t drain fast enough. Allocations piled up, objects got promoted to Gen 2, and Gen 2 collections — which are the expensive ones — started firing constantly. The page wasn’t slow because of SQL. It was slow because the GC was spending most of its time mopping up after this one method.

When old people tell you not to construct large strings with concatenation, listen.

The fix

The minimum fix is boring: build into a StringBuilder instead of +=, and write the values directly with Append instead of substituting them with Replace. That alone kills the quadratic copy and the throwaway intermediate strings — Gen 2 collections drop to nearly zero, and you didn’t add a single dependency.

var sb = new StringBuilder(dossiers.Length * 80);
sb.Append("<select>");
foreach (var d in dossiers)
{
    sb.Append("<option value=\"").Append(d.Id)
      .Append("\" style=\"color:").Append(ColorFor(d.Kind))
      .Append(";\">").Append(d.Label)
      .Append("</option>");
}
sb.Append("</select>");

I shipped a different fix. I rebuilt the dropdown with HtmlAgilityPack: a real DOM, attributes set directly, the document serialized once. It stops you worrying about a label that happens to contain a quote or an angle bracket silently producing broken HTML — for a control rendered on every page load by every user, the correctness was worth the dependency.

Here is what the repro prints on my laptop, 1500 options, 1000 iterations, 16 concurrent renders:

[SLOW: string concat + Replace in loop]
  elapsed     :    28914 ms
  allocated   : 203,627.4 MB
  GC Gen0/1/2 : 18083 / 14659 / 14593

[FAST  : StringBuilder, no Replace]
  elapsed     :      120 ms
  allocated   :    546.3 MB
  GC Gen0/1/2 :   49 /   44 /   39

[OK: HtmlAgilityPack]
  elapsed     :     3138 ms
  allocated   :  2,470.4 MB
  GC Gen0/1/2 :  390 /  198 /    6

A few things worth pausing on. The pathological version allocates 200 GB over the run and triggers fourteen thousand Gen 2 collections — that’s the GC working full-time, which is exactly what you felt as page slowness. StringBuilder is the runaway winner on raw throughput: 220× faster and 370× less allocated than the original, and faster than the DOM library too. HtmlAgilityPack allocates more bytes but funnels them through short-lived DOM nodes that die in Gen 0, so its Gen 2 count is the lowest of the three — only 6.

So which is “best” depends on which axis you care about. On every axis, both are an enormous improvement over the original.

Reproducing it

You can reproduce this on whatever you’ve got — Windows, macOS, Linux. The .NET diagnostic tools ship cross-platform and they’re all you need:

  • dotnet-counters — live view of alloc-rate, gen-2-gc-count, and % time in GC. This alone would have caught the bug.
  • dotnet-trace — sampling profiler. Export to speedscope, open the JSON at speedscope.app, and String.Replace lights up like a Christmas tree.
  • dotnet-gcdump — heap snapshot, perfect for proving Gen 2 promotion.
  • PerfView (Windows) or BenchmarkDotNet (anywhere) for deeper analysis.
  • And of course, a commercial profiler like dotTrace or ANTS if you prefer a GUI — they all show the same thing.

I put together a minimal repro in the companion repo: three versions of the same dropdown render — the pathological original, the StringBuilder fix, and the HtmlAgilityPack version — with allocation and GC counters printed at the end. Run it with dotnet run -c Release. The gap between the first row and the other two is impossible to miss; which of the two fixes wins on which metric is its own little surprise.

The lesson, again

The team had been looking under the database streetlight because that’s where bugs usually are. They weren’t wrong to look there. They were wrong to keep looking there after the first day.

A profiler doesn’t have opinions. It doesn’t know which code is “supposed to” be slow. It just tells you where the time went. Spend the ten minutes to attach one, and it will save you the two weeks of guessing.

Don’t summon magic. Profile.