<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Latency on 0x3F</title>
    <link>https://0x3f.blog/tags/latency/</link>
    <description>Recent content in Latency on 0x3F</description>
    <generator>Hugo -- 0.152.2</generator>
    <language>en-us</language>
    <lastBuildDate>Tue, 03 Mar 2026 10:00:00 +0100</lastBuildDate>
    <atom:link href="https://0x3f.blog/tags/latency/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>First Things First: Coordinated Omission</title>
      <link>https://0x3f.blog/posts/first-things-first-coordinated-omission/</link>
      <pubDate>Tue, 03 Mar 2026 10:00:00 +0100</pubDate>
      <guid>https://0x3f.blog/posts/first-things-first-coordinated-omission/</guid>
      <description>Same service, same pause pattern, different client model — p99 jumps from 1 ms to 195 ms. The measurement method itself lies.</description>
      <content:encoded><![CDATA[<h2 id="p99--1-ms--flip-one-switch--p99--195-ms">p99 = 1 ms — flip one switch — p99 = 195 ms</h2>
<p>Same service. Same pause pattern. Same nominal target rate. One change in the client model — p99 jumps 182×. Not a system failure. A measurement failure.</p>
<p>Design can lie. The environment can lie. Fix both — the benchmark looks solid, the percentiles look clean. Too clean. The measurement method itself can lie — a systematic omission baked into how the test collects data.</p>
<p>All code in this post: clone, build, run. Numbers below were measured on dual Xeon E5-2697 v2 — run the companion code on your hardware for your own results. Different hardware, different numbers — that&rsquo;s half the lesson.</p>
<p><em>Convention: charts use milliseconds; tables reproduce raw simulation output. Histograms are approximate visualizations of the recorded latency distribution — the percentile tables are the authoritative data.</em></p>
<hr>
<h2 id="send-wait-measure-repeat">Send, wait, measure, repeat</h2>
<div class="highlight"><pre data-lang="csharp"><code>public static LatencyReport Run(SimulatedService service, int ratePerSec, int durationSec)
{
    int totalRequests = ratePerSec * durationSec;
    var recorder = new LatencyRecorder();

    for (int i = 0; i &lt; totalRequests; i&#43;&#43;)
    {
        long start = Stopwatch.GetTimestamp();
        service.Process();
        long elapsed = Stopwatch.GetTimestamp() - start;
        recorder.Record(elapsed);
    }

    return recorder.GetReport();
}</code></pre></div>
<p><small>Closed-loop client — full source in companion code.</small></p>
<p>Send a request. Wait for the response. Measure the elapsed time. Send the next one. The client and the service take turns — a lockstep conversation where neither moves without the other. This pattern has a name: <strong>closed-loop</strong>.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> Most load test frameworks default to it. Most dashboards assume it.</p>
<p>What does your test do when the system slows down?</p>
<hr>
<h2 id="the-comfortable-picture">The comfortable picture</h2>
<p>The system under test: a simulated service with ~1 ms baseline latency (calibrated SpinWait) and a 200 ms pause every 500th request — modeling GC, compaction, or any periodic maintenance event. Target rate: 450 req/sec over 30 seconds (13,500 total). Average service time: (499 × 1 ms + 1 × 200 ms) / 500 = 1.4 ms. At 450 req/sec the service needs 630 ms of work per second — ~63% utilization, with headroom to spare. The pauses are the problem, not the capacity.</p>
<p>The closed-loop client has no rate limiter, no inter-request delay — <code>totalRequests</code> is just a count (rate × duration) to match the open-loop&rsquo;s output volume. The effective rate is whatever the service delivers. During normal processing (~1 ms per request), well above 450 req/sec. During a 200 ms pause: zero. The arrival rate follows the system. When the system slows, the test slows with it.</p>
<div class="chart-container">
  <canvas id="chart-4bc91d3463c210fa54b820e8639d3096"></canvas>
</div>
<script>
  (function() {
    var ctx = document.getElementById('chart-4bc91d3463c210fa54b820e8639d3096').getContext('2d');
    new Chart(ctx, 
{
  type: 'bar',
  data: {
    labels: ['0–2', '2–50', '50–100', '100–150', '150–200', '200+'],
    datasets: [{
      label: 'Request count',
      data: [13473, 0, 0, 0, 0, 27],
      backgroundColor: '#89b4fa',
      borderColor: '#89b4fa',
      borderWidth: 1
    }]
  },
  options: {
    plugins: {
      title: { display: true, text: 'Closed-loop latency distribution — 13,500 requests' },
      subtitle: { display: true, text: '~1ms baseline, 200ms pause every 500 requests' },
      legend: { display: false }
    },
    scales: {
      x: { title: { display: true, text: 'Latency bucket (ms)' } },
      y: { title: { display: true, text: 'Request count' } }
    }
  }
}
);
  })();
</script>

<div class="highlight"><pre data-lang=""><code>| Metric | Closed-loop  |
|--------|-------------:|
| Count  |       13,500 |
| p50    |      1.00 ms |
| p90    |      1.00 ms |
| p99    |      1.07 ms |
| p99.9  |    200.15 ms |
| max    |    200.28 ms |</code></pre></div>
<p>The dashboard looks clean. 99th percentile: 1 ms. Only p99.9 shows any trouble — and that&rsquo;s 27 requests out of 13,500, the ones that directly hit a pause. Every other request: ~1 ms, tight distribution, no tail. You read the numbers and move on.</p>
<p>The dashboard maps what the test recorded — not what users experienced.</p>
<p>Hume (1739): no finite set of observations guarantees the next. A thousand closed-loop measurements say p99 = 1 ms. The thousand-and-first doesn&rsquo;t have to agree. Induction from data that systematically omits the worst moments is induction from a sample that excludes its own counterexamples.</p>
<hr>
<h2 id="flip-one-switch">Flip one switch</h2>
<p>Same service. Same pause injector. Same nominal target rate. One change: the client sends on a fixed schedule, regardless of whether the previous request came back.</p>
<div class="highlight"><pre data-lang="csharp"><code>public static LatencyReport Run(SimulatedService service, int ratePerSec, int durationSec)
{
    var recorder = new LatencyRecorder();
    long intervalTicks = Stopwatch.Frequency / ratePerSec;
    long deadline = Stopwatch.GetTimestamp() &#43; (long)durationSec * Stopwatch.Frequency;
    long nextSend = Stopwatch.GetTimestamp();

    while (Stopwatch.GetTimestamp() &lt; deadline)
    {
        long intendedStart = nextSend;
        nextSend &#43;= intervalTicks;

        service.Process();

        long now = Stopwatch.GetTimestamp();
        long latency = now - intendedStart;  // ← intended, not actual
        recorder.Record(latency);

        while (Stopwatch.GetTimestamp() &lt; nextSend)
            Thread.SpinWait(10);
    }

    return recorder.GetReport();
}</code></pre></div>
<p><small>Open-loop client — full source in companion code. Note: <code>intervalTicks</code> uses integer division, introducing sub-microsecond step quantization at 450 req/sec — negligible for this demonstration.</small></p>
<p>One line changed: <code>now - intendedStart</code> instead of <code>now - actualStart</code>. The user&rsquo;s clock starts when they click, not when the server gets around to processing their request. When the service pauses, requests that should have been sent during the pause pile up — each measured from when it was <em>supposed</em> to start, because that&rsquo;s when the user started waiting.</p>
<div class="chart-container">
  <canvas id="chart-b8b70701f9ac8d7fb985049ac934642f"></canvas>
</div>
<script>
  (function() {
    var ctx = document.getElementById('chart-b8b70701f9ac8d7fb985049ac934642f').getContext('2d');
    new Chart(ctx, 
{
  type: 'bar',
  data: {
    labels: ['0–2', '2–50', '50–100', '100–150', '150–200', '200+'],
    datasets: [{
      label: 'Request count',
      data: [9100, 900, 1100, 1100, 1273, 27],
      backgroundColor: '#f38ba8',
      borderColor: '#f38ba8',
      borderWidth: 1
    }]
  },
  options: {
    plugins: {
      title: { display: true, text: 'Open-loop latency distribution — 13,500 requests' },
      subtitle: { display: true, text: 'Same service, same target rate — bimodal distribution' },
      legend: { display: false }
    },
    scales: {
      x: { title: { display: true, text: 'Latency bucket (ms)' } },
      y: { title: { display: true, text: 'Request count' } }
    }
  }
}
);
  })();
</script>

<p>Bimodal. A peak at ~1 ms and a wide spread from 50–200 ms. Two different experiences on the same chart.</p>
<div class="chart-container">
  <canvas id="chart-0bccaa457ae3fd03e19b00bddc547d51"></canvas>
</div>
<script>
  (function() {
    var ctx = document.getElementById('chart-0bccaa457ae3fd03e19b00bddc547d51').getContext('2d');
    new Chart(ctx, 
{
  type: 'bar',
  data: {
    labels: ['p50', 'p90', 'p99', 'p99.9', 'max'],
    datasets: [
      {
        label: 'Closed-loop',
        data: [1.00, 1.00, 1.07, 200.15, 200.28],
        backgroundColor: '#89b4fa',
        borderColor: '#89b4fa',
        borderWidth: 1
      },
      {
        label: 'Open-loop',
        data: [1.00, 137.89, 194.64, 200.15, 200.41],
        backgroundColor: '#f38ba8',
        borderColor: '#f38ba8',
        borderWidth: 1
      }
    ]
  },
  options: {
    plugins: {
      title: { display: true, text: 'Closed-loop vs open-loop — percentile comparison' },
      subtitle: { display: true, text: 'Same service, same target rate, same pauses — different measurement' },
      legend: { display: true }
    },
    scales: {
      x: { title: { display: true, text: 'Percentile' } },
      y: {
        type: 'logarithmic',
        title: { display: true, text: 'Latency (ms) — log scale' },
        min: 0.5,
        max: 500
      }
    }
  }
}
);
  })();
</script>

<div class="highlight"><pre data-lang=""><code>| Metric | Closed-loop  |    Open-loop |     Ratio |
|--------|-------------:|-------------:|----------:|
| Count  |       13,500 |       13,500 |           |
| p50    |      1.00 ms |      1.00 ms |      1.0x |
| p90    |      1.00 ms |    137.89 ms |    137.9x |
| p99    |      1.07 ms |    194.64 ms |    182.4x |
| p99.9  |    200.15 ms |    200.15 ms |      1.0x |
| max    |    200.28 ms |    200.41 ms |      1.0x |</code></pre></div>
<p><small>Ratios computed from raw data before rounding to displayed precision.</small></p>
<p>Same system. Same load. Same pause. One variable: whether the test waits for a response before sending the next request.</p>
<p>Closed-loop p99 = 1 ms. Open-loop p99 = 195 ms. <strong>182× on this workload.</strong></p>
<hr>
<h2 id="the-mechanism--coordinated-omission">The mechanism — coordinated omission</h2>
<p>During a 200 ms pause, the closed-loop client waits. While waiting, it sends no new requests — it goes with the system, slowing down exactly when the system slows down. 200 ms × 450 req/sec = 90 requests that <em>should have</em> been sent but weren&rsquo;t. They don&rsquo;t appear in the histogram. They don&rsquo;t exist in the data. The dashboard stays clean.</p>
<p>The open-loop client doesn&rsquo;t coordinate. It tracks what the schedule <em>should have been</em>. After the pause resolves:</p>
<ul>
<li>Request N+1: intended at T+2 ms, completed at T+201 ms → latency = <strong>199 ms</strong></li>
<li>Request N+2: intended at T+4 ms, completed at T+202 ms → latency = <strong>198 ms</strong></li>
<li>Request N+3: intended at T+7 ms, completed at T+203 ms → latency = <strong>196 ms</strong></li>
<li>&hellip;catch-up continues for ~160 requests until the schedule recovers</li>
</ul>
<p>Each pause contaminates ~160 subsequent requests with elevated latency. 27 pauses × ~160 requests = ~4,300 requests — roughly a third of all traffic — experiencing latency between 2 ms and 200 ms. That&rsquo;s why the open-loop p90 is 138 ms: the top 10% of requests (1,350 out of 13,500) fall squarely in that contaminated range.</p>
<p>The closed-loop client sees 27 bad requests. The open-loop client sees 4,300. Same service. Same pauses.</p>
<p>The worse the failure, the more requests the closed-loop client skips, the cleaner the dashboard. The mechanism is inversely proportional to the problem. A 200 ms pause omits 90 measurements. A 2-second pause omits 900. A 10-second GC stop-the-world omits 4,500. The worst event your system can produce is the one your test is least likely to record.</p>
<p>Gil Tene named this <strong>Coordinated Omission</strong> — the test coordinates with the system&rsquo;s failures, omitting measurements precisely when they would be most damning.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>Baudrillard (1981): the third phase of the simulacrum — the image masks the <em>absence</em> of reality. The closed-loop benchmark doesn&rsquo;t distort measurements. It masks their nonexistence. Those 90 requests during the pause aren&rsquo;t poorly measured. They don&rsquo;t exist. The dashboard is a simulacrum — it doesn&rsquo;t lie about the system. It replaces it.</p>
<hr>
<h2 id="how-to-stop-coordinating">How to stop coordinating</h2>
<table>
  <thead>
      <tr>
          <th>Property</th>
          <th>Closed-loop</th>
          <th>Open-loop</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Request timing</td>
          <td>After previous response</td>
          <td>Fixed schedule, independent of response</td>
      </tr>
      <tr>
          <td>What it measures</td>
          <td>Response time of sent requests (omits unsent)</td>
          <td>Response time from intended start (incl. queuing)</td>
      </tr>
      <tr>
          <td>During a pause</td>
          <td>Stops sending → omits measurements</td>
          <td>Tracks intended schedule → captures queuing</td>
      </tr>
      <tr>
          <td>p99 under pauses</td>
          <td>Looks clean (only direct hits visible)</td>
          <td>Shows full impact (queued requests visible)</td>
      </tr>
      <tr>
          <td>Best for</td>
          <td>Throughput measurement, saturation testing</td>
          <td>Latency measurement, SLA validation</td>
      </tr>
  </tbody>
</table>
<p>Four rules for latency measurement:</p>
<ol>
<li>
<p><strong>Open-loop by default for latency load tests.</strong> Closed-loop is still useful for throughput and saturation testing — finding the breaking point. But if your SLAs are latency percentiles, you need open-loop. Closed-loop tells you the system <em>can</em> handle the load; open-loop tells you what users <em>experience</em> while it does.<sup id="fnref1:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
</li>
<li>
<p><strong>Measure from intended time, not actual time.</strong> <code>latency = now - intendedStart</code>, not <code>now - actualStart</code>. The user&rsquo;s clock starts when they click, not when the server gets around to reading their request.</p>
</li>
<li>
<p><strong>Record the full tail.</strong> p50 and p99 are not enough. Report p99.9 and max. Coordinated omission hides in the gap between p99 and p99.9 — the range where closed-loop sees nothing and open-loop sees the damage.</p>
</li>
<li>
<p><strong>Use histograms that can handle it.</strong> HdrHistogram<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup> records values across a wide dynamic range with configurable precision — from sub-millisecond to multi-second latencies in the same histogram. Fixed-bucket histograms clip the tail.</p>
</li>
</ol>
<h3 id="tools-that-get-it-right">Tools that get it right</h3>
<table>
  <thead>
      <tr>
          <th>Tool</th>
          <th>Open-loop</th>
          <th>CO correction</th>
          <th>Notes</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>wrk2<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></td>
          <td>Yes</td>
          <td>Built-in</td>
          <td>Constant-rate HTTP benchmark, HdrHistogram output</td>
      </tr>
      <tr>
          <td>Gatling</td>
          <td>Yes</td>
          <td>Configurable</td>
          <td>Open-loop mode available, reports percentiles</td>
      </tr>
      <tr>
          <td>k6</td>
          <td>Partial</td>
          <td>Manual</td>
          <td>Constant-rate via scenarios, no auto-correction</td>
      </tr>
      <tr>
          <td>Custom (this post)</td>
          <td>Yes</td>
          <td>By design</td>
          <td><code>intendedStart</code> tracking, HdrHistogram.NET</td>
      </tr>
  </tbody>
</table>
<p>Capabilities and defaults vary by tool version and configuration; verify settings in your release.</p>
<h3 id="run-it-yourself">Run it yourself</h3>
<div class="highlight"><pre data-lang="bash"><code>git clone https://github.com/0x3f-blog/companion-code.git
cd companion-code/first-things-first/coordinated-omission
dotnet run -c Release</code></pre></div>
<hr>
<h2 id="benchmark-environment">Benchmark environment</h2>
<table>
  <thead>
      <tr>
          <th>Component</th>
          <th>Value</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>CPU</td>
          <td>2× Intel Xeon E5-2697 v2 @ 2.70 GHz (24 cores / 48 threads)</td>
      </tr>
      <tr>
          <td>RAM</td>
          <td>~115 GB DDR3-1866 (quad-channel per socket)</td>
      </tr>
      <tr>
          <td>OS</td>
          <td>Fedora Linux 42 (kernel 6.17)</td>
      </tr>
      <tr>
          <td>Runtime</td>
          <td>.NET 9.0.11 (RyuJIT AVX)</td>
      </tr>
      <tr>
          <td>SDK</td>
          <td>.NET SDK 10.0.102</td>
      </tr>
      <tr>
          <td>HdrHistogram</td>
          <td>HdrHistogram.NET 2.5.0</td>
      </tr>
      <tr>
          <td>Simulation</td>
          <td>450 req/sec, 30 sec, 200 ms pause every 500 requests</td>
      </tr>
  </tbody>
</table>
<p>Not BenchmarkDotNet — this is a custom in-process simulation. SpinWait calibrated at startup for ~1 ms baseline on current hardware (binary search, 50 samples, median). Fresh <code>SimulatedService</code> instance per client — no counter contamination.</p>
<p><strong>Limitations:</strong> In-process simulation — no HTTP, no network stack, no kernel-level queuing. The open-loop client is single-threaded and blocks on <code>Process()</code>, so it tracks the intended schedule rather than dispatching concurrently (a real open-loop system like wrk2 or Gatling sends requests asynchronously). These simplifications isolate the coordinated omission mechanism from transport noise — the measurement effect is the same, but absolute numbers would differ in a networked setup.</p>
<hr>
<p>Popper (1934): a meaningful test must be capable of producing a negative result. The closed-loop client cannot falsify the hypothesis &ldquo;the system is healthy&rdquo; — it hides the counterexamples. Measurements that would disprove it don&rsquo;t exist. Open-loop is the falsification instrument: it doesn&rsquo;t ask the system whether it&rsquo;s ready. It measures regardless.</p>
<p>Each layer of deception sits closer to you. Design — visible in the code. Environment — visible in the configuration. The method of collection — buried in an assumption you never questioned. Data collected correctly. But what do the data mean?</p>
<p>A metric that looks better the worse the system performs isn&rsquo;t a metric. It&rsquo;s anesthesia.</p>
<hr>
<h2 id="further-reading">Further reading</h2>
<ul>
<li>Gil Tene, <a href="https://www.youtube.com/watch?v=lJ8ydIuPFeU">How NOT to Measure Latency</a> (Strange Loop 2015) — the definitive talk on coordinated omission, open vs closed loop, and why percentile measurements lie.<sup id="fnref1:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></li>
<li>Gil Tene, <a href="https://www.infoq.com/presentations/latency-response-time/">How NOT to Measure Latency</a> (QCon San Francisco 2015) — recorded version of the talk, more on why averages and even p99 are insufficient without the full distribution.<sup id="fnref:5"><a href="#fn:5" class="footnote-ref" role="doc-noteref">5</a></sup></li>
<li>Schroeder, Wierman, Harchol-Balter, <a href="https://www.usenix.org/conference/nsdi-06/open-versus-closed-cautionary-tale">Open Versus Closed: A Cautionary Tale</a> (NSDI 2006) — the formal paper showing that open-loop and closed-loop produce fundamentally different results.<sup id="fnref2:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></li>
<li>Dean &amp; Barroso, <a href="https://dl.acm.org/doi/10.1145/2408776.2408794">The Tail at Scale</a> (CACM 2013) — why tail latency matters in distributed systems, fan-out amplification.<sup id="fnref:6"><a href="#fn:6" class="footnote-ref" role="doc-noteref">6</a></sup></li>
<li>Ousterhout, <a href="https://dl.acm.org/doi/10.1145/3213770">Always Measure One Level Deeper</a> (CACM 2018) — the general principle: measure the layer below where you think the problem is.<sup id="fnref:7"><a href="#fn:7" class="footnote-ref" role="doc-noteref">7</a></sup></li>
<li><a href="https://hdrhistogram.github.io/HdrHistogram/">HdrHistogram</a> — high dynamic range histogram for latency recording, with coordinated omission correction. Ports: Java, C#, C, Go, Rust, JavaScript, Python, Erlang.<sup id="fnref1:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></li>
<li>Gil Tene, <a href="https://github.com/giltene/wrk2">wrk2</a> — constant-rate HTTP benchmark with built-in coordinated omission correction and HdrHistogram output.<sup id="fnref1:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></li>
<li>Brendan Gregg, <a href="https://www.brendangregg.com/activebenchmarking.html">Active Benchmarking</a> — methodology and anti-patterns for honest measurement.<sup id="fnref:8"><a href="#fn:8" class="footnote-ref" role="doc-noteref">8</a></sup></li>
<li>Martin Thompson, <a href="https://mechanical-sympathy.blogspot.com/">Mechanical Sympathy</a> — latency-focused systems programming, false sharing, memory access patterns.<sup id="fnref:9"><a href="#fn:9" class="footnote-ref" role="doc-noteref">9</a></sup></li>
<li>Andrey Akinshin, <em>Pro .NET Benchmarking</em> (Apress, 2019) — comprehensive guide to .NET measurement, including percentile pitfalls.<sup id="fnref:10"><a href="#fn:10" class="footnote-ref" role="doc-noteref">10</a></sup></li>
</ul>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Schroeder, Wierman, Harchol-Balter, <a href="https://www.usenix.org/conference/nsdi-06/open-versus-closed-cautionary-tale">Open Versus Closed: A Cautionary Tale</a>, NSDI 2006. The formal demonstration that open-loop and closed-loop benchmarks produce fundamentally different performance characteristics — even on the same system under the same nominal load.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref2:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Gil Tene, <a href="https://www.youtube.com/watch?v=lJ8ydIuPFeU">How NOT to Measure Latency</a>, Strange Loop 2015. Defines coordinated omission, demonstrates the mechanism, introduces HdrHistogram. The single most important talk on latency measurement.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p><a href="https://hdrhistogram.github.io/HdrHistogram/">HdrHistogram</a> by Gil Tene. Records values across a configurable dynamic range (e.g., 1 microsecond to 1 hour) with uniform precision at any percentile level. .NET port: <a href="https://www.nuget.org/packages/HdrHistogram/">HdrHistogram.NET</a> on NuGet.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>Gil Tene, <a href="https://github.com/giltene/wrk2">wrk2</a>. A fork of wrk that maintains a constant request rate (open-loop) and records latency from intended send time. The output includes full HdrHistogram percentile data — no coordinated omission by construction.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:5">
<p>Gil Tene, <a href="https://www.infoq.com/presentations/latency-response-time/">How NOT to Measure Latency</a>, QCon San Francisco 2015. Why the mean is useless, why p99 isn&rsquo;t enough, why you need the full distribution.&#160;<a href="#fnref:5" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:6">
<p>Dean &amp; Barroso, <a href="https://dl.acm.org/doi/10.1145/2408776.2408794">The Tail at Scale</a>, CACM 2013. In a fan-out architecture, the probability of hitting at least one slow backend grows with the number of backends. Tail latency isn&rsquo;t a statistics curiosity — it&rsquo;s the dominant user experience at scale.&#160;<a href="#fnref:6" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:7">
<p>Ousterhout, <a href="https://dl.acm.org/doi/10.1145/3213770">Always Measure One Level Deeper</a>, CACM 2018. The general principle: if the numbers don&rsquo;t make sense, measure the layer below. Coordinated omission is a measurement-layer problem — you have to look at <em>how</em> the test records latency, not just <em>what</em> it reports.&#160;<a href="#fnref:7" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:8">
<p>Brendan Gregg, <a href="https://www.brendangregg.com/activebenchmarking.html">Active Benchmarking</a>. Methodology for honest benchmarking: verify work done, eliminate perturbation, report confidence. Includes a section on coordinated omission as a common anti-pattern.&#160;<a href="#fnref:8" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:9">
<p>Martin Thompson, <a href="https://mechanical-sympathy.blogspot.com/">Mechanical Sympathy</a>. Blog series on latency-sensitive systems programming — false sharing, memory access patterns, lock-free data structures. Context for understanding why sub-millisecond measurement matters.&#160;<a href="#fnref:9" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:10">
<p>Andrey Akinshin, <em>Pro .NET Benchmarking</em> (Apress, 2019). The BenchmarkDotNet author&rsquo;s comprehensive treatment of measurement in .NET — warmup, outliers, statistics, environment control, percentile reporting.&#160;<a href="#fnref:10" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>]]></content:encoded>
    </item>
  </channel>
</rss>
