I was looking into an unexpected cache hit rate drop in Claude Code and ended up finding a small client-side bug with outsized impact.
The short version: the CLI computes a cache-related value, threads it through the request path, and then ignores it when building the billing header for API key requests. OAuth users appear to avoid the issue because that header path is disabled for them.
The signal
The interesting header looked like this:
{
"type": "text",
"text": "x-anthropic-billing-header: cc_version=2.1.72.0ed; cc_entrypoint=cli; cch=0a04a;"
}
The cch field is clearly intended to carry cache-related metadata. If that value is wrong, constant, or disconnected from the actual request shape, any downstream logic that depends on it becomes unreliable.
That made this a good debugging target.
What the client is supposed to do
From the CLI code path, the flow is roughly:
- Normalize the request messages
- Derive a short hash from the user message content
- Pass that hash into the header builder
- Send the header with the outgoing request
In simplified form, it looks like this:
function NP1(message, version) {
const sample = [4, 7, 20].map((i) => message[i] || "0").join("");
const input = `${prefix}${sample}${version}`;
return sha256(input).slice(0, 3);
}
function eu7(payload) {
const message = extractUserMessage(payload);
return NP1(message, VERSION);
}
So far, nothing looks wrong. The client computes a content-derived value and carries it forward.
Where the bug is
The bug shows up in the header builder. The computed hash is passed in as an argument, but the function does not use it:
function tH8(cch) {
if (!BA9()) return "";
const version = `${CLI_VERSION}.${cch}`;
const entrypoint = process.env.CLAUDE_CODE_ENTRYPOINT ?? "unknown";
const cachePart = " cch=00000;"; // hardcoded
const workload = qK8();
const workloadPart = workload ? ` cc_workload=${workload};` : "";
return `x-anthropic-billing-header: cc_version=${version}; cc_entrypoint=${entrypoint};${cachePart}${workloadPart}`;
}
That is the entire bug.
The function receives the computed value and then emits cch=00000 for every request on that path.
Why this matters
The important part is not the exact hashing algorithm. The important part is that the client is generating a signal and then replacing it with a constant.
From an engineering perspective, that creates two problems:
- The request path looks dynamic in code review because a value is computed and passed around
- The actual wire-level behavior is static because the emitted header never changes
If the backend uses cch to segment, attribute, or improve cache behavior, a constant value destroys that signal.
I cannot prove backend cache internals from the client alone, so the cache impact is an inference. But it is a strong one: when a request classifier collapses many distinct requests into the same bucket, you should expect degraded optimization quality somewhere downstream.
Why API key users are affected
Claude Code has two authentication paths:
- Custom API key via
ANTHROPIC_API_KEY - OAuth / Claude Max style login
The attribution header is only emitted on the API key path. The relevant gate looks like this:
function BA9() {
if (disabledByEnv(process.env.CLAUDE_CODE_ATTRIBUTION_HEADER)) return false;
return featureFlag("tengu_attribution_header", true);
}
In practice, OAuth users appear to bypass this header entirely, which explains the asymmetry:
- API key users send the broken
cch - OAuth users do not send this header path, so they avoid the bug
That kind of auth-path divergence is exactly where performance regressions like this hide.
The minimal fix
The implementation fix is one line:
function tH8(cch) {
if (!BA9()) return "";
const version = `${CLI_VERSION}.${cch}`;
const entrypoint = process.env.CLAUDE_CODE_ENTRYPOINT ?? "unknown";
const cachePart = ` cch=${cch};`;
const workload = qK8();
const workloadPart = workload ? ` cc_workload=${workload};` : "";
return `x-anthropic-billing-header: cc_version=${version}; cc_entrypoint=${entrypoint};${cachePart}${workloadPart}`;
}
This is not a complicated bug. It is a propagation bug: compute the right value, pass the right value, drop the right value at the last step.
The tests that should exist
This is also the kind of issue that is easy to catch with cheap tests:
1. Unit test the header builder
it("uses the provided cch value", () => {
expect(tH8("abc12")).toContain("cch=abc12;");
});
2. Integration test both auth modes
For the same logical request:
- API key mode should emit the attribution header with the expected
cch - OAuth mode should either omit the header or behave consistently by design
3. Telemetry split by auth path
If cache hit rate or billing-header metrics are aggregated across all users, this bug gets blurred out. Breaking those metrics down by auth mode would make the regression obvious much earlier.
Takeaway
This is a good example of why performance debugging often starts with boring plumbing.
There was no crash. No exception. No obvious broken feature. The code even looked mostly correct at first glance because the hash was being computed and passed through the stack.
But the wire output told a different story.
When a system computes a value for routing, caching, attribution, or billing, the only thing that really matters is what gets serialized onto the request. Everything before that is just intent.
