The Metric as a Weapon
June 2026
The Metric as a Weapon
Series: The Architecture of Ambiguity, Part I Or: Why Systems Resist Accountability
I. The Invoice of the State
There is a distinct kind of clarity that arrives in the mail once a year. It is a tax statement. It does not arrive as an abstraction or a philosophical debate about the social contract; it arrives as a physical, numerical fact. The number at the bottom of the page is precise. The financial pain it induces is equally precise. What is entirely imprecise is what was purchased.
The initial reaction to a significant tax back-payment is usually visceral, but the true discomfort runs deeper than the amount. It is the vast, untraversable distance between the concrete pain of the single payment and the diffuse, collective, untraceable return. The arithmetic of the statement is flawless. Steuerklasse III, the Splittingtarif for the married couple, the Progressionsvorbehalt for any income earned across the border — every instrument functions exactly as designed. The machinery of extraction is perfectly calibrated. The math is not the source of the discomfort.
The discomfort emerges when you place two things in proximity.
First, there is the payment itself, extracted directly from labour. It is the taxation of earned income, taxed at the highest marginal bracket, representing thousands of hours of engineered solutions, negotiated compromises, accumulated stress, and the quiet hollowing-out of evenings that never quite belonged to anyone.
Second, adjacent to this, within walking distance in the same town, sits an entirely different arrangement. There is the man holding twelve undeveloped plots of land, producing nothing, simply waiting for zoning constraints and scarcity to drive the price toward three hundred and fifty euros a square metre. There are the inherited apartment blocks, purchased for almost nothing by a generation that had the demographic luck to arrive after the rubble was cleared and before the reconstruction boom peaked. There is the neighbour’s daughter who has never had to convert a single hour of her time into money, because time arrived already converted into capital. Meanwhile, a family of four remains confined to eighty square metres.
Observe the legal form in which the assets above are typically held. The GmbH, the holding structure, the patrimonial company: they enter the same tax year on different arithmetic. The entity subtracts the cost of its operation before taxation begins. The individual pays tax first, and lives from what is left. The asymmetry is not the rate. It is the denominator.
This is not a moral argument. It is an architectural observation.
The tax system is not designed to equalise these situations. It is designed to extract from labour with maximum efficiency, and to treat accumulated, passive capital with relative gentleness. You can debate whether that is ethically correct — that is a question of values. But the fact that it operates this way is a question of design. And design questions have architects.
The architecture of the state’s revenue tells you what the state values. The architecture of its expenditure is where the system truly obscures itself. This essay is not a libertarian complaint about whether the state takes too much. It is an inquiry into why the state cannot tell you, precisely, what it bought with what it took — and why that inability is not an administrative oversight, but a structural feature.
The discomfort is not the height of the invoice. It is the opacity of the receipt.
II. The Illusion of the Ledger
Consider a subsidy payment. It is correctly calculated, efficiently disbursed, and flows directly into the pocket of the exact person it was not meant to help.
The state has two legitimate, fundamental functions. The first is investive: to increase the total utility of the system. This means building the roads, funding the schools, maintaining the legal infrastructure, and ensuring the security that allows a society to function. The second is protective: to cast a safety net for those who fall out of the primary system through no fault of their own.
These are not the same function. They do not use the same tools, they do not operate on the same time horizons, and conflating them is the origin of most policy incoherence.
The German housing benefit — Wohngeld — is the clearest illustration of this confusion. Every year, billions are disbursed to help low-income families afford rent. It is a demand subsidy dropped onto a fundamentally inelastic market. The economic consequence is thoroughly documented and entirely unremarkable: in highly inelastic metropolitan markets where supply cannot expand to meet subsidised demand, the financial transfer largely accrues to the supplier through rent inflation. The landlord receives the margin. The family confined to eighty square metres remains in eighty square metres, but at a slightly higher rent, underwritten by the state.
The alternative is obvious. Directing the same capital toward supply expansion — new construction, zoning reform, brownfield development — would address the structural cause. But building a new apartment block takes five to eight years. It provides no immediate relief to the current generation of renters. Politicians are elected by current voters in four-year electoral cycles. Choosing to subsidise demand instead of expanding supply is not stupidity. It is highly rational behaviour inside a broken incentive structure.
Here the investive-protective confusion becomes visible in the architecture. A measure framed as a protective instrument (housing support) functions in reality as a structural subsidy to capital. Meanwhile, the investive alternative (building housing) is deferred indefinitely because it does not produce a visible, claimable effect within the political term. The protective instrument is loud and dated; the investive instrument is silent and slow. The dashboard preferred by every cabinet is the one that shows the loud instrument.
Welfare economics has long identified this problem without solving it. Utilitarian optimisation — the drive to maximise aggregate welfare, measured in GDP-compatible units — runs cleanly on the rails of the measurable. But as Amartya Sen and Martha Nussbaum have argued through the Capability Approach, utility is the wrong metric. The question is not how much do people have, but what can people do and be. A family living in eighty square metres with access to green space, functioning schools, and job security experiences a fundamentally different reality than a family in eighty square metres without them. No purely financial income metric captures that delta.
The problem is not that policymakers are uneducated or unaware of this distinction. The problem is that the Capability Approach produces no cleanly optimisable target. You cannot run a coalition agreement on the promise of human flourishing. You can, however, run one on housing benefit disbursed. The first is contestable on every axis; the second produces a number that can be reported on Thursday.
The illusion of the ledger is the belief that because the number is precise, the effect is understood. We optimise what we can measure, and ignore what we cannot, until the measurement itself becomes a proxy for reality. The proxy then displaces the thing it was meant to represent — and we govern the proxy.
A number that is precisely wrong is more dangerous than an estimate that is honestly uncertain.
III. The Metric as Power
Picture two economists sitting in the same room, looking at the exact same dataset, and reaching diametrically opposite conclusions. This does not happen because one is lying. It happens because their key performance indicators were set in different rooms, by different people, three years earlier.
Goodhart’s Law is usually quoted as a cautionary tale about measurement: when a measure becomes a target, it ceases to be a good measure. Treated only as a statistical quirk, it loses its teeth. The more accurate reading is that Goodhart’s Law is an observation about power.
The reason a measure fails when it becomes a target is not that measurement inherently corrupts. It is that optimising toward a specific target changes behaviour in ways that decouple the target from the underlying reality it was meant to represent. Gross Domestic Product is the canonical example. It measures economic activity. A multi-car pileup on the highway increases it: repair costs, hospital bills, legal fees — all counted as growth. A parent who reduces their working hours to care for a sick child decreases it. Unpaid care work does not appear at all.
The metric is not technically wrong. It measures exactly what it claims to measure: the transaction volume of capital. The problem lies in what it was selected to measure, and by whom.
Selecting the metric is the political act that precedes all other political acts. It decides, in advance, which programmes will be classified as successful and which will be condemned as failures. A government that measures poverty strictly by an income threshold defends a very different set of policies than one that measures poverty by capability deprivation. Neither government is lying about its success. Both are operating inside a framework that was built long before they arrived — a framework implicitly or explicitly designed to favour certain outcomes.
This is why the fight over the metric is the real election campaign. The visible campaign — the candidates, the televised debates, the manifestos — determines who sits in the executive chair. The invisible campaign determines what the executive is even trying to optimise. Which numbers appear on the dashboard of the chancellery? Which indices does the central bank watch? Which KPIs does the federal audit office use to judge efficiency? Those decisions are made earlier, quieter, and by fewer people. They survive elections that the politicians do not.
The distinction between we are optimising and we are choosing is not semantic. Optimisation is presented as a technical exercise: neutral, evidence-based, mathematically inevitable in the service of an agreed-upon objective. Choice, however, is political. It is contested, value-laden, and requires justification. By framing political decisions as optimisation problems, the actor removes them from democratic contestation. The metric becomes the authority. And the authority becomes unchallengeable because it is numerical.
Scenario models — integrated assessment tools that project outcomes rather than dictate targets — are a partial corrective. They do not pretend to optimise. They make trade-offs legible. They say: if we pass law X, here is what you gain on the economic dimension, here is what you lose on the environmental dimension, and here is the uncertainty range around both.
This approach is intellectually honest. It is also, for that precise reason, politically unpopular. A trade-off is a choice. A choice requires accountability. And accountability is precisely what the system is designed to defer.
This is not a measurement problem. It is power politics dressed in mathematical notation.
If the choice of the metric is an exercise of power — and it is — then the minimum an honest politics could do is to publicly name the measure of its own failure before it acts.
IV. The Architecture of Failure
A blueprint, in any working discipline of construction, has the property that what is missing is as informative as what is drawn. A wall without a load specification is not a stylistic omission. It is a defect. The blueprint is read both for what it shows and for what its silences imply.
A particular kind of organisation has built a parallel discipline for what is missing after a failure. Military units run After-Action Reviews. Software infrastructure teams conduct blameless post-mortems. Aviation safety boards investigate incidents without naming pilots. These are not procedural rituals. They are an applied methodology — and they share a single, severe principle.
You cannot improve a system by finding someone to blame. You improve it by finding the design flaw that made the failure probable, regardless of who was on duty. Blame collapses the analysis at exactly the moment when it should expand. The named culprit becomes the disposable explanation. The architecture survives intact and produces the same failure again, with a different name on it.
This discipline has never been imported into democratic politics. The reason is not that it is inapplicable. It is that it is too applicable.
Consider the German housing situation.
The 2021 coalition agreement contained a quantified commitment: four hundred thousand new housing units per year. The figure was stated, signed, photographed, distributed. It was not a slogan. It was a specification. Anyone who voted for that coalition could open the agreement, find the number, and hold it against the subsequent reality.
By 2024, completions had fallen to roughly 252,000 — already a year-on-year decline of nearly fifteen per cent. By 2026, the working projection from the construction industry was below 200,000, with the more pessimistic estimates landing closer to 175,000. The gap was not marginal. It was structural — between thirty and fifty per cent below the target, year after year, with the trajectory deteriorating rather than recovering. The most ambitious public housing commitment in a generation produced a result indistinguishable from no commitment at all.
The usual analytical response to a delta of this magnitude is well-rehearsed. Insufficient budget. Lack of political will. Too much bureaucracy. Construction industry capacity constraints. Interest rates. Each of these explanations is technically correct in the sense that none can be straightforwardly falsified. Each is also useless, because each is the failure restated at one level of abstraction higher. Bureaucracy is not a cause. It is a name for a class of friction that itself has causes. Lack of political will is what it looks like when a system’s incentives consistently push actors away from a stated objective. Pointing to these descriptions concludes the analysis at the exact moment the analysis should begin.
The other approach asks a different question. Which structural mechanisms make this outcome probable regardless of who is in charge?
Begin with planning law. The German Baugesetzbuch grants extensive participation and objection rights to existing property owners and neighbours over new construction in their vicinity. These rights are individually defensible — they protect citizens from arbitrary state action. In aggregate, they generate a system in which the holders of existing supply hold legal veto leverage over the introduction of new supply. The architecture of the law privileges the incumbent. New supply is, by construction, harder to permit than no new supply.
Continue with the tax treatment of undeveloped land. A plot held without development incurs negligible carrying cost. It is taxed by area, not by potential, not by use, not by social opportunity foregone. The economic signal sent by the system to the owner of an undeveloped plot is unambiguous: holding is free, developing is costly, waiting will appreciate the asset further. The man with twelve plots is not behaving irrationally. He is responding correctly to the price signal the state has set.
Continue with municipal approval processes. The average lag between building permit issuance and final completion for a typical multi-family development now frequently approaches three years, severely bottlenecking any supply-side response. Add the years of zoning revision, public objection, and administrative review that precede the permission itself, and the total cycle from initial proposal to completed building reliably exceeds the lifespan of any single coalition agreement. Whatever a federal government commits to in year zero will, by the time it could possibly produce a building, be governed by entirely different actors who can credibly disclaim the original commitment.
Continue with construction cost regulation. The combined effect of energy efficiency standards, accessibility requirements, fire codes, and acoustic norms — each individually defensible — produces a cost-per-square-metre floor for new construction that is consistently above what the social housing funding envelopes can finance. The state regulates social housing into existence on one balance sheet and prices it out of existence on another.
None of these are policy failures in the conventional sense. No specific politician is responsible. No specific decision is to blame. They are design decisions, taken at various points over decades, whose compounding effect is the outcome we observe. The architecture, not any one architect, produces the result.
The root cause does not name a culprit. It names a mechanism.
The architectural change that would follow, therefore, is not more political will. It is something specific. A Bodenwertsteuer — a land value tax that makes holding undeveloped land economically irrational, by levying carrying cost in proportion to potential rather than current use. A streamlined federal approval authority for designated priority construction corridors, compressing the multi-year permitting cycle by removing veto layers in defined zones. A formal separation of construction cost regulation from social subsidy levels, so that the regulatory floor cannot exceed the financing ceiling. These proposals are contestable. They are meant to be. But they are grounded in the cause, not floating above it.
Consider what changes for the man with twelve undeveloped plots. He would receive a monthly invoice for the housing he is not building, indexed to what that housing could plausibly be worth. The free appreciation he currently collects would be measured, every quarter, against the rising cost of waiting. At some carrying-cost threshold — which is what the tax-rate selection determines — holding ceases to dominate developing, and the rational owner sells, builds, or accepts the loss. The change in his behaviour would not be moral. It would be arithmetic. That is what structural consequence means in the precise sense: the difference between a declaration that he ought to build and a price signal that makes not building the more expensive of his options.
What has been done in the preceding paragraphs is not innovation. It is the application, to a political failure, of a method any post-mortem in a serious engineering organisation would apply as a matter of course. Five movements. The precise statement of what was promised. The precise statement of what occurred. The quantified delta between the two. The structural mechanisms that make the delta probable independent of personnel. The architectural change that would remove those mechanisms.
These five movements have a name in the disciplines that use them. The names vary — After-Action Review, Failure Mode Analysis, blameless post-mortem. The shape does not.
Promise. Reality. Delta. Root Cause. Structural Consequence.
The framework is not complicated. It is unfamiliar in politics not because it is intellectually difficult, but because step four — the question of which structural mechanisms produce the outcome — inevitably identifies who benefits from the current design. The veto rights on planning law are valuable to incumbent property owners. The carrying-cost-free treatment of land is valuable to land bankers. The long approval cycles are valuable to municipalities that wish to retain control. The regulatory cost floor is valuable to the existing housing stock, whose price is supported by the artificial scarcity of new supply. Step five then requires those beneficiaries to accept loss.
There is no version of this method that is politically comfortable for the people who would have to implement it. That is not a flaw in the method. It is the explanation for its absence.
Naming culprits is psychologically satisfying and structurally useless. Naming design flaws is uncomfortable and potentially effective. The choice between the two is not analytical. It is political.
V. Ambiguity as a Product
The footage is now familiar enough to be a genre. A politician in 2022, in opposition, condemns a policy as falsch und unverantwortlich. The same politician in 2026, as chancellor, defends a policy of recognisably the same shape, describing it as so leidlich functional. The cut between the two clips writes itself.
Friedrich Merz on the Tankrabatt is the clean case. Merz — who in 2022 led the CDU/CSU as the principal opposition force in the Bundestag and in 2026 became Federal Chancellor — condemned the traffic-light coalition’s fuel tax rebate as a wasteful subsidy whose primary beneficiaries were oil companies rather than commuters. Four years later, with the price of diesel pushed above two euros per litre by the Iran conflict, he reinstated a comparable mechanism. The defence offered was minimal. It was not a recantation. It was not an account of the delta. It was a shrug, lightly verbalised.
The easy reading of this sequence is hypocrisy. It is not wrong, but it is not useful. Hypocrisy is a character judgment. It explains nothing about why the sequence recurs, across parties, across countries, across decades, with near-perfect regularity. If the only person who changed his stated position on entry to office were Merz, the pattern would be a personal flaw. The pattern is the pattern.
The more interesting reading begins with a steelman. Why would a responsible, thoughtful politician choose ambiguity? There are at least three defensible reasons.
First, epistemic humility. Complex systems respond to interventions non-linearly. A policy that seems clearly correct at the planning stage often produces unintended effects in deployment. A politician who commits to a precise metric and a precise timeline before implementation is not demonstrating courage. He is demonstrating overconfidence in his model of the world. Some deliberate flexibility is rational. The architect who refuses to specify load tolerances in advance is irresponsible; the architect who refuses to revise them when the soil report contradicts the original assumption is also irresponsible. There is a defensible middle.
Second, coalition management. Democratic governance, particularly under proportional representation, requires assembling enough support to act. A precise public commitment forecloses negotiation. Ambiguity preserves the space in which compromise is possible. This is not evasion; it is the mechanics of pluralism. A coalition agreement that pretended to specify exactly what each ministry would do over four years would be either a fiction or a tyranny.
Third, asymmetric information between roles. Opposition is a diagnostic position: you observe what is failing and you say so. Government is an executive position: you inherit the failing system with all its constraints, dependencies, and irreversibilities, and you must act inside it. The same policy looks different from inside than from outside. Some apparent position-changes are genuine updates on new information. The opposition politician criticising the fuel rebate does not have to explain what he would do if the price of diesel were 2.20 € and the next election were six months away. The chancellor does.
All three of these defences are real. None of them survives the application of the framework from the previous chapter.
Epistemic humility does not require the refusal to specify what would count as failure. It requires the opposite. Because I am uncertain about outcomes, I will state in advance what would cause me to reverse course. Humility and measurability are not in conflict. Humility that refuses measurement is not epistemic. It is protective.
Coalition management does not require the abandonment of accountability after the decision is made. It requires flexibility before. Once the coalition has agreed to act, the agreement can be specific. The ambiguity that persists after commitment is not coalition management. It is shared liability diffusion — the deliberate spreading of consequence across so many actors that no single one can be held to account.
Asymmetric information between roles explains a position change. It does not excuse the absence of any mechanism requiring the actor to explain it. The politician who switches from critic to defender is not required, anywhere in the system, to account for the delta. He is not required to state what new information has arrived. He is not required to specify what his critic-self got wrong, or what his defender-self now believes the trade-off to be. The absence of any such requirement is the design flaw. So leidlich is not an analysis. It is a sound.
What all three defences share is the shape of their justification. They justify ambiguity in the moment of action. None of them justifies the structural absence of accountability after the action. The system is designed without a reckoning mechanism — not because reckoning is impossible, but because reckoning is costly to the actors who would have to design it into the system.
A second worked variant is the Gebäudeenergiegesetz. Germany passes an ambitious heating law in 2023, establishing a strict sixty-five per cent renewable requirement for new systems. It then immediately suspends that requirement for existing buildings pending municipal heat plans, allowing the continued installation of fossil boilers provided they transition to fifteen per cent bio-oil or green gas by 2029. By May 2026, a new modernisation draft proposes striking the rigid sixty-five per cent rule entirely. Bio-oil, in practice. Grain in the burner. The same molecules of bio-diesel and hydrogen-based e-fuels that are urgently required in steelmaking, cement production, and aviation — where battery substitution is physically impossible — are nominally allocated to twenty-year-old boilers in uninsulated houses to preserve the illusion of a transition without imposing its upfront costs. France, facing the same fuel-price shock, moved in the opposite direction: the prohibition on new gas heating systems in new buildings was advanced to 2027. The justification was structural — to break the dependency on Russia, the Gulf States, and the United States — rather than thermal.
The two responses to the same situation are not differentiated by intelligence. They are differentiated by which incumbent interests are organised at the table, and which costs are politically tolerable to impose. Neither government is operating outside its incentive structure. Both are operating inside it. The German government, more so, because its incentive structure is more crowded with veto players.
Ambiguity, on this reading, is not a failure of political will. It is the equilibrium state of a system whose architects had strong incentives to make accountability optional. The actors are not deviating from the design. They are running the design as written.
Even the most honest defence of ambiguity fails at the same point: it cannot survive the question of what counts as failure. Once that question is forbidden — silently, structurally, not by rule — ambiguity stops being a tool. It becomes the product.
VI. The Honest Hmpf
It is late. The analysis is complete. The building stands as it stood. Nothing has changed.
There is a particular kind of frustration that arrives at the end of a long, careful analysis. It is not the frustration of confusion — that one resolves with information. It is the frustration of clarity. You can see the structure, you can name its components, you can trace the failure to its source, and the structure remains unchanged. The diagnosis is precise. The treatment is unavailable.
The Hmpf at the end of such an evening is an honest response. It is what a systems thinker produces when he has done the work, reached the correct conclusion, and found that the conclusion does not, by itself, move anything. It is shorter than a sentence and more accurate than most of them.
This essay is not a call to action. The people who would need to act on its conclusions are not the people who will read it. The structural incentives that produce the failures described in the preceding chapters are more durable than any argument against them. That is not pessimism. It is an accurate reading of how systems persist. Systems do not collapse because someone has finally explained them. They persist because their persistence is the equilibrium their architecture produces.
But the Hmpf is not the last word. It is not resignation — it is the honest acknowledgment of a gap. The gap between understanding a system and changing it. Between naming the architecture and rebuilding it. Between an accurate diagnosis and an available treatment.
There is a smaller act available, and this essay is already performing it.
The act of precise observation, carefully written and publicly placed, does not change the structure. But it changes the conditions under which the structure can be questioned. A system that depends on ambiguity is most stable when its ambiguity goes unnamed. When the mechanism is described — clearly, specifically, without ideological loading and without rhetorical disguise — it becomes harder to maintain. Not impossible. But harder. Each precise description is a small subtraction from the deniability that the architecture relies on.
Writing does not change the world. It changes the conditions under which change becomes thinkable. This is a modest claim. It is also, in the long accounting of how ideas move through systems, not a small one. The metric is chosen long before the policy is debated; the language is chosen long before the metric. A text that refuses ambiguity at the language layer makes ambiguity slightly more expensive at the layers above.
The architect who has documented a structural failure has done something specific. He has not repaired the building. He has produced a record. The record is admissible later, when the failure can no longer be denied and the question of how it was permitted to occur becomes a question the system has to answer.
The Hmpf, then, is the beginning of the sentence. The essay is its continuation. There is no ending that resolves the gap. There is only the choice to stop there, or to keep writing.
Sources & Further Reading
Measurement, Metrics & Power
- Jerry Z. Muller: The Tyranny of Metrics (Princeton University Press, 2018) – The authoritative treatment of how quantification displaces judgment in public and private institutions. The empirical breadth is what gives it authority; the theoretical argument was made forty years earlier.
- Charles Goodhart: Problems of Monetary Management: The U.K. Experience, in Papers in Monetary Economics (Reserve Bank of Australia, 1975) – The original source, rarely read in full. The law named after it is considerably more interesting than the summarised version in circulation.
Welfare Economics & the Capability Approach
- Amartya Sen: Development as Freedom (Oxford University Press, 1999) – The foundational argument against GDP as a welfare measure, and for capability deprivation as the correct unit of analysis.
- Martha C. Nussbaum: Creating Capabilities: The Human Development Approach (Harvard University Press, 2011) – The applied and political version of the framework: what it would mean to govern by it, and why it remains institutionally marginal.
Systemic Failure & Accountability
- Sidney Dekker: The Field Guide to Understanding ‘Human Error’ (Ashgate, 2002) – The practitioner’s manual for replacing blame with system analysis. The intellectual substrate for every blameless post-mortem in use today.
- Karl E. Weick & Kathleen M. Sutcliffe: Managing the Unexpected (Wiley, 2015) – On how high-reliability organisations sustain accountability without collapsing into blame. The contrast with democratic governance is instructive.
- Philip Tetlock: Expert Political Judgment (Princeton University Press, 2005) – On the measurability of political predictions and the institutional costs of maintaining unfalsifiable forecasts. The empirical basis for demanding that political actors specify their failure conditions in advance.
Land, Housing & the Architecture of Scarcity
- Statistisches Bundesamt: Baugenehmigungen und Baufertigstellungen, annual publication – The primary source for completion figures and permit-to-completion lags cited in Chapter IV.
- Fred Harrison: The Power in the Land (Shepheard-Walwyn, 1983) – The political economy of land value taxation; older than it should need to be. The theoretical basis for the Bodenwertsteuer argument has not changed. The political resistance to it has not changed either.