Your Performance System Is Measuring the Wrong Things — And the 2025-2026 Data Proves It
Manager engagement just hit 22%. Half of all projects fail to return more value than they cost. Trust in immediate managers has dropped 37% in three years. These are not isolated data points from different problems. They are the same problem showing up in different departments. Organizations are measuring outputs while the system conditions that produce those outputs continue to degrade. The surveys report symptoms. Leadership responds to symptoms. The conditions get worse. Repeat.
This is not an argument about culture or leadership philosophy. It is a structural diagnosis backed by specific numbers from Gallup, PMI, and DDI research released in 2025 and 2026. The mechanism is visible once you stop treating every negative metric as a motivation problem and start asking what organizational decisions produced the conditions being measured.
Here is what the data actually says, what it means for how you run your business, and what you can do about it in the next 90 days.
Manager Engagement Just Posted Its Largest Single-Year Drop on Record
Gallup's 2026 data shows manager engagement at 22%, down from 27% in 2024. That five-point drop in a single year is the largest year-over-year decline on record. Global employee engagement simultaneously hit 20%, the lowest level since 2020. Gallup attributes both declines to structural drivers: organizational flattening and expanding team spans.
Before you accept the framing that this is an engagement problem, run the actual math. Managers account for 70% of variance in team engagement. When the management layer degrades, team performance degrades. When team performance degrades, business results degrade. The $10 trillion global productivity cost that analysts attach to low engagement is a downstream consequence of ignoring what is happening to the structural conditions managers operate in. It starts with a delayering decision in a budget meeting, not a cultural failure.
Here is the mechanism. An organization decides to flatten. Layers are removed. Spans grow. A manager who was overseeing eight people is now overseeing twelve, sometimes more, without a corresponding reduction in the scope of what that role requires. Cognitive load increases. Relational bandwidth shrinks. The quality of one-on-ones, feedback, and goal communication all degrade because there is less time and less capacity per person. Then the annual engagement survey runs, the scores drop, and the response is a leadership development program or a communication campaign.
That response misses the cause entirely.
The structural overload condition was created by a design decision. The survey named it a sentiment problem. Those are not the same thing, and treating them as the same thing is why engagement scores keep declining despite years of investment in culture initiatives.
There are two operational actions that actually address the cause:
- Map team spans for every manager across the last 36 months. Any manager whose span increased more than 25% without a documented reduction in role scope has a structural overload problem, not an engagement problem. Label it accurately before you decide how to respond.
- Audit how managers were selected. In organizations achieving 79% manager engagement, selection quality is the primary lever. That number versus the current 22% global average represents a gap that no training program closes after the fact. It closes at the point of selection.
One additional data point worth holding: fewer than 47% of employees strongly agree they know what is expected of them in their role. That is a role clarity and goal-communication failure. It has a direct, measurable fix. It is not a cultural sentiment issue. It is a management infrastructure problem that shows up in engagement scores and gets misread as a morale problem.
The pattern here is consistent. When organizational design decisions get made without accounting for cognitive and relational load on managers, engagement metrics decline. The decline gets attributed to leadership culture. The actual cause, the structural design, goes unaddressed. The cycle continues.
The Project Failure Math Nobody Wants to Run
PMI's 2025 Project Success Report surveyed more than 5,800 respondents across industries. The headline finding: only 50% of projects successfully deliver value exceeding their time, cost, and effort investment. Another 37% partially deliver. 13% fail outright.
Read that again. The default outcome for any project you launch is a coin flip on whether it returns more than it costs. That is not a pessimistic interpretation — that is what the data says across a sample of 5,800 organizations.
The most common reason cited for this failure, by 35% of respondents, is not poor strategy. It is the disconnect between planning and execution. The strategy was sound. The handoff from planning to actual delivery is where value evaporates.
Here is why that happens structurally. Most governance processes measure schedule and budget. Did the project ship on time? Did it come in under budget? Those are delivery metrics, not value metrics. An initiative can be on time, under budget, and still fail to address the problem it was scoped to solve, because the problem changed between kickoff and delivery and nobody was measuring strategic alignment as a live variable. They were measuring the optics of delivery.
High-performing organizations use 44% more metrics than average performers, and specifically include strategic alignment and outcome measures in their review cadence. That gap in measurement is the execution gap. You cannot close an execution deficit by improving project management discipline if the measurement system is only tracking schedule and budget.
There is a compounding factor here. The PMI data also shows that only 25% of project training investment goes toward business acumen, versus 46% toward technical skills. The people managing your most complex initiatives are being trained primarily on methodology, not on the business judgment that determines whether the initiative is still solving the right problem. As transformation complexity grows, the workforce managing it is underequipped on the dimension that predicts whether the portfolio delivers value.
Two actions that change the outcome:
- Add one strategic alignment question to every active initiative review: does this project still address the problem it was originally scoped to solve? This is not a quarterly question. It belongs in every governance meeting. If the answer has drifted, that drift is costing you before delivery, not after.
- Rebalance training investment toward business acumen. If your project managers can execute a methodology flawlessly but cannot articulate whether the initiative is still strategically relevant, you have a judgment gap that no process improvement closes.
The broader implication for portfolio management: if your governance process cannot explain why a given project is in the 50% that succeed rather than the 37% that partially deliver or the 13% that fail outright, you do not have an execution system. You have a reporting system. Reporting systems tell you what happened. Execution systems change what happens.
Trust in Managers Has Collapsed — and the Cause Is Not Who You Think
DDI's Global Leadership Forecast 2025 puts trust in immediate managers at 29%. That is a 37% decline from 2022. In three years, organizations have lost more than a third of the trust the management layer held.
That timeline matters. 2022 to 2025 is precisely the period when most large organizations executed significant flattening initiatives, expanded management spans, and simultaneously increased the strategic demands placed on managers, expecting them to drive transformation while absorbing the administrative load of the layers that were removed beneath them. The trust collapse is not a coincidence. It is the lagging measurement of those structural decisions.
The DDI data layers on top of this: 71% of leaders report increased stress since taking their current roles. 40% are considering leaving for better wellbeing. Only 49% of key roles can be filled from existing internal talent. 83% of HR leaders say organizations will need materially different capabilities within five years.
Work through the compounding effect of those numbers together. You have a management layer where nearly three-quarters are operating under elevated stress, four in ten are considering exit, internal succession covers fewer than half of critical roles, and trust from the people they lead has dropped by more than a third. That is not a talent development problem. That is a structural resilience problem. A succession crisis running in slow motion.
The organizations that identify this as a leadership quality problem are looking at lagging indicators and calling them causes. Trust fell because structural conditions were degraded. Stress increased because scope expanded while support was removed. Exit intent rose because the conditions produced it. Diagnosing the symptom and responding with a wellbeing program does not address what created the symptom.
Two structural responses worth running now:
- Run an internal role fillability audit. If more than 50% of critical roles have no credible internal successor within 18 months, you are one departure from an operational crisis. That is not a hypothetical risk. It is a current condition that requires action before the departure happens, not after.
- Audit what was removed from management roles during the last flattening. If scope expanded without a corresponding reduction in administrative burden, the structural capacity that makes good management possible was removed. Restoring it comes before any training investment. Training a manager to develop trust while leaving the structural overload in place is working against yourself.
The pattern across all three of these data sets is the same: leadership burnout, trust collapse, and engagement decline are lagging indicators of structural decisions made 18 to 36 months earlier. The organizations that run structural audits before the lagging indicators arrive do not face the same crisis. They see the mechanism in time to address it at the cause rather than the symptom.
The next sections cover how the measurement systems themselves accelerate this failure — specifically how Goodhart's Law is destroying your KPI system, and why your behavior change programs are producing no measurable change.
Goodhart's Law Is Running Your Performance System Into the Ground
In 1975, economist Charles Goodhart observed something that should have ended the blind faith in KPIs permanently: when a measure becomes a target, it ceases to be a good measure. Fifty years later, organizations still build performance systems as if this law doesn't apply to them. It does. And the cost is visible in the numbers if you're willing to look.
Here's what Goodhart's Law looks like at organizational scale. You set call volume as the primary sales metric. Reps hit the number. Conversion stays flat or drops. You add a conversion rate metric. Reps start cherry-picking high-probability leads and avoiding difficult accounts. Pipeline health degrades. You add pipeline coverage as a third metric. Reps start logging speculative conversations as qualified opportunities. Forecast accuracy collapses. At no point did anyone break a rule. At every point, the measurement system created a narrower and narrower optimization target, and rational people optimized for it.
This is not a motivation problem or a culture problem. It is an incentive architecture problem. MIT Sloan's Spring 2026 issue put it directly: narrow optimization creates gaming behaviors and ethical lapses without requiring individual bad intent. The people running the system don't need to be dishonest for the system to produce dishonest outputs. The measurement design does it for them.
The failure mode of balanced scorecards — and they do fail — is not their complexity. It's their static nature. Once targets are set and communicated, they create optimization pressure that distorts the behavior the metric was supposed to observe. The metric was a proxy for a real outcome. Once people know the proxy is what's being evaluated, they optimize for the proxy and the real outcome becomes secondary.
You can audit for this in your own organization right now. Look at your top three performance metrics. For each one, ask: are people hitting the number without producing the underlying outcome the number was supposed to represent? If your customer satisfaction score is up but repeat purchase rate is flat, the score got gamed. If project completion rate is high but PMI's data shows only 50% of projects deliver value exceeding their cost, completion is the wrong metric and it's being treated as the right one. That gap between the metric and the outcome it was meant to represent — that's the Goodhart signature.
The fix is structural, not motivational. Pairs of metrics are significantly harder to game simultaneously than individual metrics. If you measure both call volume and deal quality scores assigned post-close, optimizing one against the other requires actually doing the job rather than working around it. MIT Sloan's framework proposes adapting techniques from AI model training — where Goodhart's Law creates model collapse if not explicitly designed around — to human performance systems. The core mechanism is the same: you need second-order measures that make gaming the primary measure costly rather than costless.
There's a second structural fix that most organizations skip entirely: rotating metrics on a fixed cycle and auditing explicitly for gaming signatures before the next cycle begins. Performance measurement systems don't degrade because people get worse. They degrade because measurement creates increasingly narrow optimization pressure over time, and that pressure compounds. The organizations that sustain measurement validity do so by treating metric design as a live operational problem, not a one-time setup task.
Why Communication Campaigns Don't Change Behavior
Every year, organizations spend substantial training budgets on behavior change programs built around communication campaigns. Town halls. Values workshops. Manager training sessions on the new strategic priorities. Email cascades explaining the why behind the what. And every year, the behavior they were trying to change stays approximately the same.
The Journal of Change Management's 2026 literature review covering 25 years of research makes the mechanism explicit: change programs consistently fail because they treat readiness, resistance, and engagement as fixed individual traits rather than dynamic processes shaped by context. Put plainly — you're trying to change a person when the problem is the system they're operating inside.
Here's why this matters mechanically. A communication campaign operates on the assumption that if people understand what you want and why you want it, they'll change their behavior to match. This assumption is wrong in a specific and measurable way. Understanding and behavior change are not the same causal chain. A person can fully understand why you want them to document client interactions in the CRM and still not do it, not because they're resistant or disengaged, but because the moment they're deciding whether to document — right after a client call, three minutes before their next meeting — there is nothing in that environment pulling them toward documentation and several things pulling them away from it.
The research term for this is decision context. Behavior change happens at the moment of choice, not at the moment of communication. HBR's February 2026 analysis of the 4T model makes the mechanism concrete: target one specific behavior, build the theory from actual data diagnostics rather than assumptions, deliver the intervention at the real decision moment, and test rigorously. The operative word is timely. An intervention that operates outside the actual decision context produces no change because it's not present when the decision is being made.
This is why nudge design outperforms training programs on behavior change outcomes. A nudge at the moment of choice — a prompt in the system at the point of data entry, a structured question built into the meeting template, a default option set toward the desired behavior — operates where the decision actually happens. A training session on why documentation matters operates hours or days before the decision happens and has no presence at the decision point.
The organizational cost of misunderstanding this mechanism is substantial. Consider what behavior change programs actually cost: design, delivery, facilitator time, employee time, follow-up measurement. Now consider that the evidence base says traditional training campaigns produce no measurable behavior change when they operate outside the actual decision context. That's not a marginal return problem. That's resource allocation to a method that the research has disconfirmed.
The 2026 literature review adds another layer that most change management programs get wrong: the same employee who blocks change in one context will drive it in another. Labeling people as change-resistant based on a survey response at a single point in time is a category error. Resistance is contextual and relational. It shifts with the clarity of direction being given, the quality of management in the immediate environment, and the degree of agency the person has over how the change affects their work. Fix those conditions and the resistance frequently disappears — not because the person changed, but because the system conditions that produced the resistance changed.
The practical implication: before you run another change initiative, identify your highest-resistance group and examine what's actually true about their context. Are they receiving conflicting direction from different parts of the organization? Has their agency over their work been reduced without explanation? Are the outcomes of the change unclear to them in ways that affect their daily decisions? If yes to any of those, fixing the conditions is the intervention. Sending them through another communication cascade will produce the same result you got last time.
Performance Theater vs. Real Accountability
Most organizations have performance processes. Very few have accountability systems. The difference is specific and it's costing you in ways that show up in your execution numbers.
Performance theater looks like this: quarterly reviews where managers rate direct reports on a five-point scale, stack-ranked against a forced distribution, discussed in a calibration session, and filed. Annual goals set in January that haven't been revisited since March because the business shifted. Project governance meetings where status is reported as green when the underlying delivery is at risk, because nobody wants to be the person who called the project red. KPI dashboards that get presented to leadership showing numbers that hit the target while the business outcome the target was supposed to represent is flat or declining.
None of this is accountability. It is documentation of process. The distinction matters because process documentation and accountability produce different behaviors and different outcomes.
Real accountability has three characteristics that performance theater lacks. First, it is tied to outcomes, not activities. The question is not whether the project was delivered on time and budget — PMI's data shows half of all projects that hit schedule and cost targets still fail to deliver value exceeding their investment. The question is whether the outcome the project was scoped to produce actually occurred. If your governance process can't answer that question for every active initiative, you have a reporting system, not an accountability system.
Second, real accountability requires the gap between commitment and result to be visible and named without political cost. In most organizations, the person who surfaces a problem early enough to fix it gets less reward than the person who keeps the project status green until the problem becomes undeniable. That incentive structure is the direct cause of late-stage project failures that could have been course-corrected three months earlier. If the accountability system punishes transparency, you will get opacity. The system produces the behavior.
Third, real accountability is specific about who decided what, not just what happened. Post-mortems that identify process failures without identifying the decision points that produced them don't prevent recurrence. They just give you a story about what went wrong. The mechanism you need to close is the one where a specific person made a specific call with specific information and the call was wrong. That's the learning. Everything else is narrative.
- Audit your last three failed or underperforming initiatives. Identify the specific decision point where the trajectory became likely. Who made that decision, with what information, against what criteria?
- If your performance reviews contain more ratings than documented commitments with follow-up dates, you have a documentation process, not an accountability process.
- For every active KPI, identify what business outcome it was designed to represent. If the KPI is green and the underlying outcome is flat, the metric has been compromised. Act on the outcome, not the metric.
The accountability gap closes when consequences — positive and negative — are tied to outcomes rather than process compliance. That requires leadership to be willing to name the gap between what was committed and what was delivered, in specific terms, without softening it into process language. Most organizations aren't willing to do that consistently. The ones that are show up differently in execution data, retention of high performers, and the speed at which problems surface versus the speed at which they compound.
The structure you build around measurement, behavior change, and accountability either produces the outcomes you want or it doesn't. The data says most structures don't. The question is whether you're willing to audit the structure or keep attributing the results to the people inside it.
What Business Performance Engineering Actually Means
Every data point in this issue points to the same structural failure. Manager engagement dropped to 22% because spans expanded without anyone accounting for cognitive load. Half of all projects fail to deliver value because planning-to-execution alignment is a kickoff checkbox, not a live variable. Trust in managers fell 37% in three years because the system stripped capacity from the management layer and then asked the survey why people were losing confidence. KPIs get gamed because static targets create optimization pressure that narrows behavior without anyone intending it.
None of these are attitude problems. None of them are fixed by communication campaigns, leadership training retreats, or another round of pulse surveys. They are engineering problems. And that distinction matters because engineering problems have structural solutions.
Business Performance Engineering — BPE — is the practice of measuring and designing the conditions that produce outcomes, not the outcomes themselves. The output is a lagging indicator. By the time it shows up in your numbers, the structural failure that caused it is already 12 to 36 months old. What BPE does is move the measurement point upstream, into the system conditions: span of control, role clarity, metric design, decision context, pipeline depth, resource alignment. If those conditions are sound, the outputs follow. If they are degraded, no amount of output-focused management recovers them without fixing the underlying structure first.
This is not a rebranding of management consulting. It is a rejection of the assumption that human performance is primarily a motivation and culture problem. The Gallup data does not show 78% of managers disengaged because they stopped caring. It shows 78% of managers disengaged because the system produced conditions that make engagement structurally difficult: spans that grew 40% without scope reduction, selection criteria that never prioritized management quality, and measurement systems that report on sentiment instead of load. The DDI data on 40% of leaders considering exit is the downstream consequence of structural decisions made during the last round of organizational flattening. The MIT Sloan work on Goodhart's Law is the explanation for why your KPI system is producing numbers that no longer reflect the behavior you are trying to observe.
BPE names all of this as engineering territory. Which means it is diagnosable, designable, and fixable. Not with a framework to admire. With specific structural changes made in a specific sequence.
Three Structural Repairs You Can Make Right Now
The research summarized in this issue is not ambiguous about where the leverage is. The following three repairs are directly tied to the data. Each one addresses a structural failure, not a symptom. None of them require a budget approval for a new program.
Repair 1: Audit Manager Spans and Restore Structural Capacity
The Gallup finding is specific: manager engagement dropped from 27% to 22% in a single year, the largest single-year decline on record, and the structural driver is organizational flattening that expanded team spans without reducing role scope. The DDI data confirms the consequence: trust in immediate managers fell to 29%, a 37% decline over three years, during the same period that spans expanded and strategic demands on managers increased.
The repair starts with a span audit, not a leadership development program. Pull the team sizes for every manager in your organization today versus 36 months ago. Any manager whose span increased more than 25% without a documented reduction in administrative or scope load has a structural overload condition. That condition is what the engagement survey is reporting. It is not what the engagement survey says it is reporting.
Once you have identified the overloaded managers, the question is not how to re-engage them. The question is what was removed from the role when scope expanded, and whether you can restore it. That might mean reducing administrative requirements. It might mean eliminating standing meetings that consume management capacity without producing decisions. It might mean narrowing the manager's accountability scope to match the expanded relational load of a larger team. The point is structural. Training a structurally overloaded manager on engagement techniques produces no durable change. Fixing the overload condition does.
Organizations that reach 79% manager engagement — the benchmark for best-practice organizations in the Gallup data — get there primarily through selection quality and structural support, not through culture programs. If you cannot point to both of those as active, current investments, your engagement number will not move.
Repair 2: Add Strategic Alignment as a Live Variable in Every Initiative Review
The PMI data is a direct indictment of how most organizations run project governance. Across 5,800 respondents, only 50% of projects deliver value exceeding their time, cost, and effort investment. The top barrier, cited by 35% of respondents, is the disconnect between planning and execution. The planning-to-execution disconnect does not happen because strategies are bad. It happens because alignment is assessed once at kickoff and then removed from the measurement system.
The repair is a single question added to every active initiative review: does this project still address the problem it was scoped to solve? That is it. Not a new governance layer. Not a new reporting template. One question, every review cycle, for every active initiative.
The reason this works is that the PMI data shows high-performing organizations use 44% more metrics than average specifically because they include strategic alignment and outcome measures alongside schedule and budget. Schedule and budget measure delivery. Strategic alignment measures value. Most organizations are running governance processes that can tell them whether a project shipped on time and on budget, but cannot tell them whether it was still worth doing at the time it shipped. The execution gap is the gap between those two measurements.
Alongside this, look at where your training spend is going. The PMI data shows only 25% of training investment goes to business acumen versus 46% to technical skills. The project managers running your portfolio need business judgment more than they need technical capability. The execution gap is a judgment gap. Reallocating even 10 percentage points of training spend toward business acumen closes more of the execution gap than any project management methodology.
Repair 3: Audit Your Top Three KPIs for Gaming Signatures
The MIT Sloan analysis of Goodhart's Law is the structural explanation for why most performance systems degrade over time without anyone making a deliberate decision to game them. When a measure becomes a target, optimization pressure narrows behavior toward the metric and away from the underlying outcome. This happens at scale, without individual bad intent, because that is how incentive structures work.
The audit is straightforward. For each of your top three performance metrics, ask one question: are people hitting the number without producing the underlying outcome the number was designed to measure? Call volume up but qualified pipeline flat. Customer satisfaction scores high but repeat purchase rate declining. Project completion rates strong but value delivery at 50%. If you can find the gap between the metric and the outcome, the metric has been Goodharted.
The structural fix is a second-order measure for each primary KPI. Pairs of metrics are significantly harder to game simultaneously than individual metrics because optimizing for one tends to reveal the gap in the other. If call volume is your primary metric, add qualified opportunities created as the paired measure. Now the rep who runs up call volume without generating qualified opportunities is visible in the system, not rewarded by it.
The MIT Sloan framework recommends rotating metrics deliberately and auditing for gaming signatures on a fixed cycle, not just when performance problems become visible. By the time gaming is visible in your output numbers, the distortion is already 12 months old. The audit catches it upstream.
What the Next 12 Months Look Like — With and Without the Fix
If you run the structural repairs, here is what the 12-month picture looks like. Manager spans get mapped, overloaded managers get structural relief, and engagement begins to recover because the condition driving disengagement has been addressed. Selection criteria get audited, and future manager appointments weight relationship quality and coaching capability rather than technical expertise alone. Project governance adds strategic alignment as a live variable, and the portfolio mix starts to shift toward the 50% of projects that deliver value rather than the 37% that partially deliver and the 13% that fail outright. KPI audits surface two or three metrics that have been Goodharted, second-order measures get added, and the gaming signatures disappear because the incentive architecture no longer rewards them. Leadership pipeline depth gets assessed, and the organizations with fewer than 50% of critical roles fillable internally start building internal succession rather than discovering the gap the day a critical role turns over.
None of this is dramatic. It is incremental structural repair that compounds over 12 months into a measurably more functional operating system.
If you do not run the repairs, the 12-month picture is also predictable. Manager engagement continues to decline because the structural conditions driving it remain in place. Another round of engagement surveys reports the same numbers with slightly different framing. Another cohort of projects ships on time and on budget and fails to deliver value because strategic alignment was a kickoff conversation. KPIs continue to get hit in ways that do not move the underlying business. One or two critical leaders in the 40% who are considering exit actually leave, and you discover in real time how few internal successors are ready. The training budget gets spent on technical skills that do not close the execution gap. And the structural decisions that are producing all of this remain invisible because the measurement system is still pointed at outputs.
The data in this issue is not a prediction. It is a description of what is already happening in most organizations. The question is whether you are measuring the system that produces your outputs, or just the outputs.
What to Do Next
If any of this maps to what you are seeing in your organization, the starting point is not a new initiative. It is an audit. Within 30 days, run your last three completed projects against the PMI 50% baseline and determine whether each one actually delivered value exceeding its cost, not just whether it shipped. Within 60 days, map your top five managers' spans against 36 months ago and document what structural support was removed when spans expanded. Within 90 days, pick one high-impact behavior tied to your most critical performance gap and apply a targeted intervention at the actual decision moment rather than a training program delivered outside of context.
That sequence is not a framework. It is the specific diagnostic and repair sequence that the research in this issue points to directly. If you want to work through it with someone who has run it inside operating businesses — not consulted on it from the outside — that is what this work is about. Reach out directly. Operator to operator.
