The Margin Expansion Playbook
What actually predicts operating margin improvement? A machine learning analysis of 140 companies over six years.
Executive Summary
- Profit margin, trailing P/E, and revenue growth are the three strongest predictors of forward operating margin improvement across our 140-company panel. Companies entering a period with low current profitability and strong top-line momentum are the most likely to expand margins over the following year.
- The lever ranking shifts by sector. In Industrials, current operating margin is the dominant predictor. In Technology, valuation relative to revenue matters most. In Financials, price-to-earnings and beta lead. A one-size-fits-all playbook misses these differences.
- Directional accuracy of 56% in walk-forward cross-validation means the model is modestly better than chance at predicting whether a company’s margins will improve or deteriorate. This is consistent with the inherent noise in financial data, but the SHAP decomposition still reveals which characteristics carry the most signal.
The question
When a PE operating partner walks into a new portfolio company on day one, they face a prioritization problem. There are dozens of potential initiatives: pricing optimization, procurement consolidation, org restructuring, technology investment, go-to-market changes. Time and capital are finite. The standard approach is to run a diagnostic, draw on pattern recognition from prior deals, and make a judgment call.
We wanted to test whether the data could inform that judgment call more systematically. Specifically: across a broad panel of public companies, which observable characteristics at entry are most associated with operating margin improvement over the following year? And does the answer change depending on the sector?
Data and methodology
Our dataset is a panel of 140 public companies across seven sectors (Consumer, Energy, Financials, Healthcare, Industrials, Real Estate, and Technology) observed annually from 2019 through 2025. The raw data comes from two sources: SEC 10-K filings processed through our NLP pipeline, and financial data from Yahoo Finance. After merging and computing forward-looking targets, the effective sample is 355 company-year observations where we have both entry characteristics and a known forward margin outcome.
For each observation, we compute 44 features spanning several categories:
- Fundamental metrics: operating margin, profit margin, revenue growth, EV/EBITDA, EV/Revenue, P/E ratios, ROE, debt-to-equity, beta
- Lagged values: 1-year, 2-year, and 3-year lags of key metrics, capturing trajectory
- Rolling statistics: 2-year and 3-year moving averages and standard deviations, measuring stability
- Growth rates: year-over-year percentage changes and acceleration (change in growth rate)
- Sector-relative metrics: each company’s metrics relative to its sector median, controlling for industry norms
- Scale proxies: log-transformed market cap, enterprise value, and revenue
The prediction target is forward 1-year change in operating margin (in percentage points). A value of +0.02 means operating margin improved by 200 basis points over the following year.
We chose XGBoost (gradient-boosted decision trees) as the model because it handles non-linear relationships and feature interactions naturally. Margin expansion is not a linear phenomenon: a company with 5% margins and strong revenue growth behaves very differently from a company with 30% margins and the same growth rate. Tree-based models capture these conditional patterns without requiring us to specify them in advance.
To guard against overfitting, we use walk-forward cross-validation. The model trains only on data from earlier years and tests on the following year. This mimics the real-world constraint: you can only use information available at the time of the investment decision.
For interpretation, we use SHAP (SHapley Additive exPlanations), a method rooted in cooperative game theory that assigns each feature a contribution to each individual prediction. Unlike simple feature importance from tree splits, SHAP values are additive (they sum to the model’s prediction), directional (they tell you whether a feature pushed the prediction up or down), and consistent across observations.
The baseline: margin expansion varies widely by sector
Before introducing the predictive model, it is worth understanding the raw landscape. Forward margin change is not uniformly distributed across sectors.
| Sector | N | Median (bps) | IQR (bps) | Hit Rate |
|---|---|---|---|---|
| Real Estate | 21 | +86 | -67 / +160 | 71% |
| Technology | 57 | +20 | -35 / +169 | 56% |
| Consumer | 58 | +8 | -25 / +73 | 53% |
| Financials | 39 | +0 | -103 / +188 | 49% |
| Industrials | 95 | -2 | -162 / +128 | 46% |
| Healthcare | 49 | -4 | -148 / +89 | 43% |
| Energy | 34 | -74 | -336 / +56 | 29% |
Hit rate = percentage of company-years with positive margin expansion. IQR = interquartile range (25th to 75th percentile).
A few observations stand out. Real Estate has the highest median margin improvement (+86 bps) and a 71% hit rate, though the sample is small. Technology follows at +20 bps with a 56% hit rate. Energy is the most challenging sector for margin expansion, with a median of -74 bps and only 29% of company-years showing improvement. Industrials sits near zero, essentially a coin flip.
These baselines matter for operating partners. If you are deploying a margin expansion playbook in Energy, the structural headwinds are real. In Technology or Real Estate, the tailwind is already there.
Can we predict margin expansion?
Walk-Forward Cross-Validation Results
| Directional Accuracy | 33.5% | vs. 50% baseline (coin flip) |
| Mean Absolute Error | 4.32% | average prediction error in margin points |
| Out-of-Sample R² | -0.136 | negative = noisy; expected for financial data |
| Features | 44 | spanning fundamentals, growth, scale, sector |
| CV Folds | 3 | walk-forward (train on past, test on future) |
A word on how to read these results. The out-of-sample R-squared is negative, which means that on pure point-estimate accuracy, the model does worse than simply predicting the mean. This is common in financial prediction tasks where the signal-to-noise ratio is low. Margins are affected by macroeconomic shocks, management changes, and competitive dynamics that no model of observable entry characteristics can fully capture.
The more relevant metric for an operating partner is directional accuracy: does the model correctly predict whether margins go up or down? At 56%, it is modestly better than chance. That is not a trading signal, but it tells us the features do carry information. The value of this exercise is not in the point predictions. It is in the decomposition: which features carry that information?
The lever ranking
This is the core finding. We computed SHAP values for every feature across all 355 observations, then ranked features by their average absolute contribution to the model’s margin expansion prediction.
The three strongest predictors of forward margin expansion are:
Profit margin (current level). This is the single most informative feature. Companies with lower current profitability have more room to expand, and the model picks up on this mean-reversion tendency. Conversely, companies already operating at high margins face a gravitational pull downward.
Trailing P/E ratio. Valuation encodes the market’s expectations. A low P/E relative to the company’s other characteristics suggests the market is skeptical about earnings improvement, and when that improvement materializes, margins tend to be the mechanism. High-P/E companies, where improvement is already priced in, show less forward margin expansion.
Revenue growth. Top-line momentum matters. Companies growing revenue faster tend to see operating leverage kick in: fixed costs get spread across a larger revenue base, and pricing power often accompanies demand strength. This is the classic “growth-driven margin expansion” that PE operators target.
After these three, the ranking includes EV/Revenue (a cleaner valuation multiple for comparing across capital structures), beta (market sensitivity, which proxies for cyclicality and risk), and operating margin (the starting point for the metric we are trying to predict).
The beeswarm plot adds directionality. Look at Profit Margin (top row): blue dots (low profit margin) cluster to the right (positive SHAP, pushing toward margin expansion), while red dots (high profit margin) cluster to the left. This is the mean-reversion story: low-margin companies expand, high-margin companies compress.
For Revenue Growth, the pattern is the opposite: red dots (high revenue growth) push right. Growth begets margin expansion.
It depends on the sector
The global lever ranking is a useful starting point, but it obscures real differences across industries. We retrained the XGBoost model separately within each sector and recomputed SHAP values. The heatmap below shows the top features for each sector, with darker cells indicating stronger predictive importance.
Top 3 Predictors by Sector
- Consumer: Operating Margin, EV/EBITDA vs Sector, Market Cap
- Energy: Operating Margin, Profit Margin, Log Revenue
- Financials: Trailing P/E, EV/EBITDA, Market Cap
- Healthcare: Trailing P/E, Operating Margin, EV/EBITDA
- Industrials: Operating Margin, Profit Margin, Trailing P/E
- Technology: EV/Revenue, Log Revenue, Beta
Several sector-level patterns are worth highlighting:
Industrials is dominated by current operating margin. This aligns with the classic PE playbook for industrial portfolio companies: the starting margin level tells you the most about where margins are headed. Companies with compressed margins in Industrials tend to revert upward, often through procurement rationalization and overhead reduction.
Technology is a different story. Here, EV/Revenue and company scale (log revenue) are the top predictors. This makes sense: in tech, margin expansion is less about cost cutting and more about revenue scale. As software and platform businesses grow past fixed-cost thresholds, margins expand naturally. Valuation relative to revenue captures whether that scale premium is already priced in.
Energy is driven by current operating margin and profit margin, reflecting the cyclical nature of the sector. Margin expansion in energy is heavily influenced by commodity prices and cost discipline, both of which are captured in current profitability levels.
Financials shows trailing P/E and beta as the dominant predictors. For financial institutions, margin expansion is closely tied to rate environments and risk positioning. Companies with higher beta (more market sensitivity) and lower P/E (lower expectations) show the strongest forward margin improvement.
What this means for operating partners
The findings point to a few practical takeaways for PE value creation planning.
First, know your sector’s primary lever. The data consistently shows that the margin expansion playbook is not universal. An operating partner going into an industrial business should focus diagnostic effort on understanding the current margin structure and where the reversion opportunity sits. In a technology business, the priority shifts to revenue trajectory and scale economics. Running the same playbook across sectors leaves value on the table.
Second, current profitability is the strongest baseline signal. Across the full panel, low current margins are the single best predictor of forward margin improvement. This is partly mechanical (mean reversion), but it also reflects a real opportunity: underearning companies have more operational improvement potential. The implication for deal screening is straightforward: companies with margins well below their sector median deserve a closer look.
Third, revenue growth is the most underrated margin lever. It is tempting to frame PE value creation as a cost story. The data suggests that top-line momentum is at least as important. Companies growing revenue tend to see operating leverage work in their favor. The 100-day plan should allocate resources to growth initiatives (pricing, market expansion, product velocity) alongside the traditional cost-out efforts.
Fourth, valuation metrics carry information about margin trajectory. Low P/E and low EV/Revenue are not just “cheapness” indicators. They encode the market’s skepticism about future earnings improvement. When the model uses these features to predict margin expansion, it is finding companies where the gap between current perception and future reality is widest.
Limitations
This analysis uses public company data. The companies in our panel are materially larger than typical PE targets, which means the specific magnitudes may not transfer directly to mid-market deals. The structural relationships (which levers matter within which sectors) are more likely to generalize than the point estimates.
The sample of 355 usable observations, while adequate for the global model, gets thin at the sector level. Sector-specific SHAP rankings should be read as directional indicators, not precise measurements. We excluded sectors with fewer than 30 observations from the sector-level analysis.
Walk-forward cross-validation prevents lookahead bias, but the negative out-of-sample R-squared is an honest acknowledgment that margin expansion is hard to predict from observable entry characteristics alone. The value of this exercise is in identifying which characteristics carry the most signal, not in building a prediction tool.
Finally, these are associations, not causal estimates. We observe that low current margins predict future improvement, but the model cannot tell us whether that improvement happens because of operational changes, competitive dynamics, or macro tailwinds. The operating partner still needs judgment about why the margin opportunity exists and how to capture it. The data tells you where to look.
For complete documentation of data sources, feature engineering, model specifications, and scoring methodology, see our Assumptions & Methodology page.