MLB Payroll & Performance Analysis

K-Means Clustering

Four Distinct Spending Profiles

Teams were clustered using payroll share and WAR share across 12 positional groups. Each cluster reflects a distinct organizational philosophy toward roster construction.

Cluster 1

Rotation Horses, DH-Heavy

BAL · BOS · DET · MIN · NYY · SEA · TEX · TOR · WSH

$135.17MAvg Payroll

32.4Avg WAR

0.496Avg Win %

Cluster 2

Sporadic Offensive Spending

COL · LAA

$129.12MAvg Payroll

26.1Avg WAR

0.454Avg Win %

Cluster 3

Balanced Lineups, Bullpen Emphasis

ARI · CHC · CLE · HOU · LAD · MIA · MIL · NYM · OAK · SD · TB

$115.02MAvg Payroll

34.9Avg WAR

0.524Avg Win %

Cluster 4

Relief-Dependent Outlier

PIT

$73.12MAvg Payroll

21.8Avg WAR

0.450Avg Win %

Cluster 5

Up the Middle Value Finders

ATL · CHW · CIN · KC · PHI · SF · STL

$124.56MAvg Payroll

29.9Avg WAR

0.486Avg Win %

What makes these groups unique?

Examine the key difference between teams in clusters 1 vs 3 here.

The Pirates are in a league of their own.
The Angels and Rockies spend sporadically on offense, which is not conducive to a winning strategy.

Year-by-Year Breakdown

Cluster Assignments by Season

Track how each franchise shifted spending philosophy across the 2015–2024 sample. Scroll horizontally to view all seasons.

MLB Team Cluster Assignments (2015–2024) Interactive

Just because a team sticks to the same spending strategy, this does not mean they will be more successful. Based on statistical t testing, there is no significant difference in mean wins between teams that keep a rather constant spending strategy, compared to teams that frequently switch spending strategy. Teams that stuck with one spending strategy for 7 years or more over the sample were labeled "Stable Strategists", while other teams were labeled "Fluid Movers". Find more on the statistical test here.

Payroll is Still King Here

Change in Ownership Team Strategic Shifts

NYM

New York Mets

$133.41MAvg payroll before sale

0.506Avg win % before sale

$212.02MAvg payroll after sale

0.523Avg win % after sale

2020New Owner

Overview

Steve Cohen has been the poster boy for altering team strategy with egregious spending.

MIA

Miami Marlins

$79.46MAvg payroll before sale

0.464Avg win % before sale

$64.88MAvg payroll after sale

0.428Avg win % after sale

2017New Owner

Overview

Ownership has failed to commit to a spending strategy, resulting in poor performance.

SEA

Seattle Mariners

$132.23MAvg payroll before sale

0.469Avg win % before sale

$117.51MAvg payroll after sale

0.510Avg win % after sale

2016New Owner

Overview

The Mariners are spending less - and it's working. Bucking the trend to focus on youth.

Kansas City Royals

$129.22MAvg payroll before sale

0.485Avg win % before sale

$87.68MAvg payroll after sale

0.434Avg win % after sale

2019New Owner

Overview

Ownership is operating cohesively with team strategy in mind, increasing spending steadily.

Delta between Payroll % & WAR %

Cluster Spending Efficiency

Interactive chart shows which clusters tend to find value or overpay among positions.

Proportion of WAR vs Proportion of Payroll Represented Interactive

Cluster 1

Rotation Horses, DH-Heavy

While these teams spend the most on starting pitching, they extract good value from other positional groups, resulting in well-built rosters.

Cluster 2

Sporadic Offensive Spending

Random spending across positions has left LAA and COL with no clear vision. Minimal spending on starting pitching has kept them behind, despite extracting value well from those pitchers.

Cluster 3

Balanced Lineups, Bullpen Emphasis

The balanced approach has yielded success, but potential bullpen overpay can hinder these teams from reaching their ceilings.

Cluster 4

Relief-Dependent Outlier

The differences in payroll allocation and WAR composition signify an inability to identify big league talent when handing out contracts.

Cluster 5

Up the Middle Value Finders

An interesting combination of strong spending with a focus on sourcing value up the middle of the field.

Tableau Visualizations

Spending & Performance Dashboards

Interactive dashboards exploring positional payroll allocation, WAR efficiency, and win correlations across all 30 franchises.

Positional Group WAR by Win Correlations Tableau

Learn more about what positions most directly impact WAR totals and winning percentage. Filter through the chart to see winning percentage by WAR for each position; point size is determined by spend on that position. The results may not be as you expect, read more here.

Positional Return on Investment Tableau

Filter in this chart to see which teams returned the highest WAR totals across positions, given the amount of money each team spent on every position. WAR is the most significant predictor of winning percentage; read more about that conclusion and positional spending here.

WAR Spend per Million ($) Spent Tableau

While WAR per dollar was found to be insignificant in predicting winning percentage, it still adds context to which teams are effective in getting the most out of their players. The size of each bubble represents their WAR per Million value, while the color signifies their average yearly payroll. Houston, Tampa Bay, Oakland, are all very interesting. Oakland has a higher WAR per Million mark than Tampa Bay, with only $6M less in average payroll, yet Oakland's average winning percentage is 0.473 and Tampa Bay's is 0.547. This difference could be attributed to divisional strength over time, roster construction or nuances, or even luck. Regardless, two very similar profiles yielding very different results is quite interesting.

Houston is the best at extracting WAR from their players, with the highest WAR per Million figure at 1.387. They are executing to this level with a top 8 average payroll. They are the best example of spending when needed, while also implementing player development systems to create sustained success. This speaks to Houston's ability to understand winning baseball from the front office to the farm system. There is no surprise that they reached the World Series four times over the course of this sample.

Metric Explainer

WAR — Wins Above Replacement

Wins Above Replacement (WAR) is the performance metric used in this analysis to assess the quality of play a team is getting from its players. Essentially, WAR measures how much more impactful a given player is than a "replacement level player" at the same position. A "replacement level player" is a hypothetical player that could be brought up from the minor leagues to play in MLB at any point and have a net zero impact on the outcome of the game. The higher a player's WAR, the more valuable they are.

There are a plethora of statistics to choose from when measuring player output, but WAR is the only one that can be used for both pitchers and position players. This is crucial, as the cluster analysis relies on pitchers and hitters being compared on the same scale.

Hypothesis Test

Spending Strategy Win Percentage Test

When looking at team cluster assignments by year, two types of teams presented themselves. Stable Strategists, and Fluid Movers. Stable Strategists were teams identified as having used the same spending strategy for 7 years or more of the sample. Fluid Movers used one strategy for a maximum of 6 years or less. Often, fanbases believe their team needs to completely abandon their spending strategy to be successful. However, is it really beneficial for teams to be switching up their strategies on a frequent basis?

To answer this question, a T test between two independent samples was run. The null hypothesis used is as follows: H₀: μ_{winpct, stable} = μ_{winpct, fluid}. Mean winning percentage for both Stable Strategists and Fluid Movers were calculated to be 0.498 and 0.501 respectively. These means were used in tandem with the calculated variances for each subset, to calculate the T statistic, 0.124616. From there, the calculated p-value at a 0.05 significance level was 0.9017. This high p value means the null hypothesis fails to be rejected. Essentially, there is no significant difference in mean winning percentage between teams that keep consistent spending strategies vs. teams that switch them frequently.

Hypothesis Test

Starting Pitching WAR Origination Test

Teams in clusters 1 and 3 generate the highest WAR across positions on average. Both include teams that consistently found themselves in the playoffs throughout the duration of the sample. The most intriguing difference to examine is the approach to starting pitching. Teams in cluster 1 tend to focus spending on starting pitching, while teams in cluster three usually build out more balanced lineups. To determine whether or not one strategy was more effective than the other, a two way factorial test was created. The hypotheses were as follows: H₀: μ_1. = ... = μ_a., H₀: μ_.1 = ... = μ_.b, and H₀: there is no interaction between the two factors.

In this case, factor 1 "Cluster" and factor 2 was "Payroll_Group". The goal of the test is to establish whether or not average starting pitching WAR differed between teams based on cluster, based on payroll group, and the interaction between the two. After running the test, p values for Cluster and the interaction between cluster and payroll group came back large. This means we fail to reject those null hypotheses. Essentially, there is no significant difference in starting pitching WAR between teams in different clusters. Payroll however does have a small associated p value, meaning the null hypothesis is rejected in this case. This means teams that spend more on starting pitching generate more WAR at the position regardless of cluster. So, while cluster strategy is important, ultimately, money talks.

Regression

Parameter Impacts on Win Percentage

To measure the impact of each parameter on win percentage, a robust multiple linear regression model was created. The robust function was used to improve the accuracy of the standard errors used in test statistic and p value calculations. Due to the nature of the dataset with each season having multiple entries due to positional entries, this was the appropriate way to ensure a more accurate test. The parameters used were payroll, WAR, WAR per Dollar, and position. Most notably, the WAR parameter came back significant with the lowest p value, as well as a larger coefficient than payroll and WAR per Dollar. This means WAR was the most significant determinant of winning percentage. This makes sense, as teams in cluster 3 had the highest average winning percentage, and the highest average WAR total.

In addition to WAR being the most significant predictor, WAR per Dollar had a p value of 0.2779, which is very large. This means it does not contribute significantly to winning percentage. In context, the efficiency with which teams extract WAR from players is not nearly as important as the total amount of WAR a team can tally throughout a season.

The position parameter was broken down into each individual positional group's impact on winning percentage in comparison to Starting Pitchers, which was used as a reference level. This type of comparison seeks to add color to the question "how does spending on this position compare to starting pitching in terms of win percentage impact?" Positive coefficients here suggest that after controlling for WAR and payroll, teams whose rows reflect position player spending win more relative to starting pitcher rows, supporting the idea that balanced roster construction matters. The positional groups with the most positive coefficients were, Catchers, Designated Hitters, Left Fielders, and Utility Players. Relief Pitchers had the lowest coefficient, suggesting that additional spending on bullpen after committing substantial spending to starting pitching is less efficient than allocating those dollars to position players.

Regression

Standardized Positional Coefficients

To determine which positions were most efficient at impacting winning percentage, a regression was utilized in which total payroll is controlled. This structure explicitly displays which positions have the greatest impact on winning percentage with each marginal increase in WAR. The results of this regression are fascinating. To start, after controlling for war (measuring payrolls sole impact on winning percentage), payroll was rendered insignificant. Accumulating WAR is much more crucial than spending excessively in an attempt to buy the best players. Roster construction efficiency to extract maximum WAR is most important when trying to win games. This is illustrated perfectly by cluster 3. Teams in cluster 3 spend on average $20M less in payroll each year than teams in cluster 1, and $9.5M less than teams in cluster 5, and win on average 2.8% more and 3.8% more games, respectively.

In terms of each position's impact on team success, it was most appropriate to standardize coefficients to get a clearer look as to which positional groups are most central to team success. Starting Pitchers had the highest coefficient by far, 0.316. Relief Pitchers followed at 0.219. This is not surprising, as pitchers hold the ball for most of the game, and they make up the largest part of each roster. An increase in one win above replacement from either of these groups is fairly easy to achieve and is the reason for the clear effect on winning percentage. Of the position players, corner positions were rated highly. The three greatest coefficients among hitters were Right Fielders (0.186), Third Basemen (0.156), and First Basemen (0.141). This is interesting, as these positions are not usually deemed "crucial" when thinking about roster construction. However, high offensive output is usually associated with players in these spots, which would mean greater WAR output.

On the other hand, positions usually associated with more defensive importance had fairly low coefficients. Center Fielders (.108), Shortstops (.101), and Catchers (0.081), often are the defensive anchors of each team. The low coefficients can suggest that defensive play is less impactful on a daily basis than offensive play. Players in these positions up the middle are compensated by WAR for playing more difficult positions, and their WAR totals can climb higher thanks to a defensive boost. However, this is not enough to outweigh the offensive impact of players in the corners. Utility players have a higher coefficient, (0.149). The importance of strong defensive play in a variety of positions, as well as a consistent offensive presence is rewarded by the model.

Methodology

Data Collection Process

Individual player WAR totals were recorded into CSV files sorted by team, year, and pitchers or hitters. Each CSV contained player Name, Position, WAR Total, and Salary for the given season. WAR data pulled from Baseball Reference, and salary data from Spotrac. There were 60 CSVs for each season in the sample, on batting CSV and one pitching CSV for each team. A master file for each season was created, merging player data with team win totals, which were aggregated in a "wins" CSV file for each season. Additionally, a master payroll breakdown file spanning across all seasons was created, merging data from CSVs using R. Once each season had a master file, they were merged with the overall payroll file, resulting in the final master CSV. This file brings together team wins, payroll by positional group, positional group WAR, as well as team winning percentage. The overall master file was the basis for visualization creation and analysis.

MLB Payroll & Performance Analysis

Payroll Heatmap & Win Chart

Four Distinct Spending Profiles

What makes these groups unique?

Cluster Assignments by Season

Change in Ownership Team Strategic Shifts

Cluster Spending Efficiency

Spending & Performance Dashboards

WAR — Wins Above Replacement

Spending Strategy Win Percentage Test

Starting Pitching WAR Origination Test

Parameter Impacts on Win Percentage

Standardized Positional Coefficients

Data Collection Process