Spring Training Stats That Matter

As stat geeks, we are quick to tell less nerdy baseball fans that spring training stats mean nothing. Whether it’s the tiny sample size, the varying level of competition, the experimenting with new pitches/mechanics/stances etc, there is a ton of noise clouding the data. Even with the obvious explanations, there have still been studies performed to determine whether spring stats have any significance. Sure enough, historical studies have confirmed that spring training stats have limited value.

Several years ago, John Dewan of Baseball Info Solutions determined that hitters with more than 100 career at-bats and 36 spring training at-bats that produce a spring slugging percentage in excess of their career average by 200 points or more will often experience a power spike during the regular season. Although this study received lots of press and the theory became one of the few that people still cite today, it’s flawed. The study used slugging percentage, which includes singles, rather than isolated power, which only includes extra-base hits. If Michael Bourn, he of the .358 career slugging percentage, hit .500 in the spring with a .575 slugging percentage, he would meet Dewan’s power breakout criteria. Of course, his slugging percentage is almost completely composed of singles and should therefore not be expected to enjoy a power surge.

Other studies that have been done have simply looked at surface stats like ERA. However, I am not aware of any that have examined whether peripheral stats in spring training have any significance. When boldly predicting that Francisco Liriano would be a top 10 pitcher this year, I hypothesized that a pitcher’s spring strikeout and walk rates actually do mean something and may foreshadow a breakout or disappointing season. I also guessed that exceptionally strong springs were more significant than poor ones. So I decided to construct a study to test these two hypotheses.

The Study**:

I looked at 749 starting pitchers from 2007-2011 who threw at least 10 innings in the spring and 40 innings during the regular season. The pitchers also had to have a Marcel projection to be used as a control, so we can control for the fact that Clayton Kershaw will likely have both a high spring strikeout rate and high regular season strikeout rate. I chose to focus on K% and BB%, as that eliminates BABIP luck and changes in these metrics are typically the drivers of a breakout or disappointing season.

The Results:

First, the correlations for K% between the three sets of stats:

Season K% Marcel K% Spring K%
Season K% 1
Marcel K% 0.7211 1
Spring K% 0.4971 0.4489 1

Not surprisingly, the correlation between Marcel and Season is much higher (0.72) than Spring and Season (0.50). Since R-squared in a single regression is just the square of the correlation, the R-squared for predicting the Season K% using Marcel K% is 0.72^2, or 0.52. What we want to do is figure out if using any piece of a pitcher’s spring K% in conjunction with Marcel can increase that number. Running multiple regressions determined that indeed we can. The equation is:

K% = -2% + 0.90*(Marcel K%) + 0.18*(Spring K%)

The R-squared jumps to 0.56 (about 0.75 correlation) and the p-stat was 0.000. Success!

Next are the correlations for BB% between the three sets of stats:

Season BB% Marcel BB% Spring BB%
Season BB% 1
Marcel BB% 0.608 1
Spring BB% 0.3792 0.368 1

Once again, Marcel is much better at predicting seasonal BB% than Spring is. Interestingly, BB% appears more difficult to project than K% as both Marcel and Spring had lower correlations than in the K% table. The R-squared for predicting the Season BB% using Marcel BB% is 0.61^2, or 0.37. Like with K%, the idea now is to determine whether factoring in some of a pitcher’s spring BB% into his Marcel projection increases that R-squared. Yes we can! The equation is:

BB% = 0.87*(Marcel BB%) + 0.12*(Spring BB%)

The R-squared improves to 0.40, with another p-stat of .000. So now we also find that a pitcher’s spring BB% does actually carry some significance.

Aside from trying to determine whether spring training K% and BB% rates mean anything for the upcoming season, I felt like really strong performances meant more than poor ones. You cannot fluke your way into striking out a high percentage of hitters, but pitchers work on new pitches or their mechanics in the spring all the time and can easily explain a weak performance.

Unfortunately, I tested this and both ends of the spectrum carried the same weight. In fact, the poor performances were actually a smidge more significant than the strong performances. So in other words, good and bad springs should be treated the same.

Last, I figured I would also test spring ERA one more time to see if we can glean anything from it. As expected, the results proved that it’s all noise.


-Spring K% and BB% actually do mean something and may help identify breakout and bust performers for the upcoming season
-Good and bad springs carry the same level of significance and they should therefore be treated equally
-Spring ERA is completely useless

On Wednesday, I will identify which pitchers have posted the largest increases/decreases in their K% during the spring as compared to their projections. Then on Thursday I will do the same for BB%.

**A very special thanks to the amazing Matt Swartz for actually running the numbers, providing me with the results and explaining how to interpret them. You rock Matt! This also means that any math/study construction related questions should be directed at him.

Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year. He produces player projections using his own forecasting system and is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. Follow Mike on Twitter @MikePodhorzer and contact him via email.

Newest Most Voted
Inline Feedbacks
View all comments
12 years ago

I guess this question is for Matt – any goodness of fit tests performed to test the i.i.d. assumption?

Matt Swartz
12 years ago
Reply to  Chris

Are you concerned about normality?

12 years ago
Reply to  Matt Swartz

Less interested in the normality of the errors, and more interested in the assumption that the observational data used is i.i.d. – more specifically, interested in statistics which test that assumption.