Taking Spring Training a Little too Seriously: Hitter Edition
This article looks at the predictive power of spring training statistics last year for hitters. It applies learnings from last year to highlight a few movers and shakers so far this spring (based on spring games through 3/13).
A number of past studies have found that spring training statistics matter — but only a little. This makes intuitive sense as players generally have plenty of incentive to try during spring training, but it is rare for anyone to receive over 80 plate appearances or 20 innings pitched–each amount comprising a tiny proportion of a full season.
To further validate this sentiment, I looked at 2023 spring training statistics for hitters using FanGraphs’ shiny new spring training leaderboards. I conducted two tests: first, I looked at how well regressed 2022 regular season statistics predicted 2023 regular season statistics. Second, I considered 2023 spring training statistics together with 2022 regular season statistics (plus a regression amount, typically around 240 PA for wRC+, and less for component statistics) to see whether this improved the accuracy of projecting 2023 regular season statistics. I focused on wRC+, a single comprehensive measure of offensive talent. wRC+ is important for fantasy too, as it’s one big factor in determining how much playing time a hitter receives, although it is not the only factor, of course.
My sample included 511 hitters that had at least one plate appearance in each of the 2022 regular season, 2023 spring training, and the 2023 regular season. Baseball-Reference’s OppQual (Opponent Quality) Index was used to account for the level of competition a hitter faced–I made use of my major league equivalencies according to their Opponent Quality Index–and the statistics were adjusted to account for the spring training scoring environment (they were not adjusted for particular spring parks, however…also, note that so far this spring training, league average home runs per batted ball event is around 4.2%, slightly down from around 4.5% during 2023 spring training). To evaluate the forecasts, I used root mean square error weighted by 2023 regular season plate appearances, following industry norms. Root mean square error (or RMSE, for short) measures the typical deviation of a projection from an observed value, or result (lower is better).
The root mean square error of the 2023 wRC+ forecast based on regressed 2022 wRC+ was 24.32 wRC+ points. Adding in spring training statistics, the root mean square error improved to 23.95 wRC+–a 1.5 percent reduction in root mean square error (RMSE), no trivial feat in the world of projecting. Similarly, accounting for 2023 spring training statistics, together with 2022 statistics and regression, improved 2023 projections for key component statistics: BB%+ (where 100 is league average) forecast RMSE improved from 26.27 in the 2022 stats-only model that doesn’t account for spring training to 25.78 in the model that accounts for both spring training and regular season statistics; K%+ (where 100 is league average) forecast RMSE improved from 17.73 in the base model to 17.22 in the full model with spring training statistics; ISO+ (where 100 is league average) forecast RMSE improved from 29.34 in the base model to 28.69 in the spring training model. Further, arbitrarily penalizing the spring training statistics by weighting them less heavily made the projections worse. The optimal forecasting strategy was to weight them in proportion to the number of plate appearances a player received — the same way a forecaster would generally weight regular season statistics.
Spring training statistics offer predictive value, then–especially when making adjustments for opponent quality–but it is important not to overreact. First, although they improve predictive accuracy in general, there are plenty of individual cases where they may lead you astray, for instance, if a player is toying with a new pitch or swing, which is quite common during spring. Second, and more importantly, 80 plate appearances are going to offer you limited insight at best, regardless of whether they occur during spring or during the regular season. Full-time veterans may have 1,800 plate appearances accounted for in their projections heading into spring training — 80 PA is not going to move the needle much in comparison, and most players anyway receive significantly less than 80 PA. Notably, for a small handful of players, particularly members of the most recent draft class, 80 PA may move the needle, indeed.
In any case, you may be wondering, who has improved their wRC+ projection the most so far this spring? To answer this, I followed the same approach I did for the 2023 tests, adjusting spring training statistics to account for opponent quality and league environment. I first ran a MARCEL-style traditional 2024 projection leveraging my own regression amounts, aging curves, and major league equivalencies to account for each player’s past performances across the minor and major leagues (note, deviating from MARCEL, I do not project all rookies the same). Then, I added spring training statistics into the traditional projection to see the biggest movers and shakers. Nobody has moved their wRC+ projection by more than 10 wRC+ points yet in 2024, although plus or minus 10 can be thought of as a soft upper or lower bound by the time spring training is done.
In perhaps the least surprising twist in history, Wyatt Langford has improved his projection the most so far this spring, increasing it by 7 wRC+ points. Generally, this section will just show projected wRC+ change, so readers can roughly “apply” the changes to projection systems of their choice, but I will make an exception for Langford: I now have his 2024 projection at a 123 wRC+, slightly higher than where Steamer has him. He is the exact sort of player who can have his projection improve quickly, given his lack of professional data before this spring–spring comprises a larger portion of his overall professional body of work than it does for veteran players.
Other notable names in the top 10 improvers are Chase DeLauter, with a gain of 6 wRC+, and Colton Cowser and Oneil Cruz, with gains of 4 each. On the flip side, Enrique Bradfield’s wRC+ projection has fallen by 5, and Mike Trout and Jarred Kelenic have seen their wRC+ projections fall by 3 each (apparently Alex Anthopoulos has noticed this as well). With a 39 K% and a 77 wRC+ across 33 PA, Trout hasn’t been good against roughly Double-A quality competition (according to Baseball-Reference), but his projection only falls very slightly given his large prior body of work. The slight dip is a testament to the limits of how much spring training can tell you–you should not really be worried about Trout, who remains one of baseball’s best bats, even if his 2024 wRC+ projection is 141 instead of 144 (141 is where I have him after accounting for his spring so far). Even Javier Báez, who has had a terrifying spring with 10 strikeouts across 22 PA, only sees his wRC+ projection drop 2 points (you can definitely worry about him, though, even without giving weight to his rough spring).
In the past, I have not taken spring training seriously, aside from the occasional velocity boost or new pitch reveal. I’ll still pay it little attention given how minorly it impacts the big picture for the vast majority of players. However, moving forward, I will allow myself to continue checking box scores vigorously during March, to give current happenings just a tiny bit of weight, to experience fleeting bursts of despair and joy. You might be wondering who the biggest spring gainers were last year, following this same approach: that’d be Corbin Carroll…and Mike Brosseau. Baseball keeps you humble.
Very cool stuff, love seeing the impact to RMSE. As you said, not nothing. I’ve certainly been guilty of similar thinking about spring stats and looking mostly at new pitches and velo bumps. Since I didn’t see a full leaderboard, can I request what Volpe looks like after his spring stats show he isn’t doomed to carry a low BABIP forever?
Thank you! Volpe is basically unchanged, he’s been above average but closer to average once accounting for the fact that his OppQual has been about Triple-A-level per BR. I do agree that he’s due for some positive BABIP regression!
Thanks for the response, much appreciated. Keep it up with the great content!
Yup Excellent research. Thanks.