Adding Complexity Doesn’t Make Spring Stats More Predictive

Alternate Titles
What Do They Call Doing The Same Thing Over And Over Again And Expecting Different Results?
How Blake Spent Four Hours On A Monday And A Poll If He Should Smash Computer
Do Spring Stats Matter? The Answer May Surprise You
Five Reasons Your Dog Can Identify A Power Breakout
No Signal In Spring Power Noise

Back around 2005 or 2006, John Dewan, founder of STATS Inc. and co-founder of Baseball Info Solutions, made a very fantasy-relevant discovery: He could predict power breakouts with a 60 percent success rate based on spring training statistics.

His methodology – simply finding a player whose spring slugging is 200 points higher than their career mark (minimum 200 career plate appearances and 40 spring plate appearances) – was simple and easy to understand. It also made it easy for those drafting late to identify breakout candidates.

Unfortunately, it doesn’t really work.

In the spring of 2013, Ben Lindberg and Jon Shepherd of Baseball Prospectus tested the so-called Dewan Rule and more or less put it to rest:

Even after adjusting for league slugging percentage (which, again, the Dewan Rule doesn’t specify as a necessary step), the results revealed nothing of use. Of the 218 Dewan Rule batters, 112 improved (51.4 percent). Before their hot springs, the group as a whole slugged 24 points higher than the league. In the seasons after their hot springs, the group slugged 22 points higher than the league.

Basically, the Dewan Rule barely broke even.

Shortly before that article was written, I had taken my own dive into the Dewan Rule over at Beyond the Box Score, hoping to add greater complexity, a possibly necessary trade-off for improved efficacy. My idea was fairly simple: since Baseball Reference now provides a “quality of opposition” indicator for spring statistics, perhaps I could vet out players who were experiencing slugging surges due to feasting on inferior competition. While the QOO metric treats a Wade Davis the same as a Clayton Kershaw, I thought it may provide some additional value since it does happen to separate a Clayton Kershaw from a Jack Leathersich from a Jimmie Sherfy.

What I did last year was find batters who met the Dewan criteria (a 200-point slugging jump in spring), who also showed that 200-point slugging bump over their prior year slugging (to account for players who had already improved but not for a long enough time to move their career slugging number), and had an average quality of opposition of at least 9.0 (which works out to 50 percent Triple-A pitching, 50 percent MLB pitching, a Quad-A level of sorts).

That list had 20 players. Unfortunately, only 11 saw gains to eight who took a step back (one didn’t qualify), and the group as a whole only improved by .020 slugging over their 2012 numbers and 0.015 over their career numbers.

Player 2012 Slg Career Slg Spring Slg OppQual 2013 Slg 2012-2013 Slg Gain Career-2013 Slg Gain
CongerHank 0.167 0.33 0.694 9 0.403 0.236 0.073
KellyDon 0.248 0.344 0.696 9 0.343 0.095 -0.001
CastroJason 0.401 0.352 0.829 9.2 0.485 0.084 0.133
GutierrezFranklin 0.42 0.384 0.725 9.2 0.503 0.083 0.119
BeltBrandon 0.421 0.418 0.901 9.3 0.481 0.060 0.063
NorrisDerek 0.349 0.349 0.838 9 0.409 0.060 0.06
SmoakJustin 0.364 0.377 0.782 9.2 0.412 0.048 0.035
IbanezRaul 0.453 0.47 0.673 9.1 0.487 0.034 0.017
CowgillCollin 0.317 0.311 0.565 9.1 0.349 0.032 0.038
CrawfordBrandon 0.349 0.333 0.61 9.2 0.363 0.014 0.03
HarperBryce 0.477 0.477 0.734 9 0.486 0.009 0.009
GentryCraig 0.392 0.355 0.618 9 0.386 -0.006 0.031
GordonAlex 0.455 0.439 0.778 9.3 0.422 -0.033 -0.017
RosarioWilin 0.53 0.522 0.733 9 0.486 -0.044 -0.036
AndrusElvis 0.378 0.353 0.6 9 0.331 -0.047 -0.022
MoustakasMike 0.412 0.395 0.739 9.2 0.364 -0.048 -0.031
YoukilisKevin 0.409 0.482 0.75 9.1 0.343 -0.066 -0.139
FowlerDexter 0.474 0.427 0.843 9.3 0.407 -0.067 -0.02
CainLorenzo 0.419 0.412 0.712 9 0.348 -0.071 -0.064
AVERAGE 0.391 0.396 0.727 9.116 0.411 0.020 0.015

When I sat down to write this article, I expected to highlight how much this QOO wrinkle added and give myself the ol’ Barry Horowitz self-pat-on-the-back. That’s because this method identified Jason Castro and Brandon Belt as 2013 breakout candidates, and it was right in both cases – Castro had a slugging jump of .084 and Belt saw a .060 increase. Unfortunately, that was just confirmation bias, and I was only remembering a pair of players it worked for.

Lorenzo Cain? Not even close. Mike Moustakas? Sorry. Elvis Andrus? Ha!

And this is kind of the issue with “breakout identifiers,” because it’s really easy to only remember the ones that worked. Maybe that’s not a bad thing – a sleeper who you correctly identify surely helps more than a sleeper you drop in May hurts – but it doesn’t make the method any more reliable than picking players to improve at random.

One thing that occurred to me after trying to use the model last season was that small-sample BABIPs can wreak havoc in spring, so maybe isolated slugging (ISO) was a better breakout identifier than slugging percentage. I went back and re-ran the “Qual-Adjusted Dewan Rule” using an ISO jump of .070 in the spring of 2013 in place of the .200-point slugging jump Dewan had used.

That gave us a player pool of 44 players. Of those, 21 saw an increase in ISO in 2013, 22 did not, and one didn’t qualify. On average, players in this group gained just .006 of isolated slugging.

Age Tm OppQual PA Spring ISO Career ISO Spring-Career 2012 ISO Spring-2012 2013 ISO 2013-2012
Marlon Byrd 34 NYM 9.1 56 0.212 0.135 0.077 0.035 0.177 0.22 0.185
Franklin Gutierrez 29 SEA 9.2 46 0.45 0.128 0.322 0.16 0.29 0.255 0.095
Collin Cowgill 26 NYM 9.1 70 0.259 0.056 0.203 0.048 0.211 0.138 0.09
Nate Schierholtz 28 CHC 9.2 69 0.242 0.139 0.103 0.15 0.092 0.219 0.069
Jason Castro 25 HOU 9.2 45 0.488 0.117 0.371 0.144 0.344 0.209 0.065
Don Kelly 32 DET 9 54 0.392 0.112 0.28 0.062 0.33 0.121 0.059
Justin Upton 24 ATL 9.1 75 0.3 0.197 0.103 0.15 0.15 0.201 0.051
Rick Ankiel 32 HOU 9 52 0.432 0.178 0.254 0.183 0.249 0.234 0.051
Will Venable 29 SD 9.1 63 0.236 0.162 0.074 0.165 0.071 0.216 0.051
Brandon Belt 24 SF 9.3 74 0.464 0.159 0.305 0.146 0.318 0.192 0.046
Lucas Duda 26 NYM 9.3 63 0.322 0.171 0.151 0.15 0.172 0.192 0.042
Josh Donaldson 26 OAK 9.1 62 0.241 0.154 0.087 0.157 0.084 0.198 0.041
Brett Wallace 25 HOU 9.2 58 0.264 0.127 0.137 0.171 0.093 0.21 0.039
Raul Ibanez 40 SEA 9.1 55 0.346 0.192 0.154 0.213 0.133 0.245 0.032
Emilio Bonifacio 27 TOR 9.1 67 0.172 0.076 0.096 0.058 0.114 0.088 0.03
Justin Smoak 25 SEA 9.2 62 0.364 0.154 0.21 0.147 0.217 0.174 0.027
Gaby Sanchez 28 PIT 9.1 53 0.349 0.161 0.188 0.124 0.225 0.148 0.024
Craig Gentry 28 TEX 9 63 0.273 0.076 0.197 0.088 0.185 0.106 0.018
Gerardo Parra 25 ARI 9 61 0.264 0.12 0.144 0.119 0.145 0.135 0.016
Derek Norris 23 OAK 9 45 0.46 0.148 0.312 0.148 0.312 0.163 0.015
Brandon Crawford 25 SF 9.2 66 0.237 0.098 0.139 0.101 0.136 0.115 0.014
Alfonso Soriano 36 CHC 9.4 59 0.322 0.232 0.09 0.237 0.085 0.234 -0.003
Alex Gordon 28 KC 9.3 80 0.347 0.17 0.177 0.161 0.186 0.157 -0.004
Kevin Frandsen 30 PHI 9 65 0.229 0.097 0.132 0.113 0.116 0.107 -0.006
Yoenis Cespedes 26 OAK 9.1 62 0.315 0.213 0.102 0.213 0.102 0.202 -0.011
John Buck 31 NYM 9.2 42 0.257 0.17 0.087 0.155 0.102 0.143 -0.012
Devin Mesoraco 24 CIN 9.1 45 0.225 0.148 0.077 0.14 0.085 0.124 -0.016
Andre Ethier 30 LAD 9.1 56 0.306 0.186 0.12 0.176 0.13 0.151 -0.025
Dayan Viciedo 23 CWS 9.3 63 0.262 0.173 0.089 0.189 0.073 0.161 -0.028
Dexter Fowler 26 COL 9.3 55 0.431 0.156 0.275 0.174 0.257 0.144 -0.03
Elvis Andrus 23 TEX 9 54 0.2 0.078 0.122 0.092 0.108 0.06 -0.032
Yonder Alonso 25 SD 9 66 0.286 0.13 0.156 0.12 0.166 0.087 -0.033
Elliot Johnson 28 KC 9 60 0.189 0.115 0.074 0.108 0.081 0.074 -0.034
Jed Lowrie 28 OAK 9.1 57 0.27 0.167 0.103 0.194 0.076 0.156 -0.038
Mike Moustakas 23 KC 9.2 75 0.333 0.145 0.188 0.17 0.163 0.131 -0.039
Joey Votto 28 CIN 9 59 0.326 0.237 0.089 0.23 0.096 0.186 -0.044
Jay Bruce 25 CIN 9.3 57 0.346 0.228 0.118 0.262 0.084 0.216 -0.046
Kevin Youkilis 33 NYY 9.1 53 0.479 0.199 0.28 0.174 0.305 0.124 -0.05
Todd Frazier 26 CIN 9.3 57 0.327 0.221 0.106 0.225 0.102 0.173 -0.052
Lorenzo Cain 26 KC 9 69 0.254 0.131 0.123 0.153 0.101 0.097 -0.056
Wilin Rosario 23 COL 9 49 0.355 0.26 0.095 0.26 0.095 0.194 -0.066
Josh Reddick 25 OAK 9.2 55 0.354 0.201 0.153 0.221 0.133 0.153 -0.068
Luis Cruz 28 LAD 9.1 52 0.32 0.101 0.219 0.134 0.186 0.034 -0.1
AVERAGE 9.1 59.3 0.314 0.154 0.16 0.154 0.16 0.16 0.006

And once again, there are multiple hits that could have led me to believe there was something here – Marlon Byrd, Castro, Belt, Will Venable, Josh Donaldson and Nate Schierholtz were among the names this version of the breakout predictor would have identified, but it also thought Josh Reddick, Cain, Moustakas, and Wilin Rosario were all in line for big gains that never materialized.

To review, we’ve taken Dewan’s relatively simple model, one that was proven not to work, and tried to make two key improvements – accounting for the quality of opposition, and trying to strip out some BABIP luck by using ISO instead of slugging. And still, there seems to be little in the way of predictive power. If anyone can think of ways to further improve the potential for using spring stats to predict power breakouts, I’m all ears and will gladly try again.

And in the event you still value the “hits” that such a “breakout predictor” may find, here is a list of players who would qualify based on spring stats so far this year (minimum 40 spring plate appearances, 200 career plate appearances, and a .070 gain in isolated slugging over their career and 2013 marks):

Age Tm OppQual PA Spring ISO 2013 ISO CareerISO Spring-2013 Spring-Career
Chris Heisey 28 CIN 9.1 46 0.543 0.178 0.179 0.365 0.364
Brad Miller 23 SEA 9 54 0.5 0.153 0.153 0.347 0.347
Jordan Danks 26 CWS 9.1 45 0.405 0.138 0.115 0.267 0.29
Carlos Gomez 27 MIL 9.1 45 0.342 0.222 0.151 0.12 0.191
Jose Bautista 32 TOR 9 55 0.422 0.239 0.233 0.183 0.189
Brandon Phillips 32 CIN 9.1 48 0.334 0.135 0.158 0.199 0.176
Conor Gillaspie 25 CWS 9 46 0.296 0.145 0.14 0.151 0.156
Nolan Arenado 22 COL 9.2 45 0.293 0.138 0.138 0.155 0.155
Hunter Pence 30 SF 9.2 54 0.346 0.2 0.191 0.146 0.155
Mike Carp 27 BOS 9.1 40 0.325 0.227 0.177 0.098 0.148
Jed Lowrie 29 OAK 9.4 47 0.309 0.156 0.163 0.153 0.146
Junior Lake 23 CHC 9.1 43 0.275 0.144 0.144 0.131 0.131
Anthony Rizzo 23 CHC 9.2 41 0.29 0.186 0.174 0.104 0.116
Mike Aviles 32 CLE 9 42 0.231 0.116 0.128 0.115 0.103
Dan Uggla 33 ATL 9 57 0.311 0.183 0.212 0.128 0.099
Ian Kinsler 31 DET 9.1 56 0.271 0.136 0.181 0.135 0.09
Skip Schumaker 33 CIN 9.2 40 0.177 0.069 0.087 0.108 0.09
Martin Prado 29 ARI 9 41 0.225 0.135 0.139 0.09 0.086
Jimmy Rollins 34 PHI 9.1 44 0.236 0.096 0.157 0.14 0.079





Blake Murphy is a freelance sportswriter based out of Toronto. Formerly of the Score, he's the managing editor at Raptors Republic and frequently pops up at Sportsnet, Vice, and around here. Follow him on Twitter @BlakeMurphyODC.

19 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
David
11 years ago

Other alternate titles:

“50 plate appearances don’t mean more just because it’s Spring Training”
“Filtering data out an already small sample, and why that doesn’t help”
“Spring Training as a real world case study of the fallacy of the Primacy Effect”

Jason B
11 years ago
Reply to  David

On a related note and currently hot topic:

“The recency effect: How college basketball coaches ride one good weekend in March to better jobs and untold riches”

ONE SINGLE DAMN WEEKEND! TWO GAMES!

!!

My mind: boggled. Annually.