Adding Complexity Doesn’t Make Spring Stats More Predictive

by Blake Murphy

March 25, 2014

Alternate Titles
What Do They Call Doing The Same Thing Over And Over Again And Expecting Different Results?
How Blake Spent Four Hours On A Monday And A Poll If He Should Smash Computer
Do Spring Stats Matter? The Answer May Surprise You
Five Reasons Your Dog Can Identify A Power Breakout
No Signal In Spring Power Noise

Back around 2005 or 2006, John Dewan, founder of STATS Inc. and co-founder of Baseball Info Solutions, made a very fantasy-relevant discovery: He could predict power breakouts with a 60 percent success rate based on spring training statistics.

His methodology – simply finding a player whose spring slugging is 200 points higher than their career mark (minimum 200 career plate appearances and 40 spring plate appearances) – was simple and easy to understand. It also made it easy for those drafting late to identify breakout candidates.

Unfortunately, it doesn’t really work.

In the spring of 2013, Ben Lindberg and Jon Shepherd of Baseball Prospectus tested the so-called Dewan Rule and more or less put it to rest:

Even after adjusting for league slugging percentage (which, again, the Dewan Rule doesn’t specify as a necessary step), the results revealed nothing of use. Of the 218 Dewan Rule batters, 112 improved (51.4 percent). Before their hot springs, the group as a whole slugged 24 points higher than the league. In the seasons after their hot springs, the group slugged 22 points higher than the league.

Basically, the Dewan Rule barely broke even.

Shortly before that article was written, I had taken my own dive into the Dewan Rule over at Beyond the Box Score, hoping to add greater complexity, a possibly necessary trade-off for improved efficacy. My idea was fairly simple: since Baseball Reference now provides a “quality of opposition” indicator for spring statistics, perhaps I could vet out players who were experiencing slugging surges due to feasting on inferior competition. While the QOO metric treats a Wade Davis the same as a Clayton Kershaw, I thought it may provide some additional value since it does happen to separate a Clayton Kershaw from a Jack Leathersich from a Jimmie Sherfy.

What I did last year was find batters who met the Dewan criteria (a 200-point slugging jump in spring), who also showed that 200-point slugging bump over their prior year slugging (to account for players who had already improved but not for a long enough time to move their career slugging number), and had an average quality of opposition of at least 9.0 (which works out to 50 percent Triple-A pitching, 50 percent MLB pitching, a Quad-A level of sorts).

That list had 20 players. Unfortunately, only 11 saw gains to eight who took a step back (one didn’t qualify), and the group as a whole only improved by .020 slugging over their 2012 numbers and 0.015 over their career numbers.

Player	2012 Slg	Career Slg	Spring Slg	OppQual	2013 Slg	2012-2013 Slg Gain	Career-2013 Slg Gain
CongerHank	0.167	0.33	0.694	9	0.403	0.236	0.073
KellyDon	0.248	0.344	0.696	9	0.343	0.095	-0.001
CastroJason	0.401	0.352	0.829	9.2	0.485	0.084	0.133
GutierrezFranklin	0.42	0.384	0.725	9.2	0.503	0.083	0.119
BeltBrandon	0.421	0.418	0.901	9.3	0.481	0.060	0.063
NorrisDerek	0.349	0.349	0.838	9	0.409	0.060	0.06
SmoakJustin	0.364	0.377	0.782	9.2	0.412	0.048	0.035
IbanezRaul	0.453	0.47	0.673	9.1	0.487	0.034	0.017
CowgillCollin	0.317	0.311	0.565	9.1	0.349	0.032	0.038
CrawfordBrandon	0.349	0.333	0.61	9.2	0.363	0.014	0.03
HarperBryce	0.477	0.477	0.734	9	0.486	0.009	0.009
GentryCraig	0.392	0.355	0.618	9	0.386	-0.006	0.031
GordonAlex	0.455	0.439	0.778	9.3	0.422	-0.033	-0.017
RosarioWilin	0.53	0.522	0.733	9	0.486	-0.044	-0.036
AndrusElvis	0.378	0.353	0.6	9	0.331	-0.047	-0.022
MoustakasMike	0.412	0.395	0.739	9.2	0.364	-0.048	-0.031
YoukilisKevin	0.409	0.482	0.75	9.1	0.343	-0.066	-0.139
FowlerDexter	0.474	0.427	0.843	9.3	0.407	-0.067	-0.02
CainLorenzo	0.419	0.412	0.712	9	0.348	-0.071	-0.064
AVERAGE	0.391	0.396	0.727	9.116	0.411	0.020	0.015

When I sat down to write this article, I expected to highlight how much this QOO wrinkle added and give myself the ol’ Barry Horowitz self-pat-on-the-back. That’s because this method identified Jason Castro and Brandon Belt as 2013 breakout candidates, and it was right in both cases – Castro had a slugging jump of .084 and Belt saw a .060 increase. Unfortunately, that was just confirmation bias, and I was only remembering a pair of players it worked for.

Lorenzo Cain? Not even close. Mike Moustakas? Sorry. Elvis Andrus? Ha!

And this is kind of the issue with “breakout identifiers,” because it’s really easy to only remember the ones that worked. Maybe that’s not a bad thing – a sleeper who you correctly identify surely helps more than a sleeper you drop in May hurts – but it doesn’t make the method any more reliable than picking players to improve at random.

One thing that occurred to me after trying to use the model last season was that small-sample BABIPs can wreak havoc in spring, so maybe isolated slugging (ISO) was a better breakout identifier than slugging percentage. I went back and re-ran the “Qual-Adjusted Dewan Rule” using an ISO jump of .070 in the spring of 2013 in place of the .200-point slugging jump Dewan had used.

That gave us a player pool of 44 players. Of those, 21 saw an increase in ISO in 2013, 22 did not, and one didn’t qualify. On average, players in this group gained just .006 of isolated slugging.

	Age	Tm	OppQual	PA	Spring ISO	Career ISO	Spring-Career	2012 ISO	Spring-2012	2013 ISO	2013-2012
Marlon Byrd	34	NYM	9.1	56	0.212	0.135	0.077	0.035	0.177	0.22	0.185
Franklin Gutierrez	29	SEA	9.2	46	0.45	0.128	0.322	0.16	0.29	0.255	0.095
Collin Cowgill	26	NYM	9.1	70	0.259	0.056	0.203	0.048	0.211	0.138	0.09
Nate Schierholtz	28	CHC	9.2	69	0.242	0.139	0.103	0.15	0.092	0.219	0.069
Jason Castro	25	HOU	9.2	45	0.488	0.117	0.371	0.144	0.344	0.209	0.065
Don Kelly	32	DET	9	54	0.392	0.112	0.28	0.062	0.33	0.121	0.059
Justin Upton	24	ATL	9.1	75	0.3	0.197	0.103	0.15	0.15	0.201	0.051
Rick Ankiel	32	HOU	9	52	0.432	0.178	0.254	0.183	0.249	0.234	0.051
Will Venable	29	SD	9.1	63	0.236	0.162	0.074	0.165	0.071	0.216	0.051
Brandon Belt	24	SF	9.3	74	0.464	0.159	0.305	0.146	0.318	0.192	0.046
Lucas Duda	26	NYM	9.3	63	0.322	0.171	0.151	0.15	0.172	0.192	0.042
Josh Donaldson	26	OAK	9.1	62	0.241	0.154	0.087	0.157	0.084	0.198	0.041
Brett Wallace	25	HOU	9.2	58	0.264	0.127	0.137	0.171	0.093	0.21	0.039
Raul Ibanez	40	SEA	9.1	55	0.346	0.192	0.154	0.213	0.133	0.245	0.032
Emilio Bonifacio	27	TOR	9.1	67	0.172	0.076	0.096	0.058	0.114	0.088	0.03
Justin Smoak	25	SEA	9.2	62	0.364	0.154	0.21	0.147	0.217	0.174	0.027
Gaby Sanchez	28	PIT	9.1	53	0.349	0.161	0.188	0.124	0.225	0.148	0.024
Craig Gentry	28	TEX	9	63	0.273	0.076	0.197	0.088	0.185	0.106	0.018
Gerardo Parra	25	ARI	9	61	0.264	0.12	0.144	0.119	0.145	0.135	0.016
Derek Norris	23	OAK	9	45	0.46	0.148	0.312	0.148	0.312	0.163	0.015
Brandon Crawford	25	SF	9.2	66	0.237	0.098	0.139	0.101	0.136	0.115	0.014
Alfonso Soriano	36	CHC	9.4	59	0.322	0.232	0.09	0.237	0.085	0.234	-0.003
Alex Gordon	28	KC	9.3	80	0.347	0.17	0.177	0.161	0.186	0.157	-0.004
Kevin Frandsen	30	PHI	9	65	0.229	0.097	0.132	0.113	0.116	0.107	-0.006
Yoenis Cespedes	26	OAK	9.1	62	0.315	0.213	0.102	0.213	0.102	0.202	-0.011
John Buck	31	NYM	9.2	42	0.257	0.17	0.087	0.155	0.102	0.143	-0.012
Devin Mesoraco	24	CIN	9.1	45	0.225	0.148	0.077	0.14	0.085	0.124	-0.016
Andre Ethier	30	LAD	9.1	56	0.306	0.186	0.12	0.176	0.13	0.151	-0.025
Dayan Viciedo	23	CWS	9.3	63	0.262	0.173	0.089	0.189	0.073	0.161	-0.028
Dexter Fowler	26	COL	9.3	55	0.431	0.156	0.275	0.174	0.257	0.144	-0.03
Elvis Andrus	23	TEX	9	54	0.2	0.078	0.122	0.092	0.108	0.06	-0.032
Yonder Alonso	25	SD	9	66	0.286	0.13	0.156	0.12	0.166	0.087	-0.033
Elliot Johnson	28	KC	9	60	0.189	0.115	0.074	0.108	0.081	0.074	-0.034
Jed Lowrie	28	OAK	9.1	57	0.27	0.167	0.103	0.194	0.076	0.156	-0.038
Mike Moustakas	23	KC	9.2	75	0.333	0.145	0.188	0.17	0.163	0.131	-0.039
Joey Votto	28	CIN	9	59	0.326	0.237	0.089	0.23	0.096	0.186	-0.044
Jay Bruce	25	CIN	9.3	57	0.346	0.228	0.118	0.262	0.084	0.216	-0.046
Kevin Youkilis	33	NYY	9.1	53	0.479	0.199	0.28	0.174	0.305	0.124	-0.05
Todd Frazier	26	CIN	9.3	57	0.327	0.221	0.106	0.225	0.102	0.173	-0.052
Lorenzo Cain	26	KC	9	69	0.254	0.131	0.123	0.153	0.101	0.097	-0.056
Wilin Rosario	23	COL	9	49	0.355	0.26	0.095	0.26	0.095	0.194	-0.066
Josh Reddick	25	OAK	9.2	55	0.354	0.201	0.153	0.221	0.133	0.153	-0.068
Luis Cruz	28	LAD	9.1	52	0.32	0.101	0.219	0.134	0.186	0.034	-0.1
AVERAGE			9.1	59.3	0.314	0.154	0.16	0.154	0.16	0.16	0.006

And once again, there are multiple hits that could have led me to believe there was something here – Marlon Byrd, Castro, Belt, Will Venable, Josh Donaldson and Nate Schierholtz were among the names this version of the breakout predictor would have identified, but it also thought Josh Reddick, Cain, Moustakas, and Wilin Rosario were all in line for big gains that never materialized.

To review, we’ve taken Dewan’s relatively simple model, one that was proven not to work, and tried to make two key improvements – accounting for the quality of opposition, and trying to strip out some BABIP luck by using ISO instead of slugging. And still, there seems to be little in the way of predictive power. If anyone can think of ways to further improve the potential for using spring stats to predict power breakouts, I’m all ears and will gladly try again.

And in the event you still value the “hits” that such a “breakout predictor” may find, here is a list of players who would qualify based on spring stats so far this year (minimum 40 spring plate appearances, 200 career plate appearances, and a .070 gain in isolated slugging over their career and 2013 marks):

	Age	Tm	OppQual	PA	Spring ISO	2013 ISO	CareerISO	Spring-2013	Spring-Career
Chris Heisey	28	CIN	9.1	46	0.543	0.178	0.179	0.365	0.364
Brad Miller	23	SEA	9	54	0.5	0.153	0.153	0.347	0.347
Jordan Danks	26	CWS	9.1	45	0.405	0.138	0.115	0.267	0.29
Carlos Gomez	27	MIL	9.1	45	0.342	0.222	0.151	0.12	0.191
Jose Bautista	32	TOR	9	55	0.422	0.239	0.233	0.183	0.189
Brandon Phillips	32	CIN	9.1	48	0.334	0.135	0.158	0.199	0.176
Conor Gillaspie	25	CWS	9	46	0.296	0.145	0.14	0.151	0.156
Nolan Arenado	22	COL	9.2	45	0.293	0.138	0.138	0.155	0.155
Hunter Pence	30	SF	9.2	54	0.346	0.2	0.191	0.146	0.155
Mike Carp	27	BOS	9.1	40	0.325	0.227	0.177	0.098	0.148
Jed Lowrie	29	OAK	9.4	47	0.309	0.156	0.163	0.153	0.146
Junior Lake	23	CHC	9.1	43	0.275	0.144	0.144	0.131	0.131
Anthony Rizzo	23	CHC	9.2	41	0.29	0.186	0.174	0.104	0.116
Mike Aviles	32	CLE	9	42	0.231	0.116	0.128	0.115	0.103
Dan Uggla	33	ATL	9	57	0.311	0.183	0.212	0.128	0.099
Ian Kinsler	31	DET	9.1	56	0.271	0.136	0.181	0.135	0.09
Skip Schumaker	33	CIN	9.2	40	0.177	0.069	0.087	0.108	0.09
Martin Prado	29	ARI	9	41	0.225	0.135	0.139	0.09	0.086
Jimmy Rollins	34	PHI	9.1	44	0.236	0.096	0.157	0.14	0.079

19 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

David

11 years ago

Other alternate titles:

“50 plate appearances don’t mean more just because it’s Spring Training”
“Filtering data out an already small sample, and why that doesn’t help”
“Spring Training as a real world case study of the fallacy of the Primacy Effect”

Jason B

Reply to David

On a related note and currently hot topic:

“The recency effect: How college basketball coaches ride one good weekend in March to better jobs and untold riches”

ONE SINGLE DAMN WEEKEND! TWO GAMES!

My mind: boggled. Annually.