Using the Stuff Metric as an Injury Identification Tool

Introduction

Before I came to Rotographs – I wrote a lot on my own site, and in the FanGraphs community section. My first foray into baseball analysis was developing a metric to try and quantify “Stuff”. A New York Times article by John Branch in October 2015 discussed the elusive definition of the pitching term “stuff”. Talk of “plus stuff” and feelings of “all the stuff being there” was scattered throughout the article.

Despite interesting commentary discussing the ability for pitchers to over-power hitters, there was no true definition of the nastiness of a pitcher’s stuff. My favourite quote from the article is that stuff is “both meaningful and meaningless. There are no synonyms. Like pornography, stuff is defined mostly by example. An only pitchers have stuff. Hitters do not have stuff (Branch, 2015)”.

My colleague Daanish Mulla (@danmmulla8 on Twitter) and I put together an analysis on what we viewed to be as an operational definition of Stuff and submitted it to the Fangraphs community site. After receiving some feedback – we tweaked our equation, and came up with the Stuff equation that I used for the 2016 season to evaluate the Toronto Blue Jays pitching staff over at Baseball Prospectus Toronto (Check this out to see the history of “Stuff” reports – http://toronto.locals.baseballprospectus.com/author/mikesonne/).

So, without further ado, let’s look at the Stuff Metric.

Stuff to evaluate pitcher performance

The motivation for the Stuff Metric was to try and account for the timing and spatial characteristics of a pitch that a hitter is responsible for. Assuming there are minimal meaningful visual cues that occur between different pitch types, the stuff metric looks at the peak velocity (how quickly a pitch gets on you as a hitter), the amount of space that has to be covered between the pitcher’s arsenal (the distance between a rising fastball, and a dropping curveball), and the change in speed between the fastest and slowest pitches in the pitcher’s repertoire.

After listening to the FanGraphs community chime in on what they viewed as “stuff”, I also incorporated a switch statement, which classified a pitcher’s strategy as either change of speed (Clayton Kershaw‘s 74 mph Curveball), or high velocity breaking pitch (Noah Syndergaard’s 91 mph Slider). This left us with bins of Peak Velocity (mph), Distance between pitch separation (inches), and either Change in Velocity (% of change), or velocity of breaking pitch.

Each of these values were converted to z-scores to allow for comparison between values, and then multiplied by either fastball usage, or breaking pitch usage. Adding these values together gives us the “Stuff” metric. Here’s an example, plus the formula for determining Stuff.

Figure 1. Depiction of how “Stuff” is calculated. For more detail, visit http://www.fangraphs.com/community/get-nasty-quantifying-a-pitchers-stuff/.

Performance

Quite frankly, a metric is only good if it can be used to successfully evaluate performance. And, you have to test the metric to make sure it actually can evaluate performance! To test out the Stuff Metric, I examined all starting pitchers from 2008 to 2016, who were qualified for the ERA title in their respective seasons. This resulted in a test sample of 792 pitcher-seasons.

To evaluate the metric, I compared each pitcher’s Stuff against their ERA, FIP, GB%, WAR, K/9, Batting Average Against, and Baseball Prospectus’s new Deserved Run Average statistic (Judge et al., 2015). ***Thank you, “I’m Your Huckleberry” – Adding in SwStr%, Z-Swing %, O-Swing %, O-Contact %, and Z-Contact %.***


For ease of graphing, I converted Stuff to percentiles, and plotted average (and standard error) for Batting Average at each 10% increment of Stuff.

Table 1. Pearson Correlation (r) and Predicted Variance (r2) between Stuff and various outcome metrics.
Stuff and Outcome Correlations
Batting Average Against WAR K/9 ERA FIP GB % DRA SwStr% Z-Swing % O-Swing % O-Contact % Z-Contact %
Stuff (r ) -0.36 0.36 0.50 -0.26 -0.38 0.13 -0.37 0.42 0.14 0.08 -0.41 -0.32
Stuff (r2) 0.13 0.13 0.25 0.07 0.14 0.02 0.14 0.18 0.02 0.01 0.17 0.10

The results were promising – particularly for K/9. There was an r2 of 0.25, meaning, Stuff accounted for 25% of the variance in Strikeouts per 9 innings for qualified starting pitchers. There were moderate relationships between Stuff and FIP, WAR, and Batting Average as well – indicating, those with better Stuff had lower FIP, higher WAR, and lower batting averages against. As per an update, better Stuff also had strong relationships with higher Swinging Strike Rates, and lower in zone, and out of zone contact rates.

The Best Stuff in Baseball (Top 30)
Year – Name Stuff Stuff (%) Peak Velocity (mph) Change of Speed (%) Break Distance (“) Breaking Peak Velocity
1 2015 – Jake Arrieta 2.24 100% 94.6 15% 90.3 22.58
2 2010 – Stephen Strasburg 2.23 100% 97.6 16% 82.3 21.78
3 2007 – Ubaldo Jimenez 2.19 100% 96.7 22% 86.5 19.45
4 2014 – Jake Arrieta 2.08 100% 93.5 15% 89.3 23.03
5 2016 – Noah Syndergaard 1.98 100% 97.9 8% 90.8 11.87
6 2014 – Collin McHugh 1.97 100% 91.4 20% 85.7 22.65
7 2016 – Clayton Kershaw 1.97 100% 93 22% 87.8 22.05
8 2012 – James Shields 1.95 100% 92.1 13% 89.5 20.95
9 2015 – Clayton Kershaw 1.92 100% 93.6 21% 88 21.52
10 2010 – Justin Verlander 1.88 100% 95.5 17% 79.6 22.17
11 2007 – Felix Hernandez 1.87 99% 96.3 14% 89.1 17.75
12 2011 – Justin Verlander 1.85 99% 95 16% 79.4 22.37
13 2016 – Stephen Strasburg 1.85 99% 94.9 14% 89.1 20.31
14 2013 – Stephen Strasburg 1.81 99% 95.3 16% 79.9 22.63
15 2012 – Justin Verlander 1.78 99% 94.7 16% 84.6 21.52
16 2012 – Stephen Strasburg 1.78 99% 95.8 16% 80.5 21.58
17 2015 – Noah Syndergaard 1.76 99% 97 16% 81.8 17.17
18 2009 – Chris Carpenter 1.74 99% 93.2 20% 87.5 21.87
19 2011 – Matt Garza 1.74 99% 93.8 19% 85.7 20.98
20 2014 – Joe Kelly 1.73 99% 95.1 17% 79.2 22.7
21 2014 – Zack Wheeler 1.73 99% 94.7 17% 88.8 22.26
22 2009 – Justin Verlander 1.67 99% 95.6 16% 80.1 20.81
23 2010 – David Price 1.67 98% 95.3 19% 77.4 21.78
24 2007 – Justin Germano 1.66 98% 88 22% 68.6 22.02
25 2014 – Yordano Ventura 1.66 98% 96.6 14% 82.8 18.51
26 2014 – Stephen Strasburg 1.66 98% 94.8 16% 79.9 22.36
27 2015 – Mike Foltynewicz 1.64 98% 95.1 20% 84 19.95
28 2016 – Yordano Ventura 1.63 98% 96.2 14% 82.9 19.46
29 2013 – Justin Verlander 1.63 98% 94 16% 85.9 20.37
30 2016 – Justin Verlander 1.63 98% 93.6 16% 87.9 21.31
SOURCE: PITCHf/x

Stuff for Injury Evaluation

In the 2016 season, I passionately covered the Toronto Blue Jays for Baseball Prospectus Toronto. Every 2 weeks, I calculated the Jays Stuff, and examined trends throughout the season. At one point in the season, Marco Estrada saw a steep drop off in his Stuff. In my article about Stuff that week, I suggested that Marco was showing signs that lead me to believe he was injured (Figure 3).

Figure 3. Marco Estrada’s Stuff against time during the 2016 season, taken from http://toronto.locals.baseballprospectus.com/2016/10/03/the-final-regular-season-blue-jays-stuff-report/.

Shortly after, I got a call from Brendan Kennedy, the Blue Jays beat writer for the Toronto Star. We discussed what we thought was happening with Estrada, and he presented an article in The Star on September 13th (Kennedy, 2016). The two of us looked like geniuses, when two days later, the new came out that Estrada had been pitching through a herniated disc in his lower back. I wanted to actually test to see – if a pitcher sees a drop off in their Stuff – are they more likely to go on the disabled list?

The first thing I did, was examine when Stuff became stable. Looking at just the coefficient of variation of Stuff, variability appeared to come to a reasonable level once a pitcher had thrown 6 innings at the Major League Level. If a pitcher through 6 innings in a 2 week period, their data was included in this analysis. I removed the All Star week period for this analysis, FYI.

Using Jeff Zimmerman’s compiled DL data from 2016, I examined the proportion of times where a pitcher’s 2 week periods dropped below 2 SD of their Stuff, and compared that against whether they were hurt in 2016 season or not. The results were not statistically significant – (p = 0.22), and had a miniscule effect size (d = 0.03).

Looking at this from a slightly different perspective, I compared the pitchers who had the most variability in their Stuff against the season, against those who had the least variability in their Stuff. Those who fell below the median for Stuff variability, were less likely to appear on the DL than those who were more variable (21.7% of those below the median Stuff variability were on the DL during a season, compared to 33.9% of those above the median variability, p < 0.05, and d = 0.34).

Obviously, this needs to be examined over a longer period of time, but the trends in Stuff throughout a season can offer a window into changes in pitching strategy, and possibly, signs that something might not be physically right with a pitcher.

Conclusion

So, that’s the Stuff metric, and how I’ve used it to both study performance, and injury risk in MLB pitchers. During the 2017 season, I hope to bring some more insight into injury trends in pitchers using this metric, as well as introduce the new workload metric I’ve created – Fatigue Units. Until next time.

 

References:

Branch, J. (2015). The Mysteries of Pitching, and All That ‘Stuff’. Posted online, October 3, 2015.http://www.nytimes.com/2015/10/04/sports/baseball/the-mysteries-of-pitching-and-all-that-stuff.html

Judge, J., Pavlidis, H., & Turkenkopf, D. (2015). Introducing Deserved Run Average (DRA)’/ And All Its Friends. Published April 29, 2015. http://www.baseballprospectus.com/article.php?articleid=26195

Kennedy, B. (2016). There are Worrying Trends for Blue Jays Starter Marco Estrada. Published September 13, 2016. https://www.thestar.com/sports/bluejays/2016/09/13/there-are-worrying-trends-for-blue-jays-starter-marco-estrada.html

Sarris, E. (2015). The Best Changeups of the Year by Shape and Speed. Posted online, November 9, 2015. http://www.fangraphs.com/blogs/the-best-changeups-of-the-year-by-shape-and-speed/

Sonne, M.W., & Mulla, D. (2015). Get Nasty: Quantifying a Pitcher’s “Stuff”. Published November 14, 2015. http://www.fangraphs.com/community/get-nasty-quantifying-a-pitchers-stuff/

Sonne, M.W., & Mulla, D. (2015). Revisting the “Stuff” Metric. Published December 21, 2015. http://www.mikesonne.ca/baseball/22/





Ergonomist (CCPE) and Injury Prevention researcher. I like science and baseball - the order depends on the day. Twitter: @DrMikeSonne

newest oldest most voted
I'm Your Huckleberry
Member
I'm Your Huckleberry

I love your stuff metric. Absolutely love it.

I do have one complaint about it, though (in terms of quantifying a pitcher’s nastiness, not in terms of injury prediction): it really only accounts for two pitches. Have you considered a way to tweak it to account for more pitches?

I’m considering a project of taking each offspeed pitch, calculating “Stuff” as if it was the only offspeed pitch in the pitcher’s reportoire, and then weighting them by how often he threw each offspeed pitch. What would be problems with this approach? Would it be too fastball heavy?

Also do you know the correlation between Stuff and metrics like SwStr%, Whiffs/Swing, or Z-Contact%?