Modeling SwStr% and GB% Using Velocity and Movement
This year, I’ve been caught up on pitching. I investigated the nuance inherent to swinging strikes, indirectly made a case for completely abandoning the sinker with this piece comparing pitch type outcomes, and (maybe) identified the keys to unlocking pitcher BABIP and HR/FB.
Here, I’ve modeled swinging strike and ground ball rates using only pitch velocity movement. Surely, this work can be improved; my quantitative tool set, while fairly robust compared to the layman, is meager compared to the professional or even hobbyist statistician. Regardless, I think it’s pretty cool, and I hope it adds to the conversation constructively.
Mostly, this serves to satiate my own curiosity. Unfortunately, it may be denser than I expected — few answers are ever quite as simple as you hope them to be, I guess.
Existing Research
I linked to several of my own pieces above. Dan Lependorf wrote about estimating ground ball rates in 2013 at the Hardball Times, although its conclusions have an anecdotal slant. (It thinks about velocity and movement but doesn’t take the requisite steps to bridge the logic.)
Per the Athletic’s (and formerly FanGraphs’) Eno Sarris and an assist from Baseball Prospectus’ Harry Pavlidis, “sizzling power curves get the whiffs” (also, “there was a decent, if not robust, relationship between horizontal movement and strikeouts”). Pavlidis’ three-part series on change-up composition can be found here, by the way, which describes change-pieces that draw whiffs are not the same as those that induce grounders (“a fastball with plus velocity and a sizable gap (10+ MPH) between the heater and the changeup make for more missed bats with the offspeed pitch, while a smaller gap helps the pitcher to induce more ground balls”).
Eno also talked about sliders here (did you know Eno likes pitching?), where he notes that drop and velocity are most important for incurring whiffs and grounders, although the relationships are not super strong. He includes a regression mode (unbeknownst to me before engaging in this endeavor) very similar to those that follow. I’m not surprised by this overlap; intense baseball fans understand the importance of these variables.
Jonah Pemstein did some high-quality analysis of spin rates here; it’s somewhat tangential, as spin rate evidently affects movement, which (as I hope to demonstrate) affects whiff rate and batted ball outcomes.
And, lastly, as I write this, Eno is compiling lists of 2018’s best pitches by pitch (FS) (CH) (CU), for your reference.
Background/”Methodology”
The following modeling efforts use PITCHf/x data from 2014 through 2017 rolled up to the year-name-pitch-role level, such that each observation appears as follows:
- Year, Name, Pitch Type, Role (SP/RP)*…
- 2017, Clayton Kershaw, slider, SP, …
- 2017, Clayton Kershaw, curve, SP, …
- etc.
(*The data is parsed by pitcher role in light of how “stuff” plays up differently in relief.)
Based on a cursory inspection of Jonah and Sean Dolinar’s long-needed update on reliability, I set a cutoff point of 250 pitches to omit egregiously small pitch counts without excessively whittling down my sample (n = 3,652).
I originally intended to specify separate models for each pitch type but, instead, concluded that formal pitch type designations shouldn’t matter: a pitch is a pitch, and its movement and velocity is its own. It doesn’t matter what it’s called, and it seems to me labels might only serve to limit or dilute the analysis.* How all of a pitch’s components exist and interact should affect its outcomes and nothing else. (*I confirmed with Eno that some pitches are classified simply on grip rather than flight properties, which seems to me a nonideal data quirk for this exercise.)
I settled on the following independent variables for my model:
- veloi: velocity (mph)
- h-movi: horizontal movement (in.)
- v-movi: vertical movement (in.)
- The interaction between velocity * horizontal movement
- The interaction between velocity * vertical movement
- The interaction between vertical movement * horizontal movement
- veloFB: primary fastball velocity
- FB denotes the fastball thrown most often by that pitcher in a given season chosen among whichever ones he throws: four-seam, sinker, cutter (for these purposes).
- I chose this approach to avoid instances in which a pitcher threw, say, only one true fastball but mostly relies on junk (2017 Alex Claudio being the most salient, immediately memorable example). Alternate approaches included (1) velocity of fastest pitch (regardless of how often it’s thrown) and (2) weighted-average velocity of all fastball types (which, frankly, might be better, but I don’t know).
- Note that Eno looked at velocity differential in his aforementioned 2015 piece on sliders.
- The interaction of velocity * primary fastball velocity
… where i represents the pitch type in question.
I considered running a second iteration of the model with additional fastball specs and interactions but feared over-fitting the model without adding much explanatory power to it (which turned out to be true).
LET’S GET TO THE GOODS.
Here’s the model for swinging strike rate (SwStr%)…
Variable | Value | Note |
---|---|---|
veloi | -0.06060 | *** |
H-movi | -0.04816 | *** |
V-movi | +0.03860 | *** |
veloi * H-movi | +0.00057 | *** |
veloi * V-movi | -0.00044 | *** |
H-movi * V-movi | +0.00002 | |
veloFB | -0.04188 | *** |
veloi * veloFB | +0.00061 | *** |
constant | +4.37107 | *** |
adjusted R2 | 0.532 |
* 90% confidence
** 95% confidence
*** 99% confidence
… and ground ball rate (GB%).
Variable | Value | Note |
---|---|---|
veloi | -0.04619 | *** |
H-movi | -0.03640 | *** |
V-movi | +0.03128 | *** |
veloi * H-movi | +0.00030 | *** |
veloi * V-movi | -0.00069 | *** |
H-movi * V-movi | +0.00041 | *** |
veloFB | -0.05894 | *** |
veloi * veloFB | +0.00065 | *** |
constant | +4.80310 | *** |
adjusted R2 | 0.523 |
* 90% confidence
** 95% confidence
*** 99% confidence
The following table summarizes the mean whiff and ground ball rates before and after the modeling:
Pitch Type | SwStr% | xSwStr% | GB% | xGB% |
---|---|---|---|---|
Fourseam | 8.8% | 8.4% | 35.8% | 37.8% |
Sinker | 5.9% | 7.3% | 55.0% | 51.8% |
Change | 17.4% | 16.6% | 50.6% | 48.9% |
Slider | 17.4% | 16.2% | 45.7% | 47.5% |
Curve | 13.4% | 14.4% | 51.6% | 51.3% |
Splitter | 18.7% | 15.6% | 54.0% | 51.9% |
Average | 11.2% | 11.2% | 45.2% | 45.2% |
Note the discrepancies by pitch type. Given the somewhat subjective (and oftentimes confusing or conflicting) nature of classifying pitches, it’s not entirely surprising to me the regression, blind to pitch classifications, estimates somewhat discrepant rates for each pitch type. The model was pretty forgiving on ground balls: regressed outcomes ranged from 21% to 86%, whereas actual outcomes ranged from 12% to 85%. On the whiffs side, it was less so: regressed outcomes ranged from -3% to 27% compared to actual outcomes of 1% to 35%. But the model must be broken if it produces a negative whiff rate! I disagree; I blame Joe Saunders for throwing such a spectacularly bad sinker (and Jamey Carroll, too, for his equally miserable four-seamer) in 2014. They probably deserved to give back strikes for throwing those pitches so often.
Interpretation
It’s deliberately difficult to disentangle interpretations of the model coefficients. You shouldn’t evaluate the effects of velocity or movement in isolation because the model specification prohibits it. Every interaction has to be considered jointly. For example, in looking at only primary pitch velocity, fastball velocity (which can also be the primary pitch, by the way), and their interaction (aka their differential), more velocity and higher differentials are better. In this instance, the model rewards 100 mph fastballs most heavily, which we know is fundamentally untrue, given the superiority of off-speed and breaking pitches regarding swinging strikes. That’s why it’s important to look at other interactions such as those of primary pitch velocity and movement, which heavily punish velocity to adjust for the effectiveness of non-fastball offerings. You would need some kind of four-dimensional visualization to understand the full extent of how every interaction cooperates with one another in regard to both whiffs and ground balls. Don’t try to assess any specific variable or interaction in a vacuum.
There’s one insignificant variable, specific to the whiff rate model: the interaction between horizontal and vertical movement. Again, given all the other interactions, it doesn’t mean movement isn’t important — it’s just that this particular interaction effectively lends nothing in the way of explanatory power.
Notes
Correlation matrices suggested that, indeed, velocity and vertical movement are moderately correlated with one another. Variable inflation factors (VIFs) seemed to confirm my suspicions; the variance of the estimated coefficients might’ve ranged anywhere from 50% to 350% wider than in the absence of the multicollinearity. However, interactions also exacerbate multicollinearity. I can’t, in good faith, argue in favor of removing one or more pitch characteristics to appease this issue (although I’m open to being convinced otherwise). Even if velocity affects movement (and/or vice versa), no two pitches are consistently exactly alike. Velocity and movement interact differently for every pitch and pitcher. It would do us a disservice to assume velocity or movement is a non-factor because the other correlates strongly with it.
I considered specifying the model a number of different ways. I considered removing the interactions to simplify the model and prevent over-fitting, but I think omitting the interactions strips the model of depth; I want to know not only how velocity and horizontal movement and vertical movement affect whiff rates separately but also how velocity and a specific type of movement affect whiff rates in tandem. I considered accounting for velocity differentials using actual differentials (i.e. A minus B), again for simplicity, but I wanted to maintain each pitch’s distinct average velocity (It might, however, benefit from simply using a pitcher’s average velocity as the differential velocity for all of his pitch types.) I considered combining each movement variable into a single vector (Pythagoras would be so proud of me), but it would have reduced four quadrants of movement to one, which is too much condensing for my tastes. I considered year fixed effects, but they added virtually no value to the model despite small incremental changes to whiff rates over recent years. I considered a nonlinear approach, and I considered completely different statistical approaches (partial least squares or principal component analysis).
This model, explaining more than half the variance in whiff and ground ball rates, is clearly missing some important variables — namely, pitch location (i.e. command). Two pitches thrown completely identically will produce different outcomes when spotted perfectly on the outside corner or grooved down the middle. Release point matters (see Eno’s change-up piece linked earlier), as does sequencing, and quality of opponent, and pitcher handedness, and more. Some of these things I omitted by choice, others from a lack of resources and certainly not out of malice.
Ultimately, I settled on this. As stated previously, I expect this to only further the ongoing discussion about how to model outcomes using actual physical pitch characteristics. (Give me your feedback!)
2018 Estimates (They’re Kind of Like “Arsenal Scores”)
Probably the most critical concern to clear up: Applying the coefficients from the SwStr% version of the model (which spans 2014-17) to 2018 inputs actually produces a higher adjusted R2 (0.552) this year than the model among pitches thrown 250 times. It produces a much lower adjusted R2 (0.198) in 2018 for the GB% version of the model. The denominator is fundamentally different for ground balls — it depends not on the number of pitches but, rather, on the number of balls in play, which happen about once every six pitches. It effectively makes the sample sizes dramatically smaller, especially at this point in the season where most pitchers are just cresting the 250-pitch mark with some of their offerings. Alas, the small samples destabilize the “xGB%” estimates quite a bit, but I will still include them here for posterity.
FanGraphs primarily relies on Pitch Info data, which power much of Brooks Baseball. This means it inherently differs from PITCHf/x data in some aspects. However, to my knowledge, Brooks Baseball also relies on some raw PITCHf/x data as well. For example, as of June 13, Max Scherzer has an 18.8% swinging strike rate according to Baseball Prospectus and Brooks Baseball but a 17.9% rate at FanGraphs, whether you use the “Plate Discipline” or “Pitch Info Plate Discipline” tabs.
Alas, the results presented below more closely mirror those you will see at the Baseball Prospectus leaderboard previously linked, which err on the side of higher whiff rates. (The league-wide whiff rate is 11.0% there compared to 10.7% at FanGraphs, as of my writing this.) Same goes for ball-in-play metrics: the ground ball rates will probably look a little funky if you’re accustomed to relying on FanGraphs’ data. (I know I am.) All told, the general sentiment remains unchanged: the best estimated swinging strike (xSwStr%) and ground ball (xGB%) rates indicate what might be considered the filthiest arsenals using only velocity and movement. Remember, these estimates are far from absolute. They omit many variables (as I mentioned earlier), including command/control, which obviously plays a huge factor in how a pitcher’s pitch mix plays up.
Results are presented at the pitcher level — there are thousands of individual pitches to account for — but I’m more than happy to answer questions about specific pitches in the comments, time and sanity permitting. That was kind of the initial goal anyway: to find the best pitch(es). Or! Use the equation coefficients for yourself!
Name | Count | SwStr% | xSwStr% | diff | GB% | xGB% | diff |
---|---|---|---|---|---|---|---|
Luis Severino | 1,286 | 13.4% | 16.9% | -3.5% | 46.4% | 45.6% | 0.8% |
Garrett Richards | 1,156 | 12.7% | 15.9% | -3.1% | 52.3% | 55.2% | -2.9% |
Shohei Ohtani | 804 | 15.9% | 15.2% | 0.7% | 39.0% | 40.7% | -1.8% |
Noah Syndergaard | 1,041 | 15.5% | 14.4% | 1.1% | 50.3% | 47.1% | 3.2% |
Walker Buehler | 711 | 10.8% | 13.7% | -2.9% | 55.6% | 43.7% | 11.9% |
Chris Archer | 1,233 | 13.2% | 13.6% | -0.4% | 44.2% | 44.1% | 0.1% |
Jacob deGrom | 1,229 | 16.4% | 13.5% | 2.9% | 44.8% | 43.5% | 1.3% |
Blake Snell | 1,366 | 13.8% | 13.5% | 0.3% | 40.4% | 40.5% | -0.1% |
Jon Gray | 1,204 | 13.6% | 13.4% | 0.2% | 46.6% | 43.3% | 3.3% |
German Marquez | 1,170 | 10.3% | 13.4% | -3.0% | 45.5% | 44.5% | 1.0% |
Mike Foltynewicz | 1,302 | 10.8% | 13.3% | -2.6% | 41.5% | 42.8% | -1.3% |
Gerrit Cole | 1,339 | 15.1% | 13.3% | 1.8% | 33.9% | 43.8% | -9.9% |
Reynaldo Lopez | 1,118 | 10.0% | 13.3% | -3.3% | 35.1% | 40.1% | -5.0% |
Marco Estrada | 1,131 | 11.2% | 13.3% | -2.1% | 27.4% | 26.2% | 1.1% |
Michael Fulmer | 1,221 | 11.1% | 13.1% | -2.1% | 48.6% | 47.6% | 0.9% |
Joe Musgrove | 251 | 10.4% | 13.0% | -2.7% | 48.2% | 45.4% | 2.8% |
Daniel Gossett | 389 | 8.7% | 12.9% | -4.2% | 42.0% | 40.0% | 2.0% |
Trevor Bauer | 1,421 | 13.6% | 12.9% | 0.7% | 46.1% | 46.8% | -0.8% |
Domingo German | 539 | 15.4% | 12.8% | 2.6% | 42.2% | 50.8% | -8.5% |
Lance McCullers | 1,260 | 12.8% | 12.8% | 0.0% | 56.3% | 56.5% | -0.2% |
Miles Mikolas | 1,139 | 10.4% | 12.7% | -2.3% | 51.5% | 47.4% | 4.0% |
Stephen Strasburg | 1,269 | 12.8% | 12.7% | 0.1% | 45.2% | 45.9% | -0.7% |
Jameson Taillon | 1,125 | 10.7% | 12.6% | -1.9% | 51.5% | 45.8% | 5.7% |
Yu Darvish | 739 | 11.4% | 12.5% | -1.1% | 39.0% | 42.6% | -3.6% |
Masahiro Tanaka | 1,041 | 14.7% | 12.4% | 2.3% | 47.2% | 45.9% | 1.2% |
Charlie Morton | 1,224 | 12.9% | 12.4% | 0.6% | 52.7% | 51.3% | 1.5% |
Max Scherzer | 1,369 | 18.8% | 12.3% | 6.5% | 35.2% | 44.9% | -9.7% |
Vincent Velasquez | 1,159 | 12.7% | 12.2% | 0.4% | 40.6% | 44.2% | -3.6% |
Carlos Carrasco | 1,160 | 14.7% | 12.2% | 2.5% | 43.5% | 49.4% | -5.9% |
Luis Castillo | 1,286 | 15.3% | 12.2% | 3.1% | 46.4% | 52.5% | -6.0% |
Brandon Woodruff | 270 | 9.3% | 12.2% | -2.9% | 53.8% | 43.9% | 9.9% |
Fernando Romero | 648 | 12.0% | 12.1% | 0.0% | 46.4% | 55.9% | -9.5% |
Zach Eflin | 570 | 10.2% | 12.0% | -1.9% | 33.0% | 45.2% | -12.1% |
Michael Wacha | 1,258 | 10.8% | 12.0% | -1.2% | 44.8% | 38.1% | 6.6% |
David Hess | 451 | 10.4% | 12.0% | -1.6% | 39.4% | 35.2% | 4.1% |
Jacob Faria | 803 | 9.3% | 12.0% | -2.7% | 34.0% | 35.9% | -1.8% |
Jaime Barria | 596 | 13.3% | 11.9% | 1.3% | 40.0% | 36.7% | 3.3% |
Luke Weaver | 1,186 | 9.9% | 11.9% | -2.0% | 42.3% | 41.7% | 0.6% |
Jordan Lyles | 550 | 9.8% | 11.8% | -2.0% | 46.1% | 50.7% | -4.6% |
Anibal Sanchez | 423 | 9.5% | 11.8% | -2.4% | 51.4% | 38.4% | 12.9% |
Robbie Ray | 505 | 14.7% | 11.8% | 2.9% | 38.3% | 41.0% | -2.7% |
Ross Stripling | 527 | 12.3% | 11.7% | 0.6% | 54.7% | 41.3% | 13.3% |
Jose Urena | 1,205 | 9.7% | 11.7% | -1.9% | 51.4% | 46.7% | 4.6% |
Tyler Anderson | 1,123 | 12.3% | 11.7% | 0.6% | 34.3% | 35.1% | -0.8% |
Zack Wheeler | 1,035 | 11.3% | 11.6% | -0.3% | 46.7% | 39.9% | 6.8% |
Blaine Hardy | 421 | 9.0% | 11.6% | -2.6% | 37.1% | 37.7% | -0.6% |
Chad Kuhl | 1,213 | 10.4% | 11.6% | -1.2% | 37.6% | 45.9% | -8.3% |
Tyson Ross | 1,287 | 10.3% | 11.6% | -1.3% | 45.5% | 44.7% | 0.8% |
Kevin Gausman | 1,276 | 13.7% | 11.6% | 2.1% | 49.2% | 41.2% | 7.9% |
Sonny Gray | 1,068 | 10.1% | 11.6% | -1.5% | 46.5% | 45.9% | 0.6% |
James Paxton | 1,341 | 14.8% | 11.5% | 3.2% | 37.2% | 43.1% | -5.9% |
Chris Sale | 1,407 | 16.5% | 11.5% | 4.9% | 42.9% | 52.2% | -9.4% |
Jack Flaherty | 673 | 12.5% | 11.5% | 1.0% | 42.5% | 43.1% | -0.6% |
Matt Boyd | 1,121 | 10.4% | 11.5% | -1.1% | 30.6% | 38.4% | -7.8% |
Corey Kluber | 1,248 | 11.5% | 11.5% | 0.1% | 50.0% | 47.3% | 2.7% |
Danny Duffy | 1,388 | 10.5% | 11.5% | -1.0% | 31.3% | 38.9% | -7.7% |
Jordan Montgomery | 453 | 11.0% | 11.4% | -0.4% | 45.7% | 38.4% | 7.3% |
Mike Minor | 1,127 | 10.9% | 11.4% | -0.5% | 38.7% | 37.9% | 0.8% |
Marcus Stroman | 651 | 9.8% | 11.4% | -1.6% | 60.5% | 51.9% | 8.6% |
Kenta Maeda | 784 | 15.7% | 11.4% | 4.3% | 37.0% | 40.8% | -3.9% |
Nick Kingham | 552 | 12.7% | 11.3% | 1.4% | 39.0% | 42.9% | -3.9% |
Tyler Chatwood | 1,089 | 8.4% | 11.3% | -2.8% | 53.6% | 42.5% | 11.1% |
Kyle Freeland | 1,144 | 9.8% | 11.2% | -1.4% | 49.5% | 45.9% | 3.6% |
Wade LeBlanc | 537 | 10.1% | 11.2% | -1.2% | 33.6% | 45.3% | -11.7% |
Nick Pivetta | 1,157 | 12.6% | 11.2% | 1.4% | 42.0% | 42.3% | -0.4% |
Francisco Liriano | 913 | 11.2% | 11.2% | 0.0% | 45.2% | 46.1% | -0.9% |
Johnny Cueto | 477 | 10.9% | 11.2% | -0.3% | 45.7% | 44.3% | 1.4% |
Brett Anderson | 253 | 8.7% | 11.2% | -2.5% | 56.9% | 47.8% | 9.1% |
Clayton Kershaw | 737 | 12.2% | 11.1% | 1.1% | 45.5% | 35.9% | 9.6% |
Clay Buchholz | 332 | 11.1% | 11.1% | 0.0% | 32.4% | 40.6% | -8.3% |
Frankie Montas | 276 | 8.0% | 11.1% | -3.2% | 41.3% | 50.3% | -9.0% |
Nick Tropeano | 813 | 12.2% | 11.1% | 1.1% | 36.2% | 43.4% | -7.2% |
Justin Verlander | 1,429 | 14.1% | 11.1% | 3.1% | 30.4% | 37.0% | -6.6% |
Bryan Mitchell | 519 | 5.8% | 11.1% | -5.3% | 52.5% | 44.5% | 8.0% |
Daniel Mengden | 1,187 | 9.0% | 11.1% | -2.1% | 40.0% | 37.4% | 2.6% |
Matt Wisler | 270 | 10.0% | 11.1% | -1.1% | 26.9% | 38.2% | -11.3% |
John Gant | 264 | 13.3% | 11.0% | 2.2% | 41.5% | 43.1% | -1.6% |
Mike Clevinger | 1,349 | 11.7% | 11.0% | 0.7% | 44.1% | 37.0% | 7.0% |
Lance Lynn | 1,165 | 10.3% | 11.0% | -0.7% | 50.6% | 48.4% | 2.1% |
Patrick Corbin | 1,242 | 14.7% | 11.0% | 3.7% | 46.3% | 43.0% | 3.3% |
Carson Fulmer | 633 | 6.8% | 11.0% | -4.2% | 33.7% | 39.2% | -5.5% |
Chase Anderson | 1,059 | 9.6% | 11.0% | -1.3% | 35.6% | 38.1% | -2.5% |
Carlos Martinez | 852 | 10.7% | 10.9% | -0.2% | 55.5% | 50.3% | 5.2% |
Jose Berrios | 1,133 | 13.2% | 10.9% | 2.4% | 40.6% | 43.7% | -3.1% |
Chad Bettis | 1,181 | 9.9% | 10.8% | -0.9% | 47.0% | 43.9% | 3.1% |
Homer Bailey | 1,036 | 8.1% | 10.8% | -2.7% | 41.0% | 42.9% | -1.9% |
Dylan Covey | 464 | 9.5% | 10.8% | -1.4% | 61.7% | 47.9% | 13.8% |
Jordan Zimmermann | 502 | 10.8% | 10.8% | -0.1% | 30.9% | 41.7% | -10.8% |
Mike Soroka | 254 | 12.2% | 10.8% | 1.4% | 42.9% | 52.5% | -9.7% |
Jeremy Hellickson | 610 | 10.2% | 10.8% | -0.6% | 48.3% | 42.9% | 5.4% |
Derek Holland | 1,080 | 9.3% | 10.8% | -1.5% | 38.6% | 42.1% | -3.5% |
Jake Junis | 1,248 | 11.4% | 10.8% | 0.6% | 40.9% | 44.1% | -3.3% |
James Shields | 1,245 | 10.1% | 10.8% | -0.6% | 39.5% | 46.0% | -6.5% |
Andrew Suarez | 763 | 8.5% | 10.7% | -2.2% | 49.3% | 42.4% | 6.8% |
Trevor Williams | 1,147 | 7.8% | 10.7% | -2.9% | 41.8% | 43.2% | -1.5% |
Adam Plutko | 264 | 8.7% | 10.7% | -2.0% | 30.9% | 32.2% | -1.3% |
Eduardo Rodriguez | 1,210 | 12.9% | 10.7% | 2.2% | 41.6% | 46.5% | -4.9% |
Yonny Chirinos | 354 | 11.6% | 10.7% | 0.9% | 44.9% | 50.5% | -5.6% |
Sal Romano | 1,134 | 6.7% | 10.7% | -4.0% | 46.3% | 47.0% | -0.7% |
Trevor Cahill | 746 | 13.4% | 10.6% | 2.8% | 60.3% | 54.6% | 5.8% |
David Price | 1,114 | 10.1% | 10.6% | -0.5% | 41.2% | 45.4% | -4.2% |
Aaron Nola | 1,261 | 11.9% | 10.5% | 1.4% | 54.2% | 48.1% | 6.1% |
Kyle Gibson | 1,254 | 12.2% | 10.5% | 1.7% | 49.2% | 43.9% | 5.4% |
Hyun-jin Ryu | 462 | 11.3% | 10.5% | 0.8% | 55.9% | 42.4% | 13.5% |
Alex Wood | 1,110 | 12.1% | 10.5% | 1.6% | 46.6% | 48.0% | -1.4% |
Lucas Giolito | 1,084 | 8.4% | 10.5% | -2.1% | 41.4% | 39.5% | 1.9% |
Matt Harvey | 864 | 8.1% | 10.5% | -2.4% | 42.7% | 40.8% | 1.8% |
Trevor Richards | 541 | 10.5% | 10.5% | 0.1% | 40.5% | 40.4% | 0.1% |
Luis Perdomo | 306 | 10.1% | 10.5% | -0.3% | 40.8% | 56.7% | -15.9% |
Andrew Heaney | 931 | 12.9% | 10.4% | 2.4% | 41.1% | 43.3% | -2.2% |
Dylan Bundy | 1,201 | 15.1% | 10.4% | 4.6% | 36.1% | 35.1% | 1.0% |
Joe Biagini | 362 | 8.6% | 10.4% | -1.8% | 50.8% | 40.3% | 10.4% |
Jake Odorizzi | 1,160 | 13.1% | 10.3% | 2.8% | 26.1% | 36.7% | -10.6% |
Dillon Peters | 437 | 7.6% | 10.3% | -2.7% | 45.0% | 43.3% | 1.7% |
Brandon McCarthy | 1,059 | 7.6% | 10.3% | -2.7% | 49.8% | 48.0% | 1.8% |
Sean Manaea | 1,167 | 10.0% | 10.3% | -0.3% | 44.5% | 50.7% | -6.2% |
Jason Hammel | 1,239 | 9.8% | 10.3% | -0.4% | 38.2% | 43.6% | -5.5% |
Chris Stratton | 1,174 | 9.4% | 10.3% | -0.9% | 39.1% | 39.9% | -0.8% |
Caleb Smith | 1,164 | 13.6% | 10.2% | 3.3% | 29.1% | 41.1% | -12.0% |
Chris Tillman | 520 | 5.4% | 10.2% | -4.8% | 42.3% | 39.8% | 2.5% |
Steven Matz | 1,030 | 8.4% | 10.2% | -1.8% | 53.2% | 48.4% | 4.8% |
Zack Godley | 1,193 | 11.3% | 10.2% | 1.1% | 51.8% | 53.4% | -1.7% |
Jarlin Garcia | 507 | 8.3% | 10.2% | -1.9% | 38.5% | 46.6% | -8.0% |
Brandon Finnegan | 392 | 7.4% | 10.2% | -2.8% | 40.5% | 40.1% | 0.4% |
Marco Gonzales | 1,199 | 9.2% | 10.1% | -1.0% | 46.6% | 46.8% | -0.2% |
Matthew Koch | 877 | 7.9% | 10.1% | -2.3% | 42.6% | 41.4% | 1.3% |
Junior Guerra | 967 | 11.4% | 10.1% | 1.2% | 39.5% | 38.3% | 1.2% |
Kendall Graveman | 624 | 8.0% | 10.1% | -2.1% | 56.0% | 47.8% | 8.2% |
Jeff Samardzija | 665 | 9.3% | 10.1% | -0.7% | 33.9% | 39.8% | -5.9% |
Matt Moore | 989 | 10.9% | 10.1% | 0.9% | 39.8% | 42.6% | -2.8% |
CC Sabathia | 940 | 10.6% | 10.1% | 0.6% | 43.5% | 42.1% | 1.3% |
Wei-Yin Chen | 596 | 8.6% | 10.0% | -1.5% | 34.3% | 35.6% | -1.3% |
Steven Brault | 435 | 9.4% | 10.0% | -0.6% | 46.8% | 47.3% | -0.5% |
Dan Straily | 736 | 12.0% | 10.0% | 2.0% | 32.8% | 38.5% | -5.7% |
Tyler Mahle | 1,216 | 11.3% | 10.0% | 1.4% | 38.4% | 39.8% | -1.4% |
Ty Blach | 981 | 6.9% | 9.9% | -3.0% | 55.2% | 47.5% | 7.8% |
Andrew Triggs | 719 | 10.6% | 9.9% | 0.7% | 47.9% | 56.5% | -8.7% |
Jon Lester | 1,236 | 10.1% | 9.9% | 0.2% | 39.1% | 40.6% | -1.4% |
Sean Newcomb | 1,175 | 11.5% | 9.9% | 1.6% | 51.1% | 39.9% | 11.2% |
Dallas Keuchel | 1,371 | 9.3% | 9.9% | -0.6% | 56.4% | 45.5% | 10.9% |
Elieser Hernandez | 293 | 10.2% | 9.8% | 0.4% | 32.8% | 34.6% | -1.8% |
Brent Suter | 1,026 | 10.6% | 9.8% | 0.8% | 33.5% | 35.5% | -2.0% |
Ian Kennedy | 1,212 | 9.0% | 9.8% | -0.8% | 30.3% | 37.4% | -7.1% |
Andrew Cashner | 1,282 | 8.5% | 9.8% | -1.3% | 39.6% | 42.9% | -3.2% |
Kyle Hendricks | 1,101 | 9.6% | 9.8% | -0.2% | 48.9% | 40.8% | 8.1% |
Jhoulys Chacin | 1,165 | 8.9% | 9.8% | -0.8% | 40.5% | 42.0% | -1.5% |
Eric Skoglund | 791 | 9.0% | 9.7% | -0.8% | 43.7% | 42.4% | 1.3% |
Mike Leake | 1,201 | 8.7% | 9.7% | -0.9% | 47.7% | 53.0% | -5.2% |
Hector Santiago | 607 | 8.9% | 9.7% | -0.8% | 37.9% | 40.8% | -2.9% |
Martin Perez | 417 | 6.2% | 9.6% | -3.4% | 47.8% | 46.6% | 1.3% |
J.A. Happ | 1,290 | 12.1% | 9.6% | 2.5% | 46.7% | 39.1% | 7.6% |
Zack Greinke | 1,203 | 12.3% | 9.6% | 2.7% | 41.3% | 39.7% | 1.5% |
Ben Lively | 419 | 8.6% | 9.6% | -1.0% | 30.0% | 35.9% | -5.9% |
Miguel Gonzalez | 257 | 7.8% | 9.6% | -1.8% | 38.2% | 40.5% | -2.4% |
Joey Lucchesi | 650 | 12.0% | 9.5% | 2.5% | 43.5% | 44.0% | -0.6% |
Jaime Garcia | 904 | 9.5% | 9.5% | 0.0% | 42.2% | 45.2% | -2.9% |
Tyler Skaggs | 1,273 | 11.5% | 9.5% | 2.1% | 47.2% | 39.8% | 7.4% |
Julio Teheran | 1,144 | 11.0% | 9.4% | 1.6% | 39.1% | 42.7% | -3.6% |
Mike Fiers | 1,071 | 9.0% | 9.4% | -0.5% | 39.9% | 36.0% | 3.9% |
Felix Hernandez | 1,281 | 8.4% | 9.4% | -1.0% | 44.0% | 48.7% | -4.7% |
Jason Vargas | 536 | 11.6% | 9.4% | 2.2% | 35.3% | 41.0% | -5.7% |
Rick Porcello | 1,307 | 9.9% | 9.4% | 0.6% | 48.5% | 46.6% | 2.0% |
Jake Arrieta | 1,082 | 7.9% | 9.3% | -1.4% | 56.1% | 49.5% | 6.6% |
Tanner Roark | 1,233 | 10.3% | 9.3% | 1.0% | 47.5% | 39.7% | 7.8% |
Jose Quintana | 1,098 | 9.4% | 9.2% | 0.1% | 44.5% | 40.2% | 4.3% |
Eric Lauer | 664 | 8.0% | 9.2% | -1.2% | 30.6% | 35.5% | -5.0% |
Drew Pomeranz | 725 | 8.1% | 9.1% | -1.0% | 40.5% | 45.7% | -5.2% |
Alex Cobb | 909 | 7.5% | 9.0% | -1.6% | 51.0% | 45.3% | 5.7% |
Ivan Nova | 1,025 | 10.3% | 9.0% | 1.3% | 52.1% | 52.1% | 0.0% |
Samuel Gaviglio | 358 | 10.6% | 8.9% | 1.8% | 56.5% | 48.1% | 8.5% |
Aaron Sanchez | 1,231 | 10.6% | 8.8% | 1.7% | 51.6% | 48.5% | 3.1% |
Zach Davies | 730 | 8.5% | 8.8% | -0.3% | 48.5% | 46.6% | 1.9% |
Clayton Richard | 1,281 | 10.1% | 8.6% | 1.5% | 58.0% | 57.8% | 0.2% |
Gio Gonzalez | 1,217 | 10.5% | 8.3% | 2.2% | 53.6% | 46.3% | 7.3% |
Doug Fister | 1,112 | 5.7% | 8.3% | -2.6% | 50.9% | 48.2% | 2.7% |
Adam Wainwright | 350 | 6.3% | 8.2% | -1.9% | 54.5% | 42.4% | 12.2% |
Bartolo Colon | 1,012 | 5.6% | 7.4% | -1.7% | 46.6% | 48.0% | -1.4% |
Rich Hill | 430 | 8.1% | 7.3% | 0.9% | 32.0% | 35.2% | -3.2% |
Cole Hamels | 1,287 | 12.9% | 7.1% | 5.8% | 42.8% | 44.0% | -1.2% |
Josh Tomlin | 503 | 9.7% | 6.9% | 2.8% | 27.7% | 39.6% | -11.9% |
Data current as of June 10.
Holy Luis Severino.
Garrett Richards is the only pitcher with top-10 xSwStr% and xGB% rates. Maybe if he wasn’t averaging more than one wild pitch per game…
Other names who fall in the top-20 of each: Lance McCullers and… Domingo German.
Double top-30 guys: Charlie Morton, Carlos Carrasco, Luis Castillo.
These lists generally pass the smell test, but it’s obviously worth wondering why guys like Scherzer, Corey Kluber, etc. don’t grade out better here. Don’t ask me! I’m not a computer. But, as aforementioned, using only velocity and movement to “predict” these kinds of things can be limiting. Scherzer’s 12.3% xSwStr% doesn’t mean his whiff rate will fall one-third by year’s end. Take it all with a grain of salt. If anything, I’d consider interpreting the numbers as if they were arsenal scores — but, like, peripherals-based rather than outcomes-based arsenal scores.
And remember, the xGB% estimates are really noisy.
Takeaways
Is this model predictive? I don’t know. It’s hard enough for a pitcher to repeat his own pitch, let alone for another to try to replicate it. The average values used in this model paint a picture of a pitcher’s “typical” performance. It essentially assumes he’s static, which offends my sabermetric sensibilities. But I don’t know how else to approach this, really. It takes a few hundred pitches for any specific pitch type to become statistically reliable. I don’t know theoretically defensible it is to cherry-pick velocity and movement from April and extrapolate it. Baseball players are not robots.
Fact of the matter is I don’t know how else to use these results other than, hey, this guy’s xStStr% is way lower than his actual SwStr% for his curve — maybe it’s overperforming a little bit. In this scenario, I’ll willingly (ignorantly) conflate descriptiveness and predictiveness. But also, admittedly, I have no idea how worthwhile that is. Maybe the only purpose of this is to simply exist as having been pursued. Mostly, I’m using it as a way to reconsider interesting names I had otherwise disregarded in some capacity (Fernando Romero, Domingo German, German Marquez, Reynaldo Lopez) and to keep the faith in others (Luis Castillo, maybe).
Aaron Nola and Kyle Hendrick look extremely mediocre here.
But they are command artists who make a living off of called strikes anyway: https://www.fangraphs.com/fantasy/aaron-nolas-sinker-and-the-called-strike/