Reconciling Pitcher (x)BABIP and Hard Contact Allowed
This is a long one. I appreciate your patience in advance.
Mike Podhorzer, I and sporadic others have — but primarily Mike has — carried the torch on developing ‘expected’ metrics, such as xBABIP (expected batting average on balls in play), xHR/FB (expected home run-to-fly ball ratio) and xK% (expected strikeout rate), all and the rest of which can be found here. For the uninitiated, these xMetrics help describe how a hitter or pitcher should have performed based on various measurements of the events that unfolded and typically are more predictive of future performance than the original metric. They’re not perfect, but, like other advanced metrics, they give us a better understanding of player performance and ability.
Each metric — xHR/FB, xK%, etc. — has formulas for both hitters and pitchers, with the hitter metrics typically having stronger correlations than those for pitchers. Unfortunately, pitcher xBABIP has always eluded us. It’s inappropriate to repurpose hitter xBABIP for pitchers, but it’s because the model coefficients (weights) would be different, not because the theory underpinning the model is flawed.
That’s the problem, though: hard hits, line drives, infield fly balls — these all should affect a pitcher’s BABIP allowed. Our intuition begs it to be true. Yet there’s a resounding lack of evidence that suggest otherwise. The correlation between BABIP and hard-hit rate (Hard%), line drive rate (LD%) and infield fly ball rate (IFFB%), among others, borders on nonexistent:
- LD%: Line drive rate correlates very weakly with BABIP, producing a 0.12 R2. It’s better than nothing, but, unfortunately, LD% has virtually no correlation from year to year, something that rplunkett97 documented here. In other words, line drives do affect pitcher BABIP, but they’re worthless to use in a predictive manner, which is typically the reason why we use these metrics.
- IFFB%: Infield fly ball rate does not correlate with BABIP, producing a 0.05 R2. Using IFFB% in a vacuum ignores how frequently a pitcher allows fly balls in the first place; converting it to pop-up rate (PU%, hypothetically) by multiplying IFFB% by overall fly ball rate (FB%) generates a metric that correlates very weakly with BABIP, producing a 0.10 R2. It should have a much stronger correlation, given pop-ups are basically free outs.
- Hard%: Hard-hit rate does not correlate with BABIP, producing a 0.03 R2. This will be the impetus for the rest of this post.
All together, these metrics do very little to tell us anything meaningful about pitcher BABIP allowed. Line drives result in hits almost 70% of the time and infield flies, for all intents and purposes, are automatic outs. Yet we struggle to demonstrate any kind of meaningful relationship between them and pitcher BABIP. Likewise, you’d think a pitcher allowing frequent hard contact should struggle more than one who doesn’t, yet no dice.
Enter Statcast. It has revolutionized, or at least begun the new revolution about, how we consume player baseball data. (You may or may not be aware of the alleged Launch Angle Revolution in progress.) There are so many tools at our disposal — exit velocity, launch angle, barrels, and so on and so on — that we can basically predict a batted ball’s hit probability based on the results of previous batted balls with similar exit velocities and launch angles. (Statcast’s expected wOBA, or xwOBA, can tell us the probability of a hit but also the probable value of that hit as well.)
The granularity of this data has seemingly demystified many lingering sabermetric arguments. Yet… yet. I’m still struggling to grapple with the enigma that is Robbie Ray. (Yeah, it’s [yet] another post about Robbie Ray and BABIP.) Somehow, it always comes back to Ray.
Ray got BABIP’d to death last year, allowing a historically bad .352 mark in a full season’s work. Now, his BABIP is 75 points lower than last season (50 points lower than his career mark) — and Statcast validates it, suggesting his .206 expected batting average (xBA), per Statcast, stays true to his actual .203 batting average allowed despite allowing hard contact at a rate that almost paces the league (as sorted by 95 MPH+ percentage). Statcast, with its robust and granular data, accounting for exit velocities and launch angles and all that, defends Ray’s performance.
I want to defend it, too, but I’m still not totally sold. It’s really hard for me to swallow the fact that it’s somehow OK that he allows as much hard contact as he does. Statcast’s leaderboard, as previously linked, helpfully provides many of its cornerstone metrics in shorthand as percentages or averages. I’ll use a snippet from Tony Blengino’s contact survivors piece a month ago to further my cause:
Well, first off, let’s not get overly excited. Ray has been extremely lucky across all BIP types, with his Unadjusted Contact Score well below his adjusted mark on flies (92 vs. 119), liners (62 vs. 110), grounders (104 vs. 132) and overall (81 vs. 115). Hitters are batting only .500 AVG-.690 on liners, compared to the MLB average of .650 AVG-.869 SLG. Not a lot of skill in that; Ray’s average liner authority of 96.3 mph is second highest of the above hurlers.
I wanted to strip away all of the extra context and focus exclusively on exit velocity. Such focus assumes that pitchers don’t necessarily have control over where hitters put the ball in play. I think this is more truthful than not. Data-driven pitchers probably understand how their offerings fare by hitter handedness, pitch location in or out of the zone, etc., but they can’t reasonably coax a particular batted ball type, let alone a particular batted ball type in a particular direction, at any given moment. Statcast’s xwOBA and xBA credits the pitcher for these phenomena — credit I don’t think pitchers necessarily deserve in full.
Statcast’s leaderboard simplifies its robust database, but its shorthand representations made it easy for me to run a quick regression of BABIP to the percentage of batted ball events (BBE) of 95+ MPH (among pitchers in 2017 who have allowed at least 190 BBE). Nothing revolutionary — just putting what I perceive to be the collective intuition into code. The correlation was weak bordering on moderate, registering a 0.16 R2; incorporating barrel percentage (barrels per BBE, or Brls/BBE) improves the R2 to 0.19, which is encroaching upon legitimate “moderate correlation” territory. (There’s a moderate-to-strong correlation between 95+ MPH% and Brls/BBE.) It’s not a lot, but it’s the best I’ve seen. The correlations are weaker for 2016, which suggests to me MLBAM’s tools and technologies have improved (unsurprisingly).
Accordingly, I calculated new expected BABIPs (hey, xBABIP!) for each pitcher based on both specifications. The former (95+ MPH% only) for Ray: .325 xBABIP. The latter (95+ MPH% and Brls/BBE): .332 xBABIP. These are deliberately simplified models, but they fundamentally betray what Statcast is telling me, telling us. Ray has the 7th-largest discrepancy between his BABIP and xBABIP; the other six have BABIPs lower than .230, so it’s understandable why Statcast doubts them (looking at you, Ervin Santana).
I fully support having more data than less — goes without saying. But I think we lost the forest for the trees. This isn’t to say these results override Statcast ‘expected’ metrics. It’s just that these results are tangible and support what I think many of us intuit about pitchers who allow hard contact: that if we see it on one side of the coin (for hitters), we should somehow see it manifest on its other side as well. It can be theoretically divisive, though, if you think, like I do, that pitchers have less control over where batted balls go upon contact than we may think, or if you disagree. The easy but probably most honest answer is the truth is likely somewhere in the middle, blurred by variance.
So, that’s it. It’s not a big revelation; it just makes me feel a little better about believing that contact quality allowed actually means something. Although I had defended Ray and his inflated BABIP previously, I have had a hard time defending his low BABIP (and remarkably high strand rate) now.
All that said, I can’t leave you hanging without a table. Here’s your table of BABIPs versus xBABIPs.
Player | 95 MPH+% | Brls/BBE | BABIP | xBABIP 1 | xBABIP 2 | avg | diff |
---|---|---|---|---|---|---|---|
Ervin Santana | .314 | .043 | .217 | .289 | .296 | .292 | .075 |
Ian Kennedy | .327 | .102 | .215 | .293 | .277 | .285 | .070 |
Ariel Miranda | .316 | .072 | .220 | .289 | .285 | .287 | .067 |
Dallas Keuchel | .304 | .052 | .222 | .286 | .288 | .287 | .065 |
Lance Lynn | .303 | .077 | .220 | .285 | .277 | .281 | .061 |
Max Scherzer | .299 | .054 | .226 | .284 | .285 | .284 | .058 |
Robbie Ray | .424 | .070 | .275 | .325 | .332 | .329 | .054 |
Matt Harvey | .349 | .087 | .252 | .300 | .293 | .297 | .045 |
Hector Santiago | .388 | .116 | .263 | .313 | .298 | .306 | .043 |
Jake Odorizzi | .346 | .083 | .256 | .299 | .293 | .296 | .040 |
Jeremy Hellickson | .312 | .070 | .248 | .288 | .284 | .286 | .038 |
Ivan Nova | .358 | .063 | .267 | .303 | .306 | .305 | .038 |
Antonio Senzatela | .342 | .063 | .264 | .298 | .300 | .299 | .035 |
Carlos Martinez | .347 | .063 | .266 | .300 | .302 | .301 | .035 |
Ubaldo Jimenez | .376 | .089 | .272 | .309 | .304 | .307 | .035 |
Mike Leake | .356 | .052 | .272 | .303 | .310 | .306 | .034 |
Dylan Bundy | .368 | .084 | .271 | .307 | .302 | .304 | .033 |
Sean Manaea | .391 | .047 | .289 | .314 | .327 | .321 | .032 |
John Lackey | .367 | .079 | .274 | .306 | .304 | .305 | .031 |
Sonny Gray | .388 | .058 | .287 | .313 | .321 | .317 | .030 |
Nick Martinez | .311 | .078 | .254 | .288 | .280 | .284 | .030 |
Jesse Chavez | .401 | .089 | .286 | .318 | .314 | .316 | .030 |
Yu Darvish | .327 | .064 | .263 | .293 | .293 | .293 | .030 |
Jose Urena | .290 | .089 | .245 | .281 | .267 | .274 | .029 |
Clayton Kershaw | .283 | .068 | .248 | .279 | .272 | .275 | .027 |
Dan Straily | .266 | .067 | .243 | .273 | .265 | .269 | .026 |
Gio Gonzalez | .300 | .055 | .259 | .284 | .285 | .285 | .026 |
Alex Wood | .262 | .021 | .254 | .272 | .282 | .277 | .023 |
Mike Pelfrey | .347 | .063 | .278 | .300 | .302 | .301 | .023 |
Mike Fiers | .328 | .077 | .269 | .293 | .288 | .291 | .022 |
Alex Cobb | .343 | .051 | .282 | .298 | .305 | .302 | .020 |
CC Sabathia | .339 | .047 | .282 | .297 | .305 | .301 | .019 |
Tim Adleman | .315 | .062 | .270 | .289 | .288 | .289 | .019 |
Matt Shoemaker | .344 | .094 | .276 | .299 | .288 | .293 | .017 |
Edinson Volquez | .327 | .054 | .278 | .293 | .297 | .295 | .017 |
Ricky Nolasco | .401 | .100 | .298 | .318 | .310 | .314 | .016 |
Derek Holland | .411 | .091 | .307 | .321 | .318 | .319 | .012 |
Jose Berrios | .276 | .061 | .262 | .276 | .272 | .274 | .012 |
Jaime Garcia | .339 | .052 | .288 | .297 | .303 | .300 | .012 |
Taijuan Walker | .355 | .048 | .295 | .302 | .311 | .307 | .012 |
Andrew Cashner | .313 | .034 | .282 | .288 | .299 | .294 | .012 |
Kyle Freeland | .331 | .046 | .287 | .294 | .302 | .298 | .011 |
Miguel Gonzalez | .371 | .068 | .298 | .308 | .310 | .309 | .011 |
Julio Teheran | .295 | .070 | .269 | .283 | .277 | .280 | .011 |
JC Ramirez | .354 | .066 | .294 | .302 | .304 | .303 | .009 |
Matt Garza | .336 | .052 | .290 | .296 | .301 | .299 | .009 |
Erasmo Ramirez | .367 | .085 | .296 | .306 | .301 | .304 | .008 |
Tyler Chatwood | .292 | .050 | .275 | .282 | .283 | .282 | .007 |
Jason Vargas | .288 | .040 | .276 | .280 | .286 | .283 | .007 |
Luis Severino | .354 | .061 | .297 | .302 | .306 | .304 | .007 |
Corey Kluber | .324 | .051 | .288 | .292 | .297 | .294 | .006 |
Adalberto Mejia | .322 | .080 | .282 | .291 | .284 | .288 | .006 |
R.A. Dickey | .275 | .060 | .269 | .276 | .272 | .274 | .005 |
Jordan Montgomery | .317 | .075 | .282 | .290 | .284 | .287 | .005 |
Andrew Triggs | .338 | .057 | .294 | .297 | .300 | .299 | .005 |
Johnny Cueto | .355 | .084 | .295 | .302 | .297 | .300 | .005 |
Stephen Strasburg | .318 | .073 | .285 | .290 | .285 | .288 | .003 |
Trevor Williams | .310 | .058 | .285 | .287 | .288 | .288 | .003 |
Jose Quintana | .348 | .055 | .301 | .300 | .305 | .303 | .002 |
Carlos Carrasco | .333 | .082 | .290 | .295 | .288 | .292 | .002 |
Zack Greinke | .274 | .069 | .271 | .276 | .268 | .272 | .001 |
Jharel Cotton | .304 | .043 | .288 | .286 | .291 | .288 | .000 |
Chase Anderson | .259 | .040 | .272 | .271 | .273 | .272 | .000 |
Michael Fulmer | .271 | .042 | .277 | .275 | .278 | .276 | -.001 |
Mike Foltynewicz | .328 | .066 | .294 | .293 | .292 | .293 | -.001 |
Aaron Nola | .306 | .045 | .290 | .286 | .291 | .289 | -.001 |
Marcus Stroman | .379 | .067 | .315 | .310 | .314 | .312 | -.003 |
Gerrit Cole | .344 | .087 | .298 | .299 | .291 | .295 | -.003 |
Yovani Gallardo | .339 | .055 | .304 | .297 | .302 | .299 | -.005 |
Michael Pineda | .332 | .061 | .302 | .295 | .296 | .295 | -.007 |
Chris Sale | .305 | .062 | .292 | .286 | .284 | .285 | -.007 |
James Paxton | .286 | .023 | .293 | .280 | .292 | .286 | -.007 |
Jhoulys Chacin | .296 | .063 | .289 | .283 | .280 | .281 | -.008 |
Jake Arrieta | .320 | .055 | .300 | .291 | .293 | .292 | -.008 |
Justin Verlander | .373 | .082 | .316 | .308 | .305 | .307 | -.009 |
Jacob deGrom | .298 | .066 | .292 | .284 | .280 | .282 | -.010 |
Chris Archer | .387 | .052 | .329 | .313 | .323 | .318 | -.011 |
Masahiro Tanaka | .373 | .097 | .316 | .308 | .299 | .304 | -.012 |
Tanner Roark | .350 | .056 | .316 | .301 | .306 | .303 | -.013 |
Ty Blach | .297 | .046 | .298 | .283 | .287 | .285 | -.013 |
Zach Davies | .335 | .064 | .310 | .296 | .296 | .296 | -.014 |
Joe Biagini | .318 | .047 | .307 | .290 | .296 | .293 | -.014 |
Chad Kuhl | .368 | .054 | .325 | .307 | .314 | .311 | -.014 |
German Marquez | .362 | .068 | .320 | .305 | .306 | .305 | -.015 |
Trevor Bauer | .421 | .100 | .336 | .324 | .319 | .321 | -.015 |
Lance McCullers | .300 | .042 | .303 | .284 | .290 | .287 | -.016 |
Jordan Zimmermann | .348 | .085 | .313 | .300 | .293 | .297 | -.016 |
Kenta Maeda | .257 | .036 | .289 | .270 | .274 | .272 | -.017 |
Mike Montgomery | .261 | .035 | .292 | .271 | .276 | .274 | -.018 |
Brandon McCarthy | .198 | .017 | .272 | .250 | .256 | .253 | -.019 |
Robert Gsellman | .371 | .060 | .331 | .308 | .313 | .310 | -.021 |
Jerad Eickhoff | .364 | .057 | .329 | .305 | .311 | .308 | -.021 |
Jesse Hahn | .348 | .048 | .326 | .300 | .308 | .304 | -.022 |
Matt Moore | .431 | .102 | .347 | .327 | .322 | .325 | -.022 |
Drew Pomeranz | .354 | .071 | .325 | .302 | .302 | .302 | -.023 |
Scott Feldman | .256 | .063 | .292 | .270 | .263 | .266 | -.026 |
Kyle Gibson | .382 | .088 | .335 | .311 | .307 | .309 | -.026 |
Bronson Arroyo | .309 | .102 | .305 | .287 | .270 | .278 | -.027 |
Joe Ross | .356 | .069 | .330 | .303 | .303 | .303 | -.027 |
Matt Cain | .347 | .053 | .330 | .300 | .306 | .303 | -.027 |
Jason Hammel | .337 | .083 | .321 | .296 | .289 | .293 | -.028 |
Marco Estrada | .348 | .086 | .325 | .300 | .293 | .296 | -.029 |
Hyun-Jin Ryu | .318 | .073 | .317 | .290 | .285 | .288 | -.029 |
Luis Perdomo | .362 | .051 | .339 | .305 | .313 | .309 | -.030 |
Zack Wheeler | .342 | .079 | .327 | .298 | .293 | .296 | -.031 |
Wade Miley | .364 | .057 | .341 | .305 | .311 | .308 | -.033 |
Joe Musgrove | .336 | .060 | .330 | .296 | .298 | .297 | -.033 |
Francisco Liriano | .333 | .071 | .328 | .295 | .293 | .294 | -.034 |
Jeff Samardzija | .309 | .053 | .323 | .287 | .290 | .288 | -.035 |
Danny Duffy | .283 | .054 | .313 | .279 | .278 | .278 | -.035 |
Patrick Corbin | .372 | .076 | .346 | .308 | .307 | .308 | -.038 |
Jimmy Nelson | .325 | .043 | .337 | .292 | .300 | .296 | -.041 |
Josh Tomlin | .363 | .069 | .347 | .305 | .306 | .306 | -.041 |
Jon Lester | .273 | .052 | .317 | .275 | .274 | .275 | -.042 |
Bartolo Colon | .394 | .072 | .360 | .315 | .318 | .317 | -.043 |
Rick Porcello | .360 | .082 | .346 | .304 | .300 | .302 | -.044 |
Clayton Richard | .311 | .044 | .338 | .288 | .294 | .291 | -.047 |
Adam Wainwright | .336 | .049 | .347 | .296 | .303 | .299 | -.048 |
Daniel Norris | .366 | .071 | .359 | .306 | .307 | .306 | -.053 |
Tyler Anderson | .316 | .088 | .337 | .289 | .278 | .284 | -.053 |
Martin Perez | .349 | .054 | .357 | .300 | .306 | .303 | -.054 |
Michael Wacha | .287 | .053 | .347 | .280 | .280 | .280 | -.067 |
Kevin Gausman | .368 | .087 | .371 | .307 | .301 | .304 | -.067 |
Data extracted during All-Star Break (prior to June 14, 2017 games)
xBABIP 1 = 95+ MPH% only
xBABIP 2 = 95+ MPH% and Brls/BBE
Ray’s exit velocity seems like it’s coming at the same issue from a different angle as the recent discussion of Byung-ho Park. Park was running extremely high exit velocities on contact, so FanGraphs readers were shocked when the Twins released him, but a high average exit velocity doesn’t mean much if you’re rarely making contact. Similarly, Ray allowing a high average exit velocity doesn’t mean all that much when he’s allowing the third lowest rate of contact among qualified pitchers. At extremes like this, it appears that averages are not the best way of describing proficiency, as they lose sight of the number of times that no contact is made.
This is all moot though because BABIP only includes balls in play. Maybe you’re arguing that the denominator is low so BABIP is more prone to significant fluctuation in such an instance like for Ray? So perhaps it’s silly to overanalyze his BABIP when the answer is that it’s just random and is less likely to match his high EV allowed or Hard%?
Not random, just more noisy.