The longer we have Statcast data at our disposal, the more ways we find novel uses for them. What follows is my proposal to use the difference between expected and actual value segmented by batted ball type and venue to determine park factors (and potentially evaluate defensive value, as described in the footnote). Unfortunately, someone smarter than me was already way ahead of me. I’ll get to that in a second.
A typical park factors grid, such as those produced by ESPN or FanGraphs, commonly relies on outcomes — outcomes of plate appearances (ESPN), batted ball categories, or both (FanGraphs). They describe what actually occurred, the way wOBA describes a hitter’s actual production. Conversely, expected wOBA (xwOBA) describes what should have occurred based on a batted ball’s exit velocity (EV) and launch angle (LA). It strips away everything else, holding constant all other environmental factors in order to deliver an otherwise-context-neutral EV/LA-based value.
The difference between wOBA and xwOBA (“wOBA minus xwOBA,” or wOBA—xwOBA for short), therefore, effectively captures all value amassed or lost by other variables. In other words, if wOBA explains what actually happened in a non-neutral environment, and xwOBA explains what should’ve happened in a neutral environment, then the difference between them characterizes the effect of the environment — the ballpark itself.
Unfortunately for me (but fortunately for everyone else), Tony Blengino already did this (which is why he’s a former MLB executive and I’m not). In 2017, he used Statcast data to calculate park factors on the basis of expected outcomes relative to actual league-average production. For all intents and purposes, it’s the same idea.
Consider this post a refresher on the topic.
Let me call your attention back to a simpler time. If you search “Miguel Cabrera xwOBA” on Twitter, you’ll find, well, not a multitude, but at least a sampling, of Tweets from the summer of 2017 lamenting Cabrera’s (and his teammate’s) bad luck by measure of wOBA—xwOBA:
By Statcast's xwOBA, the two hitters who have ran the worst are both Detroit Tigers: Nick Castellanos and Miguel Cabrera. pic.twitter.com/z9SXYOwUU6
— Davis Mattek (@DavisMattek) June 22, 2017
Miguel Cabrera has the largest discrepancy between his wOBA and xwOBA. In fact, his is 26 points larger than 2nd place Matt Carpenter.
— Garion Thorne (@GarionThorne) July 18, 2017
Largest xwOBA vs wOBA gap in 2017 (min 200 AB)? Miguel Cabrera .389 vs .330, 59 point difference. Next is K. Morales & Avila at 37 points.
— Derrick Boyd (@DerrickBHQ) August 29, 2017
The 2017 season marked Cabrera’s second consecutive hard-luck season, having posted the worst wOBA—xwOBA differential in 2016, too, among hitters with at least 350 plate appearances. Eno Sarris even dug into the issue (aptly named “Buying Low on Miguel Cabrera“), looking at, among other things, differences in exit velocities among Tigers hitters and ambient temperatures by venue as possible explanations for this statistical oddity.
I distinctly remember this discussion on Twitter, yet I can’t find evidence it ever existed. I recall extensive research and speculation regarding the “hotness” of the radar guns in Detroit. Indeed, evidence from previous years suggests, but doesn’t confirm, the guns in Detroit ran (and maybe still run) hot; xwOBA on fly balls ranked highest in 2015 and 2016 and 6th-highest in 2017. However, in 2018, it ranked 7th-lowest, revealing it’s not (or maybe was but is no longer) a consistent issue. This one-year anomaly could be tied to hitter quality in Detroit, but the absence of Cabrera or anyone else noteworthy in 2018 likely did not move the needle (hundreds of other hitters graced its halls with tens of thousands more plate appearances). Even if we toss out 2015 and its problematic measurements (it was Statcast’s first year), the difference between fly ball xwOBA at Comerica Park in 2016 (1st) and 2018 (24th) is stark and stands counter to the “hot gun” argument.
No matter the reason for high levels of expected performance on fly balls in Detroit (natural variance? radar recalibrations?), actual performance has lagged by significant and consistent margins. (In the event of a recalibration, I would expect to see wOBA—xwOBA shrink, which it does not. At this point, that’s neither here nor there.)
Maybe Comerica Park is just a bad hitters’ park.
Furthermore: fly ball park factors calculated strictly according to actual production (wOBA) for Minute Maid Park (Houston) and the stadium formerly known as Safeco Field (Seattle) would be roughly league-average (15th and 16th, respectively, among the 30 ballparks). Instead, when calculated according to the difference between wOBA and xwOBA, Safeco Field had the 7th-worst margin (e.g. favoring pitchers) and, Houston, the 3rd-best (favoring hitters). (Blengino notes in one of his posts linked earlier here that Houston might be might be the cheap home run capital of the big leagues.” Consider 2017: Houston was 16th in fly ball wOBA (neutral) and Seattle was 24th (pitcher-friendly) — a very different outcome for the latter compared to 2018. Yet their wOBA—xwOBA “factors” remained consistent, with Houston being extremely hitter-friendly (4th-best) and Seattle being extremely pitcher-friendly (4th-worst). I’m highlighting the effects of run-scoring variance here, which Tom Tango, one of the purveyors of Statcast data, conveniently discussed on his blog very recently.
Blengino already did it, but I think it should become commonplace: calculate park factors in terms of the difference between expected and actual production. (As opposed to jumping through hoops trying to find a way to hold constant an array of variables while regressing run-scoring outcomes.) wOBA—xwOBA strips each park of the quality of pitchers and hitters that play there and merely characterizes the premium or penalty applied to the value of an “average” batted ball, per its EV-LA bucket.
If I can do anything to add value at this point, it might be this one minor recommendation. Blengino calculated differences in expected and actual outcomes by ballpark but (to my knowledge) did not neutralize the batted ball distribution. For example, a fly ball park factor for Yankee Stadium, which has a handful of the game’s premier sluggers, might be slightly inflated because Yankee hitters are most likely to hit the ball in the air when they pull the ball. Instead, we should compute the wOBA—xwOBA value of each batted ball type and then apply the values to a league-average, rather than park-specific, distribution of batted balls, which would then further control for hitter/pitcher quality. Still, what Blengino did is the best step I’ve seen so far.
Here are the wOBA—xwOBA differentials by venue and year for each batted ball type (in separate consecutive tables). Note that I haven’t actually calculated anything here beyond presenting what can be readily pulled from Baseball Savant. That’s a project for another day. Even in their crudest, most basic form — not yet indexed against the league average — these incredibly high-level numbers still show how different parks play up or play down specific batted ball types with some modicum of consistency. (Reminder: positive values mean actual outcomes exceed expected outcomes; negative, vice versa. Also, click column headers to sort, for ease of analysis/digestion.)
Ultimately, park factors based on wOBA—wOBA should give us a better idea of how hitters and pitchers are actually affected when moving to a new home park this offseason. Nelson Cruz gets a modest upgrade. Carlos Santana is a huge downgrade, but on the basis of line drives, not fly balls. Charlie Morton also gets a fantastic upgrade in Tampa Bay (er, St. Petersburg, technically). These two conform to our existing expectations of the ballparks in question. Conversely, Anibal Sanchez‘s new home in Washington is decidedly worse than his previous home in Atlanta, as is Matt Shoemaker’s new home in Toronto, even though ESPN pegs both moves as nearly neutral per 2018 park factors.
On Measuring Defensive Value, Briefly
Ground ball efficacy by ballpark is all over the place. There are two, and probably only two, factors at play here: lateral launch angle variance (i.e., pull vs. center vs. oppo), which I imagine is mostly luck-based; and infielder defensive skill, which is skill-based (but, in terms of difficulty of opportunities, subject to the variance of the aforementioned lateral launch angle).
In 2018, the Athletics posted the best single-season wOBA-xwOBA on ground balls (-0.036) since the start of 2015. Meanwhile, the Yankees generated their worst ground ball wOBA-xwOBA (+0.019) in the same four-year span. Matt Chapman and Miguel Andujar were, almost universally, considered the best and worst defensive third basemen, respectively.
In sum, I think there’s validity to using wOBA-xwOBA to measure defensive aptitude among infielders, but I don’t have granular-enough data or analysis to make any quantitatively-defensible assertions yet.