The In-Season Predictiveness of xwOBA

September 5, 2018

I use xwOBA as a leading indicator of good or bad things to come mid-season, for better or for worse. It’d be good to know if such reliance is truly warranted. I further talked myself into the idea when I wrote about several underperforming hitters in early June. Many of the names therein went on some serious heaters afterward, too. It wasn’t as prescient as it was playing the odds: the hitters underperforming xwOBA most extremely through two months always, always (in the Statcast Era^TM) bounce back to some degree.

It’s “predictive,” but not universally so, and only by virtue of common sense, in the same way a pitcher who allows a sub-.200 batting average on balls in play (BABIP) through two months could not reasonably sustain this high level of contact management. (There’s a discussion to be had here about the gambler’s fallacy, but I don’t think it necessarily applies to baseball. For another day.)

In terms of prior work, it’s all Baseball Prospectus‘ Jonathan Judge (only a slight exaggeration): he compared xwOBA to BP’s DRA metric as well as FIP (fielding independent pitching), a much simpler ERA estimator, and showed xwOBA is hardly superior to the field, at least for pitching. However, the article only covered year-to-year, not in-season, correlations.

After our dear and departed (but not dead) Eno Sarris asked Judge if he had looked at in-season correlations specifically, and after our dear and departed (and also not dead) Mike Petriello reinforced the notion that xwOBA could serve as an in-season predictor of regression under certain circumstances, I figured it’s high time I just tackle the question.

So: How predictive is xwOBA of wOBA in-season? For hitters and for pitchers?

Hitters

From Baseball Savant I grabbed all hitter seasons…

from 2015 through 2017 with
at least 200 plate appearances prior to June 30 and
at least 200 plate appearances after July 1
(n = 499)

…and produced a correlation matrix for wOBA, xwOBA, and the wOBA-minus-xwOBA differential (where correlation is measured/denoted by the Pearson coefficient, r) for each half-season, labeled as “1H” (first half) and “2H” (second half). The 200-PA cutoff is completely arbitrary; I wanted to ensure samples within player-seasons were sufficiently large while ensuring the same for the sample of player-seasons. Increasing the thresholds to, say, 250 PA might have a profound impact on the results. (I doubt it, but still, it might.)

In-Season xwOBA Correlations: Hitters

Metric	1H wOBA	1H xwOBA	1H diff	2H wOBA	2H xwOBA	2H diff
1H wOBA	1.000
1H xwOBA	0.783	1.000
1H diff	0.273	-0.385	1.000
2H wOBA	0.328	0.406	-0.141	1.000
2H xwOBA	0.423	0.629	-0.344	0.798	1.000
2H diff	-0.130	-0.324	0.308	0.363	-0.272	1.000

SOURCE: Statcast (Baseball Savant)

(I color-coded the cells to help illustrate the strength of each relationship. The red diagonal of ones indicate perfect positive correlation of each metric with itself, as they should. Negative ones would indicate perfect negative correlation.)

Same-half xwOBA to wOBA: r = .783, r = .798

It’s a pretty strong relationship! In either half of the season, xwOBA explains 61% to 64% of the variance in wOBA, which is great. That inspires confidence.

If you’re concerned about it: I don’t think the higher correlation coefficient in the second half compared to the first half (.798 > .783) means anything, really. It could be a weather thing. It could also be confounded by the sudden deployment of the juiced ball, which would obscure the relationships in the first half of 2015. I described here how xwOBA appears ill-equipped to handle changes in ball specifications and will therefore produce suboptimal estimates without year fixed effects (but, perhaps ironically, is well-equipped to illuminate that Major League Baseball probably tinkered with the ball again in 2018).

1H xwOBA to 2H wOBA: r = .406

Here’s the buried lede: xwOBA before June 30 bears a feeble, albeit non-zero, relationship with wOBA after July 1. If anything, first-half xwOBA correlates more strongly to its second-half self than it does to wOBA (or wOBA to itself, which is a good sign for xwOBA as both a descriptive tool and a predictive tool). The results suggest there’s more consistency in a hitter’s batted ball profile than in the outcomes produced by said batted ball profile. There’s something here — first-half xwOBA explains more than 15% of the variance in second-half wOBA, which is more than nothing — but it’s not particularly promising.

1H xwOBA-wOBA (“diff”) to 2H anything

Not great, Bob. In the grand scheme of things, my efforts to leverage a hitter’s first-half wOBA-xwOBA differential as a predictive tool are for naught. There is a weak relationship with second-half xwOBA, and the fact the correlation coefficient is negative suggests a positive differential “sees” impending gloom (and vice versa).

I warned, though, that this would likely not be the case. For one, a cursory glance at high-level xwOBA data suggests the hitters who routinely outstrip their xwOBAs are speedsters, and those who fall short of their xwOBAs are lumbering first-base types (2016, for example). It all makes sense, too — if xwOBA relies on batted ball velocities and launch angles, then foot speed bears no influence on the algorithm.

It doesn’t mean (to me, at least) that wOBA-xwOBA differential can’t be leveraged at all — like I said, if used strategically (for outliers, namely), it can be a helpful tool.

Pitchers

As with hitters, I grabbed all pitcher seasons…

from 2015 through 2017 who faced
at least 200 batters prior to June 30 and
at least 200 batters after July 1.
(n = 341)

(Note the smaller sample size than for hitters. Did you know pitchers get injured?)

In-Season xwOBA Correlations: Pitchers

Metric	1H wOBA	1H xwOBA	1H diff	2H wOBA	2H xwOBA	2H diff
1H wOBA	1.000
1H xwOBA	0.800	1.000
1H diff	0.546	-0.066	1.000
2H wOBA	0.293	0.354	-0.006	1.000
2H xwOBA	0.358	0.443	-0.022	0.823	1.000
2H diff	-0.015	-0.033	0.022	0.540	-0.033	1.000

SOURCE: Statcast (Baseball Savant)

(A lot more blue there for pitchers than hitters.)

Same-half xwOBA to wOBA: r = .800, r = .823

Also strong! Even stronger than for hitters, which is interesting (but maybe not meaningful). Judge found a full-season correlation for pitchers of r = 0.83, which suggests a full season’s worth of plate appearances increases the descriptive capacity of xwOBA by a couple more percentage points. Does that mean xwOBA becomes reliable in-season fairly quickly? A question for another day (or, admittedly, another person smarter than me).

1H xwOBA to 2H wOBA: r = .354

As with for hitters, first-half xwOBA doesn’t serve as a strong predictor of second-half wOBA allowed for pitchers. We’re essentially hoping that expected outcomes from a series of batted balls in one half of play mirror actual outcomes from a series of completely independent batted balls in another, which is a lot to ask. There’s some level of “predictiveness” here, but it’s not strong — as can be reasonably expected from a game of peaks and valleys, ebb and flow, etc.

1H xwOBA-wOBA (“diff”) to 2H anything

Bad. All bad. There’s nothing here.

Reflections

For hitters and pitchers, I don’t think these non-relationships are all xwOBA’s fault. The first-half wOBA-xwOBA differential, especially when paired with its second-half counterparts, has no concept of context. The differential doesn’t “know” if a player’s first-half xwOBA and/or wOBA is out of character, and unexpected first-half outcomes don’t preclude unexpected second-half outcomes. The differential also doesn’t “know” that a specific differential (for example, -0.030 wOBA-xwOBA) doesn’t mean the same thing to varying levels of hitter talent.

Player performance is not linear but, instead, characterized by peaks and valleys. xwOBA (the inputs) fluctuates, and wOBA (the output) fluctuates around it in tandem. It’s a messy affair to describe, let alone predict. As a descriptive tool, xwOBA does a very commendable job; as a predictive tool, not so much. But! Again, that does not negate the value of leveraging xwOBA and the wOBA-xwOBA differential in the tails of their distributions, at their most extreme.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG