Dissecting Pitcher xBB% Differentials

March 4, 2015

Two weeks ago, I wrote about the importance of evaluating expected strikeout rate (xK%) in the context of each pitcher’s respective histories. In other words, xK% on its own can only tell you so much about a pitcher’s chance and magnitude of regression toward the mean.

And last week, I refined the expected walk rate (xBB%) metric for pitchers by adding a proxy for pitch sequencing in the form of percentage of counts that reach 3-0 (“3-0%”). This helped better explain the model’s fit with respect to the data, as pitchers who worked into more 3-0 counts tended to walk more batters. (Who knew?)

The logical next step is to combine the two aforementioned analyses: 1) comparing xBB% to BB% 2) for each pitcher over time. I’ll reiterate a couple of key points. Calculating a pitcher’s xBB% can give us a decent idea of how lucky or unlucky he may have been during a given season. Calculating his xBB% and comparing it to his actual BB% on an annual basis can give us a better idea of truly how he typically performs against his xBB% — that is, if he consistently outperforms his xBB%, perhaps the difference between his xBB% and BB% is not a matter of luck at all but a skill or characteristic not captured by the variables specified in the xBB% equation.

It’s important not to get carried away. Not all pitchers consistently perform one way or the other against their expected rates, whether it’s xK%, xBB%, FIP or xBABIP. But for the ones who do, it’s a lot easier to predict regression, especially when an outlier — a blip on the radar — is easy to identify.

It’s more difficult, however, to predict the direction and magnitude of regression for pitchers who a) are inconsistent, or b) have very little data to evaluate. I will cherry-pick a handful of fantasy-relevant pitchers who fit this bill and registered 2014 BB%-to-xBB% differentials that greatly varied from their 2013 differentials. Keep in mind a negative number indicates a larger xBB% than BB% (over-performance) and a positive number indicates a larger BB% than xBB% (under-performance).

Trevor Rosenthal, STL RP
2013: -1.7%, 2014: +2.4%

Rosenthal did a remarkable job of walking more than twice as many batters as he did the year before. If we accept both his expected walk rates as gospel, he should have walked 8.1 and 11.2 percent of batters in 2013 and 2014, respectively — still not a great margin between the two years, but certainly less of an eyesore. If there’s a light at the end of this tunnel, it’s that Rosenthal’s 2013 xBB% aligns more with his minor-league BB% (aka his historical performance) than his 2014, the latter of which would have spelled catastrophe had it not been for a conveniently-depressed home run-to-fly ball ratio (HR/FB) despite a plummeting ground ball rate (38 percent) and stable infield fly rate. I would anticipate positive regression in 2015 — I can’t fathom an alternate universe where it otherwise wouldn’t — but it’s worth acknowledging that Rosenthal was never overwhelmingly fond of the strike zone in the minors, either.

Santiago Casilla, SF RP
2013: +1.9%, 2014: -1.9%

Casilla seized the closer role from Sergio Romo and ran with it during a season in which his walk rate differential flip-flopped. A look at Casilla’s differentials from 2011 through 2013 reveal consistent under-performance (+1.3%, +1.3%, +1.9% from 2011-13). Expect the differential to revert back to its trend, and you’re looking at a pitcher who, rather than cut his walk rate almost in half, should have walked more than 10 percent of batters, a rate that almost perfectly replicates his career rate. I wouldn’t bet against Romo reclaiming the closer role when Casilla resumes his erratic ways.

Jake Arrieta, CHC SP
2013: +0.8%, 2014: -1.6%

Arrieta hasn’t been very consistent, but his three- and five-year differentials of -1.3% and -0.9% render 2013 (+0.8%) the errant mark. (Indeed, it’s his worst since 2010.) Moreover, Arrieta reinvented himself last year; I would still expect a wee bit o’ regression, but I also wouldn’t be surprised if the reborn Arrieta just established a new baseline differential for himself.

Andrew Cashner, SD SP
2013: -0.6%, 2014: +1.6%

Relative to other single-season walk differentials, Cashner’s underwhelm. What makes his 2014 season particularly interesting is he not only suffered some mildly bad luck but also shaved a percentage point off his 2013 walk rate, from 6.7 percent down to 5.7 percent. If 2014 turns out to be legit, and he’s more of a guy with a neutral differential (+0.1% in 2010, -0.1% in 2012), his Doug Fister-esque reluctance to allow free passes will definitely make up for his reluctance to accrue strikeouts.

Transitioning to pitchers with only one year of data: they’re even harder to evaluate. Here are some of 2014’s most extreme walk rate differentials by pitchers who eclipsed 500 pitches, at both ends of the spectrum:

Marco Gonzales (STL SP/RP), +3.0%
Yordano Ventura (KC SP), +1.6%
Jesse Hahn (OAK SP), +1.3%
Masahiro Tanaka (NYY SP), -1.4%
Collin McHugh (NYM SP), -1.7%
Matt Shoemaker (LAA SP), -1.7%

Most of these guys put in close to a full season’s work last year, but Gonzales may be the victim of sample size noise.

Lastly, I leave you with an Excel document chronicling the walk rate differentials by year for any player who has thrown at least 500 pitches in any of the past three seasons. The far-right columns, labeled “3t” and “5t”, are my attempts to capture each pitcher’s consistency across each three-year (2012-14) and five-year (2010-14) sample. I calculated efficiency scores only for pitchers who qualified in each year of the respective sample. The higher the score, the more consistent the walk rate differential; zero indicates absolute inconsistency, and greater than ~2 is excellent (those pitchers are highlighted).

But I would also recommend simply eyeballing the data and identifying patterns for yourself. One outlier in the sample can throw off a pitcher’s consistency scores — for example, I don’t calculate three-year consistency scores for 2010-12 or 2011-13 — so it’s worth making your own assessments as well. Alas, the most consistent pitchers according to the Excel doc are likely far removed from the regression candidates you seek.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG