Pitch Sequencing and Pitcher xBB%: We’re Getting There

I expected to follow up my xK% differential post from last week with a complementary xBB% differential post. For those who don’t enjoy surprises, I’ll let you know now that that didn’t happen. In its stead, I bring what I hope is good news — news that will not only influence a future xBB% differential post but also may impact general pitcher analysis henceforth and possibly international diplomacy.

The title of this post, however, is a tad misleading. I think I can say, with some degree of certainty — and I hope to demonstrate, with some degree of competency — that pitch sequencing indeed plays a role in a pitcher’s walk rate, as the devilishly handsome Mike Podhorzer has postulated. What I can’t describe, with any degree of certainty, is the magnitude of the role it plays. In truth, I desperately want to prove Mike wrong: there must be other factors, outside of pitch sequencing (and pitch framing, perhaps), that help explain a pitcher’s walk rate. For example, I have tried incorporating O-Swing% and Zone%, two PITCHf/x metrics provided by FanGraphs that I swore would fill in the cracks, but they offer little in the way of additional explanatory power.

Undeterred, I revisited the data already available to me. And that’s when I saw it: percentage of counts that reached three balls, no strikes (“3-0%”). It was love at first sight. To poorly segue, I’ll quote Mike, from his original xK% post back in 2013:

If a pitcher throws 16 balls all game, but they all come in a row, he has walked four batters. Yet if he pitched seven innings and threw 100 pitches, only 16 balls is one heck of a ratio and would not normally match up with four walks. So sequencing is important, but only if there is a real difference in ability between pitchers.

This is essentially what 3-0% measures in shorthand. A great pitcher ideally would limit the number of plate appearances in which he sees a 3-0 count — a count that, unsurprisingly, correlates pretty strongly with walk rate (BB%) and strike rate (Str%). However, the multicollinearity (the degree to which variables move with one another) is not as strong as I expected — not nearly as strong as the collinearity between strikeout rate (K%) and strikes put into play (I/Str), two other components of the equation. Until I have access to (or can compile) better cross-sectional pitch count data, I will have to settle for using 3-0% as a proxy for pitch-sequencing skill, and I’m OK with that.

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

In an attempt to exactly replicate Mike’s work, I used Baseball Reference pitch data from 2008 through 2012, limiting the sample to pitchers who notched at least 50 innings in a season. For whatever reason, my coefficients and R-squared differed (very slightly) from those produced by his model, despite the data set and model specifications being identical. (Mike and I are stumped by it.) Because of the incongruence, I chose to expand the data to include the 2013 and 2014 seasons, and I changed the threshold to 1,000 pitches thrown. Using total pitches, instead of innings, disregards a pitcher’s efficiency; this is simply a matter of preference on my part. (For reference, 1,000 pitches loosely equates to 60 innings pitched, on average.) Without further ado:

pitcher BB% vs xBB%

xBB% = 0.598 — 0.264*K% — 0.595*I/Str — 0.494*Str% + 0.515*(3-0%)

The model’s adjusted R-squared improves from 0.7515* to 0.8209. It’s a marginal improvement — about seven points — but it’s the first time I’ve seen FanGraphs (or anyone) achieve an adjusted R-squared in the .80s, so I’m happy. Increasing the innings-pitched (or pitch) threshold to 75 IP (~1,250 pitches) improves the adjusted R-squared another two points, but it also ignores many individual seasons by relief pitchers, which would deviate from the goal of this exercise.

To reiterate: the xBB% metric could be helpful to a fantasy owner looking to identify pitchers due to regress. Salient examples include 2014’s most extreme outliers, Kevin Gausman (whose xBB% exceeds his 8.0 BB% by 2.4 percentage points) and A.J. Ramos (whose xBB% undercuts his 15.9 BB% by 3.8 percentage points). There is merit to knowing how a pitcher’s BB% annually performs against his xBB%, however. Consider the previous sentence a teaser for next week’s post, which will pair well with my inaugural work regarding a pitcher’s K%-to-xK% differential and a nice pinot noir.

 

*Because of the inexplicable incongruence between Mike’s and my models, my adjusted R-squared was slightly lower than his, which was .7697.





Two-time FSWA award winner, including 2018 Baseball Writer of the Year, and 8-time award finalist. Featured in Lindy's magazine (2018, 2019), Rotowire magazine (2021), and Baseball Prospectus (2022, 2023, 2024, 2025). Biased toward a nicely rolled baseball pant.

13 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mike PodhorzerFanGraphs Staff
10 years ago

Awesome! Do you think the correlation between K% and I/Str is an issue in both our equations?