xK%, History and Speculating on Dellin Betances

I’d like to talk to you about Dellin Betances.

Wait! Wait. No. No, I wouldn’t. I’d like to talk about Mike Podhorzer first. Mike has published a lot of great work covering the fundamentals of the xK% (and xBB%) metric for pitchers (and hitters), so if you are unfamiliar with or falling behind on his work, I recommend you first click here, here or here. But if you’re lazy, the short of it is: xK%, or expected strikeout rate, is an equation birthed from a linear regression that measures how a pitcher’s looking, swinging and foul-ball strike rates as well as overall strike percentage correlates with his strikeout rate. It doesn’t predict future strikeout rates as much as it retrospectively adjusts past strikeout rates; thus, it is a good tool for identifying pitchers who potentially benefited (or suffered) from good (bad) luck in a previous season – say, 2014.

Like many other metrics completely unrelated to xK%, however, there is evidence that certain players consistently out-perform (or under-perform) what their xK% rates predict their actual K% rates should be. (Mike alludes to this trend in his quip about Jeremy Hellickson, a xK% underachiever, in one of the articles linked above.) Similarly to how a power hitter will post consistently higher ratios of home runs to fly balls (HR/FB) than a non-power hitter, or how Mike Trout will probably post some of the highest batting averages on balls in play (babip) in the league for years to come, it appears there is some skill, or perhaps a particular characteristic, inherent to pitchers who consistently best, or fall short of, their xK% rates.

Kevin Correia is a particularly salient example. He who has underwhelmed fantasy owners for years has also notched significantly positive margins between his K% and xK% (which I will henceforth refer to as the “differential”) for the past four years:

2011: +2.5% (aka 2.5 percentage points better than his xK%)
2012: +2.5%
2013: +2.4%
2014: +2.3%

You Aren't a FanGraphs Member
It looks like you aren't yet a FanGraphs Member (or aren't logged in). We aren't mad, just disappointed.
We get it. You want to read this article. But before we let you get back to it, we'd like to point out a few of the good reasons why you should become a Member.
1. Ad Free viewing! We won't bug you with this ad, or any other.
2. Unlimited articles! Non-Members only get to read 10 free articles a month. Members never get cut off.
3. Dark mode and Classic mode!
4. Custom player page dashboards! Choose the player cards you want, in the order you want them.
5. One-click data exports! Export our projections and leaderboards for your personal projects.
6. Remove the photos on the home page! (Honestly, this doesn't sound so great to us, but some people wanted it, and we like to give our Members what they want.)
7. Even more Steamer projections! We have handedness, percentile, and context neutral projections available for Members only.
8. Get FanGraphs Walk-Off, a customized year end review! Find out exactly how you used FanGraphs this year, and how that compares to other Members. Don't be a victim of FOMO.
9. A weekly mailbag column, exclusively for Members.
10. Help support FanGraphs and our entire staff! Our Members provide us with critical resources to improve the site and deliver new features!
We hope you'll consider a Membership today, for yourself or as a gift! And we realize this has been an awfully long sales pitch, so we've also removed all the other ads in this article. We didn't want to overdo it.

Disclaimer: Although I am using the same fundamental equation as Mike, I keep on hand a slightly different data set in terms of longitude, and I use slightly different qualification thresholds. Thus, my coefficients vary, albeit minimally, from those of Mike’s model.

I’m wary to establish a definitive measure of consistency, but it’s clear Correia hasn’t left much room for year-to-year variance. There are a handful of elite starters who fit the bill as well. Adam Wainwright and Felix Hernandez have outperformed their annual xK% every year in which they threw at least 500 pitches dating back to 2005; Cliff Lee has achieved the feat annually since 2008. On the flip side of the coin, Homer Bailey has underperformed his xK% in each professional season at the major league level, as has Cole Hamels in every year but one.

Thus, calculating a pitcher’s xK% and comparing it to his K% is not enough; it’s important to know how his K% fares historically compared to his xK%. This leaves us in difficult situation when we evaluate the differential for pitchers who debuted or broke out last year. What does James Paxton’s +1.7% differential actually mean? Or Matt Shoemaker’s +1.3%, or Masahiro Tanaka’s +1.2%? To play it safe, I would expect all of them to regress, as their differentials are relatively small (two-thirds of 2014’s differentials fall between +/-1.5%).

Which brings me back to Dellin Betances. He recorded a +5.8% differential in 2014, the fourth-highest single-season differential for any pitcher who threw at least 500 pitches in the last 10 years. That kind of statistic screams regression, but a closer look at the data may help us better understand what’s going on.

To start, the rest of the names on the top-10 list of which Betances finished fourth looks as follows:

1. Craig Kimbrel, +6.6% (2012)
2. Chien-Ming Wang, +6.1% (2005)
3. Aroldis Chapman, +5.9% (2014)
4. Dellin Betances, +5.8% (2014)
5. Brandon League, +5.2% (2005)
6. J.J. Putz, +4.7% (2007)
7. Aroldis Chapman, +4.7% (2012)
8. Craig Kimbrel, +4.6% (2013)
9. Andrew Miller, +4.6% (2014)
10. Neftali Feliz, +4.6% (2009)

The broad trend here is fairly obvious: every entry on the list but Wang is a relief pitcher. (An aside: Wang’s and League’s appearances are especially hilarious, given their year-end strikeout rates were 9.7 percent and 10.5 percent, respectively. Must’ve been something in water in 2005. The names that follow them that year: prime Mariano Rivera, Roy Halladay and Francisco Rodriguez.) Moreover, half the spots on the list are owned by elite relievers such as Kimbrel, Chapman and Miller. Whether or not you call Betances elite at this point is a matter of personal preference, but for the sake of argument, I am willing to consider him among the elite for now.

Double-moreover, I ask you to please turn your attention to this sampling of names from 2014’s list of top-20 differentials, which could speak for itself if it knew how to use words:

5. David Robertson, +4.4%
11. Brad Boxberger, +3.2%
13. Wade Davis, +3.1%
14. Ken Giles, +2.8%
16. Zach Duke, +2.8%
18. Sean Doolittle, +2.7%
19. Jake McGee, +2.7%

Whatever you may think about the true abilities of these pitchers, positive differentials seem to favor better-than-average pitchers who get to blow hitters away in brief intervals. So while I think Betances’ actual K% stands to regress in 2015, I wouldn’t expect it to fall by 5.8 percentage points, or maybe even half that much. Similarly, Kimbrel’s 2014 differential of +1.6% is the lowest of his career and a solid 2.4 percentage points below his 4-year average differential.

It’s about understanding each pitcher’s history. It’s a lot to ask to mentally retain performance data for every name, but if you decide to calculate a pitcher’s xK% sometime during next year, I simply recommend you also look a couple of years back, too, to get a feel for his track record. For most, there won’t be one; it’ll be an unintelligible sequence of positive and negative numbers of all magnitudes. But, occasionally, some semblance of uniformity will appear.

With that said, it would behoove me to run the regression separately for starters and relievers, as there appears to be evidence of distinct K%-to-xK% trends between starters and relievers. It would doubly behoove me figure out why a certain few pitchers consistently over- or under-perform their differentials while most others do not. What is the common factor there?

In the meantime, and for the sake of not burdening you with long lists of numbers, you can view the data behind the analysis in this Excel document, which lists individual season differentials dating back to 2010 for each pitcher who threw at least 500 pitches in 2014. It is all sorted by the column “5t,” which is my consistency measure for each pitcher with five years’ worth of data; “3t” is for 2012 through 2014. The higher the score, the more consistent (lowest score is zero). Pitchers with only 2014 data will not have a t-score and, thus, be listed farther down the list. If a pitcher is missing completely, remember that he maybe has yet to achieve 500 career major-league pitches (looking at you, Aaron Sanchez).





Two-time FSWA award winner, including 2018 Baseball Writer of the Year, and 8-time award finalist. Featured in Lindy's magazine (2018, 2019), Rotowire magazine (2021), and Baseball Prospectus (2022, 2023, 2024, 2025). Biased toward a nicely rolled baseball pant.

14 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Dolemite
10 years ago

“Similarly to how a power hitter will post consistently higher ratios of fly balls to home runs (FB/HR) than a non-power hitter”

I believe you either mean a lower ratio of FB/HR or a higher ratio of HR/FB

welcome to fangraphs
eager to read articles from someone w an econometrics background