xK% Outperformers and Underperformers Through the Years
Yesterday I rolled out an updated version of my pitcher xK% equation, which estimates what a pitcher’s strikeout rate “should be” given various strike and strike type metrics found at Baseball-Reference.com. With my data set, I put together a table calculating historical averages during the time period (2011-2016) I compiled data for. I’ll share the top 10 pitchers that have outperformed and underperformed their xK% (so if a pitcher outperformed by 2% in 2011 and 3% in 2012, I’ll be looking at the total of 5%, rather than the average of 2.5%), and we’ll try to figure out what, if anything, the pitchers in each group have in common with each other. So let the fun begin!
Player | Average Season K% | Average Season xK% | Average Season K%-xK% | Total K%-xK% |
---|---|---|---|---|
Craig Kimbrel | 40.5% | 37.2% | 3.3% | 19.7% |
Kenley Jansen | 40.1% | 37.0% | 3.1% | 18.4% |
Aroldis Chapman | 42.8% | 39.9% | 2.8% | 16.9% |
Clayton Kershaw | 29.3% | 26.7% | 2.6% | 15.6% |
Stephen Strasburg | 28.9% | 26.3% | 2.6% | 13.1% |
David Robertson | 33.1% | 31.0% | 2.1% | 12.6% |
Andrew Miller | 35.0% | 33.0% | 2.0% | 11.9% |
Cesar Ramos | 18.2% | 16.3% | 1.9% | 11.5% |
Dellin Betances | 40.4% | 36.6% | 3.8% | 11.3% |
Mike Leake | 16.3% | 14.4% | 1.8% | 11.0% |
Group Average | 32.4% | 29.8% |
Well this is quite the group of pitchers. On average, they posted a 32.4% strikeout rate each season! Of the 10 pitchers, seven of them are relievers, and the top three are arguably the best in the game right now. Our top starting pitcher is no surprise, as he’s the best pitcher on the planet.
Are you thinking this is obvious? Of course the best pitchers are going to outperform the formulas right? They are the outliers. Well, no, I think it’s the reverse. We perceive these pitchers are the best because they do even more than we expect them to. This is as opposed to us knowing they are the best and then learning that they also outperform their xK%. It’s their outperformance that is driving their success. Or at least acts as one of the drivers.
Of course, two pitchers seemingly don’t belong, and that’s Cesar Ramos and Mike Leake. Ramos has been a journeyman LOOGY who hasn’t exactly racked up the strikeouts, but xK% thinks it should have been even worse. He was averaging in the 91 to 92 mph range until 2013, but then his fastball steadily declined, and averaged just 88.2 mph this past season. Smartly, he has moved away from the pitch in favor of his slider, and this year, his changeup.
Leake is another soft-tosser, but unlike Ramos, and the majority of the rest of the human population, he has actually gained fastball velocity. But he’s essentially been fastball-cutter his whole career, and then also mixes in three more pitches for a full five-pitch repertoire.
What does these pitchers have in common? It would be easier to compare their fastball velocities and the pitch they choose as their secondary weapon of choice:
Player | FBv | Most Frequently Used Secondary Pitch |
---|---|---|
Craig Kimbrel | 96.9 | CB |
Kenley Jansen | 92.9 | SL |
Aroldis Chapman | 98.9 | SL |
Clayton Kershaw | 93.2 | SL |
Stephen Strasburg | 95.2 | CB |
David Robertson | 92.2 | CB |
Andrew Miller | 93.9 | SL |
Cesar Ramos | 90.5 | SL |
Dellin Betances | 96.9 | SL |
Mike Leake | 90.3 | CT |
Group Average | 94.1 |
So clearly as a group, their fastball velocity is above the MLB average. To avoid arguing whether some of the pitchers really throw a slider, or is it actually a curve ball, or wait, it’s a SLURVE (!!), let’s just say that nine of the ten use a breaking ball to complement their fastball. Only Leake leans on the cutter, which isn’t entirely fair, since that’s a type of fastball. And heck, that’s really what Jansen’s fastball is too. If we ignore Leake’s cutter, we’re left with his curve ball, or slider, since they have almost identical usage (the changeup is also just barely behind).
What’s interesting to me is that none of these pitchers rely on the changeup. It’s all breaking balls, all the time when they aren’t throwing the fastball. Hmmmmm.
Now let’s check the underperformers:
Player | Average Season K% | Average Season xK% | Average Season K%-xK% | Total K%-xK% |
---|---|---|---|---|
Matt Belisle | 18.2% | 20.9% | -2.7% | -16.4% |
Luke Gregerson | 22.9% | 25.1% | -2.2% | -13.4% |
Fernando Rodney | 24.4% | 26.6% | -2.2% | -13.2% |
Louis Coleman | 23.3% | 26.4% | -3.2% | -12.6% |
James Russell | 16.4% | 18.6% | -2.2% | -11.1% |
Randall Delgado | 19.5% | 21.4% | -1.8% | -11.0% |
Hector Noesi | 16.1% | 18.8% | -2.7% | -10.8% |
Nathan Eovaldi | 16.8% | 18.6% | -1.8% | -10.5% |
R.A. Dickey | 18.2% | 20.0% | -1.8% | -10.5% |
Pat Neshek | 22.3% | 24.9% | -2.6% | -10.5% |
Group Average | 19.8% | 22.1% |
Wowzers. That’s one boring group of pitchers. And quite a bit less attractive than the first group. But again, this shouldn’t be too surprising as these pitchers haven’t enjoyed as much success because they have posted results that didn’t quite match up with some of their underlying advanced metrics.
There’s really only two full-time starters on the list, with a couple of others having some starts here and there included in their averages. Obviously, the group’s average strikeout rate is far lower than the overperformers. The rich get richer and the poor get poorer.
It’s sad to find Nathan Eovaldi, king of the “strikeout rate doesn’t match the stuff” theme, on this list. Of course he’s here! I have even less of an idea of whether there’s a common thread by just looking at the names than I did with the outperformers group, so let’s go through the same exercise:
Player | FBv | Most Frequently Used Secondary Pitch |
---|---|---|
Matt Belisle | 91.1 | SL |
Luke Gregerson | 89.0 | SL |
Fernando Rodney | 95.4 | CH |
Louis Coleman | 89.6 | SL |
James Russell | 89.0 | SL |
Randall Delgado | 92.3 | CH |
Hector Noesi | 93.0 | CH |
Nathan Eovaldi | 95.8 | SL |
R.A. Dickey | 76.2* | FB |
Pat Neshek | 89.4 | SL |
Group Average | 90.1 |
So as a group, the underperformers averaged 4 mph less in fastball velocity than the overperformers. Excluding Dickey, however, the gap narrows to 2.5 mph, which is still rather significant.
It’s interesting to suddenly see changeups show up on the list, but only from three doesn’t really suggest anything. Perhaps if the majority of the group featured the changeup, we could theorize that for whatever reason, a breaking ball was leading to outperformance, while the changeup was leading to underperformance. And although that could still be the case, these two groups of just ten pitchers certainly don’t prove it.
I do recall some previous incarnations of expected strikeout rate equations using fastball velocity as an input. I’m fairly sure that the Steamer projections do, however I don’t think they use any of the strike type rates like I do. I just always assumed that fastball velocity would simply show up in the strike type rates, with higher velocity leading to a higher S/Str%, while lower leading to lower. But maybe there’s something else that a harder fastball is doing to allow the overperformers to outperform? Unfortunately, if I try to introduce fastball velocity into my equation, the commenters already concerned about the prospect of multicollinearity are going to have their heads explode. So maybe I should proceed with caution, as there’s no doubt fastball velocity correlates highly with S/Str%.
Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year and three-time Tout Wars champion. He is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. Follow Mike on X@MikePodhorzer and contact him via email.
Most of the overachievers are at the extremes (whether it is for starters or relievers). 1. Your model may just break down a bit at the extremes and not capture the impact of extremely high swinging strike rates. The relationship between the predictor variables and K% is probably not truly linear, but close enough for government work for the most part. When you get up in to the extraordinarily high values, the difference between a linear relationship and whatever the “true” relationship is is magnified. 2. At the extremes, there just aren’t that many data points, so the model will likely just not fit as well out there. If I had to guess, if you did a residual plot (plotting residuals on the Y, independent variable on the X) I bet you’d see a pattern of systematic underestimates on the high end.
That’s true of any regression equation we’ve shared on these pages, so you’re most definitely correct. Though you’d think that maybe the underperformers would be the bottom tier of strikeout artists, and they aren’t. They are right at the average.
Also, sadly, I’m not well-versed enough in statistics to even know how to develop a non-linear equation.