On Explaining Player xK% Divergence

January 18, 2017

Yesterday I continued new xK% equation week by discussing the 10 pitchers that have overperformed and underperformed the metric the most since 2011. While I calculated the group averages, pulled in fastball velocity, and most frequently used secondary pitch, the sample size was far too tiny to conclude anything. So at the request of commenter JUICEMANE, I have decided to do a larger study in an attempt to explain why some pitchers consistently over- or underperform the xK% equation. Do the players within each group on either side have anything in common with their groupmates?

I arbitrarily chose to include the top 150 players at the top and bottom to use as my data set to analyze out of my 906 players. I pulled in fastball velocity and the most frequently used secondary pitch.

Let’s begin with the group averages:

xK% Group Averages

Player	Average Season K%	Average Season xK%	Average Season K%-xK%	Total K%-xK%	FBv
Overperformer Average	22.0%	20.5%	1.5%	5.4%	92.3
Underperformer Average	20.1%	21.9%	-1.8%	-5.7%	91.3
Total Data Set Avg (906)	20.1%	20.3%	-0.2%	-0.3%	91.8

There are a couple of takeaways here:

-The overperformer group isn’t “supposed to” be any better than the average pitcher in the entire data set (nearly identical average xK% marks), but because of their overperformance, actually posted a K% mark nearly 2% higher
-The underperformer group matched the entire data set in actually K%, despite posting an xK% 1.6% higher
-The overperformer group featured average fastball velocity 0.5 mph higher than the entire data set, while the underperformer group was 0.5 mph lower.

From this table, it would sure seem like fastball velocity is an important addition and could add incremental explanatory power to the xK% equation. Unfortunately, I already tried adding velocity into my equation before typing this up and the R-squared didn’t budge. I’m not enough of a statistical wiz to know if velocity isn’t actually meaningful because it didn’t move the R-squared needle, or if a boost in accuracy would simply show up somewhere else.

I tested Aroldis Chapman’s 2014 season, when he averaged 100.3 mph with his fastball, comparing the current xK% to the coefficients calculated with FBv included. The latter equation that incorporated his 100.3 mph fastball only increased his xK% by 0.1%!

So I’m at a loss for what to do, if anything, with fastball velocity. I am also hesitant to include it, because I still would expect it to show up in the strike type rates.

Last, let’s take a look at the most frequently used secondary pitches of each group:

xK% Pitch Type % Group Averages

Pitch Type	All	Overperformers	Underperformers
SL	49.4%	45.3%	48.7%
CB	18.5%	24.7%	12.0%
CH	16.3%	10.7%	22.7%
CT	12.0%	15.3%	11.3%
SF	3.3%	4.0%	4.0%
KN	0.3%	0.0%	1.3%

Well darn, this may be even more illuminating than the gap in fastball velocity! As you peruse through the matrix, it’s easy to spot the pink elephants. As compared to all pitchers in the data set and especially the underperformers, the overperformer group relies far more on the curve ball, and far less on the changeup.

On the other hand, the changeup is the second most common secondary pitch in the underpeformers’ arsenal. The changeup is actually the overperformers fourth most commonly thrown pitch, as both the curve ball and cutter is ahead of it.

So let’s summarize:

-The overperformers throw the curve and cutter more often
-The underperformers throw the changeup more often
-The overperformers throw the changeup less often
-The underperformers throw the curve less often

So curve ball and cutter = good for outperformering xK%, changeup = bad?

If we settle on this explanation, I wonder aloud again, why wouldn’t these differences simply show up in the strike type rates? Is there something else these pitch types would affect that would influence strikeout rate, but not show up in the components already included in the equation?

2 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Broken BatMember since 2020

8 years ago

Mike, really enjoy the quality and quantity of your work. You asked a question in your developing work ( above). Perhaps the pitch count (1/2 or 3/1) and when the group of curve balls and changes occur? If a pitcher with high velocity throws a change or curve in a 2 strike count, it would seem that pitch selection would produce more swing and miss or take ( because of surprise) than a lower velocity pitcher who might use a higher % of changeups and curve balls in 2 strike counts = less surprise to hitter). It might be a ridiculous hypothesis, but I too am looking for something to differential the performers etc.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG