Devising a Pitcher xHR/FB Rate

January 8, 2015

I was pretty successful at developing an equation to estimate what a hitter’s HR/FB rate should be given his fly ball + home run distance, along with the average absolute angle and standard deviation of the distance of those batted balls. My formula resulted in an R-squared mark of 0.649, which seems pretty darn good to me, especially when it completely ignores park factors, which we know play a significant role. On Tuesday, I found that the year-to-year correlation of a pitcher’s batted ball distance is less than half that of a hitter’s. Then yesterday, I discovered there was some correlation between a pitcher’s batted ball distance and his HR/FB rate and ISO mark.

In that last article, I teased that Jeff Zimmerman also armed me with the same angle and standard deviation data he provides me for hitters, but for pitchers. So naturally, my first inclination was to line up all the data and run those three variables to attempt to devise a pitcher version of the xHR/FB rate equation.

Once again, my data set was comprised of 663 player seasons from 2008 to 2014. The best fit equation was thus:

xHR/FB = -0.4211 + (Avg Dist * 0.0013) + (Avg Absolute Angle * 0.0036) +(Std Dev Dist * 0.0016)

Just like in the hitter’s version, angle plays the most important role, followed by standard deviation. Who would have thought that average distance would bring up the rear? Unfortunately, this equation isn’t nearly as good as the hitter’s one:

Adjusted R-squared = 0.276

That’s not terrible, but it’s quite low. Here’s a scatter plot of xHR/FB vs HR/FB:

Because I know this question will be asked, the dot all by its lonesome at the top on the 16% xHR/FB line is Miguel Gonzalez, 2014 version, with a 12.1% actual HR/FB.

In this data set, the actual HR/FB rate ranged from 3.1% to 19.2%, while the xHR/FB rate ranged from 6.6% to 16.0%. This seems like part of the problem with coming up with an equation to estimate it. It’s very fluky, jumps all over the place, and is mostly controlled by the hitter. So even if we thought we knew what HR/FB rate a pitcher should be allowing, it’s not going to end up there the majority of the time due to total randomness.

For those curious, here are three notorious HR/FB rate suppressors:

	Jered Weaver		Matt Cain		Clayton Kershaw
Season	HR/FB	xHR/FB	HR/FB	xHR/FB	HR/FB	xHR/FB
2008	8.3%	8.7%	6.8%	9.8%
2009	8.3%	7.9%	8.4%	10.6%	4.1%	9.3%
2010	7.8%	8.4%	7.4%	9.8%	5.8%	9.4%
2011	6.3%	6.8%	3.7%	8.4%	6.7%	7.1%
2012	8.6%	7.3%	8.4%	9.2%	8.1%	9.6%
2013	7.8%	7.9%	10.8%	10.9%	5.8%	8.3%
2014	8.9%	7.9%	13.7%	11.6%	6.6%	9.8%

Looks like Weaver’s HR/FB prevention skills are legit, at least according to this xHR/FB formula. So maybe it’s not just the pitcher friendly ball park. Cain had some magic going on earlier in his career, certainly with some help from his home park, but that magic has seemingly disappeared. Kershaw is obviously not a human, and defies any sort of formula. He’s a true outlier, which formulas aren’t meant to work for.

While this was a fun little exercise, the equation developed isn’t all that helpful. The search continues as we try explaining HR/FB rate differences that aren’t totally chalked up to ball park and luck.

17 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

RotoholicMember since 2016

10 years ago

How predictive is it? Is there a statistical significance when you use a formula like this:

xHR/FB Year n+1 = Constant A + (Avg Dist Year n * Constant B) + (Avg Absolute Angle Year n * Constant C) + (Std Dev Dist Year n * Constant D)

It’s projection season, so this is a lot more useful than using in-sample testing. I assume that doing an xHR/FB for year n+1 would also include prior HR/FB rates in some capacity, too.

Mike PodhorzerFanGraphs Staff

Reply to Rotoholic

I haven’t done the work on its predictive value.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG