Devising a Pitcher xHR/FB Rate

I was pretty successful at developing an equation to estimate what a hitter’s HR/FB rate should be given his fly ball + home run distance, along with the average absolute angle and standard deviation of the distance of those batted balls. My formula resulted in an R-squared mark of 0.649, which seems pretty darn good to me, especially when it completely ignores park factors, which we know play a significant role. On Tuesday, I found that the year-to-year correlation of a pitcher’s batted ball distance is less than half that of a hitter’s. Then yesterday, I discovered there was some correlation between a pitcher’s batted ball distance and his HR/FB rate and ISO mark.

In that last article, I teased that Jeff Zimmerman also armed me with the same angle and standard deviation data he provides me for hitters, but for pitchers. So naturally, my first inclination was to line up all the data and run those three variables to attempt to devise a pitcher version of the xHR/FB rate equation.

Once again, my data set was comprised of 663 player seasons from 2008 to 2014. The best fit equation was thus:

xHR/FB = -0.4211 + (Avg Dist * 0.0013) + (Avg Absolute Angle * 0.0036) +(Std Dev Dist * 0.0016)

Just like in the hitter’s version, angle plays the most important role, followed by standard deviation. Who would have thought that average distance would bring up the rear? Unfortunately, this equation isn’t nearly as good as the hitter’s one:

Adjusted R-squared = 0.276

That’s not terrible, but it’s quite low. Here’s a scatter plot of xHR/FB vs HR/FB:

Pitcher xHR-FB

Because I know this question will be asked, the dot all by its lonesome at the top on the 16% xHR/FB line is Miguel Gonzalez, 2014 version, with a 12.1% actual HR/FB.

In this data set, the actual HR/FB rate ranged from 3.1% to 19.2%, while the xHR/FB rate ranged from 6.6% to 16.0%. This seems like part of the problem with coming up with an equation to estimate it. It’s very fluky, jumps all over the place, and is mostly controlled by the hitter. So even if we thought we knew what HR/FB rate a pitcher should be allowing, it’s not going to end up there the majority of the time due to total randomness.

For those curious, here are three notorious HR/FB rate suppressors:

Jered Weaver Matt Cain Clayton Kershaw
2008 8.3% 8.7% 6.8% 9.8%
2009 8.3% 7.9% 8.4% 10.6% 4.1% 9.3%
2010 7.8% 8.4% 7.4% 9.8% 5.8% 9.4%
2011 6.3% 6.8% 3.7% 8.4% 6.7% 7.1%
2012 8.6% 7.3% 8.4% 9.2% 8.1% 9.6%
2013 7.8% 7.9% 10.8% 10.9% 5.8% 8.3%
2014 8.9% 7.9% 13.7% 11.6% 6.6% 9.8%

Looks like Weaver’s HR/FB prevention skills are legit, at least according to this xHR/FB formula. So maybe it’s not just the pitcher friendly ball park. Cain had some magic going on earlier in his career, certainly with some help from his home park, but that magic has seemingly disappeared. Kershaw is obviously not a human, and defies any sort of formula. He’s a true outlier, which formulas aren’t meant to work for.

While this was a fun little exercise, the equation developed isn’t all that helpful. The search continues as we try explaining HR/FB rate differences that aren’t totally chalked up to ball park and luck.

Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year. He produces player projections using his own forecasting system and is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. Follow Mike on Twitter @MikePodhorzer and contact him via email.

newest oldest most voted
Skin Blues
Skin Blues

How predictive is it? Is there a statistical significance when you use a formula like this:

xHR/FB Year n+1 = Constant A + (Avg Dist Year n * Constant B) + (Avg Absolute Angle Year n * Constant C) + (Std Dev Dist Year n * Constant D)

It’s projection season, so this is a lot more useful than using in-sample testing. I assume that doing an xHR/FB for year n+1 would also include prior HR/FB rates in some capacity, too.