The xHR/FB Rate Equation Unmasked

January 26, 2015

A year ago, I took our batted ball distance and HR/FB rate analysis to its final step. Armed with additional metrics from our in-house math ninja Jeff Zimmerman, I published an equation to predict a hitter’s home run per fly ball rate, which I cleverly dubbed xHR/FB. Unfortunately, the actual equation derived has been hidden behind the FG+ pay wall since. No longer.

In recent weeks, I have discussed the two additional components of the equation, along with the leaders and laggards in each during the 2014 season. They are the average absolute angle of a hitter’s fly balls and home runs and the standard deviation of the distances (SDD) of those batted balls. I also calculated the year-to-year correlations of these two metrics along with batted ball distance, finding that the SDD was rather stable, while the angle jumps around each season.

The study was performed on a population of 2,645 hitter seasons from 2008 to 2013, all of which hit a minimum of 20 home runs plus fly balls.

And the equation is…

xHR/FB = -0.8895 + (Avg Dist * 0.0025) + (Avg Absolute Angle * 0.0048) +(Std Dev Dist * 0.0038)

Adjusted R-Squared = 0.649

Now for a plot of the results:

How important is each component of the formula? I decided to run an experiment, using a baseline (values close to the league average) for each metric and calculating the xHR/FB rate. The baseline xHR/FB rate was 10.9%. I then increased each metric one by one by 10%, leaving the other two metrics at their baseline values, to find the new xHR/FB rate. These were the changes:

Baseline	10% Increase	New xHR/FB
275.0	302.5	17.8%
19.5	21.5	11.8%
56.0	61.6	13.0%

Not surprisingly, batted ball distance plays the biggest role. A 10% increase in distance shoots the xHR/FB rate up from a 10.9% mark all the way to 17.8%. The other two metrics play more minor roles, with the SDD proving to be a little more significant than the angle. Of course, increasing your batted ball distance by 10% is much more difficult than the other two metrics.

So using same-season data, the equation works pretty well to estimate what a hitter’s HR/FB rate should be as the above adjusted R-squared would attest to. But how well does it do when forecasting the following season? My population set was composed of 1,198 player seasons from 2009 to 2014. I calculated the correlation of the players’ xHR/FB rates in Year 1 to their actual HR/FB rates in Year 2. I then compared that correlation to the correlation of that data set’s HR/FB rate in Year 1 to HR/FB rate in Year 2. Is my xHR/FB rate better at predicting the following season’s HR/FB rate than HR/FB rate itself? Let’s find out:

Metric	Correlation
xHR/FB Yr 1 to HR/FB Yr 2	0.590
HR/FB Yr 1 to HR/FB Yr 2	0.578

Success! Barely. But there is reason for more excitement than appears at first glance. An acknowledged weakness of my equation is that it ignores park effects. We know that ballpark plays an enormous role in a hitter’s HR/FB rate, but unfortunately there is no easy way for me to incorporate this type of data easily. So the fact that this equation that pretends every hitter plays in a neutral park still fared ever so slightly better than just relying on previous year’s HR/FB rate is a huge win.

Knowing that the metric ignores park effects, you could then take the xHR/FB rate mark and make your own park adjustment to it. But, there is a caveat when considering park factors. The conditions at some parks affect the distance of the batted balls themselves. Examples include Coors Field and Petco Park. In those situations, the distances will already account for that particular park effect and therefore further adjustments to the xHR/FB rate won’t be necessary. Essentially, adjustments should only be made due to fence distances and any other factors that don’t directly impact batted ball distance.

As a reminder, the average absolute angle and SDD metrics are not currently available online, so at this point, you will be unable to calculate xHR/FB rate marks on your own.

During the rest of the week, I’ll look into the hitters who most underperformed and overperformed their xHR/FB rates in 2014.

24 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

wgmcd

10 years ago

Have you thought about trying to model home run rate by using each player as a replicate, but instead of home run / FB % use a binomial distribution with each fly ball being a trial? As it is set up right now, you’re giving equal weight to a player that hit 20 fly balls as one who hit 200. Also, using multiple model inference approaches such as AIC would give you a clearer idea of which model is performing better when looking at future years.

Mike PodhorzerFanGraphs Staff

Reply to wgmcd

I wish I was smart enough to know what you were talking about! But yes, my study was based on just a basic regression analysis.

novaether

Reply to Mike Podhorzer

Just google how to do a weighted regression in excel. You should find that the gap between the two R^2s increases when you start weighting if your model is truly superior.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG