Introducing Batter xHR/FB Rate, Version 4.0: The Correlations

Yesterday, I shared the history of my xHR/FB rate equation and the first pieces of research on my journey toward developing Version 4.0. Today, I’ll discuss a myriad of correlations for a myriad of metrics and how those calculations helped me determine which would win a spot in my final equation. Fun!

First, I gathered as much Statcast data as possible that could potentially drive HR/FB rate. I painstakingly downloaded various metric totals for all batters going back to 2015 with at least 50 fly balls plus line drives. That left me with 2,109 batter seasons with which to do lots and lots of math.

I began with correlations. Does the metric even correlate with HR/FB rate, and if so, is it a strong correlation? What follows is a table of correlations with HR/FB rate and definitions for each metric, even if they seem obvious from the name.

Correlations with HR/FB
Metric Correlation
Avg Dist FB+LD 0.783
Barrel FB% 0.763
Avg Dist FB 0.721
Barrel LD% 0.689
Std Dev of Dist FB+LD 0.593
Avg Dist LD 0.438
Pull LD% 0.369
Pull FB% 0.349
Oppo FB% -0.286
Spin Rate FB 0.144
Solid Contact FB% 0.106
Straightaway FB% -0.036

For maximum understanding, I sorted the metrics in order of the absolute value of their correlations, as we don’t care if the correlation is positive or negative, just that it’s strong.

We find three metrics that correlate above 0.70 with HR/FB:

Avg Dist FB+LD — average batted ball distance of all fly balls and line drives
Barrel FB% — percentage of fly balls classified as “Barrel”
Avg Dist FB — average batted ball distance of all fly balls

Obviously, it doesn’t make sense to ultimately use both Avg Dist FB+LD and Avg Dist FB in my equation, so you’ll have to wait and see which ended up finding its way into my equation. Obviously, Barrel FB% makes a compelling case to be included.

Next, we find a metric all alone in the 0.60+ range:

Barrel LD% — percentage of line drives classified as “Barrel”

Aha! That follows what we learned yesterday that it’s not just barreled fly balls that go for homers at a high rate, but barreled line drives are also highly desired. The high correlation here confirms that Barrel LD% is an important rate that cannot be ignore when projecting HR/FB rates.

Moving along, we find yet another metric alone in the 0.50+ range:

Std Dev of Dist FB+LD — standard deviation of batted ball distances of all fly balls and line drives

Remember my inaugural edition of xHR/FB? It included the standard deviation of fly balls, but then the variable disappeared from my equation in future versions because I wasn’t able to access the data, or didn’t know how to. I then realized I could download all the batted ball data from Statcast and calculate the standard deviations of these batted balls myself. It was a time consuming process, but I did just that. Turns out, this was a very good idea, as the metric highly correlates with HR/FB.

What this tells us is how varied a batter’s distances are on those batted ball types. A batter whose batted ball distance is ultra consistent is going to have a lower Std Dev of Dist than one who alternates shorter distances and big blasts. Guess which batter is likely to post a higher HR/FB rate? The big blaster, of course.

Next is Avg Dist LD, which is the same as defined above, but limited to just line drives. The question eventually became whether I wanted to combine the two batted ball types and use Avg Dist FB+LD or separate them and use both Avg Dist FB and Avg Dist LD.

Finally, we get the batted ball direction rates:

Pull LD% — percentage of line drives that were pulled
Pull FB% — percentage of fly balls that were pulled

While we would have expected the two Pull rates to have similar correlations, it’s mildly interesting that Pull LD% actually sports a higher one than its FB counterpart. It suggests that both should be part of the equation.

After those two, we encounter the first negatively correlated metric:

Oppo FB% — percentage of fly balls that were hit the opposite way

That makes sense! If pulling flies and liners is key to hitting home runs, then obviously going the opposite way makes things quite a bit more challenging to get that ball over the wall.

Now, I present to you the metric I’ve never used before and wasn’t even aware of its availability:

Spin Rate FB — the average spin rate of all fly balls

We always hear that more spin results in a ball that will travel further. Now we have confirmation! It’s a pretty low correlation though, ranking near the bottom of the heap.

Last, we finish with the remaining two metrics with low correlations:

Solid Contact FB% — percentage of fly balls classified as “Solid Contact”
Straightaway FB% — percentage of fly balls that were hit straightaway

After discovering that 21.1% of fly balls classified as “Solid Contact” went for a home run during the Statcast era, I was pretty surprised to find the correlation of that rate with HR/FB rate so low. Weird. Strange. Baffling.

Fly balls hit straightaway don’t matter. They are slightly negatively correlated, so I guess that’s something, but the low correlation suggests they could simply be ignored.

So there you have it, correlations galore! Tomorrow, I will unveil xHR/FB v2.0 and then we can have some fun discussing the components and then the hitters who strayed furthest from what the equation calculated, representing potential home run sleepers and busts in 2021.

Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year. He produces player projections using his own forecasting system and is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. Follow Mike on Twitter @MikePodhorzer and contact him via email.

newest oldest most voted
Lunch Angle
Lunch Angle

so.. when are the 2021 Pod Projections coming out? 😀


Yes, inquiring minds want to know!