Introducing Batter xHR/FB Rate, Version 4.0: The Correlations
Yesterday, I shared the history of my xHR/FB rate equation and the first pieces of research on my journey toward developing Version 4.0. Today, I’ll discuss a myriad of correlations for a myriad of metrics and how those calculations helped me determine which would win a spot in my final equation. Fun!
First, I gathered as much Statcast data as possible that could potentially drive HR/FB rate. I painstakingly downloaded various metric totals for all batters going back to 2015 with at least 50 fly balls plus line drives. That left me with 2,109 batter seasons with which to do lots and lots of math.
I began with correlations. Does the metric even correlate with HR/FB rate, and if so, is it a strong correlation? What follows is a table of correlations with HR/FB rate and definitions for each metric, even if they seem obvious from the name.
Metric | Correlation |
---|---|
Avg Dist FB+LD | 0.783 |
Barrel FB% | 0.763 |
Avg Dist FB | 0.721 |
Barrel LD% | 0.689 |
Std Dev of Dist FB+LD | 0.593 |
Avg Dist LD | 0.438 |
Pull LD% | 0.369 |
Pull FB% | 0.349 |
Oppo FB% | -0.286 |
Spin Rate FB | 0.144 |
Solid Contact FB% | 0.106 |
Straightaway FB% | -0.036 |
For maximum understanding, I sorted the metrics in order of the absolute value of their correlations, as we don’t care if the correlation is positive or negative, just that it’s strong.
We find three metrics that correlate above 0.70 with HR/FB:
Avg Dist FB+LD — average batted ball distance of all fly balls and line drives
Barrel FB% — percentage of fly balls classified as “Barrel”
Avg Dist FB — average batted ball distance of all fly balls
Obviously, it doesn’t make sense to ultimately use both Avg Dist FB+LD and Avg Dist FB in my equation, so you’ll have to wait and see which ended up finding its way into my equation. Obviously, Barrel FB% makes a compelling case to be included.
Next, we find a metric all alone in the 0.60+ range:
Barrel LD% — percentage of line drives classified as “Barrel”
Aha! That follows what we learned yesterday that it’s not just barreled fly balls that go for homers at a high rate, but barreled line drives are also highly desired. The high correlation here confirms that Barrel LD% is an important rate that cannot be ignore when projecting HR/FB rates.
Moving along, we find yet another metric alone in the 0.50+ range:
Std Dev of Dist FB+LD — standard deviation of batted ball distances of all fly balls and line drives
Remember my inaugural edition of xHR/FB? It included the standard deviation of fly balls, but then the variable disappeared from my equation in future versions because I wasn’t able to access the data, or didn’t know how to. I then realized I could download all the batted ball data from Statcast and calculate the standard deviations of these batted balls myself. It was a time consuming process, but I did just that. Turns out, this was a very good idea, as the metric highly correlates with HR/FB.
What this tells us is how varied a batter’s distances are on those batted ball types. A batter whose batted ball distance is ultra consistent is going to have a lower Std Dev of Dist than one who alternates shorter distances and big blasts. Guess which batter is likely to post a higher HR/FB rate? The big blaster, of course.
Next is Avg Dist LD, which is the same as defined above, but limited to just line drives. The question eventually became whether I wanted to combine the two batted ball types and use Avg Dist FB+LD or separate them and use both Avg Dist FB and Avg Dist LD.
Finally, we get the batted ball direction rates:
Pull LD% — percentage of line drives that were pulled
Pull FB% — percentage of fly balls that were pulled
While we would have expected the two Pull rates to have similar correlations, it’s mildly interesting that Pull LD% actually sports a higher one than its FB counterpart. It suggests that both should be part of the equation.
After those two, we encounter the first negatively correlated metric:
Oppo FB% — percentage of fly balls that were hit the opposite way
That makes sense! If pulling flies and liners is key to hitting home runs, then obviously going the opposite way makes things quite a bit more challenging to get that ball over the wall.
Now, I present to you the metric I’ve never used before and wasn’t even aware of its availability:
Spin Rate FB — the average spin rate of all fly balls
We always hear that more spin results in a ball that will travel further. Now we have confirmation! It’s a pretty low correlation though, ranking near the bottom of the heap.
Last, we finish with the remaining two metrics with low correlations:
Solid Contact FB% — percentage of fly balls classified as “Solid Contact”
Straightaway FB% — percentage of fly balls that were hit straightaway
After discovering that 21.1% of fly balls classified as “Solid Contact” went for a home run during the Statcast era, I was pretty surprised to find the correlation of that rate with HR/FB rate so low. Weird. Strange. Baffling.
Fly balls hit straightaway don’t matter. They are slightly negatively correlated, so I guess that’s something, but the low correlation suggests they could simply be ignored.
So there you have it, correlations galore! Tomorrow, I will unveil xHR/FB v2.0 and then we can have some fun discussing the components and then the hitters who strayed furthest from what the equation calculated, representing potential home run sleepers and busts in 2021.
Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year and three-time Tout Wars champion. He is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. Follow Mike on X@MikePodhorzer and contact him via email.
so.. when are the 2021 Pod Projections coming out? 😀
Yes, inquiring minds want to know!
LOL, I’m soooooo close. Going to try my hardest to publish tonight.