Introducing Batter xHR/FB Rate, Version 4.0: The Research

If it’s really true that Chicks Dig the Long Ball, then how do they feel about the nerds trying to figure out who will hit those long balls and how many of them they will hit? As fantasy owners, the home run is the ultimate result of a hitter’s plate appearance. It counts for a homer, obviously, but also a run scored, at least one run batted in, and a 1.000 batting average. Unfortunately, a hitter can’t also steal a base while rounding the bags on his trot home, but contributions in four of five categories in just one plate appearance seems good enough. Because of the value of a home run, accurately projecting them is one of the keys to a fantasy championship. Luckily, I’ve spent six years trying to do just that.

So let’s begin with a bit of an xHR/FB rate history, followed by my newest research in developing xHR/FB v4.0.

All the way back in 2015, I unmasked my original batter xHR/FB equation. At the time, Statcast didn’t exist, and I had to rely on Jeff Zimmerman’s scraped data that wasn’t available anywhere else. Although the equation essentially just used the variables I had access to and made the most logical sense at the time, it included metrics we would definitely agree drives HR/FB rate — average fly ball distance, average horizontal angle, and the standard deviation of average distance, which differentiated between the hitter who consistently hits 300 foot flies from the one who alternates 400 and 200 footers (the latter would earn higher xHR/FB rate marks, all else being equal). Since the components all made sense, and still do, the equation worked out pretty darn well. This is especially true considering it was only the first iteration and there was a dearth of publicly available metrics to include.

Two years later, I took full advantage of our new favorite toy, Statcast, and shared xHR/FB v2.0. The latest equation combined FanGraphs metrics, fly ball Pull% and fly ball Oppo%, with Statcast’s barrels per batted ball event, or Brls/BBE. What barrels provided was a measure of both exit velocity and vertical launch angle, in combination. It’s not enough to learn a batter hits the ball hard or his average launch angle equates to a high fly ball rate. That’s because, for all we know, the batter’s hard hit balls could come on grounders, while all his weakly hit balls are hit in the air, or vice versa. Barrels tells us the rate at which the optimal combination of exit velocity and vertical launch angle occur, allowing me to use just that one rate and kill two birds with one stone. It was a beautiful thing.

With v2.0, I didn’t stop there, however. I became even more ambitious, taking the equation one step further by attempting to adjust it for the batter’s home park. So I added a handedness home run park factor component, which ever so slightly increased the R-squared (versus the non-park adjusted v2.0). We all know that home park plays a significant role in HR/FB rates, but it’s far trickier to account for that in an equation than you might think. That’s likely why the R-squared didn’t jump more dramatically than it did. In the end, the R-squared of v2.0 was higher than v1.0, so I was pleased with the small, but meaningful, gain.

Just a year after unveiling v2.0, and a little more than three years ago from today, I revealed yet another update, this time v3.0 (though I dubbed it Version 2.0, as an updated Statcast-driven xHR/FB). Our league Batted Ball metrics go back to 2002, and beginning in 2015 and lasting through 2017, each season represented a new HR/FB rate high. The historic home run surge broke my equation, and I needed to investigate why. After discussions with colleagues, I realized two things — I needed a better denominator for barrels and I needed to account for the drag/spin rate of batted balls. So, rather than use all batted ball events as the barrels denominator, I used “true fly balls”, which excludes pop-ups, and I also added back in a component from my original equation, average fly ball distance. BOOM, the R-squared rocketed and I was thrilled.

Now we’re back to today, a point at which I’ve gone an excruciating three whole years without developing the latest and greatest of xHR/FB rate equations. I was happy with v3.0, so what else is there left to do? A LOT!

As happens frequently, I had the itch to start my xHR/FB rate research from scratch. So began the journey toward a v4.0. My goal was to use Statcast data entirely, rather than mix data sources, as each source defines batted balls differently, and that would skew the numbers.

First, I wanted to calculate a variety of rates based on the Batted Ball Type, Quality of Contact, and Batted Ball Direction filters. Which selections resulted in the highest home run rates? So I created a matrix of each Quality of Contact and Batted Ball Direction versus the Batted Ball Types I had selected and calculated what percentage of total batted balls (of that batted ball type) each represented, along with the home run percentage.

What follows are the calculations for the Fly Ball and Line Drive Batted Ball Types:

Quality of Contact
% of Batted Ball Type 18.4% 9.2%
HR% 67.2% 40.4%
Solid Contact
% of Batted Ball Type 10.4% 12.9%
HR% 21.1% 4.1%
Data includes all Statcast years, from 2015-2020

Batted Ball Direction
% of Batted Ball Type 24.2% 36.2%
HR% 39.3% 8.3%
% of Batted Ball Type 38.4% 36.0%
HR% 10.5% 2.4%
% of Batted Ball Type 37.4% 27.8%
HR% 5.9% 2.0%
Data includes all Statcast years, from 2015-2020

The tables display the percent of the batted ball type in the top row that each of the five variables represent and the percentage of those variables, with that batted ball type, that are home runs. So 18.4% of all fly balls were classified as Barrels, and 67.2% of those fly balls classified as Barrels were home runs. I highlighted the HR% rates that were significant enough to explore using in my v4.0 equation.

The Quality of Contact table displays two of the six selections from that Statcast search filter. Home runs hit with the four remaining Quality of Contact options represented less than 1% of the total balls hit with those qualities, so they were ignored. Of course, the Barrel home run rates are massive. However, we find that all along, I have been ignoring the Solid Contact classification, and should not have been. I had no idea that just over 20% of Solid Contact fly balls have gone for a homer as well. This was eye-opening and I was excited to test these batted balls in my new equation.

What I also learned is that home runs are hit on line drives too! My assumption was that home runs were almost automatically classified as fly balls by most (all?) data sources, even if the combination of exit velocity and launch angle would technically make it a line drive. So I never bothered to look into a hitter’s line drive metrics. That means that all this time of using only fly balls as the denominator for barrel rate, I really should have been using fly balls and line drives.

The Batted Ball Direction table displays all three of the selections from that Statcast search filter. These represent the same categories we show on FanGraphs, but the rates differ, thanks to divergent methodologies. There are some interesting tidbits here. We knew that strictly for hitting home runs, pulling fly balls is highly preferable, but pulling line drives isn’t nearly as productive. While the home run rate on a pulled line drive does represent about three and a half and four times the Straightaway and Opposite marks, respectively, it’s still fairly low and significantly below the rate on fly balls. Still, it was worth highlighting for further investigation.

Furthermore, we actually find another double digit rate that’s not from a pulled batted ball. Perhaps surprisingly, 10.5% of flyballs hit straightaway have gone for a homer. I would not have guessed that, so adding this batted ball combination as a potential new equation variable was exciting. Finally, what we aren’t surprised about are the lowly Opposite field fly ball and line drive home run rates.

So what have we learned from all this data? Let’s summarize what it takes to hit a home run:

  • Barrels are king, and although a fly ball barrel is preferable, a line drive barrel is mighty fine as well
  • Solid Contact batted balls shouldn’t be ignored! A fly ball classified as such are home runs just over 20%, but all other Solid Contact Batted Ball Types can be ignored
  • Pull all the fly balls! Pulled flies are golden (barrels are therefore platinum). Pulled line drives may be meaningful when predicting home runs.
  • Hitting fly balls straightaway is okay too!
  • Don’t go the opposite way with a fly ball or line drive. Just don’t. Well, if home runs is your goal, of course.

Tomorrow, I’ll explain how I used what I learned from this research to develop my xHR/FB Version 4.0 equation.

Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year. He produces player projections using his own forecasting system and is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. Follow Mike on Twitter @MikePodhorzer and contact him via email.

Newest Most Voted
Inline Feedbacks
View all comments
3 years ago

Is boner one of them?