Diving Deeper Into Pitcher Batted Ball Distance

Yesterday, I continued batted ball distance week with our first look into the pitching side of the equation. Up until then, we had learned a lot about what a hitter’s distance meant, but essentially nothing about a pitcher’s. We now know that a pitcher’s batted ball distance allowed has a year-to-year correlation less than half that of a hitter’s. But, the correlation was still high enough to be meaningful.

At the end of yesterday’s post, I finished with this line:

Where we go from here, I am not entirely sure, but it could help with the next step of trying to further explain BABIP and HR/FB rate differences.

Since I certainly wasn’t going to wait for someone else to jump in and do the research, I immediately went ahead and crunched the numbers myself. I decided to calculate the correlation between a pitcher’s batted ball distance allowed with three variables: HR/FB, BABIP and ISO. The first metric was the obvious one, while BABIP was more of a curiosity. It doesn’t really follow that a greater distance allowed results in a higher BABIP, since it’s likely that more of the balls end up out of play…over the fence. I didn’t think of ISO allowed initially because slash line allowed data isn’t on our pitcher pages. So I hopped on over to Baseball Reference to get those numbers.

My population was 663 pitcher seasons from 2008 to 2014. Let’s go one by one.

HR/FB: 0.442

Pitcher Dist-HR-FB Correlation

There we go. A nice upward sloping line and a respectable correlation. It’s significantly lower than for hitters though, which is interesting. Obviously, park factors, batted ball angle and standard deviation all play a role, but those variables were missing from the hitter’s correlation as well.

Next up is…

BABIP: .088

Pitcher Dist-BABIP Correlation

Yeah. No. It’s one heck of a blob. There is a very slight positive correlation here, but not enough to really care about. As speculated above, it makes sense. Hard hit balls that go a long way are most of the time going to clear the fence, which would be excluded from BABIP, or caught near the wall. Perhaps some will go for doubles, but not enough of them to make a difference in the correlation. And besides, a high BABIP is typically the result of lots of line drives and/or avoidance of the pop up. Line drives don’t go for great distances, so a pitcher could sport the lowest distance allowed, but every batted ball could be a liner that falls between fielders.

And last, we have…

ISO: 0.396

Pitcher Dist-ISO Correlation

There it is again. Similar to HR/FB, with a sweet upward sloping line. But, why isn’t it higher? Sure, some doubles and triples are line drives down the lines that don’t necessarily get hit a long distance. But you would think that the further a pitcher allows batted balls to travel, the higher the ISO he allows. The positive correlation does confirm that this is the case, but not to the degree one might expect.

Since this is the first time we’re delving into this data, it obviously leads to more questions. What we have learned is that this data has meaning, it’s correlated to some degree with other result-based metrics and even has some correlation from year to year.

Why aren’t the HR/FB and ISO correlations higher? What else could we do with this data to further our understanding of a pitcher’s success or failure?

Oh, and yes, I am also now armed with the average absolute angle and standard deviation of distance data.

Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year. He produces player projections using his own forecasting system and is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. Follow Mike on Twitter @MikePodhorzer and contact him via email.

newest oldest most voted

I’ve found (in a much smaller, single season sample size) an R2 for HR+FB dist and ISO is about the same as what you found, but that the correlation was higher for total batted ball distance and ISO. I haven’t thought too much about why that is, maybe there is some subjectivity in the grey area between LD and FB that makes a difference.