Devising a Deserved Barrel%
A couple of weekends ago at BaseballHQ‘s First Pitch Arizona conference, The Athletic’s Eno Sarris and I talked about hitter metrics most descriptive and/or predictive of power. In Eno’s presentation, he included a quip from analyst Hareeb al-Saq:
“Knowing barrels on top of average EV [exit velocity] tells you a lot. Knowing average EV on top of barrels tells you a little.”
Eno was surprised by this finding — that barrel rate is a more beneficial metric than average EV, or even EV on a certain type of batted ball event (BBE), such as fly balls and line drives. Incidentally, this is something Al Melchior and I researched last year for which we reached the same conclusion: barrels, whether as a percentage of batted ball events or plate appearances, correlate more strongly than average, maximum, or fly ball/line drive EVs did to common power metrics such as home runs per fly ball (HR/FB), isolated power (ISO), or hard-hit rate (Hard%).
However, it made more sense to Eno when I articulated that calculating barrel rate is simply the act of isolating a hitter’s most-optimal batted ball events. In other words, the inclusion of launch angle (LA) adds another explanatory dimension to EV. In my head, it’s like having two separate circles — one for EV, the other for LA, each containing every individual batted ball outcome from the season — and overlapping them. The overlapped portion of the Venn diagram signifies barrels, and it changes in size depending on the quality of the batted ball events.
This visualization gave me an idea. I’m not here to argue the merits or demerits of the formulation of “barrel” as a measure of performance or its descriptive or predictive values. Objectively, one must acknowledge, no matter its arbitrariness, that a barrel is a high-quality shorthand indicator of high-quality contact. My issue with barrels is there’s little way to know if a barrel is earned from an odds standpoint. The question becomes one of probability. Think about the Venn diagram: if you have a collection of EVs and LAs, how many of each can you expect to be paired up optimally to produce a barrel? A barrel is a barrel — it happened, indisputably. But did the hitter deserve the frequency of barrels he produced over the course of the season?
How much of a barrel rate is skill or luck?
This question can be answered pretty resoundingly with just average EV and average LA. Alas, deserved barrel rate, or Deserved Barrel%. (“Expected barrel rate” sounds cooler and falls in line with existing “x” research, but “deserved” more effectively communicates the point — that this is backward-looking. “Expected” sometimes suggests to people it’s “what we should expect” rather than “what we should have expected.”)
Using all hitter-seasons of 300+ batted ball events from 2017 through 2019 (n=542), I specified a multiple regression equation with barrel rate (in this instance, barrels as a percentage of all batted ball events, or Barrel/BBE) as the dependent variable and average EV and average LA as the independent variables:
Barrel/BBE = EV + LA + ε
The first batch of results produced an adjusted r2 of 0.62 — which, on a scale from 0 to 1, is very strong. (To be clear, a strong fit could have been reasonably expected given EV and LA are the direct inputs to a barrel.) However, the results seemed a little, I don’t know, off; all of the game’s best hitters routinely over-performed their deserved barrel rates, but so did Dee Gordon, who, uh, deserved a negative 5% barrel rate. It occurred to me the relationship between Barrel/BBE and it’s component parts is not necessarily linear — it might make sense for EV, but it doesn’t for LA, which, at a certain point, too steep an angle leads to pop-ups. Accordingly, I included squared terms to establish a nonlinear relationship. I also included year fixed effects (y_XXXX) to account for the changing ball:
Barrel/BBE = EV + LA + EV2 + LA2 + y_2017 + y_2018 + ε
Not only did the goodness of fit improve (adjusted r2 = 0.67) but also Aaron Judge’s 2017 season suddenly over-performed by a much smaller margin and Gordon disappeared from the over-performers’ list.
Judge makes for a fantastic case study. Judge achieved an absurd 25.7% barrel rate while hitting 52 home runs (one per 6.5 batted ball events) in 2017. Without a sufficient track record, but full knowledge of his gargantuan power, how could we know if his performance was legitimate (or, alternatively, how much of his performance was legitimate)? Check out this table:
Year | EV | LA | Actual Barrel/BBE | Deserved Barrel/BBE | Diff |
---|---|---|---|---|---|
2017 | 94.9 | 15.8 | 25.7% | 20.2% | 5.5% |
2018 | 95.9 | 11.4 | 16.2% | 20.6% | -4.4% |
2019 | 94.7 | 12.4 | 20.2% | 18.1% | 2.1% |
Judge’s deserved barrel rate has remained fairly stable, but the pendulum of actual outcomes swings wildly to and fro. Judge found his happy medium in 2019, which is an easy but truthful cop-out to the question of whether his 2017 or 2018 seasons were more legitimate (“it’s probably somewhere in the middle”).
We find ourselves in similar circumstances with J.D. Martinez, who hit 45 home runs in just 489 plate appearances in 2017:
Year | EV | LA | Actual Barrel/BBE | Deserved Barrel/BBE | Diff |
---|---|---|---|---|---|
2017 | 91 | 15.2 | 19.5% | 12.1% | 7.4% |
2018 | 93 | 10.7 | 16.0% | 13.8% | 2.2% |
2019 | 91.3 | 12.5 | 12.0% | 11.2% | 0.8% |
It’s worth noting that average EV and LA are not perfectly sticky year over year. Aging curves and good ol’ random variance affect raw outcomes annually. Even in the absence of knowing exactly how a hitter’s average EV and/or LA might trend from year to year — meaning the best we can do is hold constant their previous season’s EV and LA values — deserved barrel rate still appears more stable than observed/measured/actual/whatever barrel rate.
Here’s the final equation for you:
Deserved Barrel/BBE = (–0.132 * EV) + (0.00186 * LA) + (0.00082 * EV2) + (0.000027 * LA2) + (0.00749 for 2017) + (–0.00185 for 2018) + 5.309
(I would argue, for simplicity’s sake, that the year fixed effects, underlined above, are so small that you could probably ignore them. Just FYI.)
And here is a list of 2019’s biggest Barrel/BBE over- and under-performers curated at my discretion:
- George Springer, +6.1% (biggest over-performer, min. 300 BBE)
- Fernando Tatis Jr., +5.7% (get your pitchfork)
- Pete Alonso, +5.2%
- José Altuve, +4.7%
- Ronald Acuña Jr., +4.6%
- Eugenio Suárez, +4.5%
… - Rafael Devers, –3.1%
- Manny Machado, –3.1%
- Jose Ramirez, –3.4%
- Alex Bregman, –4.5%
- Yuli Gurriel, –4.8% (2nd-biggest under-performer, min. 300 BBE)
It’s important to remember that I haven’t made any declarations about deserved barrel rate being a leading indicator (aka predictive) of future performance. Accordingly, I would not necessarily assume the likes of Devers, Bregman, and Gurriel will be even better in 2020 than they were in 2019. If anything, I think it means their actual barrel rate betrays the quality of contact they should’ve produced. In other words, deserved barrel rate would be a better description of what already happened, not a precursor for even better performance moving forward.
Additionally, I’m not sure how Deserved Barrel% would compare to something like Statcast’s expected wOBA (xwOBA). I imagine there may be some descriptive and predictive overlap; however, xwOBA is based on the expected results of actual EV-LA combinations, whereas I’ve tried to use average EV and LA to reverse-engineer those EV-LA combinations from a very high level.
Penultimately: would it surprise you to learn Mike Trout has consistently over-performed his deserved barrel rate by wide margins (+3.8% in 2017, +4.0% in 2018, +5.8% in 2019)? This is one drawback of using simply average EV (and LA, for that matter): it loses sight of the distribution of those EVs, which vary from player to player. It’s very reasonable to assume that Trout’s distribution of EVs is more favorably predisposed to producing barrels than the distribution of another hitter with roughly the same average EV (~90 mph). While the regression equation helps explain a substantial portion of barrel rate, unexplained factors related to skill and luck certainly remain. Yes, that also means Tatis Jr.’s over-performance could be legitimate. I decline to weigh in on the matter.
And, lastly: I want to mention that I don’t consider myself the most skilled econometrician. Someone like Baseball Prospectus’ Jonathan Judge could (and probably already has, but proprietarily) conduct a more rigorous statistical analysis to determine a deserved barrel rate using something less aggregated than average EV and average LA. Still, I think it’s a good start, and I imagine I will use it a lot this offseason in trying to find a signal amid the noise.
Every time I see these regression formulas I cringe a little. You have two variables in your equation twice hence multicollinearity which will impact your R-squared value.
Multicollinearity is a necessary evil imo. Less-concerned about the r-squared although it’s fair to note I should’ve shed light on it in the post.
You also specified the first was an adjusted r^2, which adjusts and penalizes for the added predictors in the model. I’m sure the second r^2, .67, adjusted was too
Correct!