My favorite part of this year’s World Baseball Classic, aside from the baseball, obviously, was the television broadcasts’ frequent reference to players’ swing speeds. I was floored, even if only because I didn’t know (but should’ve known) we had the technology capable of measuring it. Regarding Major League Baseball and Statcast’s adoption of such a metric, a little birdy told me I shouldn’t hold my breath. Disappointed, I moved on.
Then yesterday, while fooling around in Baseball Savant’s Statcast database trying to diagnose the misalignment of Miguel Cabrera’s outcomes with his peripherals, I noticed the database query’s “sort by” function offered an option to sort by “estimated swing speed.” A quick Google search indicates to me the Statcast and MLB Advanced Media team(s) has (have) yet to formally announce this; sprint speed has been the more exciting recent development, apparently.
Not to me! I quickly got to work querying the data. I also quickly learned downloading the raw data files that underpin the swing speed summaries previously linked do not include swing speed, which is unhelpful. In other words, swing speed is not communicated to us from Baseball Savant’s organs on a play-by-play basis. I imagine this is by design. So, I was resigned to running a single query that summarized swing speed data at a high level: the average swing speed for every hitter with at least 100 at-bats in a given season, from 2015 through 2017.
Here’s what I found.
The average swing speed from the start of the Statcast Era™ until yesterday is 59.6 mph with a minimum of 51.3 mph (2017, Mallex Smith) and a maximum of 66.5 mph (2015, take a guess). Despite some minor fluctuations, the average and standard deviation mph by year suggests swing speeds haven’t changed recently.
Here’s a leaderboard detailing an assortment of the fastest swing speeds the last three years:
You’ll rarely see a more obvious correlation before ever statistically verifying it. But verify, I will.
Correlations with Other Metrics
So, yeah, swing speed evidently correlates with power. In a sample of 1,106 hitters who met the previously specified at-bat threshold, swing speed and isolated power (ISO) exhibited a very strong positive correlation (R = 0.63, on a scale of -1.0 to 1.0). For posterity, its regression coefficients turned up as such:
xISO = 0.01638*mph — 0.81394
I also modeled the correlation as a quadratic (i.e., incorporating a mph2 term) in the event there eventually were negative marginal returns on swing speed, such that a swing that’s too fast might result in declining ISO. Alas, the model turned up no such evidence within the realm of realistic outcomes (a maximum average swing speed of, say, 70 mph).
In terms of swing speed’s predictive capacity, there existed a weak correlation (R = 0.35) between year-to-year changes in swing speed and ISO, e.g., how much a player’s swing speed changed from 2016 to 2017 compared to how much his ISO changed during that time (n = 590).
This may come as no surprise: swing speed is negatively correlated with contact rate (Contact%), albeit weakly (R = 0.35). When modeled as a quadratic (R = 0.36), the model indicated a swing speed around 53 mph might be most ideal for making the most contact. It also suggests more contact trades off for less power. (The inverse of that statement perfectly captures our general idea of power hitters: home runs at the expense of strikeouts.)
When modeled as year-to-year differences, there existed no meaningful correlation (R = 0.17).
An aside: overall contact rate doesn’t really account for a player’s plate discipline; in hindsight, I think establishing the correlation between swing speed and zone contact (Z-Contact%) might be a more helpful measure, because those swings count much more than the bad ones.
Weighted On-Base Average
Ultimately, my investigation of Cabrera’s wOBA led me to stumble upon this data in the first place. Swing speed exhibited a moderately strong correlation (R = 0.54) with wOBA…
xwOBA = 0.00937*mph – 0.23793
… and the quadratic form of the model indicated no negative marginal returns within the reasonable range of swing speed outcomes. Moreover, the year-over-year model suggests at least a weak correlation (R = 0.33) between changing swing speeds and wOBAs.
It’s cool stuff. It doesn’t do much yet other than reinforce a lot of what we already know (or, at least, what we thought we knew) as well as what we’re continuing to learn in the era of Aaron Judges, Joey Gallos and Miguel Sanos.
I think there is substantial opportunity to use swing speed as a diagnostic tool. For example, Cabrera and Jose Bautista, both having very bad years relative to what we expect of them, have seen their swing speeds decline 2.4 mph and 2.6 mph, respectively, this season. Cabrera’s 2017 swing speed ranks in the top 10 percent the last three years, so it’s still elite, for all intents and purposes. But Bautista’s is barely above average. He’s old; they’re both getting up there.
So the big question is, are they playing through injury or descending somewhat ungracefully into old age, as human bodies inevitably do? It’s hard to say, but estimated swing speed is the glaring indicator that betrays all of Cabrera’s peripherals (although, again, his swing speed is still elite, suggesting he can produce admirably, even if it’s not MVP-caliber production). While helpful, it does little to solve the problem of the mismatch between Cabrera’s peripherals and outcomes this year. Alas…
We know nothing about this data on a granular level. Given there’s evidence the technology that measures exit velocity runs hot or cold depending on the ballpark, such is likely the case for swing speed. We don’t know if this data needs recalibration, but it likely does.
We don’t know what the fastest swing is. We don’t know if these data include check swings, half-assed swings, etc., and how Statcast controls for them, if at all. We don’t know how the data look on a weekly or monthly basis and how a player’s swing speed might typically change throughout the course of a season.
(Correction: You can find game-level data by choosing “Player & Game Date” as your “group by” function — nothing more granular than that — but you can’t search for more than one or a few selected players at a time. This single-player query for Curtis Granderson shows single-game swing speeds ranging from 26.9 mph to 73.5 mph. In my experience, the query will otherwise time out if you try to pull data for every hitter, and setting a minimum number of ABs affects the “group by” specification. This particular query found 1,629 player-games in 2017 of at least five ABs, so there is hope yet, but that leaves you with a spotty data set.)
(Back to your regularly scheduled caveats.)
We have no idea (yet!) how quickly it becomes reliable (“stabilizes”), although I imagine because it’s subject to nothing but the player’s own talents and efforts (as opposed to a ball in play, subject to countless competing factors), it becomes reliable rather quickly.
Good stuff. I’m excited to find a way to incorporate it into projections and player analysis for 2018.