One Hitter, Two Hitter, Red Hitter, Blue Hitter

Brad Penner-USA TODAY Sports

How would you define Jeff McNeil as a hitter in just a few words? If you had to place him in his own “group” of hitters, who else would you place him with? Last week, I used a cluster analysis to find a player that might compare to Luis Arraez and in turn, help provide some approach recommendations for increasing his power. This week, I’ll use that same cluster analysis, with just a few tweaks, to determine what combination of Statcast and plate discipline metrics increases roto value on average. Let’s start with a refresher on my process.

First, I downloaded all data from the Statcast and plate discipline leaderboards for qualified hitters in 2022. Second, I conducted two Principal Component Analyses, one on the Statcast metrics and one on the plate discipline metrics. This gave me one nice number per player that encompassed all of their Statcast numbers and one nice number per player that encompassed all of their plate discipline metrics. With those numbers I created this visual:

Batter Profile Clusters

Now, you don’t know what’s good and what’s bad, do you? That’s the trouble with the PCA. It’s a great way to summarize but you lose some of the interpretability of the metrics. So, let me break down each cluster by showing you the averages across all players in each cluster:

Averages by Cluster – Statcast
Cluster EV maxEV LA Barrel% HardHit% AVG xBA SLG xSLG wOBA xwOBA
1 90.8 113.9 13.0 10.9 46.1 0.279 0.273 0.485 0.477 0.357 0.354
2 86.4 108.0 13.2 3.3 29.1 0.261 0.254 0.382 0.357 0.314 0.304
3 89.3 111.5 15.5 9.9 40.8 0.232 0.236 0.406 0.410 0.315 0.321
4 89.9 112.2 13.3 8.8 42.1 0.268 0.258 0.456 0.432 0.347 0.337
5 88.3 110.4 10.3 5.5 37.3 0.261 0.256 0.394 0.383 0.312 0.309
6 90.3 112.8 13.3 11.0 44.7 0.256 0.253 0.449 0.447 0.338 0.339
*Qualified hitters 2022
Blue – Min
Yellow – Max

Averages by Cluster – Plate Discipline
Cluster O-Swing% Z-Swing% Swing% O-Contact% Z-Contact% Contact% Zone% F-Strike% SwStr% CStr% CSW%
1 33.9 73.5 49.7 64.6 87.0 77.9 39.8 60.8 11.0 14.1 25.1
2 30.9 66.1 45.8 76.7 91.6 85.8 42.3 60.6 6.6 18.1 24.7
3 30.9 68.9 46.5 61.7 83.5 74.9 41.0 60.7 11.8 16.5 28.2
4 31.0 69.9 46.9 67.0 87.2 79.4 40.9 59.8 9.7 16.0 25.7
5 35.2 70.0 49.4 69.4 88.9 80.6 41.0 62.3 9.8 16.0 25.8
6 31.6 69.0 47.0 62.9 85.2 76.3 41.0 60.7 11.2 16.4 27.6
*Qualified hitters 2022
Blue – Min
Yellow – Max

Now, let’s try to classify each of these clusters:

Cluster 1 – Let the big dogs eat. This group rules in nearly all statcast metrics and is just barely beaten out for the Barrel% prize by Cluster 6. They swing often and don’t see the ball in the zone often. Ex: Aaron Judge.

Cluster 2 – Contact. These hitters are not being fooled, swinging less often but making contact when they do. They have good averages but could benefit from increased power. Ex: Jeff McNeil.

Cluster 3 – Whiffs. These players are not making as much contact, swinging and missing a lot, and have low averages/expected averages. Ex: Cody Bellinger.

Cluster 4 – Good, not great. Don’t let the colorless cells fool you, these hitters are good. They make good hard contact, get on base often and rarely get fooled. Ex: Juan Soto.

Cluster 5 – Aggressive. These hitters are swinging outside of the zone often, at the first pitch often, and getting on base less often. Ex: Nick Castellanos.

Cluster 6 – Expected. These hitters are accomplishing what x-stats say they should. They hit the ball hard, but could perhaps benefit from a little more patience. Ex: Ryan Mountcastle.

Now to put the cherry on top, which cluster provided the most hitting value (mSB excluded from calculation) in roto dollars? Well, we can use the YTD 2022 auction calculator to find that, as you would guess, cluster 1 wins because they had Aaron Judge. But, what about the others? Here’s the average value produced from each cluster in 2022:

Cluster 1 $23.61
Cluster 2 $4.69
Cluster 3 $4.71
Cluster 4 $16.39
Cluster 5 $4.46
Cluster 6 $13.60

If you’re wondering what all of this means for your 2023 season draft, you’re not alone. One thing that is very interesting, however, is that cluster 4 has no statcast max level yellow highlighting it’s row, yet these are the second most valuable players. Cluster 6 is doing what we expect them to do, they are valuable and they own the barrel% category. While everyone is looking at MaxEVs and baseball savant profiles, these clusters may be helpful in finding value at discounts. Now, here’s a very long table of each player and their respective cluster. Enjoy.

Classified Hitters
Cluster1 Cluster2 Cluster3 Cluster4 Cluster5 Cluster6
Aaron Judge Jose Altuve Anthony Rizzo Paul Goldschmidt Whit Merrifield Bryan Reynolds
Francisco Lindor Jeff McNeil Josh Rojas Alex Bregman DJ LeMahieu Yordan Alvarez
Trea Turner Wilmer Flores Mike Yastrzemski Cedric Mullins II Randal Grichuk Kyle Schwarber
Corey Seager Elvis Andrus Seth Brown Nolan Arenado Ketel Marte Gleyber Torres
Amed Rosario Ha-seong Kim Jesus Aguilar Christian Yelich Andres Gimenez Matt Chapman
Freddie Freeman Yuli Gurriel Jesse Winker Xander Bogaerts Luis Rengifo J.T. Realmuto
Jose Ramirez Kyle Farmer Mark Canha Manny Machado Jeremy Pena Tommy Pham
Bo Bichette J.P. Crawford MJ Melendez Ty France Jonathan Schoop Yandy Diaz
Pete Alonso Adam Frazier Cody Bellinger Jake Cronenworth Nico Hoerner J.D. Martinez
Jose Abreu Luis Arraez Luke Voit Jurickson Profar Andrew Benintendi Ryan McMahon
Marcus Semien Steven Kwan Carlos Santana Adolis Garcia Thairo Estrada Brandon Drury
Matt Olson Cesar Hernandez Nelson Cruz Shohei Ohtani Isiah Kiner-Falefa Andrew McCutchen
Vladimir Guerrero Jr. Myles Straw Jorge Mateo Rafael Devers Miguel Rojas Willy Adames
Austin Riley Charlie Blackmon Starling Marte Tommy Edman A.J. Pollock Rowdy Tellez
Dansby Swanson Tony Kemp Lane Thomas Alex Verdugo Nick Castellanos Trey Mancini
Max Muncy Josh Bell Justin Turner George Springer
Teoscar Hernandez Brandon Nimmo Gio Urshela Sean Murphy
Marcell Ozuna Bobby Witt Jr. Alejandro Kirk Andrew Vaughn
Ronald Acuna Jr. Alec Bohm Ke’Bryan Hayes C.J. Cron
Hunter Renfroe Ian Happ Eugenio Suarez
Josh Donaldson Randy Arozarena Carlos Correa
Patrick Wisdom Kyle Tucker Will Smith
Trent Grisham Juan Soto Ryan Mountcastle
Eduardo Escobar Nathaniel Lowe Taylor Ward
Mookie Betts Javier Baez
Anthony Santander Brendan Rodgers
Rhys Hoskins Austin Hays
Christian Walker Julio Rodriguez
Daulton Varsho





8 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Jeff Zimmermanmember
1 year ago

Great work.

I think there are always too many categories. It would be interesting to see which ones are correlated and either combine or remove them.

MaxEV, AvgEV, Barrel%, HardHit%, SLG, xSLG, and a useful xISO (xSLG-xAVG). Get them down to one metric.

It’s the same with plate discipline numbers.

My guess you’ll end up with a power value, a LA value, contact value, and an OOZ/chase value.