One Hitter, Two Hitter, Red Hitter, Blue Hitter

December 15, 2022

How would you define Jeff McNeil as a hitter in just a few words? If you had to place him in his own “group” of hitters, who else would you place him with? Last week, I used a cluster analysis to find a player that might compare to Luis Arraez and in turn, help provide some approach recommendations for increasing his power. This week, I’ll use that same cluster analysis, with just a few tweaks, to determine what combination of Statcast and plate discipline metrics increases roto value on average. Let’s start with a refresher on my process.

First, I downloaded all data from the Statcast and plate discipline leaderboards for qualified hitters in 2022. Second, I conducted two Principal Component Analyses, one on the Statcast metrics and one on the plate discipline metrics. This gave me one nice number per player that encompassed all of their Statcast numbers and one nice number per player that encompassed all of their plate discipline metrics. With those numbers I created this visual:

Batter Profile Clusters

Now, you don’t know what’s good and what’s bad, do you? That’s the trouble with the PCA. It’s a great way to summarize but you lose some of the interpretability of the metrics. So, let me break down each cluster by showing you the averages across all players in each cluster:

Averages by Cluster – Statcast

Cluster	EV	maxEV	LA	Barrel%	HardHit%	AVG	xBA	SLG	xSLG	wOBA	xwOBA
1	90.8	113.9	13.0	10.9	46.1	0.279	0.273	0.485	0.477	0.357	0.354
2	86.4	108.0	13.2	3.3	29.1	0.261	0.254	0.382	0.357	0.314	0.304
3	89.3	111.5	15.5	9.9	40.8	0.232	0.236	0.406	0.410	0.315	0.321
4	89.9	112.2	13.3	8.8	42.1	0.268	0.258	0.456	0.432	0.347	0.337
5	88.3	110.4	10.3	5.5	37.3	0.261	0.256	0.394	0.383	0.312	0.309
6	90.3	112.8	13.3	11.0	44.7	0.256	0.253	0.449	0.447	0.338	0.339

*Qualified hitters 2022
Blue – Min
Yellow – Max

–

Averages by Cluster – Plate Discipline

Cluster	O-Swing%	Z-Swing%	Swing%	O-Contact%	Z-Contact%	Contact%	Zone%	F-Strike%	SwStr%	CStr%	CSW%
1	33.9	73.5	49.7	64.6	87.0	77.9	39.8	60.8	11.0	14.1	25.1
2	30.9	66.1	45.8	76.7	91.6	85.8	42.3	60.6	6.6	18.1	24.7
3	30.9	68.9	46.5	61.7	83.5	74.9	41.0	60.7	11.8	16.5	28.2
4	31.0	69.9	46.9	67.0	87.2	79.4	40.9	59.8	9.7	16.0	25.7
5	35.2	70.0	49.4	69.4	88.9	80.6	41.0	62.3	9.8	16.0	25.8
6	31.6	69.0	47.0	62.9	85.2	76.3	41.0	60.7	11.2	16.4	27.6

*Qualified hitters 2022
Blue – Min
Yellow – Max

Now, let’s try to classify each of these clusters:

Cluster 1 – Let the big dogs eat. This group rules in nearly all statcast metrics and is just barely beaten out for the Barrel% prize by Cluster 6. They swing often and don’t see the ball in the zone often. Ex: Aaron Judge.

Cluster 2 – Contact. These hitters are not being fooled, swinging less often but making contact when they do. They have good averages but could benefit from increased power. Ex: Jeff McNeil.

Cluster 3 – Whiffs. These players are not making as much contact, swinging and missing a lot, and have low averages/expected averages. Ex: Cody Bellinger.

Cluster 4 – Good, not great. Don’t let the colorless cells fool you, these hitters are good. They make good hard contact, get on base often and rarely get fooled. Ex: Juan Soto.

Cluster 5 – Aggressive. These hitters are swinging outside of the zone often, at the first pitch often, and getting on base less often. Ex: Nick Castellanos.

Cluster 6 – Expected. These hitters are accomplishing what x-stats say they should. They hit the ball hard, but could perhaps benefit from a little more patience. Ex: Ryan Mountcastle.

–

Now to put the cherry on top, which cluster provided the most hitting value (mSB excluded from calculation) in roto dollars? Well, we can use the YTD 2022 auction calculator to find that, as you would guess, cluster 1 wins because they had Aaron Judge. But, what about the others? Here’s the average value produced from each cluster in 2022:

Cluster 1 $23.61
Cluster 2 $4.69
Cluster 3 $4.71
Cluster 4 $16.39
Cluster 5 $4.46
Cluster 6 $13.60

–

If you’re wondering what all of this means for your 2023 season draft, you’re not alone. One thing that is very interesting, however, is that cluster 4 has no statcast max level yellow highlighting it’s row, yet these are the second most valuable players. Cluster 6 is doing what we expect them to do, they are valuable and they own the barrel% category. While everyone is looking at MaxEVs and baseball savant profiles, these clusters may be helpful in finding value at discounts. Now, here’s a very long table of each player and their respective cluster. Enjoy.

Classified Hitters

Cluster1	Cluster2	Cluster3	Cluster4	Cluster5	Cluster6
Aaron Judge	Jose Altuve	Anthony Rizzo	Paul Goldschmidt	Whit Merrifield	Bryan Reynolds
Francisco Lindor	Jeff McNeil	Josh Rojas	Alex Bregman	DJ LeMahieu	Yordan Alvarez
Trea Turner	Wilmer Flores	Mike Yastrzemski	Cedric Mullins II	Randal Grichuk	Kyle Schwarber
Corey Seager	Elvis Andrus	Seth Brown	Nolan Arenado	Ketel Marte	Gleyber Torres
Amed Rosario	Ha-seong Kim	Jesus Aguilar	Christian Yelich	Andres Gimenez	Matt Chapman
Freddie Freeman	Yuli Gurriel	Jesse Winker	Xander Bogaerts	Luis Rengifo	J.T. Realmuto
Jose Ramirez	Kyle Farmer	Mark Canha	Manny Machado	Jeremy Pena	Tommy Pham
Bo Bichette	J.P. Crawford	MJ Melendez	Ty France	Jonathan Schoop	Yandy Diaz
Pete Alonso	Adam Frazier	Cody Bellinger	Jake Cronenworth	Nico Hoerner	J.D. Martinez
Jose Abreu	Luis Arraez	Luke Voit	Jurickson Profar	Andrew Benintendi	Ryan McMahon
Marcus Semien	Steven Kwan	Carlos Santana	Adolis Garcia	Thairo Estrada	Brandon Drury
Matt Olson	Cesar Hernandez	Nelson Cruz	Shohei Ohtani	Isiah Kiner-Falefa	Andrew McCutchen
Vladimir Guerrero Jr.	Myles Straw	Jorge Mateo	Rafael Devers	Miguel Rojas	Willy Adames
Austin Riley	Charlie Blackmon	Starling Marte	Tommy Edman	A.J. Pollock	Rowdy Tellez
Dansby Swanson	Tony Kemp	Lane Thomas	Alex Verdugo	Nick Castellanos	Trey Mancini
		Max Muncy	Josh Bell	Justin Turner	George Springer
		Teoscar Hernandez	Brandon Nimmo	Gio Urshela	Sean Murphy
		Marcell Ozuna	Bobby Witt Jr.	Alejandro Kirk	Andrew Vaughn
		Ronald Acuna Jr.	Alec Bohm	Ke’Bryan Hayes	C.J. Cron
		Hunter Renfroe	Ian Happ		Eugenio Suarez
		Josh Donaldson	Randy Arozarena		Carlos Correa
		Patrick Wisdom	Kyle Tucker		Will Smith
		Trent Grisham	Juan Soto		Ryan Mountcastle
		Eduardo Escobar	Nathaniel Lowe		Taylor Ward
			Mookie Betts		Javier Baez
			Anthony Santander		Brendan Rodgers
			Rhys Hoskins		Austin Hays
			Christian Walker		Julio Rodriguez
					Daulton Varsho

8 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Jeff ZimmermanFanGraphs Staff

2 years ago

Great work.

I think there are always too many categories. It would be interesting to see which ones are correlated and either combine or remove them.

MaxEV, AvgEV, Barrel%, HardHit%, SLG, xSLG, and a useful xISO (xSLG-xAVG). Get them down to one metric.

It’s the same with plate discipline numbers.

My guess you’ll end up with a power value, a LA value, contact value, and an OOZ/chase value.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG