Introducing: Weighted Plate Discipline Index (wPDI) for Pitchers

April 2, 2019

Today, I will attempt to develop a simple pitcher metric. My exercise will provide us with a recapitulation of the plate discipline data at our disposal, while at the same time afford us the opportunity to unearth some fascinating pitching tendencies of lesser known hurlers.

To do this, let’s start with the basic ingredients of plate discipline, from the point of view of the pitcher.

We can break down any pitch into these simple binary events:

Was the ball thrown in the strike zone?
Was the ball swung on?
Did the batter make contact with the ball?

On FanGraphs, we have a multitude of statistics which track these plate discipline possibilities. They are available here for every major league player. [Definitions can be found here in our library.]

Let’s now define a few quantities which enumerate the resulting outcomes (for the binary events).

1) Was the ball thrown in the strike zone?

Zone% = Pitches in the strike zone / Total pitches

Zone% = The percent of pitches which are thrown in the zone.
(1-Zone%) = The percent of pitches which are thrown out of the zone.

2) Was the ball swung on?

Z-Swing% = Swings at pitches inside the zone / pitches inside the zone

O-Swing% = Swings at pitches outside the zone / pitches outside the zone

The denominators of these two quantities are a subset of all pitches (i.e. Z-Swing is only for pitches in the zone, and O-Swing is only for pitches outside of the zone). To reflect these outcomes as a percent of ALL pitches, the quantities are:

(Zone%) * Z-Swing% = The percent of all pitches which are in the zone & that are swung on.
(Zone%) * (1- Z-Swing%) = The percent of all pitches which are in the zone & that are not swung on.
(1 – Zone%) * O-Swing% = The percent of all pitches which are out of the zone & that are swung on.
(1 – Zone%) * (1- O-Swing%) = The percent of all pitches which are out of the zone & that are not swung on.

3) Did the batter make contact with the ball?

Z-Contact% = Number of pitches on which contact was made on pitches inside the zone / Swings on pitches inside the zone

O-Contact% = Number of pitches on which contact was made on pitches outside the zone / Swings on pitches outside the zone

Now onto contact. The denominators once again are a subset of all pitches (i.e. Z-Contact is only for pitches swung on in the zone, and O-Contact is only for pitches swung on outside of the zone). To reflect these outcomes as a percent of ALL pitches, the quantities are:

(Zone%) * (Z-Swing%) * Z-Contact% = The percent of all pitches which are in the zone, swung on & contact is made.
(Zone%) * (Z-Swing%) * (1 – Z-Contact%) = The percent of all pitches which are in the zone, swung on & contact is not made.
(1 – Zone%) * (O-Swing%) * O-Contact% = The percent of all pitches which are out of the zone, swung on & contact is made.
(1 – Zone%) * (O-Swing%) * (1 – O-Contact%) = The percent of all pitches which are out of the zone, swung on & contact is not made.

We listed three binary events – each with exactly two possible results. By pure multiplication, there would be 2³ = 8 possible outcomes. However, because a pitch that is not swung on by definition cannot have any contact made – we may remove 2 of the 8 scenarios. We are now left with just 6 possible outcomes, for which we already have defined said quantities.

From a pitcher’s perspective, some of these outcomes are better than others. Let’s talk about each of them for a bit.

Classifying the 6 Pitching Outcomes

	Outcome	Outcome	Outcome	Outcome	Outcome	Outcome
	A	B	C	D	E	F
Zone?	Out of Zone	Out of Zone	Out of Zone	In Zone	In Zone	In Zone
Swing?	Swung On	Swung On	No Swing	Swung On	Swung On	No Swing
Contact?	No Contact	Contact Made	No Swing	No Contact	Contact Made	No Swing

Questions: How can we rank these 6 outcomes? Which of these 6 outcomes are good for pitchers? Which are not good? Which are somewhere in the middle?

Outcome A – Out of Zone / Swung On / No Contact – I would classify this as a very desirable outcome. In fact, I think that it is the most desirable. The batter shouldn’t be swinging at a pitch out of the zone in the first place – and on top of that – he didn’t make contact. The pitch will be counted as a strike. It is a very effective and deceptive pitch.

Outcome B – Out of Zone / Swung On / Contact Made – I would classify this as a generally desirable outcome, but it is in the middle. Pitchers should want batters swinging at pitches outside the zone. The batter shouldn’t be swinging, but it is far from the best outcome – the batter did make contact.

Outcome C – Out of Zone / No Swing – I would classify this as a generally undesirable outcome. Unless you have a particularly good catcher who frames well or unless you have a lousy umpire, 85-90% of these pitches will end up as balls. Obviously, you won’t give up base hits on these pitches – but as far as pitcher effectiveness, I wouldn’t classify this as a positive pitch. Looking at the flip side, it’s a very desirable outcome for a batter.

Outcome D – In Zone / Swung On / No Contact – This outcome is extremely desirable, as it will result in a strike. However, I would rank this outcome lower than A – which was out of the zone. Pitchers should desire swings on pitches outside the zone, rather than inside it.

Outcome E – In Zone / Swung On / Contact – This outcome is the least desirable. The pitch was in the zone, and the ball was struck. This is the largest outcome for generating hits and runs for the opposing batter. Obviously, it is possible for a foul ball to ensue or for weak contact to be generated in fair play. Compared to the other outcomes, I would rank this one as the most inferior.

Outcome F – In Zone / No Swing – A highly desirable outcome. The pitch will be called a strike with high probability, unless you have a poor catcher framer, or a poor umpire.

Ranking the above:

Pitching Outcome Indexes

Outcome	Description	Index
A	Out of Zone / Swung On / No Contact	100%
D	In Zone / Swung On / No Contact	90%
F	In Zone / No Swing	80%
B	Out of Zone / Swung On / Contact Made	65%
C	Out of Zone / No Swing	10%
E	In Zone / Swung On / Contact Made	0%

The indexes that I provide are from 0% to 100%. The most desirable is at 100%, with the least desirable at a 0%. These aren’t arbitrary, although for the moment, they aren’t formed on a purely mathematical basis. The rough idea is:

A & E are clearly the top and bottom to set the range – 100% & 0%.
D is set at 10% lower than A to show the more desirable outcome of generating a swinging strike out of the zone.
In a recent Twitter poll that I conducted (see the additional notes below) surveyors concluded that F ranks lower than D. F is then set to be 10% lower.
B is a middle outcome, but on the positive side. We need to set it over 50%. It is positive since it generates a swing outside of the strike zone, despite the contact. I have set it at 65%.
C is set at 10% to reflect a conservative 10% chance of getting a pitch out of the zone called for a strike.

Now let’s put all of this together and define a new statistic!

Outcomes as % of All Pitches

Here are the outcome definitions in terms of the plate discipline metrics. A% + … + F % = 100%

A% = Out of Zone / Swung On / No Contact = (1 – Zone%) * (O-Swing%) * (1 – O-Contact%)

B% = Out of Zone / Swung On / Contact Made = (1 – Zone%) * (O-Swing%) * O-Contact%

C% = Out of Zone / No Swing = (1 – Zone%) * (1- O-Swing%)

D% = In Zone / Swung On / No Contact = (Zone%) * (Z-Swing%) * (1 – Z-Contact%)

E% = In Zone / Swung On / Contact Made = (Zone%) * (Z-Swing%) * Z-Contact%

F% = In Zone / No Swing = (Zone%) * (1- Z-Swing%)

Weighted Plate Discipline Index (wPDI) for Pitchers:

The formula for wPDI, the Weighted Plate Discipline Index:

wPDI = Index_A * A% + Index_B * B% + Index_C * C% + Index_D * D% + Index_E * E% + Index_F * F%

Similar to wOBA, this weighted index awards higher values to the better outcomes. It meaningfully aggregates pitcher plate discipline outcomes. It is a way to compare pitchers via one single value.

The indexes are obviously the key. For now, let’s peek at the leaderboards using the proposed indexes. Let’s see if the list of the top pitchers coincides with 2018 surface stats. Let’s also see if we can generate some interesting findings.

The leaderboards below have been generated entirely from 2018 plate discipline data:

Starting Pitcher 2018 wPDI Leaderboard

Name	IP	wPDI
Chris Sale	158.0	.390
Patrick Corbin	200.0	.377
Domingo German	85.7	.372
Jacob deGrom	217.0	.369
Max Scherzer	220.7	.367
Collin McHugh	72.3	.363
Blake Snell	180.7	.361
Carlos Carrasco	192.0	.359
Justin Verlander	214.0	.359
Kyle Hendricks	199.0	.358
Aaron Nola	212.3	.356
Masahiro Tanaka	156.0	.356
Zack Greinke	207.7	.356
Lance McCullers Jr.	128.3	.354
Jason Vargas	92.0	.353
James Paxton	160.3	.353
Trevor Bauer	175.3	.350
Gerrit Cole	200.3	.349
Stephen Strasburg	130.0	.349
Luis Castillo	169.7	.347
Hyun-Jin Ryu	82.3	.346
Marco Gonzales	166.7	.346
Noah Syndergaard	154.3	.346
Shohei Ohtani	51.7	.346
Kenta Maeda	125.3	.346
Chris Archer	148.3	.344
Corey Kluber	215.0	.344
Jack Flaherty	151.0	.344
Felix Pena	92.7	.344
Shane Bieber	114.7	.343
Robbie Ray	123.7	.343
Wade LeBlanc	162.0	.342
Pablo Lopez	58.7	.342
German Marquez	196.0	.342
Rich Hill	132.7	.342
Luis Severino	191.3	.342
Dylan Bundy	171.7	.342
Charlie Morton	167.0	.341
Joe Musgrove	115.3	.340
Trevor Cahill	110.0	.339
Zack Godley	178.3	.339
Ross Stripling	122.0	.339
Mike Clevinger	200.0	.338
Walker Buehler	137.3	.338
Tyler Skaggs	125.3	.338
Andrew Heaney	180.0	.338
Nick Pivetta	164.0	.337
Alex Wood	151.7	.336
Carlos Martinez	118.7	.336
Jameson Taillon	191.0	.336
Jose Berrios	192.3	.336
Garrett Richards	76.3	.335
Anibal Sanchez	136.7	.335
Zack Wheeler	182.3	.334
CC Sabathia	153.0	.333
Cole Hamels	190.7	.333
Miles Mikolas	200.7	.333
John Gant	114.0	.331
Joey Lucchesi	130.0	.331
Vince Velasquez	146.7	.330
Jon Gray	172.3	.330

Minimum 35 IP

Chris Sale sits atop the starting pitcher wPDI leaderboard. Other notable recognizable names in the top 10 include deGrom, Scherzer, Snell, Carrasco and Verlander. Patrick Corbin makes this list at #2 mostly due to his high A and low E components; he generated a lot of swings and misses outside of the zone and produced little contact inside the zone.

Domingo German is a player that stands out within the top 10. He especially exceled in outcome F – making sure that batters did not even swing at his pitches in the zone. Keep an eye on him to start 2019. Last night, in German’s victory over the Tigers – his wPDI was .409!

Also, interesting to see within the top 25 are Jason Vargas and Marco Gonzales. Vargas excelled at E & F – some of the in-zone outcomes. Gonzales excelled at B & C – some of the out-of-zone outcomes. Gonzales got a lot of batters to make contact on pitches outside of the zone.

Of note is Collin McHugh at #6, although he should technically be classified as a reliever as far as 2018 goes. In his first start to ’19 – he produced a .416 wPDI.

Relief Pitcher 2018 wPDI Leaderboard

Name	IP	wPDI
Ryan Pressly	71.0	.401
Blake Treinen	80.3	.399
Dellin Betances	66.7	.396
Oliver Perez	32.3	.394
Aroldis Chapman	51.3	.390
Will Smith	53.0	.386
Jace Fry	51.3	.383
Edwin Diaz	73.3	.380
Hector Neris	47.7	.378
Kirby Yates	63.0	.375
Erik Goeddel	36.7	.375
Josh Hader	81.3	.373
Pedro Strop	59.7	.372
Craig Stammen	79.0	.371
Jose Leclerc	57.7	.370
JT Chargois	32.3	.369
Luis Santos	20.0	.369
Raisel Iglesias	72.0	.369
Jeanmar Gomez	25.0	.368
Brad Hand	72.0	.368
Tyler Olson	27.3	.365
Daniel Coulombe	23.7	.364
Jose Castillo	38.3	.364
Tommy Kahnle	23.3	.363
Alex Claudio	68.3	.363
Reyes Moronta	65.0	.363
Craig Kimbrel	62.3	.363
Ryan Brasier	33.7	.362
Adam Ottavino	77.7	.362
A.J. Cole	48.3	.362
Alec Mills	18.0	.361
Miguel Diaz	18.7	.361
Jeremy Jeffress	76.7	.360
Keone Kela	52.0	.360
Matt Barnes	61.7	.360
Will Harris	56.7	.360
Seunghwan Oh	68.3	.360
Corbin Burnes	38.0	.360
Brooks Pounders	15.3	.359
Steve Cishek	70.3	.358
Taylor Rogers	68.3	.358
Pedro Araujo	28.0	.358
Aaron Loup	39.7	.357
Jose Alvarado	64.0	.357
Sean Doolittle	45.0	.357
Vidal Nuno	33.0	.356
Andrew Miller	34.0	.355
Jeffrey Springs	32.0	.355
Pat Neshek	24.3	.355
Tony Watson	66.0	.355
Roberto Osuna	38.0	.354
Adam Kolarek	34.3	.354
Seranthony Dominguez	58.0	.354
Tanner Scott	53.3	.353
Tony Barnette	26.3	.353
Chris Devenski	47.3	.352
Jeurys Familia	72.0	.352
Ken Giles	50.3	.351
Trevor Hildenberger	73.0	.351
A.J. Minter	61.3	.351
Sergio Romo	67.3	.351

Minimum 15 IP

Recognizable elite relief pitchers include Diaz, Treinen, Chapman and Betances, who are found within the top ten. Kirby Yates, Josh Hader and Will Smith are other closers found near the top.

Oliver Perez at #4? Well, there must be a reason why his is still employed and still being used in decently high leverage situations. Maybe this explains it.

Atop all relievers though, was Ryan Pressly – who led all pitchers in 2018 at .401 [min 5 IP]. Pressly exhibited elite A, C and F components. That is, Pressly avoided bats extremely well. He generated lots of swings and misses on out-of-zone pitches, yet when the batter didn’t swing – it was often a strike. If Osuna faulters in 2019, it’s clear who should be given the next save opportunity.

Jace Fry excelled in components A, E and F – which is somewhat similar to Pressly. Fry threw a few more balls out of the zone which were not swung on, but he even further limited contact on balls swung on in the zone.

Assorted Notes:

The maximum wPDI in 2018 was around .400, with the lowest (not shown) around .250. The average wPDI across all pitchers was approximately .325.
Since A% + B% + C% + D% + E% + F% = 100%, if I had used an index of 100% for each of the outcomes, all pitchers would have exhibited a value of 100%.
wPDI does not currently consider other possibly useful modifiers such as contact type (hard/medium/soft), or call data (called strikes, called balls), etc. Instead, wPDI contemplates only 3 binary events. wPDI currently goes for simplicity – breaking everything down into only 6 possible outcomes.
I took to Twitter to help me with the ranking of outcomes (polls conducted here, here and here). I tried to incorporate poll relativity in creating the initial indexes.
- I completely disagreed with Twitters ranking of A vs. D. Twitter slightly preferred swinging / no contact while in the zone over out of the zone. If I am pitcher – I’d much rather induce a swing on a lousy pitch than at a good one.
- I also disagreed somewhat with Twitter’s ranking of B vs. F – which the voters seemed to be evenly split on. To me, not generating a swing on a pitch in the zone is more desirable than getting contact on an outside pitch.
wPDI is a skills-based metric. At some point into 2019, we will be able to see which pitchers exhibit skills growth and decline.
Although one single game is still a small sample size, it’s nice that wPDI can produce a “game score.” Theoretically, one can track wPDI game to game, and consider rolling averages. For wOBA, where the denominator is plate appearances – one game is an extremely small sample size. With wPDI – the denominator is pitches, which will converge a lot faster.

All in all, this was a very useful exercise. Looking at the individual components of wPDI can tell you a lot about the effectiveness and deceptive characteristics of pitchers.

There is more work to be done on wPDI, starting with the indexes. There may be index values which nicely correlate outcomes to strikeouts, or which correlates outcomes to the limiting of walks, etc. This was a first, but meaningful attempt. We also need to ask the question of whether to add more complexity to the metric, or to keep it simple. Should we limit wPDI to these 6 outcomes, or should we add in some other binary events and expand?

wPDI is not fully ready for prime time just yet. I first wanted to establish and to demonstrate the concept. You, the collective readers of this website, are the best possible source of feedback.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG