Introducing: Weighted Plate Discipline Index (wPDI) for Pitchers

Today, I will attempt to develop a simple pitcher metric. My exercise will provide us with a recapitulation of the plate discipline data at our disposal, while at the same time afford us the opportunity to unearth some fascinating pitching tendencies of lesser known hurlers.

To do this, let’s start with the basic ingredients of plate discipline, from the point of view of the pitcher.

We can break down any pitch into these simple binary events:

  1. Was the ball thrown in the strike zone?
  2. Was the ball swung on?
  3. Did the batter make contact with the ball?

On FanGraphs, we have a multitude of statistics which track these plate discipline possibilities. They are available here for every major league player. [Definitions can be found here in our library.]

Let’s now define a few quantities which enumerate the resulting outcomes (for the binary events).

1) Was the ball thrown in the strike zone?

Zone% = Pitches in the strike zone / Total pitches

  • Zone% = The percent of pitches which are thrown in the zone.
  • (1-Zone%) = The percent of pitches which are thrown out of the zone.

2) Was the ball swung on?

Z-Swing% = Swings at pitches inside the zone / pitches inside the zone

O-Swing% = Swings at pitches outside the zone / pitches outside the zone

The denominators of these two quantities are a subset of all pitches (i.e. Z-Swing is only for pitches in the zone, and O-Swing is only for pitches outside of the zone). To reflect these outcomes as a percent of ALL pitches, the quantities are:

  • (Zone%) * Z-Swing% = The percent of all pitches which are in the zone & that are swung on.
  • (Zone%) * (1- Z-Swing%) = The percent of all pitches which are in the zone & that are not swung on.
  • (1 – Zone%) * O-Swing% = The percent of all pitches which are out of the zone & that are swung on.
  • (1 – Zone%) * (1- O-Swing%) = The percent of all pitches which are out of the zone & that are not swung on.

3) Did the batter make contact with the ball?

Z-Contact% = Number of pitches on which contact was made on pitches inside the zone / Swings on pitches inside the zone

O-Contact% = Number of pitches on which contact was made on pitches outside the zone / Swings on pitches outside the zone

Now onto contact. The denominators once again are a subset of all pitches (i.e. Z-Contact is only for pitches swung on in the zone, and O-Contact is only for pitches swung on outside of the zone). To reflect these outcomes as a percent of ALL pitches, the quantities are:

  • (Zone%) * (Z-Swing%) * Z-Contact% = The percent of all pitches which are in the zone, swung on & contact is made.
  • (Zone%) * (Z-Swing%) * (1 – Z-Contact%) = The percent of all pitches which are in the zone, swung on & contact is not made.
  • (1 – Zone%) * (O-Swing%) * O-Contact% = The percent of all pitches which are out of the zone, swung on & contact is made.
  • (1 – Zone%) * (O-Swing%) * (1 – O-Contact%) = The percent of all pitches which are out of the zone, swung on & contact is not made.

We listed three binary events – each with exactly two possible results. By pure multiplication, there would be 23 = 8 possible outcomes. However, because a pitch that is not swung on by definition cannot have any contact made – we may remove 2 of the 8 scenarios. We are now left with just 6 possible outcomes, for which we already have defined said quantities.

From a pitcher’s perspective, some of these outcomes are better than others. Let’s talk about each of them for a bit.

Classifying the 6 Pitching Outcomes
Outcome Outcome Outcome Outcome Outcome Outcome
A B C D E F
Zone? Out of Zone Out of Zone Out of Zone In Zone In Zone In Zone
Swing? Swung On Swung On No Swing Swung On Swung On No Swing
Contact? No Contact Contact Made No Swing No Contact Contact Made No Swing

Questions: How can we rank these 6 outcomes? Which of these 6 outcomes are good for pitchers? Which are not good? Which are somewhere in the middle?

Outcome A – Out of Zone / Swung On / No Contact – I would classify this as a very desirable outcome. In fact, I think that it is the most desirable. The batter shouldn’t be swinging at a pitch out of the zone in the first place – and on top of that – he didn’t make contact. The pitch will be counted as a strike. It is a very effective and deceptive pitch.

Outcome B – Out of Zone / Swung On / Contact Made – I would classify this as a generally desirable outcome, but it is in the middle. Pitchers should want batters swinging at pitches outside the zone. The batter shouldn’t be swinging, but it is far from the best outcome – the batter did make contact.

Outcome C – Out of Zone / No Swing – I would classify this as a generally undesirable outcome. Unless you have a particularly good catcher who frames well or unless you have a lousy umpire, 85-90% of these pitches will end up as balls. Obviously, you won’t give up base hits on these pitches – but as far as pitcher effectiveness, I wouldn’t classify this as a positive pitch. Looking at the flip side, it’s a very desirable outcome for a batter.

Outcome D – In Zone / Swung On / No Contact – This outcome is extremely desirable, as it will result in a strike. However, I would rank this outcome lower than A – which was out of the zone. Pitchers should desire swings on pitches outside the zone, rather than inside it.

Outcome E – In Zone / Swung On / Contact – This outcome is the least desirable. The pitch was in the zone, and the ball was struck. This is the largest outcome for generating hits and runs for the opposing batter. Obviously, it is possible for a foul ball to ensue or for weak contact to be generated in fair play. Compared to the other outcomes, I would rank this one as the most inferior.

Outcome F – In Zone / No Swing – A highly desirable outcome. The pitch will be called a strike with high probability, unless you have a poor catcher framer, or a poor umpire.

Ranking the above:

Pitching Outcome Indexes
Outcome Description Index
A Out of Zone / Swung On / No Contact 100%
D In Zone / Swung On / No Contact 90%
F In Zone / No Swing 80%
B Out of Zone / Swung On / Contact Made 65%
C Out of Zone / No Swing 10%
E In Zone / Swung On / Contact Made 0%

The indexes that I provide are from 0% to 100%. The most desirable is at 100%, with the least desirable at a 0%. These aren’t arbitrary, although for the moment, they aren’t formed on a purely mathematical basis. The rough idea is:

  • A & E are clearly the top and bottom to set the range – 100% & 0%.
  • D is set at 10% lower than A to show the more desirable outcome of generating a swinging strike out of the zone.
  • In a recent Twitter poll that I conducted (see the additional notes below) surveyors concluded that F ranks lower than D. F is then set to be 10% lower.
  • B is a middle outcome, but on the positive side. We need to set it over 50%. It is positive since it generates a swing outside of the strike zone, despite the contact. I have set it at 65%.
  • C is set at 10% to reflect a conservative 10% chance of getting a pitch out of the zone called for a strike.

Now let’s put all of this together and define a new statistic!

Outcomes as % of All Pitches

Here are the outcome definitions in terms of the plate discipline metrics. A% + … + F % = 100%

A% = Out of Zone / Swung On / No Contact = (1 – Zone%) * (O-Swing%) * (1 – O-Contact%)

B% = Out of Zone / Swung On / Contact Made = (1 – Zone%) * (O-Swing%) * O-Contact%

C% = Out of Zone / No Swing = (1 – Zone%) * (1- O-Swing%)

D% = In Zone / Swung On / No Contact = (Zone%) * (Z-Swing%) * (1 – Z-Contact%)

E% = In Zone / Swung On / Contact Made = (Zone%) * (Z-Swing%) * Z-Contact%

F% = In Zone / No Swing = (Zone%) * (1- Z-Swing%)

Weighted Plate Discipline Index (wPDI) for Pitchers:

The formula for wPDI, the Weighted Plate Discipline Index:

wPDI = IndexA * A% + IndexB * B% + IndexC * C% + IndexD * D% + IndexE * E% + IndexF * F%

_

Similar to wOBA, this weighted index awards higher values to the better outcomes. It meaningfully aggregates pitcher plate discipline outcomes. It is a way to compare pitchers via one single value.

The indexes are obviously the key. For now, let’s peek at the leaderboards using the proposed indexes. Let’s see if the list of the top pitchers coincides with 2018 surface stats. Let’s also see if we can generate some interesting findings.

The leaderboards below have been generated entirely from 2018 plate discipline data:

Starting Pitcher 2018 wPDI Leaderboard
Name IP wPDI
Chris Sale 158.0 .390
Patrick Corbin 200.0 .377
Domingo German 85.7 .372
Jacob deGrom 217.0 .369
Max Scherzer 220.7 .367
Collin McHugh 72.3 .363
Blake Snell 180.7 .361
Carlos Carrasco 192.0 .359
Justin Verlander 214.0 .359
Kyle Hendricks 199.0 .358
Aaron Nola 212.3 .356
Masahiro Tanaka 156.0 .356
Zack Greinke 207.7 .356
Lance McCullers Jr. 128.3 .354
Jason Vargas 92.0 .353
James Paxton 160.3 .353
Trevor Bauer 175.3 .350
Gerrit Cole 200.3 .349
Stephen Strasburg 130.0 .349
Luis Castillo 169.7 .347
Hyun-Jin Ryu 82.3 .346
Marco Gonzales 166.7 .346
Noah Syndergaard 154.3 .346
Shohei Ohtani 51.7 .346
Kenta Maeda 125.3 .346
Chris Archer 148.3 .344
Corey Kluber 215.0 .344
Jack Flaherty 151.0 .344
Felix Pena 92.7 .344
Shane Bieber 114.7 .343
Robbie Ray 123.7 .343
Wade LeBlanc 162.0 .342
Pablo Lopez 58.7 .342
German Marquez 196.0 .342
Rich Hill 132.7 .342
Luis Severino 191.3 .342
Dylan Bundy 171.7 .342
Charlie Morton 167.0 .341
Joe Musgrove 115.3 .340
Trevor Cahill 110.0 .339
Zack Godley 178.3 .339
Ross Stripling 122.0 .339
Mike Clevinger 200.0 .338
Walker Buehler 137.3 .338
Tyler Skaggs 125.3 .338
Andrew Heaney 180.0 .338
Nick Pivetta 164.0 .337
Alex Wood 151.7 .336
Carlos Martinez 118.7 .336
Jameson Taillon 191.0 .336
Jose Berrios 192.3 .336
Garrett Richards 76.3 .335
Anibal Sanchez 136.7 .335
Zack Wheeler 182.3 .334
CC Sabathia 153.0 .333
Cole Hamels 190.7 .333
Miles Mikolas 200.7 .333
John Gant 114.0 .331
Joey Lucchesi 130.0 .331
Vince Velasquez 146.7 .330
Jon Gray 172.3 .330
Minimum 35 IP

Chris Sale sits atop the starting pitcher wPDI leaderboard. Other notable recognizable names in the top 10 include deGrom, Scherzer, Snell, Carrasco and Verlander. Patrick Corbin makes this list at #2 mostly due to his high A and low E components; he generated a lot of swings and misses outside of the zone and produced little contact inside the zone.

Domingo German is a player that stands out within the top 10. He especially exceled in outcome F – making sure that batters did not even swing at his pitches in the zone. Keep an eye on him to start 2019. Last night, in German’s victory over the Tigers – his wPDI was .409!

Also, interesting to see within the top 25 are Jason Vargas and Marco Gonzales. Vargas excelled at E & F – some of the in-zone outcomes. Gonzales excelled at B & C – some of the out-of-zone outcomes. Gonzales got a lot of batters to make contact on pitches outside of the zone.

Of note is Collin McHugh at #6, although he should technically be classified as a reliever as far as 2018 goes. In his first start to ’19 – he produced a .416 wPDI.

Relief Pitcher 2018 wPDI Leaderboard
Name IP wPDI
Ryan Pressly 71.0 .401
Blake Treinen 80.3 .399
Dellin Betances 66.7 .396
Oliver Perez 32.3 .394
Aroldis Chapman 51.3 .390
Will Smith 53.0 .386
Jace Fry 51.3 .383
Edwin Diaz 73.3 .380
Hector Neris 47.7 .378
Kirby Yates 63.0 .375
Erik Goeddel 36.7 .375
Josh Hader 81.3 .373
Pedro Strop 59.7 .372
Craig Stammen 79.0 .371
Jose Leclerc 57.7 .370
JT Chargois 32.3 .369
Luis Santos 20.0 .369
Raisel Iglesias 72.0 .369
Jeanmar Gomez 25.0 .368
Brad Hand 72.0 .368
Tyler Olson 27.3 .365
Daniel Coulombe 23.7 .364
Jose Castillo 38.3 .364
Tommy Kahnle 23.3 .363
Alex Claudio 68.3 .363
Reyes Moronta 65.0 .363
Craig Kimbrel 62.3 .363
Ryan Brasier 33.7 .362
Adam Ottavino 77.7 .362
A.J. Cole 48.3 .362
Alec Mills 18.0 .361
Miguel Diaz 18.7 .361
Jeremy Jeffress 76.7 .360
Keone Kela 52.0 .360
Matt Barnes 61.7 .360
Will Harris 56.7 .360
Seunghwan Oh 68.3 .360
Corbin Burnes 38.0 .360
Brooks Pounders 15.3 .359
Steve Cishek 70.3 .358
Taylor Rogers 68.3 .358
Pedro Araujo 28.0 .358
Aaron Loup 39.7 .357
Jose Alvarado 64.0 .357
Sean Doolittle 45.0 .357
Vidal Nuno 33.0 .356
Andrew Miller 34.0 .355
Jeffrey Springs 32.0 .355
Pat Neshek 24.3 .355
Tony Watson 66.0 .355
Roberto Osuna 38.0 .354
Adam Kolarek 34.3 .354
Seranthony Dominguez 58.0 .354
Tanner Scott 53.3 .353
Tony Barnette 26.3 .353
Chris Devenski 47.3 .352
Jeurys Familia 72.0 .352
Ken Giles 50.3 .351
Trevor Hildenberger 73.0 .351
A.J. Minter 61.3 .351
Sergio Romo 67.3 .351
Minimum 15 IP

Recognizable elite relief pitchers include Diaz, Treinen, Chapman and Betances, who are found within the top ten. Kirby Yates, Josh Hader and Will Smith are other closers found near the top.

Oliver Perez at #4? Well, there must be a reason why his is still employed and still being used in decently high leverage situations. Maybe this explains it.

Atop all relievers though, was Ryan Pressly – who led all pitchers in 2018 at .401 [min 5 IP]. Pressly exhibited elite A, C and F components. That is, Pressly avoided bats extremely well. He generated lots of swings and misses on out-of-zone pitches, yet when the batter didn’t swing – it was often a strike. If Osuna faulters in 2019, it’s clear who should be given the next save opportunity.

Jace Fry excelled in components A, E and F – which is somewhat similar to Pressly. Fry threw a few more balls out of the zone which were not swung on, but he even further limited contact on balls swung on in the zone.

Assorted Notes:

  • The maximum wPDI in 2018 was around .400, with the lowest (not shown) around .250. The average wPDI across all pitchers was approximately .325.
  • Since A% + B% + C% + D% + E% + F% = 100%, if I had used an index of 100% for each of the outcomes, all pitchers would have exhibited a value of 100%.
  • wPDI does not currently consider other possibly useful modifiers such as contact type (hard/medium/soft), or call data (called strikes, called balls), etc. Instead, wPDI contemplates only 3 binary events. wPDI currently goes for simplicity – breaking everything down into only 6 possible outcomes.
  • I took to Twitter to help me with the ranking of outcomes (polls conducted here, here and here). I tried to incorporate poll relativity in creating the initial indexes.
    • I completely disagreed with Twitters ranking of A vs. D. Twitter slightly preferred swinging / no contact while in the zone over out of the zone. If I am pitcher – I’d much rather induce a swing on a lousy pitch than at a good one.
    • I also disagreed somewhat with Twitter’s ranking of B vs. F – which the voters seemed to be evenly split on. To me, not generating a swing on a pitch in the zone is more desirable than getting contact on an outside pitch.
  • wPDI is a skills-based metric. At some point into 2019, we will be able to see which pitchers exhibit skills growth and decline.
  • Although one single game is still a small sample size, it’s nice that wPDI can produce a “game score.” Theoretically, one can track wPDI game to game, and consider rolling averages. For wOBA, where the denominator is plate appearances – one game is an extremely small sample size. With wPDI – the denominator is pitches, which will converge a lot faster.

All in all, this was a very useful exercise. Looking at the individual components of wPDI can tell you a lot about the effectiveness and deceptive characteristics of pitchers.

There is more work to be done on wPDI, starting with the indexes. There may be index values which nicely correlate outcomes to strikeouts, or which correlates outcomes to the limiting of walks, etc. This was a first, but meaningful attempt. We also need to ask the question of whether to add more complexity to the metric, or to keep it simple. Should we limit wPDI to these 6 outcomes, or should we add in some other binary events and expand?

wPDI is not fully ready for prime time just yet. I first wanted to establish and to demonstrate the concept. You, the collective readers of this website, are the best possible source of feedback.

We hoped you liked reading Introducing: Weighted Plate Discipline Index (wPDI) for Pitchers by Ariel Cohen!

Please support FanGraphs by becoming a member. We publish thousands of articles a year, host multiple podcasts, and have an ever growing database of baseball stats.

FanGraphs does not have a paywall. With your membership, we can continue to offer the content you've come to rely on and add to our unique baseball coverage.

Support FanGraphs




Ariel was a finalist for two 2018 FSWA Awards - Baseball Article of the Year, and Baseball Writer of the Year. Ariel is the creator of the ATC (Average Total Cost) Projection System. Ariel also writes for CBS Sports and Sportsline, and is the host of the Great Fantasy Baseball Invitational - Beat the Shift Podcast. Ariel and his fantasy partner, Reuven Guy, have used the ATC system projections to finish in the money in several NFBC, RTSports, Doubt Wars and other national leagues, racking up several division titles. Ariel is a member of the inaugural Tout Wars Draft & Hold League. Ariel Cohen is a fellow of the Casualty Actuarial Society (CAS) and the Society of Actuaries (SOA). He is a Vice President of Risk Management for a large international insurance and reinsurance company. Follow Ariel on Twitter at @ATCNY.

newest oldest most voted
NicklePickers
Member
Member
NicklePickers

Very cool! Excited for this one to hit prime time.