Which Source for Pitching Metrics is Best?

October 29, 2018

Rob Silver, the 2016 National Fantasy Baseball Championship (NFBC) Main Event winner and high-stakes fantasy baseball extraordinaire, messaged me on Twitter a few days ago to ask a question: Which source of pitching statistics are most accurate? I’m paraphrasing. Also, I could paraphrase the question any number of ways: Which source should we be using? Which most reliably correlates with pitcher performance?

It was a question for which I had no answer. Admittedly, I use a variety of sources, none of which align with one another — something I have noticed before but about which I can do nothing but shrug and accept it as a quirk of being a sabermetrician who bears the struggle of dealing with publicly available data.

The sources cryptically mentioned above include the following:

“Plate Discipline,” hosted at FanGraphs; these data are supplied by Baseball Info Solutions (BIS)
“Pitch Info,” also hosted at FanGraphs and which displaced PITCHf/x data a couple of years ago; these data are effectively PITCHf/x, but cleaned up and refined
PITCHf/x, which populates Baseball Prospectus’ PITCHf/x leaderboards

As aforementioned, none of the data align, at least not perfectly. Moreover, Brooks Baseball, which I use frequently for player-specific analyses of more granular pitching data, are allegedly powered by Pitch Info. But, again, to my knowledge, do not align perfectly with the Pitch Info data on FanGraphs, the two of which are presumably one in the same (but not).

Silver’s question prompted me to finally tackle the issue head-on. What follows are the results.

I pulled five years’ worth of data from each source, split up by season and limited to pitchers who qualified for the ERA title (at least 162 innings recorded) or threw at least 2,500 pitches. This creates a panel of roughly 350 player-seasons.

For FanGraphs data (Plate Discipline and Pitch Info), I used or calculated the following variables:

Swing that makes contact on a pitch in the zone: Zone% * Z-Swing% * Z-Contact%
Swing that does not make contact on a pitch in the zone: Zone% * Z-Swing% * (1 — Z-Contact%)
Swing that makes contact on a pitch outside the zone: (1 — Zone%) * O-Swing% * O-Contact%
Swing that does not make contact on a pitch outside the zone: (1 — Zone%) * O-Swing% * (1 — O-Contact%)
No swing on a pitch in the zone: Zone% * (1 — Z-Swing%)
No swing on a pitch outside the zone: (1 — Zone%) * (1 — O-Swing%)

For each player, these percentages sum to 100%, comprising every possible outcome for a plate appearance at the highest possible level, all expressed as percentages/frequencies.

For Baseball Prospectus data (PITCHf/x), I took a more circuitous route given the tools (variables) available to me:

Swing that makes contact: [Sw Rate] * (1 — Whf/Sw)
Swing that does not make contact: [Sw Rate] * Whf/Sw
Called strike percentage: [Called S] / Num
This serves as a proxy for “No swing on a pitch in the zone”
Called ball percentage: [Called B] / Num
This serves as a proxy for “No swing on a pitch outside the zone”

You might think the less-granular data from the PITCHf/x leaderboards would produce the worst results. You might be wrong (but, also, you might be right — there’s a complication here upon which I’ll expound shortly).

For each set of data, I specified separate regression equations using every outcome listed above as independent variables for each of the following descriptive dependent variables:

K% (strikeout rate)
BB% (walk rate)
ERA (earned run average)
FIP (Fielding Independent Pitching)
xFIP (Expected FIP)
SIERA (what uhhhh what does that stand for)

This table summarizes the adjusted r² produced by each equation for every dependent variable and data source:

Goodness of Fit Measurements

Metric	PITCHf/x	Plate Discipline	Pitch Info
K%	0.832	0.784	0.804
BB%	0.633	0.589	0.561
ERA	0.227	0.194	0.195
FIP	0.455	0.401	0.408
xFIP	0.544	0.545	0.525
SIERA	0.631	0.582	0.576

Across the board, PITCHf/x performs better than Plate Discipline and Pitch Info (except for xFIP, which neutralizes the ill effects of home runs and fly balls, thereby inadvertently leveling the playing field). You can describe — not predict, but describe — strikeout and walk rates really, really well with each data set, and as you should. ERA, while a more dubious affair, still bears a moderate correlation to the data. Note the increasing correlation with the peripheral pitching metrics — FIP, xFIP, SIERA — which, by no coincidence whatsoever, mirrors the strength with which they describe/predict ERA in-season. (Please do not make me dig up the SIERA over xFIP over FIP diatribe.)

But, ah, yes, the complication: the PITCHf/x data uses not only fewer variables but also potentially more accurate independent variables. Called strikes and balls are inherently more accurate when describing the outcomes of plate appearances with no swing: they remove all human error that might be associated with an umpire expanding the strike zone (i.e. strike on a pitch typically called a ball) or squeezing the pitcher (ball on a typical strike). This sets PITCHf/x data at a slight advantage, although I can’t confirm by how much. Would it be better than the others without it? Worse? Comparable?

Given the uncertainty here, and given how close each data source compares to one another, it’s hard for me to determine this as anything other than a three-way draw. I can’t, in good faith, declare PITCHf/x has a distinct edge (small as it may be) because of this wrinkle. My best advice to you: use all of them; mentally blend them together and understand that they describe pitcher performance in different ways that ultimately produce similar outcomes. Sorry, I know it’s an underwhelming result. At least it brings me, and hopefully Silver, some peace of mind.

I’ll leave you with this final nugget. Swinging strike rate (SwStr%) is easily the most popular standalone peripheral metric that fantasy analysts use as a shorthand for pitcher effectiveness. Here’s how each data source’s swinging strike rate correlates with strikeout rate, just so the record shows:

Plate Discipline: r² = 0.753
Pitch Info: r² = 0.753
PITCHf/x: r² = 0.771

Just note you have to calculate PITCHf/x swinging strike rate on your own (swing rate multiplied by whiffs-per-swing). But it’s only microscopically better. In other words: you’re on your own, kid.

1 Comment

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Nicklaus GautMember since 2018

6 years ago

Thank you for the write up, as similar things have been on my mind lately. You described “Plate Discipline” as being hosted at FanGraphs , with data being supplied by BIS. Are FG’s “Pitch Values”, Pitchf/x data that’s ‘cleaned up and refined’ in-house by FanGraphs, or is that done by BIS as well?

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG