Rob Silver, the 2016 National Fantasy Baseball Championship (NFBC) Main Event winner and high-stakes fantasy baseball extraordinaire, messaged me on Twitter a few days ago to ask a question: Which source of pitching statistics are most accurate? I’m paraphrasing. Also, I could paraphrase the question any number of ways: Which source should we be using? Which most reliably correlates with pitcher performance?
It was a question for which I had no answer. Admittedly, I use a variety of sources, none of which align with one another — something I have noticed before but about which I can do nothing but shrug and accept it as a quirk of being a sabermetrician who bears the struggle of dealing with publicly available data.
The sources cryptically mentioned above include the following:
- “Plate Discipline,” hosted at FanGraphs; these data are supplied by Baseball Info Solutions (BIS)
- “Pitch Info,” also hosted at FanGraphs and which displaced PITCHf/x data a couple of years ago; these data are effectively PITCHf/x, but cleaned up and refined
- PITCHf/x, which populates Baseball Prospectus’ PITCHf/x leaderboards
As aforementioned, none of the data align, at least not perfectly. Moreover, Brooks Baseball, which I use frequently for player-specific analyses of more granular pitching data, are allegedly powered by Pitch Info. But, again, to my knowledge, do not align perfectly with the Pitch Info data on FanGraphs, the two of which are presumably one in the same (but not).
Silver’s question prompted me to finally tackle the issue head-on. What follows are the results.
I pulled five years’ worth of data from each source, split up by season and limited to pitchers who qualified for the ERA title (at least 162 innings recorded) or threw at least 2,500 pitches. This creates a panel of roughly 350 player-seasons.
For FanGraphs data (Plate Discipline and Pitch Info), I used or calculated the following variables:
- Swing that makes contact on a pitch in the zone: Zone% * Z-Swing% * Z-Contact%
- Swing that does not make contact on a pitch in the zone: Zone% * Z-Swing% * (1 — Z-Contact%)
- Swing that makes contact on a pitch outside the zone: (1 — Zone%) * O-Swing% * O-Contact%
- Swing that does not make contact on a pitch outside the zone: (1 — Zone%) * O-Swing% * (1 — O-Contact%)
- No swing on a pitch in the zone: Zone% * (1 — Z-Swing%)
- No swing on a pitch outside the zone: (1 — Zone%) * (1 — O-Swing%)
For each player, these percentages sum to 100%, comprising every possible outcome for a plate appearance at the highest possible level, all expressed as percentages/frequencies.
For Baseball Prospectus data (PITCHf/x), I took a more circuitous route given the tools (variables) available to me:
- Swing that makes contact: [Sw Rate] * (1 — Whf/Sw)
- Swing that does not make contact: [Sw Rate] * Whf/Sw
- Called strike percentage: [Called S] / Num
This serves as a proxy for “No swing on a pitch in the zone”
- Called ball percentage: [Called B] / Num
This serves as a proxy for “No swing on a pitch outside the zone”
You might think the less-granular data from the PITCHf/x leaderboards would produce the worst results. You might be wrong (but, also, you might be right — there’s a complication here upon which I’ll expound shortly).
For each set of data, I specified separate regression equations using every outcome listed above as independent variables for each of the following descriptive dependent variables:
- K% (strikeout rate)
- BB% (walk rate)
- ERA (earned run average)
- FIP (Fielding Independent Pitching)
- xFIP (Expected FIP)
- SIERA (what uhhhh what does that stand for)
This table summarizes the adjusted r2 produced by each equation for every dependent variable and data source:
|Metric||PITCHf/x||Plate Discipline||Pitch Info|
Across the board, PITCHf/x performs better than Plate Discipline and Pitch Info (except for xFIP, which neutralizes the ill effects of home runs and fly balls, thereby inadvertently leveling the playing field). You can describe — not predict, but describe — strikeout and walk rates really, really well with each data set, and as you should. ERA, while a more dubious affair, still bears a moderate correlation to the data. Note the increasing correlation with the peripheral pitching metrics — FIP, xFIP, SIERA — which, by no coincidence whatsoever, mirrors the strength with which they describe/predict ERA in-season. (Please do not make me dig up the SIERA over xFIP over FIP diatribe.)
But, ah, yes, the complication: the PITCHf/x data uses not only fewer variables but also potentially more accurate independent variables. Called strikes and balls are inherently more accurate when describing the outcomes of plate appearances with no swing: they remove all human error that might be associated with an umpire expanding the strike zone (i.e. strike on a pitch typically called a ball) or squeezing the pitcher (ball on a typical strike). This sets PITCHf/x data at a slight advantage, although I can’t confirm by how much. Would it be better than the others without it? Worse? Comparable?
Given the uncertainty here, and given how close each data source compares to one another, it’s hard for me to determine this as anything other than a three-way draw. I can’t, in good faith, declare PITCHf/x has a distinct edge (small as it may be) because of this wrinkle. My best advice to you: use all of them; mentally blend them together and understand that they describe pitcher performance in different ways that ultimately produce similar outcomes. Sorry, I know it’s an underwhelming result. At least it brings me, and hopefully Silver, some peace of mind.
I’ll leave you with this final nugget. Swinging strike rate (SwStr%) is easily the most popular standalone peripheral metric that fantasy analysts use as a shorthand for pitcher effectiveness. Here’s how each data source’s swinging strike rate correlates with strikeout rate, just so the record shows:
Plate Discipline: r2 = 0.753
Pitch Info: r2 = 0.753
PITCHf/x: r2 = 0.771
Just note you have to calculate PITCHf/x swinging strike rate on your own (swing rate multiplied by whiffs-per-swing). But it’s only microscopically better. In other words: you’re on your own, kid.