When I started this series which attempts to determine the projected fantasy value for prospects, I knew today’s step would be the hardest. The issue was converting various pitch grades (and control) into a workable framework for a pitcher’s overall production value. I thought I may not end up with a workable answer, but the following results have promise beyond just grading pitches.
I was able to piece together work from various articles and gave each pitch a grade based on the ERA scale. Combining per-pitch-ERA’s with a control value, it looks like we can estimate a pitcher’s overall value.
The one grade out of the group which is set in the scouting circles is the fastball grade based off velocity. Here are the standard grades for fastballs used here at FanGraphs and with several teams.
|Grade||Tool Is Called||Fastball Velo|
The other pitches are graded from a scout’s experience. The key is to be able to take each of the individual grades and come up with an overall composite grade. Here is the process I will use.
Some semi-heavy (confusing) math
The process for giving pitches an ERA value wasn’t straight forward. With a curveball, how can a person tell, even with Pitchf/x and/or Trackman, which pitch break and speeds are best? Even with fastballs, the velocity is just one measure and it doesn’t take into account break and deception. To get the Pitch ERA (pERA), I used each pitch’s groundball rate (GB%) and swinging strike rates (SwStr%) to create an ERA value. Here are the four major steps.
Step 1. Collect pitch groundball rate and swinging strike rates
I hate to say this, but Mr. Eno Sarris was right or at least he was a headed down the right path when it came to evaluating pitches. A couple of years ago, he started down that path by giving pitches a certain benchmark knowing the league average (median) values for SwStr% and GB%. Over time, the values were refined with Rob Parker at FakeTeam.com creating his own Arsenal Scores.
When Eno and I were working on the values, we never could determine how to deal with the pitches, especially four-seam fastballs, which induced weak pop flies. These low-groundball pitches are more valuable than good-groundball pitches because they give up lower BABIP. I finally found a solution.
Step 2. Create a GB% Equation for ERA
In this step, I had already created the equation a year or so ago. The key was to remember I created it considering my advanced age. The equation came from my work showing the virtues of kwERA at the Hardball Times. I called it GBkwERA (horrible name I know) and here it is:
GBkwERA = kwERA * (-3.518*GB%^2+2.344*GB%+.629)
where kwERA = 5.20 – 12*(K%-BB%)
Note: The 5.20 value can be changed depending on the run environment
Now with this equation, we can insert the pitch specific ground ball rates. Now I need to work with the strikeout and walk rates, which leads us onto our next two steps.
Step 3. Converting swinging strike rate to strike out rate
While pitches can be given a strikeout rate, that’s only true if the pitch ended an at bat. A pitcher may prefer one pitch to finish off a batter while others set up the final knockout, though, so it’s not a great measure for every pitch, particularly fastballs. For this reason, I prefer to keep using SwStr%. The swinging strike to strikeout transition can be done using this simple rule of thumb:
K% = 2 * SwStr%
I have used the formula for years. I looked at all pitchers who threw at least 60 IP in a season from 2005 to 2016. The r-squared between the two values was .60 and the r-squared value only goes up as the innings pitched increases.
Step 4: Adding a walk rate
For this step, I just used the league average walk rate of 8.0%. Using this average value allows the per-pitch math to be simpler and a separate Control/Command Grade can be individually added later.
Equation to find prospect grades
By putting all the above work together, the pERA equation ends up as:
pERA = 5.20 – 12*(SwStr% * 2 – 8%) * (-3.518*GB%^2+2.344*GB%+.629)
Here is the spreadsheet with the values and other stats from 2014 to 2016. For reference, here are 2016 the median and average pERA values for each pitch:
The individual pERA values can then be weighted by times thrown to create an overall ERA value. This value (first column in the spreadsheet) is missing the control components (walks), but is a great measure of a pitcher’s overall “stuff”.
The next step was to create a 20-80 prospect grade for each of the pERA values. For this step, I used standard deviation with a twist. First, here is some background from Kiley McDaniel on how scouting grades and standard deviation are linked:
The invention of the scale is credited to Branch Rickey and whether he intended it or not, it mirrors various scientific scales. 50 is major league average, then each 10 point increment represents a standard deviation better or worse than average. In a normal distribution, three standard deviations in either direction should include 99.7% of your sample, so that’s why the scale is 20 to 80 rather than 0 and 100. That said, the distribution of tools isn’t a normal curve for every tool, but is somewhere close to that for most.
The problem I ran into when working with ERA scale is that ERA values are not normally distributed. For example, consider the league average ERA to be 4.00 and a standard deviation of 1.0. If I looked at 3 standard deviations from the mean, almost no pitchers will have an ERA of 1.00, but an ERA over 6.00 is pretty common. What I did instead, is determined the pERA values which would meet the number of data samples the make a normal distribution.
Normally, one standard deviation in a normal distribution contains 68.2% of the all the available data around the mean. ERA, and therefore pERA, are not normal distributions as seen here (average pERA value is 4.49):
So instead, I found the pERA values which were at the 34.1% level above and below the median and continued this for two more deviations. In the end, I ended up with the following values for four-seam fastballs:
|Grade||Standard Deviations||Four-seam Fastball pERA|
Then, I gave each pitch a corresponding grade depending on how many standard deviations the pitch grades (pERA) were from the mean.
If there is one area I am not 100% happy with right now, it is this step. Just not enough pitches are in the top ranges and those that are, aren’t thrown much. If I start requiring a higher number of minimum pitches thrown, though, I will have removed the bad pitches which also should be part of the equation. If I am going to adjust information in any one area, it will probably be with this conversion but I am not sure how.
Command/Control and Overall Grade
With the end in sight, the command and/or control component of the pitching grade must be addressed. I will let Baseball America’s J.J. Cooper explain the difference between command and control from a chat this summer:
Jake (Georgia): Hi JJ, thanks for the chat. Can you please explain the difference between control and command?
J.J. Cooper: Control is throwing strikes. It’s something that can arguably be graded by the stats–teams differ on their philosophies on this but for many teams 3.0 BB/9 in majors = 50 control. With grades going up and down from there. If the catcher sets up for a pitch at the high and outside corner of the zone and the pitcher misses to low and inside but still a strike, the pitcher is still demonstrating control. Control is throwing strikes instead of balls. Command cannot be quantified without a new level of trackman/etc that as of yet has not been developed to my knowledge. Command is hitting your target. If a catcher sets up low and away and the pitch ends up high and inside but a strike, that’s not demonstrating command. Control can be governed partly by mindset. A pitcher may decide that he’ll throw the ball over the middle of the plate rather than give up walks. Or another pitcher (like Tom Glavine) may pitch off of the plate in a 3-ball count preferring to stay with his approach rather than “give in.” It’s why you could argue that Glavine’s command was actually a grade or two better than his control. Here’s a pay story I wrote a few years ago that explains it all in extreme detail.
I am going to go the easy route and just look for an average control grade which is pretty easy using walks per nine innings (BB/9). I used the same modified standard deviation procedure I used for pitch grades and came up with the following grades for command:
The 3.0 BB/9 value (which is really close to league average) listed in the J.J. Cooper quote is close to the 3.3 BB/9 I calculated. The difference is because pitchers with better control are allowed to throw more pitches and weight the league average close to 3.0 BB/9.
Right now, I am going to do a step which doesn’t make sense mathematically in an OPS sort of way but ends up working out fine. I am going to use BB/9 and BB% in the same equation. The reason is that the math is way easier and that BB/9 values are normally used for Control Grades. What I found was that for every one point change in BB/9, a pitcher’s ERA changes in the same direction by a half a point. That rule is just too easy to not use, especially when looking at all pitchers who have control issues. Additionally, I found the following formula to get an overall ERA grade (second column).
Overall pERA with control = Overall pERA w/o control + ((BB/9 -2) * 0.5)
That is it for today. Time to wrap things up.
Future work and conclusions
I am at the point where I need to divide the pitchers up into starters and relievers to give them their overall prospect grades, a process which deserves its own article. Additionally, I need to look into the number of pitch types thrown to determine how much a varied arsenal matters. Some initial quick looks seem puts the adjustment and +/- 0.25 ERA, but I want to test out some more ideas.
The preceding process pulls formulas and assumptions from several places. In the end, every value depends on some previous formula. Here is where the values could be adjusted in the future to tighten up the final figure.
- GBkwERA formula: All the other values are based off it, so a small change with it will mean further changes down the line.
- Pitch type standard deviations: I don’t like the results of this step, but any change I made the problem seem worse (like pitches being graded at 110). I am working on a solution, but nothing yet.
- Command value added: While I feel good about the control value, I would like to add a command value. I am just not sure how to yet.
One final issue with the data right now, it is just backward looking and probably not the best for predictive purposes. One step I want to add (or someone else if they are ambitious enough), is to regress the pERA values (or probably its inputs) for more predictive values. These regressed value will be good to maybe help determine how a pitcher will perform as they mess with their pitch mix.
I am a little surprised the results worked out as well as they did. In the end, each pitch is given an ERA-equivalent value (pERA) and a Pitch Grade. That value, along with a Control Grade, can be combined by using the combination of several different formulas to get an overall value. For me, the process next turns to using these pitch values to help get a pitcher’s overall grade and possible MLB production. While I created the process to help with prospect grades, I can see its uses include valuing starter to reliever transitions, pitchers changing pitch mixes, adding a pitch, or result changes as velocity declines.
Finally, fire away with your questions. Let’s find the holes in the process now so it can be adjusted for future work.
Jeff, one of the authors of the fantasy baseball guide,The Process, writes for RotoGraphs, The Hardball Times, Rotowire, Baseball America, and BaseballHQ. He has been nominated for two SABR Analytics Research Award for Contemporary Analysis and won it in 2013 in tandem with Bill Petti. He has won three FSWA Awards including on for his MASH series. In his first two seasons in Tout Wars, he's won the H2H league and mixed auction league. Follow him on Twitter @jeffwzimmerman.