Poll: Which Group of Pitchers Performs Better? – The Results

During the all-star break, I decided to undergo a little experiment. I took two groups of 10 starting pitchers comprised of those whose ERAs outperformed and underperformed their SIERA marks by the largest margins. There were 437 of you who answered the question “Which Group Posts a Lower ERA RoS?” and 61.1% of you voted for Group A, the SIERA outperformers. Despite this group actually posting a higher SIERA than Group B, you felt that the magic would continue. Let’s find out the results and if the majority was correct.

I’ll first start by reviewing how the SIERA outperformers did in both halves:

Group A – The SIERA Outperformers, 1st Half

Name IP K% BB% BABIP LOB% HR/FB ERA SIERA Diff
Bartolo Colon 126.2 14.0% 3.0% 0.287 80.2% 6.0% 2.70 4.19 -1.49
Bronson Arroyo 123.2 13.7% 4.6% 0.254 78.9% 11.6% 3.42 4.41 -0.99
Clayton Kershaw 145.1 24.8% 6.3% 0.238 78.7% 5.4% 1.98 3.24 -1.26
Hiroki Kuroda 118.2 17.7% 5.1% 0.252 82.6% 9.8% 2.65 3.88 -1.23
Jason Marquis 112.1 14.7% 13.2% 0.256 79.7% 19.6% 3.77 5.11 -1.34
Jeff Locke 109.0 16.7% 10.8% 0.228 83.3% 6.7% 2.15 4.56 -2.41
Jorge de la Rosa 109.1 16.6% 8.5% 0.294 76.4% 6.7% 3.21 4.32 -1.11
Mike Leake 117.0 15.0% 5.5% 0.260 79.6% 10.0% 2.69 4.11 -1.42
Patrick Corbin 130.1 21.2% 6.4% 0.246 81.9% 7.8% 2.35 3.61 -1.26
Travis Wood 122.2 17.6% 7.8% 0.227 76.5% 5.8% 2.79 4.45 -1.66
Average 121.2 17.4% 7.0% 0.253 79.6% 8.8% 2.74 4.15 -1.40

Group A – The SIERA Outperformers, 2nd Half

Name IP K% BB% BABIP LOB% HR/FB ERA SIERA Diff
Bartolo Colon 63.2 17.5% 5.2% 0.307 79.7% 6.0% 2.54 4.13 -1.59
Bronson Arroyo 78.1 17.3% 3.5% 0.288 76.3% 18.3% 4.37 3.75 0.62
Clayton Kershaw 90.2 26.7% 4.9% 0.272 83.3% 6.7% 1.59 2.76 -1.17
Hiroki Kuroda 82.2 18.9% 5.4% 0.324 68.5% 11.0% 4.25 3.67 0.58
Jason Marquis 5.1 0.0% 11.1% 0.333 45.5% 0.0% 10.13 7.23 2.90
Jeff Locke 57.1 18.9% 13.5% 0.365 67.0% 14.7% 6.12 4.53 1.59
Jorge de la Rosa 58.1 14.2% 9.0% 0.320 73.9% 9.3% 4.01 4.67 -0.66
Mike Leake 75.1 15.6% 6.7% 0.321 75.3% 13.9% 4.42 4.26 0.16
Patrick Corbin 78.0 19.9% 6.1% 0.337 68.6% 13.7% 5.19 3.67 1.52
Travis Wood 77.1 17.4% 8.4% 0.280 78.6% 8.4% 3.61 4.58 -0.97
Average 66.2 18.6% 6.7% 0.310 74.6% 11.2% 3.98 3.96 0.02

Remember that magic this group benefited from that resulted in the trio of a lucky BABIP, LOB% and HR/FB ratio in the first half? Yeah, that good fortune disappeared. Their BABIP and HR/FB marks jumped right back up to the second half league average, but they did sustain an above average LOB%, even though it dropped dramatically from the first half. Over this relatively small sample of 10 pitchers, in aggregate, they do not actually have any special abilities. As a group, their ERA was essentially the same as their SIERA in the second half, a far cry from the 1.40 runs they outperformed their SIERA by in the first half.

I asked another question in my original post, and that was “Which Range Will Group A’s ERA Fall Into RoS?”. The 3.50-3.74 range garnered the highest percentage of votes at 24.4%, while the correct range of 3.75-3.99 earned the third highest percentage at 21.4%. It seems pretty clear that everyone assumed regression, but not as much as actually occurred.

Interestingly, this group’s strikeout and walk rates improved from the first half, which pushed its SIERA below 4.00. In the comments, Sky Kalkman, man of many saber-friendly Internet sites, shared his theory that perhaps as BABIP regresses like we saw in this group, their peripherals will improve. That is exactly what happened. It’s still too small a sample to conclude anything, but this theory has me intrigued now.

Group B – The SIERA Underperformers, 1st Half

Name IP K% BB% BABIP LOB% HR/FB ERA SIERA Diff
Edinson Volquez 109.2 18.8% 10.2% 0.342 63.3% 8.8% 5.74 4.31 1.43
Edwin Jackson 100.1 19.5% 8.1% 0.320 62.3% 10.6% 5.11 3.83 1.28
Ian Kennedy 108.0 19.1% 8.4% 0.298 67.1% 12.6% 5.42 4.26 1.16
Jeremy Hellickson 117.2 20.3% 5.5% 0.296 66.9% 10.8% 4.67 3.74 0.93
Joe Blanton 112.1 18.2% 5.1% 0.343 70.6% 18.1% 5.53 3.85 1.68
Matt Cain 112.0 22.1% 7.9% 0.257 63.4% 12.7% 5.06 3.84 1.22
Rick Porcello 99.1 19.4% 4.6% 0.317 65.4% 15.7% 4.80 3.15 1.65
Roberto Hernandez 108.1 18.2% 5.4% 0.304 69.7% 21.2% 4.90 3.63 1.27
Wade Davis 94.2 19.9% 9.4% 0.381 66.3% 13.5% 5.89 4.21 1.68
Yovani Gallardo 113.2 18.3% 8.7% 0.310 65.7% 12.5% 4.83 4.10 0.73
Average 107.2 19.3% 7.3% 0.315 65.9% 13.6% 5.17 3.88 1.29

Group B – The SIERA Underperformers, 2nd Half

Name IP K% BB% BABIP LOB% HR/FB ERA SIERA Diff
Edinson Volquez 60.2 17.4% 9.4% 0.293 67.1% 17.2% 5.64 4.39 1.25
Edwin Jackson 75.0 14.5% 7.0% 0.324 64.6% 9.2% 4.80 4.34 0.46
Ian Kennedy 73.1 22.6% 10.4% 0.290 72.2% 14.3% 4.17 4.04 0.13
Jeremy Hellickson 56.1 14.6% 9.2% 0.328 65.2% 11.1% 6.23 4.94 1.29
Joe Blanton 20.1 14.9% 7.9% 0.361 60.1% 24.0% 8.85 4.26 4.59
Matt Cain 72.1 18.8% 6.1% 0.264 84.5% 8.1% 2.36 4.04 -1.68
Rick Porcello 77.2 19.1% 7.1% 0.312 75.1% 12.1% 3.71 3.68 0.03
Roberto Hernandez 42.2 15.9% 7.1% 0.318 72.4% 20.0% 4.85 3.73 1.12
Wade Davis 40.2 15.0% 9.4% 0.316 71.0% 4.5% 3.98 4.65 -0.67
Yovani Gallardo 67.0 19.2% 8.3% 0.278 79.1% 10.9% 3.09 3.96 -0.87
Average 58.2 17.7% 8.1% 0.303 72.2% 12.3% 4.39 4.17 0.22

This group was terrible in the first half, hampered by a high BABIP and HR/FB ratio and an inability to strand runners. But, most of those problems suddenly disappeared in the second half and the group went from underperforming their SIERA marks by 1.29 runs to just 0.22 runs. Yes, they still underperformed, but 0.22 is much less significant and within a reasonable error range. In fact, the group actually posted a lower BABIP than Group A in the second half! The other luck metrics weren’t much worse than Group A either.

The leading vote-getter to the question of “Which Range Will Group B’s ERA Fall Into RoS?” was 4.00-4.24, with 36.2% of the vote. This proved to be a bit too optimistic and surprisingly only 9.8% of you guessed the correct range of 4.25-4.49, which garnered the fourth highest percentage of votes.

For a second time, we observe a change in peripherals, this time a decline, as the strikeout rate dropped and walk rate increased. This was on the heels of a BABIP decline, once again giving some early credence to Sky’s theory mentioned above. Ultimately, the worse skills led to a higher SIERA in the second half versus the first half.

Now let’s directly compare the average lines of each group in the second half:

Group IP K% BB% BABIP LOB% HR/FB ERA SIERA Diff
SIERA Outperformers 66.2 18.6% 6.7% 0.310 74.6% 11.2% 3.98 3.96 0.02
SIERA Underperformers 58.2 17.7% 8.1% 0.303 72.2% 12.3% 4.39 4.17 0.22

So the 267 of you who voted that Group A, the SIERA outperformers, would post a lower rest of season ERA, give yourself a pat on the back, as you were correct. But I bet it would still surprise many to learn how much the gap narrowed between the two groups. Group A posted the better peripherals and SIERA, so they should have posted a better ERA. But the story is that they essentially matched their SIERA after significantly outperforming it in the first half. It’s always tempting to fish for an explanation and try to justify the outperformance with unique visual observations and scouting type analysis. Nobody wants to shrug their shoulders and say they don’t know. But I’m here to tell you that it’s okay, you are allowed to simply explain it off as luck over a small sample size.

In the comments of the original post, Wobatus was kind enough to figure out each group’s rest of season ZiPS projections. He calculated Group A’s as 4.16 and Group B’s as 4.12. So, essentially the same. With that context, we do note that Group A slightly outperformed their RoS projection, while Group B underperformed it. However, we don’t know what peripherals the ZiPS projections were projecting, so we’re missing crucial information necessary for a complete analysis.

Obviously over just two halves of a season, any pitcher could outperform or underperform their SIERA marks like we saw with Bartolo Colon and Jeremy Hellickson. But knowing ahead of time which pitchers are going to do that is a fool’s errand. If you know that as a group, the outperformers will regress, while the underperformers will improve, then you have to try your hardest to ignore ERA and rely on the underlying skills and SIERA marks.





Mike Podhorzer is the 2015 Fantasy Sports Writers Association Baseball Writer of the Year and three-time Tout Wars champion. He is the author of the eBook Projecting X 2.0: How to Forecast Baseball Player Performance, which teaches you how to project players yourself. Follow Mike on X@MikePodhorzer and contact him via email.

25 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Mister
11 years ago

Cool. This is pretty close to what I thought would happen, but I’m surprised to see that Group A ended up winning by such a large margin. I figured it would be very close in the end.

I can’t come up with a good reason as to why peripherals moved to push SIERAs in the direction of ERAs. BABIP regression would affect per 9 peripherals, but shouldn’t affect per PA peripherals, and SIERA is calculated using K% and BB%, correct?

Mister
11 years ago
Reply to  Mister

I guess the moral of the story is that everything regresses. Any time you have significantly above or below average performances, you should expect regression to the mean. More stable parameters like those that go into SIERA will regress less than metrics like ERA, but they will still regress.

And now I am preaching to the choir.

some.guy
11 years ago
Reply to  Mister

“Any time you have significantly above or below average performances, you should expect regression”

Patently untrue, the way you worded it.

I only mention this because as a Braves fan, I still hear people still thinking that Kimbrel’s 90% LOB% will regress to the mean, when instead, that number is the mean which accurately represents his talent. Deviations from that mean of 90% should regress, but not to an average performance of around 72%, as you seem to suggest.

Mister
11 years ago
Reply to  Mister

Well, I didn’t say what I meant by “average.” It could mean league average, or it could mean a player’s career average from previous seasons. I almost put in a caveat there to say “league average, unless a player has a long track record of being better/worse than league average,” or something like that, but I couldn’t get the English right so I just left it slightly vague.

And the context here implied that I wasn’t referring to specific individuals, but rather to groups of players. A single player can always be an exception, but probably not a group of 10. Basically, if you pick the top 10 players in ANY stat at the halfway point of the season and then track them the rest of the season, you should find that they move in the direction of the league average in that stat, either to a large or small degree.

Mister
11 years ago
Reply to  Mister

Nevermind though, I’m wrong about this theory applying to this case. The league average SIERA this year was 3.87, and Group B’s first half SIERA was 3.88. So according to my previous explanation, Group B’s SIERA shouldn’t have changed much during the 2nd half. Either this is just random variation, or there really is something about underperforming your SIERA that leads to your SIERA increasing somewhat.

Giovani
11 years ago
Reply to  Mister

You’re sure using a lot of words and comments section space to essentially say nothing and show you don’t understand regression.

Mister
11 years ago
Reply to  Mister

You’re using a small number of words to show that you’re an asshole.

A mustachioed business tycoon
11 years ago
Reply to  Mister

Hey a lot of us have hard work to do reading through comment boards around here!