Poll: Which Group of Pitchers Performs Better? – The Results

October 2, 2013

During the all-star break, I decided to undergo a little experiment. I took two groups of 10 starting pitchers comprised of those whose ERAs outperformed and underperformed their SIERA marks by the largest margins. There were 437 of you who answered the question “Which Group Posts a Lower ERA RoS?” and 61.1% of you voted for Group A, the SIERA outperformers. Despite this group actually posting a higher SIERA than Group B, you felt that the magic would continue. Let’s find out the results and if the majority was correct.

I’ll first start by reviewing how the SIERA outperformers did in both halves:

Group A – The SIERA Outperformers, 1st Half

Name	IP	K%	BB%	BABIP	LOB%	HR/FB	ERA	SIERA	Diff
Bartolo Colon	126.2	14.0%	3.0%	0.287	80.2%	6.0%	2.70	4.19	-1.49
Bronson Arroyo	123.2	13.7%	4.6%	0.254	78.9%	11.6%	3.42	4.41	-0.99
Clayton Kershaw	145.1	24.8%	6.3%	0.238	78.7%	5.4%	1.98	3.24	-1.26
Hiroki Kuroda	118.2	17.7%	5.1%	0.252	82.6%	9.8%	2.65	3.88	-1.23
Jason Marquis	112.1	14.7%	13.2%	0.256	79.7%	19.6%	3.77	5.11	-1.34
Jeff Locke	109.0	16.7%	10.8%	0.228	83.3%	6.7%	2.15	4.56	-2.41
Jorge de la Rosa	109.1	16.6%	8.5%	0.294	76.4%	6.7%	3.21	4.32	-1.11
Mike Leake	117.0	15.0%	5.5%	0.260	79.6%	10.0%	2.69	4.11	-1.42
Patrick Corbin	130.1	21.2%	6.4%	0.246	81.9%	7.8%	2.35	3.61	-1.26
Travis Wood	122.2	17.6%	7.8%	0.227	76.5%	5.8%	2.79	4.45	-1.66
Average	121.2	17.4%	7.0%	0.253	79.6%	8.8%	2.74	4.15	-1.40

Group A – The SIERA Outperformers, 2nd Half

Name	IP	K%	BB%	BABIP	LOB%	HR/FB	ERA	SIERA	Diff
Bartolo Colon	63.2	17.5%	5.2%	0.307	79.7%	6.0%	2.54	4.13	-1.59
Bronson Arroyo	78.1	17.3%	3.5%	0.288	76.3%	18.3%	4.37	3.75	0.62
Clayton Kershaw	90.2	26.7%	4.9%	0.272	83.3%	6.7%	1.59	2.76	-1.17
Hiroki Kuroda	82.2	18.9%	5.4%	0.324	68.5%	11.0%	4.25	3.67	0.58
Jason Marquis	5.1	0.0%	11.1%	0.333	45.5%	0.0%	10.13	7.23	2.90
Jeff Locke	57.1	18.9%	13.5%	0.365	67.0%	14.7%	6.12	4.53	1.59
Jorge de la Rosa	58.1	14.2%	9.0%	0.320	73.9%	9.3%	4.01	4.67	-0.66
Mike Leake	75.1	15.6%	6.7%	0.321	75.3%	13.9%	4.42	4.26	0.16
Patrick Corbin	78.0	19.9%	6.1%	0.337	68.6%	13.7%	5.19	3.67	1.52
Travis Wood	77.1	17.4%	8.4%	0.280	78.6%	8.4%	3.61	4.58	-0.97
Average	66.2	18.6%	6.7%	0.310	74.6%	11.2%	3.98	3.96	0.02

Remember that magic this group benefited from that resulted in the trio of a lucky BABIP, LOB% and HR/FB ratio in the first half? Yeah, that good fortune disappeared. Their BABIP and HR/FB marks jumped right back up to the second half league average, but they did sustain an above average LOB%, even though it dropped dramatically from the first half. Over this relatively small sample of 10 pitchers, in aggregate, they do not actually have any special abilities. As a group, their ERA was essentially the same as their SIERA in the second half, a far cry from the 1.40 runs they outperformed their SIERA by in the first half.

I asked another question in my original post, and that was “Which Range Will Group A’s ERA Fall Into RoS?”. The 3.50-3.74 range garnered the highest percentage of votes at 24.4%, while the correct range of 3.75-3.99 earned the third highest percentage at 21.4%. It seems pretty clear that everyone assumed regression, but not as much as actually occurred.

Interestingly, this group’s strikeout and walk rates improved from the first half, which pushed its SIERA below 4.00. In the comments, Sky Kalkman, man of many saber-friendly Internet sites, shared his theory that perhaps as BABIP regresses like we saw in this group, their peripherals will improve. That is exactly what happened. It’s still too small a sample to conclude anything, but this theory has me intrigued now.

Group B – The SIERA Underperformers, 1st Half

Name	IP	K%	BB%	BABIP	LOB%	HR/FB	ERA	SIERA	Diff
Edinson Volquez	109.2	18.8%	10.2%	0.342	63.3%	8.8%	5.74	4.31	1.43
Edwin Jackson	100.1	19.5%	8.1%	0.320	62.3%	10.6%	5.11	3.83	1.28
Ian Kennedy	108.0	19.1%	8.4%	0.298	67.1%	12.6%	5.42	4.26	1.16
Jeremy Hellickson	117.2	20.3%	5.5%	0.296	66.9%	10.8%	4.67	3.74	0.93
Joe Blanton	112.1	18.2%	5.1%	0.343	70.6%	18.1%	5.53	3.85	1.68
Matt Cain	112.0	22.1%	7.9%	0.257	63.4%	12.7%	5.06	3.84	1.22
Rick Porcello	99.1	19.4%	4.6%	0.317	65.4%	15.7%	4.80	3.15	1.65
Roberto Hernandez	108.1	18.2%	5.4%	0.304	69.7%	21.2%	4.90	3.63	1.27
Wade Davis	94.2	19.9%	9.4%	0.381	66.3%	13.5%	5.89	4.21	1.68
Yovani Gallardo	113.2	18.3%	8.7%	0.310	65.7%	12.5%	4.83	4.10	0.73
Average	107.2	19.3%	7.3%	0.315	65.9%	13.6%	5.17	3.88	1.29

Group B – The SIERA Underperformers, 2nd Half

Name	IP	K%	BB%	BABIP	LOB%	HR/FB	ERA	SIERA	Diff
Edinson Volquez	60.2	17.4%	9.4%	0.293	67.1%	17.2%	5.64	4.39	1.25
Edwin Jackson	75.0	14.5%	7.0%	0.324	64.6%	9.2%	4.80	4.34	0.46
Ian Kennedy	73.1	22.6%	10.4%	0.290	72.2%	14.3%	4.17	4.04	0.13
Jeremy Hellickson	56.1	14.6%	9.2%	0.328	65.2%	11.1%	6.23	4.94	1.29
Joe Blanton	20.1	14.9%	7.9%	0.361	60.1%	24.0%	8.85	4.26	4.59
Matt Cain	72.1	18.8%	6.1%	0.264	84.5%	8.1%	2.36	4.04	-1.68
Rick Porcello	77.2	19.1%	7.1%	0.312	75.1%	12.1%	3.71	3.68	0.03
Roberto Hernandez	42.2	15.9%	7.1%	0.318	72.4%	20.0%	4.85	3.73	1.12
Wade Davis	40.2	15.0%	9.4%	0.316	71.0%	4.5%	3.98	4.65	-0.67
Yovani Gallardo	67.0	19.2%	8.3%	0.278	79.1%	10.9%	3.09	3.96	-0.87
Average	58.2	17.7%	8.1%	0.303	72.2%	12.3%	4.39	4.17	0.22

This group was terrible in the first half, hampered by a high BABIP and HR/FB ratio and an inability to strand runners. But, most of those problems suddenly disappeared in the second half and the group went from underperforming their SIERA marks by 1.29 runs to just 0.22 runs. Yes, they still underperformed, but 0.22 is much less significant and within a reasonable error range. In fact, the group actually posted a lower BABIP than Group A in the second half! The other luck metrics weren’t much worse than Group A either.

The leading vote-getter to the question of “Which Range Will Group B’s ERA Fall Into RoS?” was 4.00-4.24, with 36.2% of the vote. This proved to be a bit too optimistic and surprisingly only 9.8% of you guessed the correct range of 4.25-4.49, which garnered the fourth highest percentage of votes.

For a second time, we observe a change in peripherals, this time a decline, as the strikeout rate dropped and walk rate increased. This was on the heels of a BABIP decline, once again giving some early credence to Sky’s theory mentioned above. Ultimately, the worse skills led to a higher SIERA in the second half versus the first half.

Now let’s directly compare the average lines of each group in the second half:

Group	IP	K%	BB%	BABIP	LOB%	HR/FB	ERA	SIERA	Diff
SIERA Outperformers	66.2	18.6%	6.7%	0.310	74.6%	11.2%	3.98	3.96	0.02
SIERA Underperformers	58.2	17.7%	8.1%	0.303	72.2%	12.3%	4.39	4.17	0.22

So the 267 of you who voted that Group A, the SIERA outperformers, would post a lower rest of season ERA, give yourself a pat on the back, as you were correct. But I bet it would still surprise many to learn how much the gap narrowed between the two groups. Group A posted the better peripherals and SIERA, so they should have posted a better ERA. But the story is that they essentially matched their SIERA after significantly outperforming it in the first half. It’s always tempting to fish for an explanation and try to justify the outperformance with unique visual observations and scouting type analysis. Nobody wants to shrug their shoulders and say they don’t know. But I’m here to tell you that it’s okay, you are allowed to simply explain it off as luck over a small sample size.

In the comments of the original post, Wobatus was kind enough to figure out each group’s rest of season ZiPS projections. He calculated Group A’s as 4.16 and Group B’s as 4.12. So, essentially the same. With that context, we do note that Group A slightly outperformed their RoS projection, while Group B underperformed it. However, we don’t know what peripherals the ZiPS projections were projecting, so we’re missing crucial information necessary for a complete analysis.

Obviously over just two halves of a season, any pitcher could outperform or underperform their SIERA marks like we saw with Bartolo Colon and Jeremy Hellickson. But knowing ahead of time which pitchers are going to do that is a fool’s errand. If you know that as a group, the outperformers will regress, while the underperformers will improve, then you have to try your hardest to ignore ERA and rely on the underlying skills and SIERA marks.

25 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Mister

11 years ago

Cool. This is pretty close to what I thought would happen, but I’m surprised to see that Group A ended up winning by such a large margin. I figured it would be very close in the end.

I can’t come up with a good reason as to why peripherals moved to push SIERAs in the direction of ERAs. BABIP regression would affect per 9 peripherals, but shouldn’t affect per PA peripherals, and SIERA is calculated using K% and BB%, correct?

Reply to Mister

I guess the moral of the story is that everything regresses. Any time you have significantly above or below average performances, you should expect regression to the mean. More stable parameters like those that go into SIERA will regress less than metrics like ERA, but they will still regress.

And now I am preaching to the choir.

some.guy

“Any time you have significantly above or below average performances, you should expect regression”

Patently untrue, the way you worded it.

I only mention this because as a Braves fan, I still hear people still thinking that Kimbrel’s 90% LOB% will regress to the mean, when instead, that number is the mean which accurately represents his talent. Deviations from that mean of 90% should regress, but not to an average performance of around 72%, as you seem to suggest.

Well, I didn’t say what I meant by “average.” It could mean league average, or it could mean a player’s career average from previous seasons. I almost put in a caveat there to say “league average, unless a player has a long track record of being better/worse than league average,” or something like that, but I couldn’t get the English right so I just left it slightly vague.

And the context here implied that I wasn’t referring to specific individuals, but rather to groups of players. A single player can always be an exception, but probably not a group of 10. Basically, if you pick the top 10 players in ANY stat at the halfway point of the season and then track them the rest of the season, you should find that they move in the direction of the league average in that stat, either to a large or small degree.

Nevermind though, I’m wrong about this theory applying to this case. The league average SIERA this year was 3.87, and Group B’s first half SIERA was 3.88. So according to my previous explanation, Group B’s SIERA shouldn’t have changed much during the 2nd half. Either this is just random variation, or there really is something about underperforming your SIERA that leads to your SIERA increasing somewhat.

Giovani

You’re sure using a lot of words and comments section space to essentially say nothing and show you don’t understand regression.

-2

You’re using a small number of words to show that you’re an asshole.

A mustachioed business tycoon

Hey a lot of us have hard work to do reading through comment boards around here!

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG