Predictability of Pitcher On-Base BABIP by Jeff Zimmerman July 6, 2022 I’ve had a hard time figuring out Dean Kremer and couldn’t explain his low 2.48 ERA while most of his ERA estimators are in the high 4.00s. I finally got around to looking at his splits with the bases empty and those with runners on base. With the bases empty, Kremer has a .391 BABIP. Once someone gets on base, he has a .167 BABIP. Since he has a .310 overall BABIP, the discrepancy wasn’t obvious. I decided to look into the predictability of BABIP with and without runners and with BABIP-related issues and how much a difference can help predict an ERA regression. To do the study, I look at starters (>90% games where starts) who at least threw 80 IP in season one and at least 40 IP in the next to help account for survivor bias. Just a reminder that previous studies found it takes forever for a pitcher’s normal BABIP to stabilize (2000 BIP, ~ 4 seasons of 180 IP). In the first test, here is the BABIP for when runners are on base from one season to the next. A 0.005 r-squared might as well be zero. For another way to look at this, here is the overall BABIP from year 1 compared to the year 2 BABIP for runners on base. The previous season’s combined BABIP is about twice as predictive (just an r-squared of .009) but still useless. So to go a step further, how much does a high or low BABIP with runners on base inflate or deflate a pitcher’s ERA. I used xFIP because it’s the most predictive of the ERA estimators in a short sample like we are using here. Here is a comparison of: A pitcher’s BABIP with runners on base The difference between ERA and xFIP There is an obvious trend The slope of the line is 10.7. Assuming a set talent BABIP for a pitcher (.287 BABIP is the league average), we should expect the pitcher’s ERA to change 0.1 for every 10.7 points (or 10 for ease of calculations) that his BABIP is under the pitcher’s talent. Here is an example Going back to Kremer, I’ll assume that a .287 BABIP is his talent. He has a .167 BABIP with runners on. That works out to a .120 difference (.287-.167). Taking that difference times 10.7 works out to 1.28 (10.7 * .120) expected increase in ERA going forward. That works out to a 3.76 ERA (2.48 + 1.28). His projected ERA value doesn’t go all the way up to his xFIP and in these extreme cases, it shouldn’t. There are just so many input to ERA that isn’t being accounted for (e.g. bullpen and defense) that this process will never provide a magic solution. What it does though helps explain one of the disconnects between a pitcher’s ERA and his ERA estimators. Here are the qualified pitchers with the lowest BABIPs with runners on base and their overall ERA and xFIP (the linked table is the ERA only with runners on base). Qualified Starters With Lowest On-Base BABIP Name On-base BABIP ERA xFIP Luis Garcia .184 3.81 3.61 Tyler Anderson .200 3.09 3.97 Pablo López .207 2.97 3.49 Joe Musgrove .207 2.25 3.08 Marco Gonzales .209 3.29 4.93 Alek Manoah .221 2.33 3.84 Sandy Alcantara .222 1.82 3.39 Julio Urias .224 2.57 3.88 Nick Martinez .226 3.63 4.13 Framber Valdez .227 2.67 3.14 Most of these guys are good and even if their ERA increased up to their xFIP, they would still be serviceable. The one exception is Marco Gonzales. Over his career, Gonzales’s ERA has been able to consistently outperform his xFIP by 0.65 (3.97 ERA vs 4.62 xFIP). The deal is that it’s not because of his BABIP with runners on base (.296 BABIP with runners on, .282 with the bases empty). A safe assumption with Gonzales is that ERA should head up to about 4.30 (4.93 xFIP – 0.65) by just using his career difference between the two.