Archive for Meta Analysis

Readjusting Batted Ball Input for pERA

A few years back, I created pERA (pitch ERA) to help give each pitch a grade based on its results. For each grade, I never included any kind of walk rate until the final value when I added it in BB/9. It was never included in the individual pitches. A few months back, I looked into Ball% and immediately knew I needed to add it to the pERA formula. On top of that, I added a weak contact element. After a new finding, I needed to go back and tweak the batted ball numbers. Read the rest of this entry »


Strikeout and Walk Adjustments From Minor League Rules

This past season, the minor leagues experimented with several rules including pre-tacked baseballs and automatic strike zones. The following is a look at how those rules changed the amount of expected production from players.

First off, I’m not going to weave a narrative around this data dump. There is no polishing this turd. The information can be referenced later as fantasy managers begin to dive into 2023’s results while preparing for next season. Read the rest of this entry »


2023 Projection Showdown — THE BAT X vs Steamer Home Run Forecasts, Part 1, A Review

Now that the regular season has ended, it’s recap time! Over the next couple of weeks (months?), I’ll be reviewing all my preseason articles. I want to always be held accountable for the advice I provide, but also it’s fun to find out what actually happened and if I was right. We’ll start with the first in my new series this year, the 2023 Projection Showdown, which pitted THE BAT X against Steamer in various hitting categories. We begin with the first category of home runs. In part 1, I identified the hitters who THE BAT X projected for a higher 600 at-bat home run pace than Steamer. Let’s find out which projection system proved closer.

Read the rest of this entry »


Simplifying My Life: Power and Contact Thresholds

There are too many stats (“Welcome to FanGraphs”), so I decided to take a step back and try to remove as much noise as possible when making decisions. I’m not reinventing any concept, just concentrating on the most important factors. The fewer, the better. Today, I’m going to focus on my “new” power factor and mention how I settled on Contact%.

I know several other sources have a focus on keeping their inputs basic, but each one disagrees with the results. I decided to add to the disagreement and pick out the best options for the standard roto game. Read the rest of this entry »


Linking STUFFF Changes to Fantasy Relevant Stats

I have a major love-hate relationship with the STUFFF metrics. After just a few pitches, useful information becomes available to determine if a pitcher has improved or not. On the other hand, the issue I have against STUFFF is the lack of transparency and values change as the dataset increases. With all the STUFFF talk, all I want to know is how changes in it will affect a pitcher’s fantasy-relevant stats. In my first article, I set some ERA baselines for the STUFFF values. The next step is to understand what a change in a STUFFF value has on a pitcher. For example, if I hear their Stuff+ jumps from 90 to 110, why should I care? Is the pitcher’s ERA going to drop by 1.00 or by 0.10 or not at all? I decided to just make a major data dump to have a reference when a STUFFF value does move.

Caution: The following values may or may not be predictive. They could just be descriptive. There is just not enough information (2 years of information) to run any ideal predictive test at this point, especially with STUFFF’s vagueness and everchanging nature.

Read the rest of this entry »


Upgrading My Individual Pitch Result Metric

On a personal level, the All-Star break can be declared a success as I’ve made major improvements to my pitch result evaluator, pERA. I was supposed to do dive into it last season, but I spent most of the time dealing with the league’s new rules so this update got pushed off until now. I planned on adding Ball Percentage (Ball%), Called Strikes (CStr%), and StatCast batted ball information. I felt each add would provide a clearer picture of the pitcher’s pitches. I eventually found out I was double counting the same information with Ball% and CStr% and needed to remove one. Read the rest of this entry »


For a Starter to Beat His ERA Estimators …

The “ability” of a pitcher to consistently beat his ERA estimators will always be a discussion top. Today, I’m going to put context on who has suppressed their ERA for two straight seasons and how they performed in the third season. I’ve been trying to see if I have missed anything while digging into under and overperforming starts and found that I might have missed the obvious, the starter’s team.

Before getting to the team context, here are the baseline chances for starting pitchers to consistently beat certain ERA benchmarks. Read the rest of this entry »


Fastball Quality Matters …

Last week, I examined if throwing too many four-seam fastballs led to a pitcher being predictable and getting hit around. What I noticed was that I needed to expand out past just four-seamers and include sinkers. Again, I failed to find a connection between fastball quality-and-quantity and weakly hit batted balls. Instead, I was able to determine some benchmarks to find good fastballs.

Through some observations, I believed that throwing too many fastballs, especially if they were of poor quality (e.g. slow, average spin), would get hit harder. I dug through the numbers just hoping for my thoughts to be verified but I found jack squat. Nothing. Read the rest of this entry »


What is Too Many Four-Seamers?

The question came up when I examined David Peterson. I wondered if he was getting hit around because he was throwing a ton of subpar fastballs. Today, I’m back-testing the theory.

I had no idea what I was going to find but the results, positive or negative, will help to shape future studies. I examined starters from 2021 and 2022 who threw at least 20 innings (n=201). I limited the time frame to include the STUFFF metrics that have only been around that long. Also, I limited this study to guys who threw their four-seamer more than their sinker. I started with just four-seamers and stayed away from sinkers. The STUFFF metrics are separated based on pitch type so I wanted to stay in one lane.

The narrative behind four-seamers (or any fastball) would be that batters would familiarize themselves with these fastballs. I know that bad fastballs won’t generate as many strikeouts but do they get hit around more, especially if that’s all batters see.

Additionally, I included my pERA values which is only based on if the pitch misses (SwStr%) and the direction it is hit (GB%). These values might seem high but I don’t scale the value based on pitch type and fastballs generate fewer swings-and-misses than non-fastballs. It’s time to start the journey.

First, I grouped the pitchers by how far their ERA estimator was from their actual ERA. Here are the results.

Four-Seamer Fastball Metrics Depending on ERA-FIP
ERA-FIP > 1 Between -1 and 1 < -1
BABIP .322 .286 .241
HR/9 1.5 1.2 1.3
K% 18.7% 21.6% 22.6%
FF% 42.5% 37.8% 34.4%
FF%/(FF%+SI%) 79.1% 78.4% 71.1%
FFv 93.1 93.1 92.9
wFF/C -1.26 -0.21 0.12
Stuff+ 86.4 91.9 94.9
Bot+ 47.6 52.4 50.0
pERA 4.82 4.67 4.68

 

Four-Seamer Fastball Metrics Depending on ERA-xFIP
ERA-xFIP > 1 Between -1 and 1 < -1
BABIP .310 .287 .254
HR/9 1.8 1.2 1.0
K% 18.9% 21.7% 22.9%
FF% 39.4% 38.2% 35.1%
FF%/(FF%+SI%) 77.9% 78.2% 76.9%
FFv 93.0 93.2 92.9
wFF/C -1.57 -0.19 0.76
Stuff+ 87.2 91.3 99.1
Bot+ 48.8 52.2 53.5
pERA 4.88 4.68 4.50

 

Four-Seamer Fastball Metrics Depending on ERA-SIERA
ERA-SIERA > 1 Between -1 and 1 < -1
BABIP .307 .287 .264
HR/9 1.9 1.2 0.9
K% 18.9% 21.8% 21.6%
FF% 39.7% 38.0% 36.6%
FF%/(FF%+SI%) 79.6% 77.5% 79.2%
FFv 92.8 93.2 92.7
wFF/C -1.51 -0.21 0.58
Stuff+ 87.4 92.0 93.4
Bot+ 49.2 52.4 51.7
pERA 4.87 4.67 4.58

 

Four-Seamer Fastball Metrics Depending on ERA-xERA
ERA-xERA > 1 Between -1 and 1 < -1
BABIP .309 .286 .276
HR/9 1.8 1.2 1.3
K% 18.9% 21.9% 19.8%
FF% 41.0% 38.0% 35.7%
FF%/(FF%+SI%) 80.1% 78.8% 70.8%
FFv 92.5 93.2 92.9
wFF/C -1.61 -0.13 -0.39
Stuff+ 85.2 92.6 88.6
Bot+ 47.1 52.7 49.2
pERA 4.83 4.65 4.86

There is a lot to unpack, but the biggest takeaways for me are

  • The pitchers with higher than expected ERA threw more fastballs on average.
  • The pitchers with higher-than-expected ERA generally had worse STUFFF.
  • The pitchers with lower-than-expected ERA mixed in more sinkers.
  • Fastball velocity didn’t matter. It still remains linked to strikeouts.

Here are two more groupings by HR/9 and BABIP.

Average Four-Seamer Fastball Metrics Depending on HR/9
HR/9 > 1.7 Between 0.7 and 1.7 < .0.7
BABIP .294 .285 .293
HR/9 2.2 1.2 .6
K% 18.1% 21.8% 23.8%
FF% 39.7% 37.7% 38.8%
FF%/(FF%+SI%) 79.3% 78.2% 73.8%
FFv 92.466 93.156 93.943
wFF/C -1.72 -.09 .40
Stuff+ 85.7 92.5 92.3
Bot+ 49.3 52.1 53.8
pERA 4.99 4.65 4.49

 

Average Four-Seamer Fastball Metrics Depending on BABIP
BABIP > .317 Between .253 and .317 < .253
BABIP .334 .284 .237
HR/9 1.3 1.3 1.2
K% 20.1% 21.4% 22.9%
FF% 40.5% 37.8% 36.0%
FF%/(FF%+SI%) 75.5% 78.9% 76.9%
pfxvFA 93.212 93.112 92.941
pfxwFA/C -.76 -.32 .50
Stuff+ 85.6 92.1 96.8
Bot+ 51.1 52.1 51.5
pERA 4.75 4.69 4.59

The results are a little messier but the conclusions are close to being the same.

  • The batters who got hit around threw a few more fastballs on average.
  • The pitchers who got hit around had worse STUFFF.
  • Fastball velocity or sinker/four-seam mix didn’t matter to over-or-under-perform batted ball metric.

The two major factors seem to be the usage rate and the STUFFF metrics.

After eyeballing the above tables, it seems like a usage under 40% along with a Stuff+ value under 90 and a Bot Stuff under 50. To see if these benchmarks work, I took the 2023 starters and grouped them.

 

2023 ERA-ERA Estimators for Starters Throwing Lots of Bad Four Seamers
Four-seam traits FIP xFIP SIERA
Usage >40%, BotStuff <50 -0.10 -0.19 -0.03
Everyone else 0.06 0.07 0.04
Usage >40%, Stuff+ <90 -0.12 0.19 0.17
Everyone else 0.06 0.06 0.04

The pitchers I expected to perform worse actually performed better. That’s suboptimal. I did find out what possibly didn’t work but it would be nice if the values were predictive. I ran one last comparison for future reference, here are the pitchers’ stats for if their ERA is above or below their ERA estimators so far this season.

 

2023 Stats for Grouped by ERA-ERA Estimator Above or Below Zero
ERA minus estimator FF% wFA/C BABIP HR/9 botStf FF Stf+ FF
ERA-FIP >0 40.2% -0.53 .320 1.4 47.9 93.6
ERA-FIP <0 42.6% 0.17 .268 1.3 49.7 96.6
ERA-FIP >0 41.2% -0.80 .318 1.6 48.0 92.7
ERA-FIP <0 41.5% 0.47 .270 1.0 49.5 97.6
ERA-SIERA <0 40.7% -0.86 .318 1.6 47.3 92.2
ERA-SIERA >0 42.1% 0.53 .270 1.0 50.3 98.2

The usage doesn’t matter this season but the STUFFF values show some signs worth continued investigation.

That’s enough failure for one article. Here is what I see needs to be done next.

  • Sinkers will be included by weighting the results by usage. David Peterson mixes in some (bad) sinkers so maybe the combination brings more clarity.
  • I’m going to attempt a fastball grade that takes into account the predictive values (STUFFF), pitch results (pERA), and batted ball results (pVAL). From some past work, I wasn’t a huge fan of pVALs but I think they might help show the possible disconnects between shape and results (e.g. ability to hide the ball).

While I didn’t come to any groundbreaking information, I found what not to believe and hopefully, I can improve the future results.


Strikeout Rate’s Link to WHIP

I’m still in disbelief from a recent finding I made. It started with this comment in a recent article I wrote about STUFF:

How much WHIP changed in the two “Stuff” models was almost too good to be true. In both cases, the walk rate increased as a pitcher’s stuff got better, but the hit suppression was so large that the WHIP declined.

Well I was wrong about the hit suppression. I went back and found no link to BABIP. The difference was because WHIP is on an innings denominator and a strikeout removes the chance for a Hit and Walk. An out comes down to the random chance of a batted ball. I know it’s confusing so here is an example assuming a pitcher with a 9 K/9, 3 BB/9, and .300 BABIP and throws 6 IP/GS. Read the rest of this entry »