2022 Projection Accuracy: Hitter Playing Time

February 3, 2023

What started as a checkup on how projections turned into a fairly important find when using projections. On the projection front, aggregators, especially when done smartly, continue to crush the competition. The big illumination is ZiPS being near the top since it uses zero human input.

First off, here are last season’s results with my conclusion.

Hitter Playing Time
For playing time, three of the aggregators, Average, ZEILE, and ATC shoved in this category (Depth Charts takes a hit because it only uses one playing time input). It’s an easy win for the Wisdom of the Crowds.

To find this year’s player set to test, I used all the hitters drafted in at least 42 of the 47 NFBC Main Events. From this list, I excluded Seiya Suzuki because several systems didn’t include him. Also, I excluded Nelson Cruz, Luis Garcia, Manuel Margot, Jake Fraley, Seth Brown, Garrett Cooper, and Darin Ruf 러프. One or multiple systems didn’t have a projection for them. In all, I would have removed four projections, but decided it was better to have more projections and a few players missing. In all, this process was run on 223 hitters.

With the same dataset, I removed the hitters who missed a significant part of the season due to injury. The players left out were Adalberto Mondesi, Miguel Sano, Alex Kirilloff, Kris Bryant, Austin Meadows, Anthony Rendon, Ozzie Albies, Jazz Chisholm Jr., and David Fletcher.

To determine accuracy, I calculated the Root Mean Square Error (RMSE) for four different sets of values. RMSE is a “measure of how far from the regression line data points are” and the smaller the value, the better.

I collected the projections on April 6th from a mix of 23 different sets. Some were free while others were behind a paywall. Those behind a paywall will be labeled as Paywall with a number (e.g. Paywall #1). Additionally, some of the projections were aggregates of other projections. All but one of the aggregators were publicly available. The one that wasn’t is called Aggregator #1. ATC, Depth Charts, and ZEILE are the projections that aggregate their competitors. Also, Steamer, ZiPS DC, and Depthcharts use the same playing time projections. THE BAT and THE BAT X use the playing time from ATC.

Finally, I looked into several ways to aggregate the projections to see if there was a preferred method and they were:

Average of all
Median of all
Preseason smart average: For this one, I had Rob Silver look at last season’s results, pick three sources to average, and they were used. He chose THE BAT X, Razzball, and Paywall #6.
Post-season best average: This started with an average of nine of the projections that I know get regular updates during the preseason. Next, I removed the worst remaining system using this year’s results. The value needed to get under 130.9, the top value for a standalone system.

Here are the results.

RMSE Value as Worse Systems Are Removed

Systems	RMSE
9	134.0
8	133.4
7	132.9
6	131.8
5	130.7
4	129.4
3	128.8
2	131.9
1	130.9

The three systems that had the best results are publicly available, Razzball, ZiPS, and Davenport.

With all that out of the way, here are the rankings using the full 223 hitters.

RMSE Values: All Players

System	RMSE
Post-Season Best	128.8
Aggregator #1	130.8
Davenport	130.9
ZiPS	132.4
BatX	133.5
Bat	133.6
ATC	134.3
Preseason Guess	135.1
Mr.Cheatsheet	135.3
Median	136.0
Razzball	136.2
CBS	137.8
ZEILE	137.8
Paywall #2	138.6
Average	139.3
Paywall #5	140.0
DraftBuddy	140.1
FreezeStats	142.6
Paywall #3	142.7
Steamer	142.8
Paywall #4	143.1
DepthCharts	143.9
Paywall #6	143.9
Paywall #1	144.8
ZiPS DC	145.1
Rotoholic	169.4
Mays Copeland	173.4

Before drawing any conclusions, here are the results without the hurt players.

RMSE Values: Hurt Players Removed

System	RMSE
Post-Season Best	112.1
Davenport	113.6
Aggregator #1	115.2
THE BAT X	115.5
THE BAT	115.6
ATC	116.1
Mr. Cheatsheet	116.9
ZiPS	117.4
Median	117.5
Preseason Guess	117.6
Average	118.5
CBS	119.3
ZEILE	119.3
Razzball	119.8
Draft Buddy	120.9
Paywall #5	121.9
Paywall #3	123.1
Paywall #2	123.6
FreezeStats	124.1
Steamer	124.3
DepthCharts	124.8
Paywall #4	125.4
ZiPS DC	126.0
Paywall #1	126.3
Paywall #6	127.2
Rotoholic	150.3
Mays Copeland	157.8

Like last season, the aggregated systems (e.g. ATC, THE BATs, Median, ZEILE) are near the top. The two projections that stand-alone are Davenport and ZiPS. Last season, they didn’t perform horribly but not good enough to stand out . Here are those rankings.

Note: I might be talking about Mr. Cheatsheet next year as a projection to target.

Both of them had a bad finish but they both were near the top at other times. For standalone playing time projections, they should be given consideration along with the aggregators and Razzball.

Since the playing time from ZiPS is separate from the other playing time projections here at FanGraphs, I asked Dan Szymborski, how he sets the playing for ZiPS.

So setting playing time by just knowing player traits is at least average and outperforms most projection systems.

I was not surprised to find that some of the factors helped predict playing time. While the short 2020 season has caused some hiccups, I found playing time projections could be improved by knowing a hitter’s previous playing (injuries), player talent (good players play more than crappy players), and age. My formula was just a 10% improvement, but still helpful.

What ZiPS is doing is pointing out factors analysts might be missing. For example, why does ZiPS have Gunnar Henderson at 557 AB and Steamer down at 531 AB? A system must be even lower on Henderson’s playing time and is dragging ATC down to 510 AB.

One issue with ZiPS is that it doesn’t robotically zero out playing time. There will be more plate appearances than available in a season. It’s not close to a perfect projection system, but it is definitely catching some factors other projections aren’t.

Here are a couple of issues I could see chopping into ZiPS’s high rank going forward.

It could just be a recent blip where analysts are still having problems evaluating playing time so near to the shortened 2020 season and the 2021 late start. Once baseball gets back to normal, analysts might perform better.
The other projection creators could start spotting their biases and make adjustments to correct them. I’m not sure about this change happening. I discussed ZiPS’s performance with two people behind the better projections and they blew off the ZiPS results.

It’s always I ton of work to set up these projection comparisons. As expected, the aggregators dominated again with a couple of single systems (ZiPS and Davenport) taking a step up this past season. It’s interesting that ZiPS performed as well as it did considering it has no human input.

9 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

GreggMember since 2020

2 years ago

Really great stuff, Dr. Z. I always thought projection systems need to be analyzed by parsing PT and rate stats (not by combining them). This is one side of the coin – would love to see the rate side with this level of detail.

One of my big takeaways from the “RSME – Hurt Players Removed” chart is not to pay for any projection system. Only one of the six paywall systems beat the average (barely).

Also, I think you can just call ZiPS DC as Depth Charts since you’re not using any ZiPS rate stats there?

Last edited 2 years ago by Gregg

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG