A Quick Analysis of 2016 Hitter Projections

I’ve been using projections to create dollar values for my fantasy leagues for more than ten years, and even understanding how to convert projections into dollars is just half the battle. The other half is deciding which projections to use in the first place. Should you use only one set of projections? Or multiple? Should you use the freely available projections here on FanGraphs? Or should you pay for projections from other sources? I’m not going to answer any of those questions definitively, but let’s take a look at a handful of projection sources and compare their projections to 2016 actual results.

A few caveats before I start throwing up tables:

1) I chose to use ottoneu FanGraphs points per plate appearance as my primary review, so there is an ottoneu focused lens here, but the linear weights based scoring on a per plate appearance basis correlates very strongly to wOBA
2) I’m not using every projection set that’s available, so this isn’t meant to be an exhaustive review. I primarily focused on the projections available here at FanGraphs, but also included the PECOTA projections from Baseball Prospectus.
3) This covers only 2016 results, so keep in mind that good projection performance last year is no guarantee of future good performance
4) Only players that accumulated 100+ plate appearances and had projections from all systems are included

With that out of my system, let’s take a look at the FGPts Pts/PA root-mean-square error results (note- the lower the number, the better the projections did compared to actual)

FGPts Pts/PA RMSE ’16 Projections vs Actual
Ages Count Steamer ZiPS Pod FANS PECOTA
21-26 83 0.229 0.229 0.224 0.238 0.224
27-30 97 0.221 0.226 0.229 0.234 0.231
31-40 87 0.199 0.202 0.200 0.212 0.206
ALL 267 0.217 0.219 0.218 0.229 0.221

For my sample of 267 hitters Steamer had the best Pts/PA projections, but four out of the five sets were very close, with the crowdsourced Fans projections bringing up the rear. For the age 21 to 26 bucket (representing rookies and potential breakout players), the Pod projections (from our own Mike Podhorzer) tied with PECOTA (from Baseball Prospectus). Steamer edged out the other systems in both the 27 to 30 and 31 to 40 age ranges.

So we’ve looked at which systems performed better on a rate basis, but how did the projections do in projecting playing time? Here are the RMSE results for PA:

FGPts PA RMSE ’16 Projections vs Actual
Ages Count Steamer ZiPS Pod FANS PECOTA
21-26 83 133.5 150.8 130.8 143.5 130.6
27-30 97 135.3 153.0 140.9 158.5 146.6
31-40 87 120.7 136.8 119.2 131.5 140.8
ALL 267 130.1 147.2 131.0 145.5 139.9

Steamer once again leads the pack for all players, but keep in mind that the Steamer playing time as shown on FanGraphs is actually fed from our staff maintained Depth Charts, while ZiPS performs poorly here because their playing time estimates don’t account for 25 man rosters or lineup/bench roles.

Looking at the various age buckets, PECOTA and Pod once again do the best with the youngest players, while Steamer and Pod do best with the veterans. It’s interesting to note that every system did worse projecting the 27 to 30 year old players than either other age range.

Overall it seems clear to me that Steamer did the best projecting 2016 performance on a rate basis and playing time (with the help of FanGraphs Depth Charts), but could we have done even better using an aggregate projection approach? Let’s take a look!

FGPts Pts/PA RMSE ’16 Projections vs Actual
Ages Count Steamer ZiPS Pod FANS PECOTA Aggregate Machine
21-26 83 0.229 0.229 0.224 0.238 0.224 0.221 0.222
27-30 97 0.221 0.226 0.229 0.234 0.231 0.223 0.223
31-40 87 0.199 0.202 0.200 0.212 0.206 0.197 0.196
ALL 267 0.217 0.219 0.218 0.229 0.221 0.214 0.215

Well, would you look at that! The Aggregate column represents a simple average of all five projections, and the Machine column omits the FANS (machine is a bit of a misnomer, Podhorzer is surely not a host).

The Aggregate projections for all players did better than Steamer alone, and did better than PECOTA and Pod for the youngest age bucket. Steamer retains its edge with 27 to 30 year olds, and the Machine projections performed best for veterans.

Now let’s do the same thing, but looking at plate appearances:

FGPts PA RMSE ’16 Projections vs Actual
Ages Count Steamer ZiPS Pod FANS PECOTA Aggregate Machine
21-26 83 133.5 150.8 130.8 143.5 130.6 125.6 124.2
27-30 97 135.3 153.0 140.9 158.5 146.6 138.6 136.1
31-40 87 120.7 136.8 119.2 131.5 140.8 118.3 117.5
ALL 267 130.1 147.2 131.0 145.5 139.9 128.3 126.6

The Machine projection aggregate pretty much runs away with the playing time analysis. The reason the Machine does better than the Aggregate here is due to the exclusion of the Fans projections, which performed very poorly with respect to playing time.

Conclusion

Last year I used a combination of Steamer/ZiPS/Pod projections when preparing my personal ottoneu dollar values, and incorporated the Fans projections as part of my playing time estimates. Based on these results I may swap out ZiPS for PECOTA, or at the very least add PECOTA to my aggregate mix. In addition, the assumption I had that the fans do a pretty good job of estimating playing time seems to have been wrong, so I’ll be sticking to a Steamer (Depth Charts)/Pod mix this year.

Keep an eye out for a similar analysis of pitching projections in January!

We hoped you liked reading A Quick Analysis of 2016 Hitter Projections by Justin Vibber!

Please support FanGraphs by becoming a member. We publish thousands of articles a year, host multiple podcasts, and have an ever growing database of baseball stats.

FanGraphs does not have a paywall. With your membership, we can continue to offer the content you've come to rely on and add to our unique baseball coverage.

Support FanGraphs




Justin is a life long Cubs fan who has been playing fantasy baseball for 20+ years, and an ottoneu addict since 2012. Follow him on Twitter @justinvibber.

newest oldest most voted
White Jar
Member
White Jar

Cool analysis. But could you please clarify what is included in the aggregate and machine totals?

TheEmbassy
Member
TheEmbassy

If I’m reading correctly, aggregate is all the projections he tested, while machine omits the fan projections.