Unsolved Mystery: Prospect Pedigree on Hitting Projections
My current aim in fantasy baseball is to find instances where player evaluations can be improved. With several prospects recently getting called up, I am trying to answer the simple question: is there any projection information to be gained from being a highly touted prospect. The short answer is yes, but it took me a while to get good results.
I wanted to keep the analysis simple so I used all available Steamer projections which to back to 2010. Additionally, I used Baseball America’s top 100 ranked prospects for that time frame. From these two data sets, I compared the hitter’s projected results to the actual results for their first few seasons.
I expected to find that the top-100 prospects would outperform their projections and the rest of the rookies would underperform. Overall, the two values would average out. I came to the results but it took a few detours.
I ran just a couple of tests before asking Steamer’s creator, Jared Cross, to help to understand some discrepancies. He gave me a few more tests to run and I eventually found my way.
As I previously stated, I want to start with an uncomplicated process so I just compared OPS. It is a decent proxy for hitting ability and it’s widely available. For the first test, I just subtracted the actual OPS from expected OPS without regard to plate appearances.
I like using an unweighted average to help remove survivor bias. If two player with the same talent level (.700 OPS) are promoted and one hits for a .300 OPS in his first week and the other for 1.100, the luckier guy has a better chance of staying in the majors and accumulating plate appearances.
Top-100 | Non-Prospect | |||
---|---|---|---|---|
Time Frame | Average | Median | Average | Median |
Same Year | .015 | .008 | -.048 | -.039 |
Year +1 | -.051 | -.015 | -.049 | -.029 |
Note: The first key with these values is to look at the median values. Hitters with just a few plate appearances could skew the average values even though the values came back somewhat close. I will continue to provide both values for context.
The projections were close with the top prospects being a bit over but the non-prospects being quite a bit lower in their rookie season. The projected values show little difference in year two but both are below the projection.
While this table is not perfect, the ‘Same Year’ data end up close to my final results.
The next step was to do a weighting by plate appearances. The top prospects may be projected for more playing time so they may influence the projections more. I used the two methods to weigh the plate appearances, smallest of the pair and the harmonic mean.
Prospects | Non-Prospects | |||
---|---|---|---|---|
Harmonic Mean | Min Value | Harmonic Mean | Min Value | |
Year Of | -.008 | -.010 | -.003 | -.002 |
Plus one | -.012 | -.009 | -.013 | -.012 |
Using this method the values merge almost perfectly with the non-prospects performing better in their first season.
I stopped analyzing the year plus one data as the additional prospect ranks made little difference in the projections.
For the next analysis, I split the data by those players projected to play in the majors and those who weren’t. I wanted to see if the projections did better for players expected to be in majors (higher in the minors and getting preseason hype) or those who were called up early.
I used a cut off of 10 projected plate appearances (PA) for the two groups. First, here are the average and mean differences for the four groups.
Top-100 w/ PA | Top-100 w/o PA | Non-Prospect w/ PA | Non-Prospect no PA | |
---|---|---|---|---|
Average | -.024 | .096 | -.030 | -.052 |
Median | -.031 | .073 | -.039 | -.037 |
Again, concentration on just the median values, one number sticks out, the OPS for the players not projected from much MLB playing time. The other three are about the same low numbers. These results don’t make much sense so I needed to dig some more.
The focus moves to why this group differs. There are several possibilities and I’ll first look at the percentage of plate appearances at each level for the season before and of their debut.
AAA | AA | A | Rookie | ||
---|---|---|---|---|---|
Low PA Proj | Season Before | 3.5% | 31.7% | 32.7% | 0.2% |
Season of | 35.3% | 23.6% | 4.6% | 0.1% | |
High PA Proj | Season Before | 14.0% | 48.5% | 19.0% | 0.2% |
Season of | 61.5% | 11.5% | 0.6% | 0.1% |
Most of the hitters with a small number of projected plate appearances were in Single-A while the high plate appearance projection played mainly in Double-A. The projection data used to create the high plate appearance group likely has more seasons of data, especially against better competition.
Since these projections are pre-season, the low-PA group’s performance may have taken a step forward in their promotion season and this improvement isn’t expressed in the pre-season projections.
To see how much these low-level top-100 prospects might have improved their stock, I compared of their projection from the debut season to the next season. The median jump in OPS was from 64 points (average was 95 points). These values are almost in-line with how much they are over their debut season’s projected OPS.
One final test. Instead of dividing the players by projected plate appearances, I grouped them by their highest previous season level and divide those groups into Top-100 status and normal rookies.
Level | AAA | AA | A | Average of 3 | |
---|---|---|---|---|---|
Top-100 | Average | .038 | -.025 | .080 | .031 |
Median | .027 | -.028 | .139 | .046 | |
Non-prospect | Average | -.063 | -.047 | .053 | -.019 |
Median | -.043 | -.044 | .050 | -.012 | |
Difference | Average | .100 | .022 | .028 | .050 |
Median | .070 | .015 | .089 | .058 | |
These are the values are close to the first table with the Top-100 average difference being higher than the non-prospects who had some Triple-A experience. Some weirdness did exist with the Double-A results being lower than expected and the Single-A higher but they even. The Average columns at least give me a starting point to continue my analysis.
Conclusion
I’m calling it quits for now and I’d be surprised if many readers made it this far. Trying to see if prospect rankings could provide any additional information was tough but fruitful. About a 50-point difference in projected versus actual OPS exist between Top-100 prospects and regular rookies. The difference disappears the next season as the projections quickly catch up with the player’s talent. On average, fantasy owners should expect rookies to provide ~40 more points of OPS above and non-prospects ~10 OPS points fewer than their projected values.
While OPS doesn’t immediately translate to an offensive category, it helps to provide easy context to help find the difference. From this point, I plan on splitting out the OPS components (e.g. AVG, HR/PA, etc) to make them more usable for fantasy owners. Additionally, I can determine if any statistical differences exist from top-100 subsets. And there are still pitchers to get to. It will be a long journey for sure.
Jeff, one of the authors of the fantasy baseball guide,The Process, writes for RotoGraphs, The Hardball Times, Rotowire, Baseball America, and BaseballHQ. He has been nominated for two SABR Analytics Research Award for Contemporary Analysis and won it in 2013 in tandem with Bill Petti. He has won four FSWA Awards including on for his Mining the News series. He's won Tout Wars three times, LABR twice, and got his first NFBC Main Event win in 2021. Follow him on Twitter @jeffwzimmerman.
Great work. Personally, what stood out was how great callups without any AA experience performed. I think it opens the whole can of worms as to what exactly pre-season projections are intended to do: a) project the true talent level for a given player over the next season, or b) project how a player will actually perform at the MLB level given the important knowledge that he will get called up this season.
I’m pretty sure it’s “a” for most systems, but as your research suggests, adjustments may be in order for those players who truly unexpectedly get called up — and even more so for top 100 guys.
The same effect occurs with post-season projections. If an expected bad team is actually decent and makes the playoffs, its pre-season rating was almost certainly too low. So the pre-season odds for these teams to win the pennant or WS are likely to be underestimated by most models unless they make the appropriate adjustment.
One issue is that the projections for the lower level players are low. For example, Tatis Jr. has the following Steamer projection: .219/.280/.346 for an .626 OPS. Using the procedure, his OPS should be projected to .666 which is close to the 2017 hitting versions of Hamilton, Odor, and Swanson.
Not every prospect will help the team but many will