xStats, Steamer, and Players With Small Sample Sizes

xStats seems to excel at identifying changes in small sample sizes, which might make it ideal in season tool for fantasy players, since it can rapidly adapt to a player’s changing profile. When you start looking at players with larger samples, other projection systems like Steamer or ZiPS exceed the predictive value of xStats. This is a weakness I hope to address in the future, through a combination of more data and refined algorithms. In the meantime, projecting players with little major league experience is a valuable trait. In the past few weeks I have written articles about Trea Turner and Gary Sanchez. Sanchez is particularly interesting due to his ridiculously over the top power numbers, and I think the method I came up for adapting his batted ball numbers to a more realistic projection is reasonable (feel free to tell me if you disagree). This week I’m casting a wider net and looking at a larger group of young players.

For the purposes here, I’m looking at players under the age of 26 who have at least 300 plate appearances in my records (and my records doesn’t necessarily line up with MLB records, due to various measurement and reporting errors). This isn’t an exhaustive list, moreso focusing on players who likely have some fantasy value.

Batting Average

Batting average is the most difficult stat to predict of the lot. Generally speaking, and you’ll see this is a trend, xStats is not as high as Steamers for most of these batters.  Or, in the case of batting average, any.  According to xStats, all of these guys could see their batting averages drop. Of particular note, it knocks Yasiel Puig down more than a full standard deviation, from an above average .284 to a below average .232. PECOTA and ZiPS peg him around .279 and .275 respectively, so perhaps xStats is being unnecessarily harsh. Either way, it is probably something you want to keep in mind.

According to this, Yelich could face a significant drop in batting average, a z score drop of more than .5, and Jose Ramirez has a z score drop of more than 1. These are two guys who could, according to this system, face pretty serious drops in fantasy value relative to other projection system.  Obviously, this isn’t the be all end all, you probably trust the more advanced projection systems much more than this.  I don’t blame you.  But this is a data point, something to keep in mind when you look around the table on draft day.

Slugging

xStats may be more suited to project slugging, as it is more adept are predicting extra base hits. Yasiel Puig is sitting way at the bottom of the list, again. xStats doesn’t like his batted balls one tiny bit. Carlos Correa is towards the top of this list, too, sitting right behind Grichuk, Castellanos and Naquin. This system suggests Grichuk’s power is legitimate and well earned.  He is sitting on an 8.8% Value Hit rate, far exceeding the 5.7% MLB average. xStats feels he got short changed in the extra base department, and suggests he may have deserved more hits, 10 more singles, 1 more triple, and 1 more home run, but 3 fewer doubles. That’s a lot of potentially missed hits over the course of only one season and it explains why xStats is so big on him in all of these categories.

Didi Gregorious, whom I wrote about him once before, should not be expected to sustain his 2016 power figures. His batted balls don’t suggest it is any sort of sustainable ability, and I suggest being wary of his value going into 2017. Yes, he plays in many hitter friendly parks, but even with the recent ballpark adjustments I’ve made to xStats, his batted ball profile still can’t justify his recent power surge.

Weighted On Base Average

Aledmys Diaz has a surprised appearance at the top of this list, although it probably isn’t much use in a fantasy sense. xStats feels he deserves more doubles, and an ever so slight up tick in home runs. This increases his xOBA above what Steamer projects. However, xStat’s lower estimated batting average hurts him much more than any small increase in home run performance, and of course the number of doubles is effectively meaningless in a fantasy setting, especially since the fewer estimated singles also decreases his slugging.

Take a look at Giancarlo Stanton, though. Steamer gives him a .383 wOBA, and xStats .345 xOBA, quite a large difference. His batted ball stats are likely spoiled by the string of injuries he has suffered the past few seasons, but at the same time Stanton hasn’t proven capable of sustained success in the majors.  That may be an unfair statement, given the injuries, and I honestly feel a bit dirty saying it, but I’m not sure how else you can evaluate him. He hits the ball ridiculously hard, harder than anyone else in the game, but even so his xStats don’t seem to reflect that.  There is more to baseball than exit velocity.  You also need launch angle, and even more importantly you need consistency.

Below I have a summary of the z-scores I’ve calculated for these stats. They may differ slightly from z-scores you see in other places, I did it by hand. The HR, wOBA, AVG, and SLG columns show the difference in z-scores between xStats and Steamers, positive numbers show that xStats is higher. The final column is the summed the difference in z-scores.  This table should be sortable, so feel free to sort by which column you find interesting.

Home Runs

Nick Castellanos certainly had a breakout offensive year in 2016, although xStats was pretty high on him in 2015 as well. Back in 2015, even though he only hit 15 in game, xStats threw 20 expected home runs at him, effectively claiming five of his doubles deserved to be promoted into four baggers. In 2016, this power was realized as he hit 18 home runs in 150 fewer plate appearances. Even still, xStats claims he deserved 26 home runs in 2016, and when his batted ball profiles form both seasons are combined, xStats claims he ought to be hitting roughly 0.045 home runs per plate appearance.  Or, if you prefer, 1 home run every 22.4 plate appearances, or 19.6% xHR/FB.

In 2016, xStats registered a power surge in Carlos Correa, which is unfortunate because his home run total actually dropped quite a bit, having hit 2 fewer home runs in 150 more plate appearances.  xStats is more favorable, estimating 26 last season, six more than he registered in game.  If Correa keeps up that sort of batted ball quality, and there is little doubt he can, he is an outstanding athlete with great power, the home runs will come. xStats estimates he will hit 27 in 2017, five more than Steamer projects.  That’s half a z-score point of increase, a very solid power gap. Perhaps something you should keep in mind on draft day, assuming you play in a league where he’s still available.

Change in Z-Score xStats vs Steamer
Name PA ΔzHR ΔzwOBA ΔzAVG ΔzSLG Δz
Randal Grichuk 493 0.45 0.14 -0.12 0.47 0.94
Carlos Correa 628 0.49 -0.05 -0.19 0.16 0.41
Tyler Naquin 441 0.26 0.27 -0.38 0.24 0.39
Derek Dietrich 412 0.43 -0.06 -0.25 0.03 0.15
Miguel Sano 546 0.20 0.03 -0.25 0.10 0.08
Nick Castellanos 510 0.58 -0.14 -0.63 0.25 0.06
Devon Travis 524 0.11 -0.06 -0.28 0.04 -0.19
Brandon Drury 442 0.09 -0.14 -0.28 -0.09 -0.42
Aledmys Diaz 561 0.14 0.13 -0.72 -0.18 -0.63
Jackie Bradley 546 0.30 -0.19 -0.69 -0.06 -0.64
Christian Yelich 620 0.32 -0.25 -0.56 -0.16 -0.65
Tommy Joseph 536 -0.01 -0.05 -0.56 -0.25 -0.87
C. J. Cron 431 -0.10 -0.25 -0.34 -0.25 -0.94
Anthony Rendon 647 -0.12 -0.24 -0.50 -0.38 -1.24
Tim Anderson 625 0.33 -0.22 -1.10 -0.28 -1.27
Joc Pederson 537 -0.19 -0.25 -0.47 -0.37 -1.28
Yasmany Tomas 430 -0.37 -0.36 -0.47 -0.55 -1.75
J. T. Realmuto 445 -0.10 -0.46 -0.78 -0.43 -1.77
Marcell Ozuna 500 -0.15 -0.41 -0.72 -0.52 -1.80
Marcus Semien 592 -0.10 -0.46 -0.78 -0.56 -1.90
Kevin Kiermaier 590 -0.33 -0.43 -0.50 -0.66 -1.92
Odubel Herrera 643 -0.04 -0.49 -0.97 -0.52 -2.02
Avisail Garcia 453 -0.17 -0.48 -0.85 -0.63 -2.13
Addison Russell 557 -0.02 -0.54 -1.00 -0.59 -2.15
Kolten Wong 449 -0.28 -0.48 -0.69 -0.81 -2.26
Giancarlo Stanton 515 -0.25 -0.60 -0.72 -0.71 -2.28
Bryce Harper 595 -0.39 -0.57 -0.85 -0.74 -2.55
Cesar Hernandez 624 0.00 -0.60 -1.19 -0.80 -2.59
Andrelton Simmons 483 -0.41 -0.55 -0.81 -0.87 -2.64
Jose Ramirez 584 -0.22 -0.59 -1.16 -0.74 -2.71
Ender Inciarte 627 -0.26 -0.65 -1.13 -0.75 -2.79
Byron Buxton 513 -0.36 -0.62 -1.03 -0.83 -2.84
Didi Gregorius 581 -0.56 -0.67 -0.81 -0.88 -2.92
Javier Baez 452 -0.29 -0.78 -0.97 -0.88 -2.92
Yasiel Puig 489 -0.38 -1.00 -1.63 -1.41 -4.42

So, if you believe these stats, Grichuk, Naquin, Correa, Castellanos, and Sano are guys you may want to keep your eye on, while you may want to stay away from Puig, Baez, Gregorius, Inciarte, and Ramirez.  Some of the other names at the bottom of this list, Harper and Stanton in particular, are very interesting cases. They are somewhat outside the scope of this piece, where I am trying to focus a bit more on guys with smaller sample sizes, but these are super star talents that xStats is, relatively speaking, pretty low on.

None of these xStats conclusions should be taken as gospel. But it should be seen as a data point, something to consider in the greater context of the information available.  Take heed of the warnings, take a note of the bright spots.  Every piece of information helps.





Andrew Perpetua is the creator of CitiFieldHR.com and xStats.org, and plays around with Statcast data for fun. Follow him on Twitter @AndrewPerpetua.

11 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Carl Pavanos Mustache
7 years ago

Andrew, will you be releasing projections on your xStats site?

Carl Pavanos Mustache
7 years ago

If you don’t mind me asking, where can I find your 2017 estimates, if you’ve released them? I can’t seem to find them on xstats. Maybe I’m just blind.