Author Archive

A Pitch Mechanics Consistency Data Experiment Part II

On July 17th of the 2022 season in Minnesota, Dylan Cease dealt. He threw seven innings, only gave up one hit, and recorded eight strikeouts. His showing left a game score of 83. It wouldn’t be his highest game score of the year (94), in fact, it wouldn’t even be his second-highest (90), but it was a great outing nonetheless. I’m going to use this game as way of continuing my analysis from last week on what we can measure from a pitching mechanics standpoint using statcast pitch-level data. Like in last week’s post, I took the following variables from Cease fastballs on that great start, July 17th:

‘release_pos_x’, ‘release_pos_z’, ‘release_spin_rate’, ‘release_extension’, ‘spin_axis’

I then conducted a principal component analysis in order to bring these five columns of data down into two. That allows me to then plot the data points on a scatter plot like so:

Cease 7/17/23 PCA Scatter Plot

The graph above shows two principal components of all of Cease’s fastballs thrown on July 17th. I am interested in understanding if the spread, or variance, of these data points, relates in any way to performance. A helpful suggestion from FanGraphs member, “couthcommander” came in last week’s post:

“[C]an you…change the point-character shape based on inning?”

Cease 7/17/23 PCA Scatter Plot By Inning

I chose a slightly different route and changed the color of the points based on the inning. I was expecting to see the darker points (later innings) on the outer edges of the scatter plot and lighter points (earlier innings) tighter around the center, but it’s hard to notice much of a pattern from this one game. Let’s visualize it in a different way. Rather than directly plotting the two principal components as X and Y, I calculated the variance of each by inning and compared the two components:

PCA 1 and 2 Variance by Inning Bar Chart

Click to enlarge

 

The first principal component shows higher variance as the game goes on through the fourth inning, but then comes back down for the fifth and seventh. A similar pattern is shown in the second component but only through inning two. The variance in PCA2 jumped in inning five but came back down in inning seven. No fastballs were thrown in inning 6.

It’s important to remind ourselves of what we’re actually looking at here. PCA1 finds a new axis of variation in this multi-dimensional dataset. Imagine a straight line being drawn through a multi-dimensional scatter plot. This new “principal component” does its best job of explaining as much of the variability in the dataset as possible. By that logic, PCA1 is just a little more informative than PCA2. The bar chart is telling us that as the game increased, that component become more variable through the fourth and then stabilized in the fifth. But remember, this is only explaining the following:

‘release_pos_x’, ‘release_pos_z’, ‘release_spin_rate’, ‘release_extension’, ‘spin_axis’

So the question is, does it matter? Does the variance of a component measure of these five features correlate with success? We can look at the components of Cease’s start before and after the great July 17th start.

 

–July 12th @ CLE: Game Score 66–
PCA1 = 3.3
PCA2 = 0.3

–July 17th @ MIN: Game Score 83–
PCA1 = 1.8
PCA2 = 0.3

–July 24th VS CLE: Game Score 63–
PCA1 = 2.1
PCA2 = 0.2

Variance = STD(PCA)^^2 x 10,000

 

While this is in no way conclusive evidence, it’s a start. The variance of PCA1 was lowest on July 17th. The next step in this analysis, as always, is to bring in more data! I will work towards answering the question, does a low variance PCA1 or PCA2 correlate with better performance? If it does, fantasy managers could use this information, if it is tracked and made available, to determine hot spots in a season where pitchers are locked-in. Thanks for participating in this data journey with me. We’ll see where it takes us.

 

 

 


Ottoneu: Prospect Pitchers That Might Be Worth Rostering for 2024

ZiPs 2024 gives us some insight as to how prospects will perform if and when they make it to the big leagues. If we can get a general sense of how a player will perform with projections, we can get a general sense of how much they should be valued. To call this process an oversimplification is to look up at the sun and say, “Bright!” Yes, it is an oversimplification, that’s a given. First, we’re trying to predict not only the future performance of a player who hasn’t actually done it yet. Next, we’re trying to determine how much that performance will be worth without any real context. Where will they play? Who will be on their team? Are they as mentally strong as they are physically strong? Finally, we’re assuming they’ll be healthy.

This oversimplified process can only give us a sense of who might perform like a big leaguer in 2024 and since I’m writing from a FanGraphs points scoring system viewpoint, we can make comparisons with other, more established pitchers. Here’s a reminder of my process. First, I find prospect pitchers yet to debut using The Board. Next, I bring in the ZiPs 2024 projections for the players on that list. Not all of them have projections. After that, I convert their projected stats into FanGraphs Ottoneu points. Finally, I throw the prospects and their projected points into Justin Vibber’s Surplus Calculator output for 2023 and make comparisons. The result tells me how these pitchers will perform in 2024 if they are in a pool of 2023 projected players. The dollar value given assumes that next year’s player pool will be much like this year’s player pool. Here’s an example:

Player Comparison and Value Creation
Name IP rPTS rPTS/IP Dollars
Brandon Pfaadt 153.0 738.0 4.82 $5-$8
Jordan Montgomery 157.3 735.7 4.93 $8
*Yellow=Estimated value

Pfaadt is already grabbing the attention of Ottoneu players as his current FanGraphs points average salary is $4, or $3 Median. Will he increase in value by the end of 2024? ZiPs likes his chances and you can compare his projected points total for 2024 with this year’s Jordan Montgomery. If you pay over the average now, let’s say $6, and this projection comes to fruition, you’ll have a good chance of generating value in 2024. There is, however, another scenario where ZiPs is off the mark and he only brings in $4 in 2024. In that case, you’ll be overpaying. Here are the rest of the 2024 ZiPs projected prospect pitchers and what their value could be at the end of the 2024 season:

Projected Prospect Value for 2024
Name IP rPTS PTS/IP Value
Kodai Senga 142.0 688.2 4.8 $13-15
Brandon Pfaadt 153.0 738.0 4.8 $5-8
Tanner Bibee 115.0 466.0 4.1 $3-5
Grayson Rodriguez 121.7 567.4 4.7 $3-$5
Ricky Tiedemann 112.0 513.0 4.6 $3-$5
Robert Gasser 120.0 511.4 4.3 $3-$5
Gavin Stone 108.0 464.0 4.3 $3-$5
Kyle Harrison 112.0 520.7 4.6 $3-$4
Taj Bradley 120.3 528.8 4.4 $2-5
Gavin Williams 110.3 457.1 4.1 $2-$3
Andrew Painter 112.7 451.2 4.0 $2-$3
Daniel Espino 104.3 446.6 4.3 $2-$3
Bobby Miller 105.3 421.1 4.0 $2-$3
Mick Abel 105.0 371.0 3.5 $1-$2
Owen White 104.0 438.1 4.2 $1
Ben Joyce 56.3 275.9 4.9 $1
*Ottoneu FanGraphs Points Leagues
**Estimates generated by comparing players with similar projections to Justin Vibber’s Auction Calculator values

Let’s compare these estimated 2024 values with some current (2023) average/median Ottoneu salaries:

Current FanGraphs Points Leagues Avg./Med.:

Kodai Senga – Average: $15 / Median: $15
Grayson Rodriguez – Average: $4 / Median: $6
Taj Bradley – Average: $3 / Median: $3
Kyle Harrison – Average: $3 / Median: $3
Ricky Tiedemann – Average: $3 / Median: $3
Robert Gasser – Average: $2 / Median: $3
Tanner Bibee – Average: $2 / Median: $1
Gavin Stone – Average: $2 / Median: $2

This is just one way of trying to look into an uncertain future; mashing a bunch of different spreadsheets together and then estimating a value. Is it worth doing, or would you rather just pay a few dollars now to see what happens later? I think this analysis helps us do both. Remember that the goal is to identify future value and not current value. It allows us to prospect on players because we like them or we believe in them or we saw them at a AA game and were impressed. But, it also allows us to put some kind of filter on how we are rostering and for how much. Are you rostering Taj Bradley for $7 because he was bumped up during arbitration, or you got him in a rebuild trade deal when someone else realized his salary was too high? It may be time to re-examine that hold because, by this analysis at least, he won’t reach that value in 2024. Everyone has a strategy and this is just one approach, but it’s utilizing analytical tools and projections from smarter people than myself to provide insight and that can’t be a bad thing.


A Pitch Mechanics Consistency Data Experiment

The second word in the “music to many people’s ears” term, Spring Training, is an important one to consider. Pro ball players are training. They are preparing for the season. What types of things are they working on? Beat writers report out every year that pitchers are tinkering with new grips, different release points, varying arm slots, diets, cleats, the list is endless. This is assumed to be even more evident in pitchers. As they ramp up to game-ready status, what exactly are they ramping up and can it be quantified by a writer with only so much publicly available data at his fingertips? Away we go in answering that question together.

With statcast data available in spring training ballparks, we can access pitch-level data from the good folks at baseball savant. God bless them. There are a few metrics that measure what I would consider pitcher mechanics and here they are:

[‘release_speed’, ‘release_pos_x’, ‘release_pos_z’, ‘effective_speed’, ‘release_spin_rate’, ‘release_extension’,’spin_axis’]

These seven variables are incredibly manageable from a data perspective when compared to some of the more advanced biomechanical data teams and private company analysts are working with today. However, it can be really difficult to notice patterns from game to game just by looking at a spreadsheet:

Max Scherzer Statcast Data – 3/3/23
release_ release_ release_ effective_ release_ release_ spin_
speed pos_x pos_z speed spin_rate extension axis
94.1 -3.28 5.38 94.1 2269 6.3 229
93.2 -3.09 5.56 92.9 2213 6.2 221
93.7 -3.19 5.41 93.2 2384 6.0 226
92.9 -3.11 5.41 92.8 2317 6.2 224
93.2 -3.07 5.6 93.1 2223 6.2 219
*The header row was separated into two for viewing purposes.

Yes, you could look at this and make general assumptions. But, what if we want to visualize this? What if we wanted to hyper-analyze this so that the only people who really know what the heck is going on are the ones who are too busy playing the game in hyperspace? Bring in principal component analysis!

I’ve used this technique for a few articles here on FanGraphs. In this case, a principal component is being created based on multi-dimensional data, like the spreadsheet above with numerous columns, to create a new column. It cuts through the data and builds new “axes of variation” to better explain multiple data points. A more simplistic way of explaining this is that it’s taking multiple columns in the spreadsheet and condensing them into one. If we then create two of those new, condensed data columns, or principal components, then we can create a visualization. If this is too much data talk for you, hopefully, it gets better as I bring in the baseball.

Let’s start with the young, yet-to-debut major league pitcher, Grayson Rodriguez. How do the metrics above look, game by game as he ramps up for a season in which he expects to debut? I’ll create two principal components to help summarize a dataset similar to the table above and I’ll plot them on the x and y axis of a scatter plot, like this:

Gray-Rod Game 1 PCA Scatter

What we see is two variables, principal components one and two, explaining all the variables listed at the top of this article for one Gray-Rod Spring Training game’s worth of fastballs. It’s not very exciting. But, bring in a second game’s worth of fastballs to the visual and the excitement levels go through-the-roof!

Gray-Rod Game 1 and Game 2 PCA Scatter

…Ok, maybe it’s not that much more exciting. But, at least we can now see a little more of a story starting to develop. Ideally, since these variables are mostly repeatable we should see the blue and red dots sit closer together. What’s up with that game 2 outlier at the top of the second plot? We can compare that pitch with the averages of the other pitches in that game to analyze it further:

Data Point Evaluation
release_ release_ release_ effective_ release_ release_ spin_ PC PC
speed pos_x pos_z speed spin_rate extension axis 1 2
Data Point in Question 98.2 -2.27 6.14 100.3 2077.0 7.3 207.0 -0.00 0.02
Averages of Outing 97.9 -2.14 6.11 99.8 2021.1 7.4 208.4 -0.00 -0.00

It seems that this pitch had a higher release, effective speed, and release spin rate. Is this significant? I really have no idea. It could just be noise. I would love to know if Rodriguez would have noticed any difference after that pitch was thrown. Would he have admitted that he really wanted to get that guy out? Let’s go to the video to see what the situation was:

…Oh, wait. We can’t because MASN doesn’t want to film in sunny Florida. Luckily, we can still look at the savant video-less page here. On a 2-0 count against Spencer Torkelson, maybe Gray-Rod reared back and put a little extra mustard into making sure he didn’t get to 3-0. We’ll likely never know.

How might this compare with a pitcher who is more established? Let’s conduct this same analysis on two of Max Scherzer’s spring outings this season and compare:

Scherzer Game 1 and Game 2 PCA Scatter

Scherzer shows a little tighter spread between all of his pitches and lacks the clear outliers showcased by Rodriguez. The more interesting part to me is that the pitches get closer together from game one to game two. Could that mean anything? Could he be getting ramped up and more consistent, more repeatable?

Now the ultimate question in baseball analytics, how can we actually use this to win? I believe checking in on pitcher components throughout the season may be able to help us identify fatigued players who need rest in order to get their components back into a form that is more in line with areas of succes. This would require measuring the game by game spread or variation of the points. If that number is larger, is that a measurement of inconsistency? If it is lower, does it correlate with success? This analysis really brings up more questions than it answers, as per usual:

What if we changed the colors of the data points in the visualization to reflect individual start game scores?

Are tighter pitches (less spread among single games point locations) better?

What could be done with more data? Can this analysis be applied to biomechanical data?

How does it apply to non-fastballs? Do certain pitchers struggle with repeated motion on certain pitches and not others?

If this post is a thread in that old spring training baseball jersey you pulled out of the closet for your trip to Florida, then let’s start pulling until there’s nothing left and you’ve gotta borrow sunscreen from the shirtless guy next to you. My hope is that with a little more time and research, I’ll be able to utilize this analysis to detect in-season struggles by starting pitchers.


Ottoneu: Prospect Hitters That Might Be Worth Rostering for 2024

Rostering minor leaguers in any Ottoneu format is super fun. It’s fun because you get to take a chance on a player. Build your roster now in order to plan for the future. Isn’t that what so many real-world teams are doing? Yet, it can be cumbersome to hold onto a player for a few years while paying them a $1 or more in salary when you really don’t know when and even if they will pan out.

A few months ago, I was up against this very challenge. Rostering Jack Leiter was exciting. He was a college pitcher with a lot of hype and talent and now a minor leaguer who has held onto both of those traits. But, I needed to try and quantify how much he would be worth in a few years. His MLB ETA isn’t until this season and likely the end of this season. Looking for a range of reasonable salary expectations, I used ZiPs’ 2024 and 2025 projections to get a sense of how to value Leiter in a few years. My analysis concluded that Leiter would likely only return somewhere around $4 in 2023 and that’s if he meets the higher end of likely outcomes. I was paying $6, too much for this year, so I cut him.

Rostering Leiter at $6 doesn’t make sense for 2023. But now that he is back on the waiver wire, should I attempt to re-roster him for $1? In order for that to make sense, two things need to happen. First, he needs to return over $1 of value in 2023. Second, he needs to return over $1 of value in 2024 and pass through arbitration untouched at the end of the 2023 season. Only then is he worth selecting for a $1. In this analysis, I’ll look at the players who are most likely to pass through those qualifiers. Choo! Choo! All aboard the hypothetical train! Here’s a map of our journey:

Step 1: Take a deep breath. A lot of assumptions are going to be made here. This is not an exact science. Please, do not rush to your player page and start auctioning off these players at these exact prices. Use these values as a jumping-off point.

Step 2: Grab a pool of minor league players who have not yet debuted from “The Board” with a Future FV value greater than or equal to 30 and subset it down to players with a “Current Level” not equal to “MLB”.

Step 3: Download ZiPs 2024 projections for the players in Step 2 and calculate their Ottoneu FanGraphs points totals based on those projections.

Step 4: Add those players into the auction calculator’s outputs, sort by rPTS, and see what the value is for players in that same rPTS area. Keep in mind, player value is altered by positional adjustments.

This process ultimately tells us what these players’ auction value could be if they play to their full zips projections in 2024 and the league looks identical to this season’s (2023) player pool. It tells us how much they should be valued for next year.

Out of the 271 prospects I downloaded from the board, 242 had a “Current Level” other than MLB. Of those 242, only 60 have been projected by ZiPs for 2024. Of those 60, only 15 (but number 1 on the list kind of doesn’t count) end up with a positive projected value in 2024. Here they are:

Projected Prospect Value for 2024
Name rPTS Pos aPOS Dollar Range
Masataka Yoshida 769.2 LF $20.33 $22-$26
Christian Encarnacion-Strand 749.0 3B $12.55 $18-20
Anthony Volpe 719.9 2B $15.03 $15-$16
Andy Pages 680.8 RF $20.33 $13-$15
Endy Rodriguez 633.2 C $24.10 $12-$14
Colton Cowser 650.6 LF $20.33 $11-$12
George Valera 640.4 RF $20.33 $9-$10
Noelvi Marte 656.7 3B $12.55 $5-$6
Michael Busch 655.3 DH $12.55 $5-$6
Matt Mervis 638.9 1B $14.73 $4-$5
Ceddanne Rafaela 609.9 CF $20.33 $4-$5
Elly De La Cruz 623.6 SS $14.55 $2-$3
Jordan Westburg 614.8 2B $15.03 $1-$2
Addison Barger 599.7 2B $15.03 $1-$2
Zac Veen 582.5 RF $20.33 $1-$2
*Ottoneu FanGraphs Points Leagues
**ZiPs projections

Now, it’s very, very possible that this process excluded some highly touted prospects. Hopefully, I was clear enough with my process that you can hypothesize as to why they were left out or replicate a version of this for yourself. This is not a top draftable prospects list. I repeat; This is not a top draftable prospects list. For example, where is Jordan Walker? Well, ZiPs doesn’t necessarily project Walker to be roster-able in 2024. Here’s where the Cardinals prospect stands in this process:

Jordan Walker Prospect Value Comps for 2024
Name rPTS Pos aPOS Dollar Range
Josh Lowe 539.1 OF $19.28 -$4.58
Jordan Walker 538.8 RF $19.28
Adam Duvall 537.6 OF $19.28 -$4.79
*Ottoneu FanGraphs Points Leagues
**ZiPs projections

Now, to see why Walker falls out of a roster-able projected salary, let’s look at his 2024 ZiPs projections compared to a couple players high on this list, Colton Cowser and Christian Encarnacion-Strand:

Prospect Comp by ZiPs
Name AB H 2B 3B HR BB HBP SB CS
Christian Encarnacion-Strand 523 136 28 4 29 34 9 4 1
Colton Cowser 525 124 27 1 16 66 14 8 3
Jordan Walker 491 117 26 3 14 39 7 11 3
*Ottoneu FanGraphs Points Leagues
**ZiPs projections

You can see why Cowser has more projected value in 2024, but you can also see how this system has some flaws. Walker has already been creating a lot of buzz in Spring Training and many might look at that 2024 projection and scoff. Projecting players who haven’t appeared in the MLB yet is hard. Time is money and if I had more of it, I would go about the valuation a little more scientifically. Rather than comparing rPTS and suggesting a value range, it would be interesting to iterate through runs of the auction calculator, each time removing the player the minor leaguer would be replacing and determining value from a more believable player pool. I think I’ve gotten close enough for now. Roster any of the players listed above for $1, make it through the end of the 2023 season arbitration without their salaries increasing, and you have a good chance at value in 2024.


Ottoneu: These Pitchers Are More Valuable In Points Leagues

Question: What’s the most important thing to remember when drafting in any fantasy baseball league?

Wise-Guy Answer: The type of beer you have on deck.

Wise-Guy’s Friend’s Answer: Making sure you have snacks that don’t grease up your keyboard!

Serious Answer: Your league’s scoring system.

Read the rest of this entry »


An Exercise in Injury Impact

Last week I built two hypothetical fantasy teams. The first team, the ATC Swiss-Army Knives, was comprised of players from each of the first ten rounds (12-team) with the lowest ATC IntraSD measurement. The second team, the ATC Scary Single Blades, was built in the opposite way, taking the player in each round with the highest IntraSD. For kicks, I then built a third team that was a healthy mix of the two. Read the rest of this entry »


Two Ways of Spreading Risk: How To Use ATC’s IntraSD to Balance Your Hitters

Do you remember being a kid and really wanting a swiss-army knife? They were always locked up in a case, out of reach of 10-year-old grips which would happily test the product right there in the store if the guy with the keys would let them. The swiss-army knife could do it all and would turn any youngster into an adventuring hero instantly. But next to the super utility red magic instrument were the scarier single-blade knives that you were sure, positive, your mom would say no to. In fact, you weren’t so sure yourself if you could handle such an object. If fantasy baseball players on draft day are knives spinning in the display case at your local hardware store, ATC’s IntraSD provides instructions on how to use each one.

Spend a moment sorting by IntraSD on the ATC projections page and you’ll quickly understand what the metric seeks to explain. Players like Seiya Suzuki and Fernando Tatis Jr. have very low values and represent your swiss-army all-around player. Players like Pete Alonso and Jake McCarthy have high values and represent your scary, yet very sharp and useful, single-blade knives or category-specific players. IntraSD, in essence, measures how equally, or unequally, distributed a player’s roto category values are. Here are our four examples in both roto-values as created by z-scores on our auction calculator and the raw projections that generate those values:

Auction Calculator Category Values
Name PA mAVG mRBI mR mSB mHR
Fernando Tatis 502 $2.84 $1.88 $1.41 $5.07 $4.26
Seiya Suzuki 567 -$0.19 -$2.15 -$2.01 $0.85 -$1.38
Pete Alonso 657 $0.35 $10.71 $3.82 -$2.52 $8.55
Jake McCarthy 556 -$0.32 -$5.50 -$2.58 $9.82 -$6.51
*ATC projections, **Auction Calculator default settings, ***Yellow=LowIntra, Red=HighIntra

ATC Projections
Name Team G PA AB HR R RBI SB AVG
Fernando Tatis Jr. SDP 117 502 441 31 82 82 20 0.277
Seiya Suzuki CHC 135 567 497 20 73 69 11 0.262
Pete Alonso NYM 154 657 573 38 89 110 4 0.264
Jake McCarthy ARI 137 556 501 11 71 58 30 0.262
*Yellow=LowIntra, Red=HighIntra

My favorite way to approach a draft is to target the low IntraSD value players. I’m all for drafting Tatis and Suzuki because of the spread of their values. You can see that both players have fairly even values across categories and Tatis provides a little bump in the mSB value. Alonso’s value is mostly tied up in mRBI and mHR while McCarthy’s is isolated to mSB.

I love spreading the value around and I don’t like having one-category players on my team. However, many fantasy managers have great success in balancing a team with those solo cats. So let’s experiment with this season’s player pool. Let’s build two teams using two different strategies. We’ll use ATC’s IntraSD to build one team that has players whose value is equally distributed across offensive categories. Then, we’ll build a second team that isolates specific skills. I’ve only included hitters in the first 10 rounds because pulling in pitchers to this analysis makes things complicated. I’ll write up a pitchers-only version of this post next week. For this analysis, I’m using ATC projections and ranking players based on 12-team leagues and average draft position (ADP) from NFBC drafts. I’m also not using a specific draft position, rather, I’m simply pulling out the highest IntraSD hitter and the lowest IntraSD hitter per round. Here are the two players per round who meet those qualifiers:

Round 1
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Ronald Acuña Jr. 2.0 633 $2.24 $0.61 $8.96 $11.43 $3.55 $42.99 1.31
Kyle Tucker 6.2 630 $2.84 $7.32 $2.76 $5.85 $5.49 $40.45 0.49

If you have the number two pick and you skip over Acuña for Tucker, you’re bound to get some funny looks in the draft room. This analysis is a little unfair in the first round, because all players with first round ADP and high dollar value will naturally have high value in each category. But without going through this exercise, I would have never realized that Kyle Tucker (33) is expected to hit more home runs than Acuña (29). In the end, Acuña’s mSB and mR values edge out the total dollar contest.

Round 2
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Fernando Tatis Jr. 18.7 502 $2.84 $1.88 $1.41 $5.07 $4.26 $25.73 0.40
Pete Alonso 19.3 657 $0.35 $10.71 $3.82 -$2.52 $8.55 $32.81 1.45

You know what’s going on here. Tatis provides stolen bases and Alonso doesn’t. Alonso is expected to hit more home runs (38, Tatis 31) and Tatis has better potential batting average value. Tatis is a perfect example of an all-around player and should be the captain of the ATC Swiss-Army Knives.

Round 3
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
J.T. Realmuto 25.9 550 $0.03 -$0.45 -$2.02 $3.24 -$1.49 $24.42 0.43
Nolan Arenado 35.2 638 $0.92 $6.43 $0.10 -$2.50 $3.03 $24.31 0.95

Realmuto’s value is on par with Arenado’s because of a huge positional adjustment and while Arenado looks like the better player from a projections standpoint, every team needs a catcher. To have a catcher with a low IntraSD, one who can contribute in all categories exceptionally well relative to other catchers, has tremendous value.

Round 4
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Jose Altuve 37.9 632 $3.20 -$1.25 $7.34 $1.60 $1.33 $27.16 0.63
Matt Olson 46.3 659 -$3.72 $8.09 $3.99 -$3.68 $6.61 $23.19 1.52

What an interesting battle this one is. Olson has most of his value tied up in home runs and the rbi that come with those home runs. But Altuve is so well balanced. Could there still be some more pop and stolen base potential in the 32-year-old second baseman? Projections say yes. It’s a tough decision to make.

Round 5
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Ozzie Albies 57.1 597 -$0.71 $1.26 $1.56 $2.59 $0.49 $20.12 0.32
Kyle Schwarber 58.7 618 -$7.73 $5.27 $5.04 -$1.32 $9.16 $26.60 1.70

Round four and round two end up looking very similar. Homeruns, runs, and rbi, or stolen bases and average? Remember that scarcity is built into these raw category (z-score) values, so while Schwarber’s 39 ATC projected home runs is third best in the league, Albies’ 15 ATC projected stolen bases lands him at 48th. However, since it is more difficult to find stolen bases laying around on the waiver wire than it is to find home runs, Albies’ stolen base value rounds out his IntraSD to a lower number.

Round 6
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Adolis García 59.0 610 -$7.20 $2.07 -$0.85 $4.84 $1.56 $16.60 1.16
Adley Rutschman 71.6 552 -$1.34 -$4.89 -$1.87 -$2.69 -$3.17 $11.17 0.40

Another catcher appears on team ATC Swiss-Army Knives and already we can see drawbacks to simply selecting the most category-diverse player per round. Unless you want to go H.A.M. in a two catcher league, I wouldn’t advise drafting both Rutschman and Realmuto. But, this does point out the batting-average drain a player like García can be on your roster. However, that stolen base value is always attention grabbing. Had you selected Realmuto as your catcher earlier and avoided García, your team would be much more balanced.

Round 7
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Tommy Edman 75.9 624 -$0.56 -$5.68 $2.23 $7.72 -$6.01 $12.64 1.46
Dansby Swanson 79.1 648 -$2.77 $1.07 $2.19 $2.16 -$0.02 $12.89 0.51

A lot of times we see a projection line like that of Swanson’s:

G: 151 PA: 648 HR: 23 R: 85 RBI: 79 SB: 14

…and nothing really stands out. But, it’s an incredible line. Compare it with Edman’s:

G: 147 PA: 624 HR: 12 R: 85 RBI: 58 SB: 26

…and you might jump out of your shoes when you see that 26 stolen base projection. But Swanson has him beat nearly everywhere else. Are you noticing a pattern? If a player has high positive stolen base value and is lacking in any other category, their IntraSD increases and they become a little more one-sided.

Round 8
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Starling Marte 87.1 550 $2.28 -$4.77 -$0.87 $8.00 -$4.87 $15.97 1.38
Gunnar Henderson 90.2 584 -$2.49 -$1.84 -$1.89 $0.64 -$2.10 $8.65 0.25

Henderson holds the third loweset IntraSD among players in this analysis and he has the second lowest IntraSD among projected third basemen. Reds prospect Spencer Steer holds the lowest among third basemen at 0.20. Add to that impressive mark the fact that the third base position has such a steep drop off this upcoming season and Henderson’s value, or perhaps ADP, is likely to continue to go up. But once again, that stolen base inhibition comes creeping into your mouse clicking finger and the the “Draft” button starts pulsing.

Round 9
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Alejandro Kirk 104.3 489 $3.12 -$4.09 -$7.49 -$4.32 -$3.80 $8.54 0.87
Seiya Suzuki 105.8 567 -$0.19 -$2.15 -$2.01 $0.85 -$1.38 $11.30 0.12

Suzuki’s recent injury news came as a heavy blow to a few of my draft strategies. He has the lowest IntraSD among all players projected by ATC and I planned on capitalizing on that fact, making him the tiny little scissors in the swiss-army knife that all too often come in handy. He is projected to be such a useful, all-around player that those less “in the know” may let him slide down draft boards. Add to that the fact that he is not a first year player and the shiney new toy in the draft room and I had Suzuki as one of my most common targets. While his season is certainly not over, oblique injuries are so nagging and tough to assess. Kirk, on the other hand, can start to balance out a team that focused on those stolen bases and home runs early on.

Round 10
Name ADP PA mAVG mRBI mR mSB mHR Dollars IntraSD
Jake McCarthy 108.3 556 -$0.32 -$5.50 -$2.58 $9.82 -$6.51 $11.09 1.74
Gleyber Torres 114.1 573 -$1.29 -$1.48 -$1.93 $1.01 -$0.73 $10.52 0.15

Much like in the way that this exercise allowed me to notice Kyle Tucker’s “better than Acuña’s” home run projections, I unexpectedly notice Torres as a well-balanced player. His playing time is certainly in question as the Yankees have Isiah Kiner-Falefa and DJ LeMahieu, but Roster Resource currently has Torres listed as the lead-off hitter and starting second baseman. I really do believe in a healthy home run total from Torres in 2023 given his HR/FB improvements in 2022 along with his improved ability to hit the fastball. McCarthy’s IntraSD tells us the same old story of how valuable higher stolen base projections can be.

Now, it’s time to look at how our individual teams would fair in standard roto formats. Please, meet team ATC Scary Single Blades and team ATC Swiss-Army:

The ATC Scary Single-Blades
Total HRs Total Rs Total RBI Total SBs AVG Average IntraSD
248 820 799 153 0.258 1.4

C: Alejandro Kirk 1B: Pete Alonso, Matt Olson 2B: Tommy Edman SS: – 3B: Nolan Arenado OF: Ronald Acuña Jr., Kyle Schwarber, Adolis García, Starling Marte, Jake McCarthy

The ATC Swiss-Army Knives
Total HR Total R Total RBI Total SB AVG Average IntraSD
233 800 756 138 0.263 0.4

C: J.T. Realmuto, Adley Rutschman 1B: – 2B: Jose Altuve, Gleyber Torres, Ozzie Albies SS: Fernando Tatis Jr., Dansby Swanson 3B: Gunnar Henderson OF: Seiya Suzuki, Kyle Tucker

Fascinating. It turns out that the scary single-blade knife slowly spinning in fluorescent light is the better option. The natural balance of one-sided players ends up producing better totals than seeking a diverse team. Looking at the positional distribution of both teams tells us that selecting from either side of the strategy leaves us with a poorly constructed roster. In the end, Mom knows best and as you stomped around the hardware store begging for her to buy you a knife that you probably shouldn’t have, she suggested you settle for something more age-appropriate, the Victorinox Knife, a useful yet less overly utilitarian option:

The ATC Victorinox
Total HR Total R Total RBI Total SB AVG Average IntraSD
264 820 818 130 0.257 0.69

C: J.T. Realmuto 1B: Pete Alonso 2B: Jose Altuve, Gleyber Torres SS: Dansby Swanson 3B: Gunnar Henderson OF: Seiya Suzuki, Kyle Tucker, Kyle Schwarber, Adolis García

Finally, here are the hypothetical roto standings of our three-team league:

Hypothetical Roto Standings
Team HR R RBI SB AVG Total
Single-Blades 2 2 2 3 2 11
Swiss-Army Knives 1 1 1 2 3 8
Victorinox 3 2 3 1 1 10

The results of this exercise would indicate that drafting a healthy balance of high IntraSD players is actually a really good thing. But that caveat of not actually being able to get these players in a snake-draft is a pretty big one. Regardless, it is an exercise you can conduct yourself with your known draft position prior to draft day. A balanced team is a good one, but if you can get players who are the best at certain categories, like the Single-Blades, you will be drafting a very good team. There is one thing that has not been included in this anaylsis thus far and that is, what happens if a player just doesn’t meet their expectations for a category? IntraSD also allows you to spread that risk. If you have a base-stealer who stops stealing bases and no back-up plan, then you better get creative mid-season. Take a look at your personal draft strategy from a perspecitive of balance using IntraSD and you’ll likely gain some insight and perspective that could better prepare you for the season.


Ottoneu: These Players Are More Valuable In Points Leagues

It’s always good to remind yourself of your league’s scoring system before you start a re-draft for the season. If you’re like me and you play in multiple fantasy baseball leagues with multiple scoring systems, things can get a little blended together. Here are some really important points to remember when comparing Ottoneu points and standard roto Read the rest of this entry »


Ottoneu: Where Auction Price Is Just a Concept

What is fantasy baseball money anyways? It’s made up, monopoly money, fake, and not tied to any real financial logical thinking. If I want to pay Mike Trout $75, I’ll do it. So long as no one in my league wants to pay him $76. It’s in my rights as an Ottoneu player. Some people pull up to the closest gas station they can find, don’t even look at the price per gallon, pump away and jam the “No” button when asked, “Do you want a receipt?” Others drive around for miles waiting to find that station showing a $0.05 lower price than all the rest, print that receipt, stuff it in their wallet, and keep it for proof of the deal they got to show their neighbors at a later date. What I’m really trying to get at here is that the price you pay can be entirely different than the price someone else is willing to pay and this is really what playing in fantasy baseball auction leagues is all about.

Click over to the auction calculator, set it to Ottoneu FanGraphs points league scoring, and take a look at Aaron Judge’s 2022 year-to-date (YTD) value. $93! That’s 1458.2 points worth of one outfielder. Starting in 2018, the FanGraphs Staff II League champion typically accumulated a few hundred points above 18,500. For teams that rostered Judge in 2022 and came close to this point total, Judge represented roughly eight percent of their points total. But any manager who bid anywhere above $70 for Judge last season was likely ridiculed. While Judge’s current average (and median) salary in FanGraphs points leagues is $50, his value won’t be realized until the end of this season. Yes, the person who bid $70 for Aaron Judge this time last year came out on top, but did they cut him and send him back to re-draft this year? He’s valued by the auction calculator at $68.80 and that doesn’t include inflation.

So, who do you believe? Where do you look? How many of you have already stopped reading this and are currently typing in the comments section about how your valuation is the valuation? To be honest, I don’t know. What I do know is the auction calculator provides a good jumping-off point. If you really want to get a close estimate of what you should pay in an auction, you can plug in all the keepers in your league and it will help with controlling for inflation. I chose not to go through that process and instead added roughly 20% to a player’s salary to get a “Predicted Value”. Last week, our FanGraphs staff two league conducted its yearly re-draft. Here are the 10 players furthest from that prediction in the negative (Underpays) and the 10 players furthest from that prediction in the positive (Overpays):

Player Salary Underpays
Name Auction Calc Value Predicted Value Winning Bid Diff
Christian Yelich $29.40 $35.29 $20.00 -$15.29
José Abreu $22.45 $26.94 $18.00 -$8.94
Mitch Haniger $15.64 $18.77 $10.00 -$8.77
Mike Yastrzemski $5.96 $7.15 $1.00 -$6.15
Paul Goldschmidt $37.41 $44.89 $39.00 -$5.89
Mike Trout $53.62 $64.35 $60.00 -$4.35
Javier Báez $6.09 $7.31 $3.00 -$4.31
Rhys Hoskins $21.09 $25.31 $21.00 -$4.31
Wil Myers $5.83 $6.99 $3.00 -$3.99
Austin Hays $3.86 $4.64 $1.00 -$3.64
*Auction calculator outputs used Steamer Projections
*Diff = Winning Bid – Predicted Value

Player Salary Overpays
Name Auction Calc Value Predicted Value Winning Bid Diff
Jesse Winker $2.60 $3.12 $23.00 $19.88
Ozzie Albies $8.14 $9.77 $27.00 $17.23
DJ LeMahieu $1.00 $1.00 $17.00 $16.00
Michael Brantley $1.00 $1.00 $16.00 $15.00
Trevor Story $1.00 $1.00 $12.00 $11.00
J.D. Martinez $1.00 $1.20 $12.00 $10.80
Giancarlo Stanton $17.66 $21.20 $31.00 $9.80
Dylan Carlson $1.00 $1.00 $10.00 $9.00
Cody Bellinger $1.00 $1.00 $9.00 $8.00
Spencer Steer $1.00 $1.00 $9.00 $8.00
*Auction calculator outputs used Steamer Projections
*Diff = Winning Bid – Predicted Value

There will always be a few players whose final salary is way out of line with what we would expect. It happens in every draft and someone undoubtedly tweets about it looking for crowd-sourced justification that the price is out of line. But only showing you the highest and lowest differentials doesn’t do the predictions analysis justice. The graph below shows my average predictions versus the average actual winning bid (salary) by decile. The decile rank was created using the predicted value. Keep in mind that this is a re-draft, so the player pool is limited to the non-keeper, free agents.

Actuals vs. Preds, Salary

There are huge discrepancies from decile four onward because my predictions didn’t make negative players worth $1, but instead kept their negative price and, even worse, added negative value in my inflation calculation. This is not the way to do it but had I converted all these players to $1 the graph wouldn’t become any more informative because all the blue bars would average to $1. The way it is constructed now at least gives you a sense of how unlikely players in lower deciles are to actually return a positive value. With that explanation out of the way, we can see that deciles five, six, and nine have larger standard deviations in actuals, meaning drafters were willing to pay well over $1 in many cases. If this informs us of anything, it’s that in most leagues, a $1 player is difficult to define/value and it remains an area with huge potential for those who can come close to doing so.

Lastly, here are the average-squared error measurements of each valuation from this draft and this time, I’ve made all negative-value players worth $1. This is because in reality if my spreadsheet said player X was valued at -$13, but I disagreed, I would only bid $1.

Mean Squared Error (MSE):

AVERAGE((Actual Salary – Auction Calculated Value) ^ 2) = 34.1

AVERAGE((Actual Salary – Adjusted or Predicted Value) ^ 2) = 32.0

Remember that the Adjusted/Predicted Value, the blue bars in the graph above, were simply auction calculator values adjusted for inflation by adding 20%. Overall, these adjustments proved more accurate by MSE.

Adjusting for inflation, in this case, was a good move, but in the grand scheme of things it didn’t really make individual predictions that much better. Just look at Jesse Winker in the “Overpay” table above. Winker’s low game/PA projection places him just above replacement level with 590.6 projected points, but at 5.2 points per game. In other words, when you play him, he’ll be better than the average player but it’s unclear just how much he will play. If you take the under on playing time you’ll have a really valuable player that you can plug in and out of your lineup. If you take the over, that points per game number will start to go down, possibly finding him somewhere in-between those two points measurements with a higher total points but a lower points per game, making him, in my opinion, a much different player. Have I written myself into circles? Maybe. But, it’s all related to the fact that trying to find accurate valuation for players in Ottoneu leagues is a challenge. The key here is that all of these values, total points, points per game, adjusted for inflation values, are all based on projections and projections, by nature, are inaccurate.

Will we ever get closer to true auction values? Probably not and that’s what makes it so much fun. Is $10 a good price for Mitch Haniger? I think so. I’m the one who paid it. By my calculation I saved myself $8. But only the San Francisco Giants outfield and Hanigers’ ability to stay there will tell me whether I’m right or wrong.


Launch Angles, Release Points and Hit Predictability

Through games played on June 23rd, 2022, Luis Arraez held the highest batting average in the MLB at .349. He was just ahead of Paul Goldschmidt (.340), who was in the midst of putting together a career year, and Xander Bogaerts (.335), who was just being Xander Bogaerts. So, if you had chosen a player that you thought was most likely to get a hit the following day, June 24th, any of those three players would have been a safe bet. But, it’s just not that simple, is it? Goldschmidt played the next day but went 0-4. Arraez played and went 0-4. Bogaerts didn’t play. And that really is the challenge in trying to predict something like who will get a hit each day. That’s why there remains a $5.6 Million jackpot on the line.

I’ve written about my ventures in using analytics and a predictive model (Jolt) to help with daily batter hit predictions while playing in the Beat the Streak contest. You can learn more about the contest here, you can listen to a podcast about it during the season and you can sign up to the play game yourself! I won’t write much about the specifics of the contest, but it is the motivation for this research. The general idea is that you choose a player each day that you think will get a hit, if he does, you get a point, if he doesn’t, you go back down to zero. The goal is to reach 56. But, strip away the millions, strip away any contest or fantasy-style game, and what we’re left with is the question of how to best predict the next day’s hitters.

Jolt, the name of the model I’ve built to aid in making this prediction and a tribute to “Joltin” Joe DiMaggio, was built on the concept that the launch angle and launch speed of the hitter matter tremendously. Since we know that certain launch angles are more likely to lead to a hit and that balls hit hard also add to that likelihood, we can look for players who do that type of thing often. To show this in a visualization I randomly sampled a few days’ worth of savant batted ball data for each month of the 2022 season, sub-set that data down to only looking at batted balls from four-seam fastballs, and looked at the distribution of hits versus non-hits:

Launch Angle Distributions - Hits vs. Non-Hits

In this sample of data, batted balls (this does include home runs) ended in hits much more often with launch angles between, roughly, 10 and 20 degrees and that is something we have known for a while now. Balls launched at these angles have a much higher likelihood of being line drives and therefore, more difficult for fielders to get to. Let’s use this information and go back to June 24th. Through games played on the 23rd, there were eight hitters right in that solid average launch angle of 18 degrees bin. Here they are along with their up until that point batting average and June 24th results:

Hit Results 6/24/22, Mid Average LA
Name LA AVG 6/24/22
Mike Yastrzemski 18.8 0.250 1-4
Will Smith 18.8 0.256 2-5
Justin Turner 18.8 0.220 0-4
Ha-Seong Kim 18.8 0.226 1-3
Christian Walker 18.5 0.208 1-4
Cedric Mullins II 18.4 0.248 1-4
Marcus Semien 18.2 0.228 2-5
Mookie Betts 18.1 0.273 0-0
*Among qualified hitters with a 18 degree average launch angle through 6/23/22

Yahtzee! Is it really that easy? Just pick the hitter who has been to the plate a lot and has a level, hit-falling average launch angle? You might think this is basically the same as selecting line-drive hitters, but it’s not. At least, those two measurements aren’t showing the same hitters. Only Will Smith found himself in both the group of players above and in the top 20 qualified hitters by line-drive percentage. The funniest part about this sample of hitters is that the hitter with the highest batting average, Mookie Betts, is one of only two players to not get a hit the following day. This is random of course. But, just for kicks, let’s do it with hitters who had been putting the ball on the ground (5 degrees) too much through June 23rd and look at how they did on the 24th:

Hit Results 6/24/22, Low Average LA
Name LA AVG 6/24/22
Yandy Díaz 5.1 0.263 0-4
Nicky Lopez 5.1 0.217
Vladimir Guerrero Jr. 5.3 0.264 2-5
Miguel Cabrera 5.7 0.299 1-4
Juan Soto 5.8 0.214 1-4
*Among qualified hitters with a 5 degree average launch angle through 6/23/22

Ok, theory killed? Clearly you can see from the histograms that while a launch angle in the 10 to 20 range falls for a hit more often, there are a lot of other launch angles that fall for hits too. Those same launch angles don’t fall for hits as well. While finding players who have a tight launch angle, Alex Chamberlain style, would be a good strategy, you would likely find yourself choosing the same handful of hitters every day, thus limiting your player pool. Just look at Paul Goldschmidt’s 2022 cumulative average launch angle:

Goldschmidt Cumulative LA

It becomes fairly stable around BBE number 200. Choosing Goldy every day would have been a good strategy in 2022, but for obvious reasons wouldn’t allow you to string together 56 consecutive hits and that’s the name of the game. It would also fail to take into consideration who was throwing the ball, which is arguably 50% (but probably more) of the equation. Jolt’s first iteration sought to find players with good launch angles matching good release points. The thinking was that high release points translate to higher approach angles and that uppercut swings bring the bat through the zone on those particular pitches longer. It’s not a new concept in baseball. Ted Williams’ 1986 book, The Science of Hitting, detailed some of this thinking.

It was even backed up by the model’s validation. Jolt, iteration one, found ‘release_pos_z’, or the “vertical release position of the ball measured in feet from the catcher’s perspective” according to baseball savant, as the fifth most important variable in predicting a hit out of all of statcast’s outputs. Unfortunately, this is much, much more descriptive than it is predictive. A trained model will say that the launch angle of a batted ball and the release point of the pitch help “predict” whether the ball falls for a hit or not. While a release point, especially if you isolate to single pitch like just a fastball, can be predicted, you can’t really predict at which angle the ball will be hit before the batter swings. Here’s an example:

Release Pos Z vs. LA Scatter Plot

In this image, green represents hits in the data and red represents non-hits. The green band going across the chart tells us just how important the launch angle is, but it doesn’t have a relationship with the release point of the pitch. Any of these release points can match up with any of these launch angles and fall for a hit. It is uncommon for any launch angle above ~70 degrees to fall for a hit, but there may be a few outliers in there. Regardless, matching up a pitcher’s release point with a batter’s launch angle doesn’t seem to provide much detail when analyzing this data. Most of us would completely disagree with the data in this case, but it doesn’t mean that hitters are actively upper-cut swinging on high-released fastballs because I would imagine, that’s friggin’ impossible to do. But maybe it naturally happens? Maybe looking at it from a vertical approach angle (VAA), the angle of the ball as it crosses into the zone, is the better…um…approach? Let’s see:

LA vs VAA (FA)

This graph would tell you that besides the outliers, all VAAs can be hit with all LAs. Again, there is a HUGE discrepency between what we computer baseball nerds see and read and think and what a hitter actually does. If there are any hitters out there who know that tomorrow’s starting pitcher has a very steep vertical approach angle, are they altering their swing or approach to match it? Um…I’ll guess…no. It’s hard enough for them to decide to swing or take. But there must be somethign I’m missing. The launch angle in which a ball that falls for a hit is struck may not necessarily relate to how high the pitcher is releasing the ball, but certainly, some swing types are better against those pitches than others. But what is measuring that? What is measuring the actual swing? There have been some attempts made like the data collected by Swing Graphs, but nothing that I’ve seen is freely available to the public.

A model, whether it has a good R-squared, average-squared error, misclassification rate, or get-a-lot-of-likes-on-twitter rate, doesn’t do a good job of telling us whether a certain launch angle will be more successful against a certain vertical approach angle because it’s just too random and there’s too much noise. It’s also too difficult to create that data before it happens in order to make predictions on it. But, Jolt simply won’t quit. Iterations continue and there remains work to be done to better model tomorrow’s hit likelihood. In fact, MLB does it for its Beat the Streak app. But, no one has found success in just picking the top-recommended hitter each day, have they? Of course, I’m not implying that a model will be the only way to win this contest, in fact, I don’t think one single person will ever be able to win this contest. However, sometimes a simple model coupled with logical thinking and sound judgment is best. Jolt’s next attempt will focus on that. Just look at the table of June 24th’s outcomes at the top of this page and you’ll see, there’s something to just choosing who is hot each day and who works well against the guy standing on the mound.