Predicted 2019 NFBC ADP
Disclaimer: This is just for fun. I am, by no means, claiming that the predicted average draft positions (ADPs) described below will happen. Obviously! I’m no prophet. Also, I am not claiming these predictions are merely educated guesses. In fact, these aren’t even my predictions — they’re yours. Or, well, they’re not your predictions — they’re my computer’s predictions, but fitting your behavior to observed events.
That’s a complicated way of saying: by using historical ADP data and end-of-season (EOS) values, we can model future ADP values. (xADP, if you will.) Namely, with 2018 EOS, 2018 ADP, and 2017 EOS, we can predict 2019 ADP — and explain almost 60 percent of its variance (adjusted r2 = 0.59).
Over the years, I’ve compiled seven years’ worth of ADP data from the National Fantasy Baseball Championship (NFBC) and EOS values courtesy of Razzball — almost 5,500 player-seasons, although for these purposes I’ve narrowed my n to 2,315.
The problem with ADP (and EOS) ranks is they’re perfectly linear, whereas the values they represent are not. For example, the difference in value between the 1st and 2nd picks in a draft is not the same as the difference in value between the 399th and 400th picks. To resolve this issue, I converted the linear values (1, 2, 3 … 99, 100, 101 … 398, 399, 400) into logarithmic dollar equivalents ($58, $52, $48 … $13.70, $13.60, $13.50 … $0.17, $0.15, $0.12). In order to do this, though, I needed to construct a hypothetical league in which these values would be applied. Somewhat arbitrarily, I chose a 15-team league with 27-player rosters, such that the draft pool would be 405 players deep. I made this decision partly due to limitations with my older data. However, I’m not convinced that changing the league size — not dramatically, at least — would make much difference anyway.
(Here’s some messy math. Feel free to skip to the next paragraph if it’s not your jam.)
Because the data effectively operates as a panel — hundreds of players whose actual and perceived values are observed over time — and because we cannot dissociate ourselves from our knowledge, perceptions, and biases of these players, there exists serial correlation among the values. This correlation — the influence of 2017 EOS on 2018 ADP, and so on — artificially inflates the fit of the model. A standard linear regression of this same data produces an adjusted r2 of 0.85, which is incredible but misleading. To correct for this, I specified a first-differences regression, which subtracts the previous period’s value from the current period (for example, 2018 ADP minus 2017 EOS).
(OK, stop skipping.)
Here’s the final model:
ADPt = α*[EOSt-1–ADPt-1] + β*[ADPt-1–EOSt-2] + (εt–εt-1) + EOSt-1
(…maybe you could’ve skipped that part, too.)
I can tell you, without even really looking at the numbers, that the model will probably struggle in some areas. Obviously, it won’t know if a player has retired. It’s unaware of seasons truncated by injury or midseason debut. It has no concept of prospect hype and will absolutely whiff on Vladimir Guerrero Jr., who has yet to debut but will be the prospect gem with tons of helium in 2019 fantasy drafts. It is blissfully unaware of the out-of-control Adalberto Mondesi hype train. You can yell at me all you want about how stupid that is, but you’d be yelling at a computer that can’t think for itself.
It’ll be months before we see this work bear fruit, but we can at least compare the results of the Too Early Mock Drafts that ran in September, the data for which were compiled by the enigmatic Smada.
If there are any glaring omissions, it’s probably because the player made an impact in 2018 despite being drafted outside the top-405 players. Let me know who’s missing and I can give you the ADP value!
I’m also curious to hear your thoughts — on the first round, on perceived accuracy or inaccuracy, everything. My computer wants your feedback. At the end of the day, though, this isn’t anything more than a stupid data trick. It’ll be fun (for me, at least) to see how close this gets.
Only glaring omission seems to be Juan Soto, who I imagine wasn’t drafted all that much. Interested in where he places, I imagine not too far from Acuna, in the low 50s.
I also could guess you will see some differences in how Kris Bryant is treated in real drafts vs this model, but injuries played a big role for him this year.
Ah! Soto. Yes, you’re right — model missed him because he wasn’t drafted at all in NFBC in 2018, I think. And because he was effectively negative value in 2017 EOS (didn’t play) and a negative value in 2018 ADP (wasn’t drafted), the computer perceives his breakout as a fluke: 317th overall. No way that’s happening.
Re: Bryant: right. Same applies to Daniel Murphy, probably; no way he goes at the back end of the top-200. Injured guys will fall through the cracks here.
I think both Soto and Acuna are likely to slot a bit higher in real drafts than where Acuna actually landed on the list above. Bryant will definitely end up much higher.
This is kinda fun, but considering the limitations of the model acknowledged by the author it’s probably not gonna end up being useful in most practical cases. Better to use actual ADP data adjusting for league context and the latest news reports as you approach your drafts next pre-season.
As a first step, this research is amazing for those of us in Dynasty and Keeper leagues. Even if the projections start flawed, they might get better every year to the point where they can be real asset in an owner’s portfolio.
Thanks! I was hoping someone would say they’d use it for that. I plan to use the results to help guide some of my offseason trade ventures, but we’ll see if it bears fruit.