Over the year or two many people have noted that the Statcast numbers in Comerica Park seem inflated compared to the real world results for the batted balls. From cursory research it appeared that this trend may be focused in right field, but perhaps not. Either way, you see Miguel Cabrera topping the xwOBA leaderboards on Baseball Savant, and xOBA leaderboards on xStats while his true results are lagging behind.
In one hand, that may appear to be a positive sign for a possible recovery, given his down season in 2017. In the other hand, it shouldn’t sit that well. His xwOBA is about 60 points higher than his real wOBA. The xStats fair a little better, and the difference is closer to 40 points. Either way, that is an absolutely enormous discrepancy.
There have been some hints that this effect may be focused in right field, which I noticed with Nick Castellanos. When I tell you something weird is going on in right field of Comerica Park, you might instinctively shed blame on that ludicrously deep right center field fence. I certainly did. That fence is about 430 feet deep and 11 feet high. It is one of the deepest areas in any MLB ballpark, rivaling AT&T Park.
But, I no longer believe that is the explanation. More on that in a moment, first I want to explain my methods.
As I have written about in a number of posts recently, I’ve been working on a physics model to estimate the trajectory of every batted ball. It isn’t totally finished yet, but I work on it a little every week and I try to share the results as I move along. The model works by picking a target height, which is the height of the wall on the angle of the batted ball in the origin stadium. I then follow the trajectory of the ball until it falls beneath that target height, and spit out the distance from home plate.
Next, I compare that calculated distance with the actual distance to the wall. I subtract the true wall distance from the calculated distance and call that the Delta. I grouped these Deltas into 5 foot buckets (-15, -10, -5, 0, 5, etc) and found the home run probability for each. I have plotted those results for Comerica Park in the chart below. I then fit a curve, which you can also see in that chart. I am using this curve to calculate the home run probability.
I’ve since realized .07 and 1.55 make a better fit for this curve. The difference isn’t that great, though.
Okay, let’s take a step back. You may be wondering why I need this sort of logistic regression to find the home run probability. If the trajectory says the ball goes over the wall, it is a homer, right?
Not really. My physics model is making a few assumptions. First, it assumes every fly ball has 2500 RPM backspin. Variations in this spin rate can add or subtract a number of feet, and change the results dramatically. Next, there might be variation from one ball to the next in terms of drag, which might change the behavior of the ball by a quite a bit. Then there is the issue of temperature, humidity, and pressure stemming from rough estimates used in the calculations. And then you have the issue of wind, which is totally ignored by the model.
So, there are a number of variables that are not being accounted for by the model itself which must be accounted for in some other manner. This sigmoid curve is an attempt to correct for some of this variation.
Okay, back to Miguel Cabrera. I have plotted Cabrera’s spray chart using the calculated distances using my model, and I created a Viz to show this data, which you can see here. To be clear, these points represent the area in the field where the ball dropped below the height of the fence. These are not where the ball landed.
This is all of Miggy’s data in Comerica Park since 2015, excluding the balls that weren’t tracked by Trackman. They are color coded by my estimated home run probability, and each shape represents the real game outcome.
There is something weird about this chart, isn’t there? In left field you have a few balls that go well beyond the wall, but are not home runs. In right field… well, let’s clean up some of the clutter and look more closely at right field.
These are all of the balls with at least 1% chance to leave the yard, according to my estimates. Notice how many outs there are in right field that are awarded a high home run probability. I count 23 that go well beyond the wall and an additional 9 along the wall. In addition to four doubles and a single that are all beyond the wall.
At the end of the day, these trajectories represent the average batted ball distance, and there should be a normal distribution of true distances around this mean. So you’d expect about half of the balls to land shorter and half longer, all things being equal. The results of the sigmoid curve hint towards a different result, and that the ‘average’ distance of my model may in fact be too high (perhaps 2500 rpm for a flyball is too high). Either way, the results for Cabrera in this park might require more than random variation to explain these results. Only 22 of the 93 balls pictured ended up being home runs. The model estimated there would be 35 home runs. Another big discrepancy.
That is a bit weird. So let’s look all batters in Comerica. I set this to a minimum home run probability of 15%, and eliminated all actual home runs to clean up the image and make it easier to see.
Judging from this alone, there appears to be something going on in right field in Comerica Park. It is possible that my model is wrong, but it is doing a pretty good job overall at estimating home run rates. I have made another viz showing that off, which you can view here.
Considering the totality of the evidence currently at my disposal, including all of the weird Statcast stats we’ve seen coming out of Comerica park, I believe the simplest explanation is wind. Which is terrible, because wind might be the most difficult variable to account for.
This leaves us in a weird place, though. If Tigers players like Miguel Cabrera are having their stats artificially suppressed by wind, what does that mean for their future value? I would imagine it means they would hold more value if they changed teams, which is something we witnessed with J.D. Martinez at the trade deadline. But what about Miguel Cabrera? He has a long term deal with the Tigers that makes him practically untradable.
You shouldn’t look at his expected results, either xStats or Savant, and assume Miggy will bounce back in 2018. His back injury is a big deal, for one, but this stadium issue isn’t going away, either. Cutting the ball through right field requires hitting the ball about 98 or 99 mph, judging by the results I see. You can play around with the viz to see for yourself. Miggy has been hitting steadily fewer high velocity batted balls. It is possible that he has crossing some critical point where he is no longer hitting enough of his balls in play hard enough to make up for this environmental challenge. And is it possible his bat speed may continue to drop, given his injury concerns.
|Year||Above 95mph||Above 98mph||Above 100mph|
The players leaving Comerica park, though. That is something you should be happy to see. Take a look at Ian Kinsler’s right field spray chart.
Angels Stadium is slightly deeper and has taller walls, but perhaps more favorable hitting conditions and a little luck could turn those outs into hits. Either way, he can only go up from here, he’s 0 for 11 with these balls in Comerica.
Long story short, the various types of expected stats in Comerica Park appear to be failing to take into account some environmental variable. This variable appears to be influencing right field to a far greater extent than left field. I believe it could be wind, but it could be something else.