Taking a Look at Changes in Contact Rate
Back in 2009, Eric Seidman wrote a piece here that looked into when samples become reliable for certain statistics. The piece was based off work done by Pizza Cutter. You can read the piece here for a full explanation of how the conclusions were reached, but below is a list showing how many PAs it takes for a statistic to become reliable.
50 PA: Swing % 100 PA: Contact Rate 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA 200 PA: Walk Rate, Groundball Rate, GB/FB 250 PA: Flyball Rate 300 PA: Home Run Rate, HR/FB 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate 550 PA: ISOPrior to yesterday’s games, the cut off for qualified hitters in 2013 was 99 PA. As a result, we can now look at 2013 Contact% and compare it to career rates to see who is making contact at a much higher or much lower rate so far this season.
The average gap in 2013 Contact% and career Contact% was -0.22%. The standard deviation was 3.41%. No players were two standard deviations or more above the mean, but six were two standard deviations or more below the mean. They are listed below.
Name |
2013 Contact% |
Career Contact% |
Contact% Gap |
Pedro Alvarez |
62.50% |
69.90% |
-7.40% |
Albert Pujols |
78.30% |
85.80% |
-7.50% |
Dan Uggla |
64.30% |
72.70% |
-8.40% |
Jeff Francoeur |
69.40% |
77.80% |
-8.40% |
Jason Castro |
73.50% |
82.20% |
-8.70% |
Colby Rasmus |
59.20% |
75.90% |
-16.70% |
This is kind of a big deal because there is a strong correlation between Contact% and K%. I did a quick regression test using the 179 hitters who had 100+ PA last year, and got an r-squared of 0.8192. Not that the relationship of those two things wasn’t obvious beforehand.
Aside from Pujols, all of these guys were high strikeout hitters to begin with. But this is further evidence that Pujols is in full on decline mode. When I calculated average and standard deviation for the gap between 2013 Swing% and career Swing%, Pujols was one of only four players whose gap in Swing% is more than two standard deviations above the mean. In other words, not only is Pujols making 7.5% less contact, he’s also swinging at 5.8% more pitches. Not a good combination.
There were two other notable names that showed up fairly high on the “making less contact” list, slow starters B.J. Upton and Jay Bruce. Their contact percentages are down 6.5% and 4.1%, respectively. Obviously, their strikeout rates are way up, and their slash lines don’t look like they have in the past.
Of the two, you should be less worried about B.J. He’s walking more than he did last year, and his BABIP is a miniscule .209.
On the other hand, Bruce should be a serious concern. His average is right around .250 like it always is, but he has needed a .362 BABIP to keep it there. When the luck goes away and he’s left with significantly worse contact skills, the average may fall off the map. It would be tough to sell high since the power hasn’t been there and because he hasn’t even been a top 70 outfielder according to ESPN’s player rater. But if there is someone out there hoping for a rebound and still willing to pay 75 cents on the dollar, take it.
At the top of the “making more contact” list, you unsurprisingly find the names of some of the biggest surprises of the season. Nate McLouth and Matt Carpenter have the largest positive gaps between their 2013 and career contact rates at 6.3% and 6%, respectively. McLouth’s plate discipline numbers are super impressive as he has a 14% BB% and a 9.1% K%. Thanks to the increase in contact, McLouth is currently a top five outfielder per the player rater, and Carpenter has been a top ten option at both second and third base.
Both guys are also buy-high candidates. Neither one is relying on a completely unsustainable BABIP. They’re both a little above average at .315 and .316, but they aren’t going to get hit too hard by regression. If someone added them off the wire and is looking to cash in and sell high before they regress, take them up on that offer.
Mark Reynolds also shows up in 8th on the list with a 4.7% increase in Contact%. As a result, his K% is 7.3% lower than his career average. That has led to a .280 batting average (.290 BABIP) for Mark freaking Reynolds. You can’t predict ball.
Below is a list of those with a gap between their 2013 and career contact rates that is more than one standard deviation above the mean.
Name |
2013 Contact% |
Career Contact% |
Contact% Gap |
Nate McLouth |
92.00% |
85.70% |
6.30% |
Matt Carpenter |
91.40% |
85.40% |
6.00% |
Torii Hunter |
82.60% |
76.80% |
5.80% |
Chris Davis |
74.40% |
68.80% |
5.60% |
Trevor Plouffe |
84.40% |
79.10% |
5.30% |
Alfonso Soriano |
79.30% |
74.10% |
5.20% |
Mark Reynolds |
68.90% |
64.20% |
4.70% |
Lorenzo Cain |
84.40% |
79.70% |
4.70% |
Josh Rutledge |
84.20% |
79.50% |
4.70% |
Donovan Solano |
89.30% |
84.70% |
4.60% |
Jonathan Lucroy |
90.40% |
85.80% |
4.60% |
Angel Pagan |
92.50% |
87.90% |
4.60% |
James Loney |
92.40% |
87.90% |
4.50% |
Jed Lowrie |
88.60% |
84.20% |
4.40% |
Josh Donaldson |
82.40% |
78.00% |
4.40% |
Joey Votto |
82.50% |
78.20% |
4.30% |
Starling Marte |
79.60% |
75.40% |
4.20% |
Adrian Beltre |
84.30% |
80.10% |
4.20% |
Manny Machado |
83.10% |
78.90% |
4.20% |
Ben Zobrist |
87.70% |
83.70% |
4.00% |
Russell Martin |
86.30% |
82.30% |
4.00% |
Shin-Soo Choo |
80.20% |
76.20% |
4.00% |
Jayson Werth |
82.10% |
78.10% |
4.00% |
Matt Holliday |
82.40% |
78.40% |
4.00% |
Ruben Tejada |
89.70% |
85.90% |
3.80% |
Miguel Cabrera |
82.80% |
79.20% |
3.60% |
Greg Dobbs |
85.50% |
82.00% |
3.50% |
Austin Jackson |
82.90% |
79.40% |
3.50% |
Norichika Aoki |
92.40% |
89.00% |
3.40% |
John Buck |
78.20% |
74.80% |
3.40% |
Justin Morneau |
84.00% |
80.80% |
3.20% |
You can find more of Brett's work on TheFantasyFix.com or follow him on Twitter @TheRealTAL.
What does it mean that a statistic has “become reliable”? Does it mean we should expect the contact rate of these hitters to remain about where it is for the rest of the season, or for those whose contact rate is higher than career avg. we should expect it to come back down, vice versa?
I guess the idea is that it means we shouldn’t necessarily expect regression. Like if a guy has a .400 BABIP through 100 PA, we expect that to regress to somewhere close to his average. But if it’s “reliable” it means we’re not really talking about an anomaly and could expect to continue to see the new rate.
No, Brett, I don’t think that’s a good explanation. I think Pizza Cutter’s work was to show the point at which a sample’s results explained 50% of subsequent variation. Even if that’s not phrased exactly right, I’m positive that there’s still a ton of regression left to do at the “stabilization” point.
That you’d expect the stat for the rest of the season to be closer to what it is now than what pre-season projections said.
Here is how Tango suggests the data be used:
“I like to get things to r=.50. You’ll see the reason in a minute. If r=.70, when PA=300, then r=.50 when PA=130. It’s not important how I got that for now.
Ok, so what can you do with that?
This means that you can add 130 PA of league average HR/FB to any player, to get an estimate of his true talent.”
And to get to r=.5 he suggests multiplying the r=.7 value by 3/7. So it’s a bit of math and research to look up league averages. Pizza Cutter describes it as this: “At .70, the rate of signal to noise crosses the halfway point”. Tango’s method seems more accurate, and Pizza’s method seems quicker. Seems they’ve been arguing over which one to use for about a decade, now.
Razzball’s Jaywrong wrote about this recently. If you’re looking for more analysis and explanation about what these PA benchmarks mean, I suggest checking it out.
http://razzball.com/when-is-a-streak-not-a-streak-anymore/