Taking Hitter Analysis to Another Level

Truthfully, I have had it. Here at RotoGraphs we are almost too good at identifying emerging pitchers. We can decently find up-and-coming pitchers with Arsenal Score being the last great accomplishment. Additionally, I am also to blame by looking at how pitchers change and determining if they may be injury prone (example with Scott Kazmir). Most of the advancements in examining pitchers is because of the addition of Pitchf/x data. We know immediately if a pitcher is throwing slower or if he has a new pitch. I feel hitters have taken a back seat for a few years and I would like to try to increase our knowledge of them. I plan on expanding into new areas for batters. Also, I am completely open to new ideas from our readers.

One of the biggest issues with hitters is it tough to know if/when they have changed. The difference may involve power, foot speed, swing adjustments or how pitchers are adjusting to the hitter. I want to take hitter analysis to the next level. I will attempt to live on the “tip of the spear”. The problem with living on the tip is I will probably get cut a few times. I am going to look at some never published data and see if it is useful/predictive. I could see several of my ideas not working out.

Finally before I get into some of my ideas, I will gladly welcome any of your ideas. It can be tweak to something I bring up or a completely new idea. I am ready to give hitters their fair shake.

Idea 1 – Use Pitchf/x Data To See How Pitchers Are Attacking Hitters

One idea is to see how pitchers are attacking certain hitters and if any changes exist. This idea started with Rob Arthur’s work at Baseball Prospectus and continued on with our own Eno Sarris at JABO. I am going to steal their ideas. I plan to examine three items which are percentage of fastballs seen, percentage of pitches in the strike zone and first pitch strikes. As a hitter become more prolific, pitchers will give the batter less and less decent pitches to swing at. The key then is to see how the hitter responds. Here is an example with Salvador Perez over the last four half seasons.

Note: I have not written the code to get first pitch strikes yet.

Half season: FB%, Zone%, wRC+
2013 1H: 58%, 47%, 94
2013 2H: 53%, 48%, 121
2014: 1H: 52%, 45%, 116
2014: 2H: 47%, 47%, 61

Pitchers aren’t giving Perez any fastballs to swing at and he may be struggling because of it.

Another example can be seen with Billy Hamilton and how fastballs may be bringing his game down.

Half season: FB%, Zone%, wRC+
2013 2H: 71%, 58%, 155
2014: 1H: 60%, 51%, 105
2014: 2H: 70%, 52%, 60

Pitchers found Hamilton could hit fastballs in the zone and adjusted to throwing him tons of fastballs, but out of the strike zone.

The key with this information is to see if other MLB teams think they have found a hitter’s weakness and are trying to exploit it.

Idea 2 – Use Improved Batted Ball Data to Rank and Categorize Hitters

Batted ball data helps to give us fans a better idea of how a hitter is performing at the simplest level. Are they getting on base from infield hits or are they making solid contact for line drives? With Inside Edge data, we can take the information a step future. Besides just the three common trajectories (line drives, groundballs, fly balls) their information includes data on if the batted ball contact is well-hit, medium or weak. Instead of just three classifications, there are nine. I went and found the likelihood of the event happening, the BABIP (including home runs), and wBABIP (or wOBAcon; how hard is the contact). Here are the league averages for each of the nine batted ball types.

Batted Ball Type: BABIP, wOBAcon, % of batted ball
Groundball – Weak: .151, .112, 31.4%
Groundball – Medium: .461, .416, 9.5%
Groundball – Well-Hit: .647, .610, 3.8%
Line Drive – Weak: .622, .579, 2.3%
Line Drive – Medium: .650, .638, 7.3%
Line Drive – Well-Hit: .719, .815, 11.1%
Flyball – Weak: .078, .074, 18.5%
Flyball – Medium: .069, .081, 8.2%
Flyball – Well-Hit: .641, 1.168, 7.8%

The key outcomes a hitter wants are all Line Drives and Well-Hit Flyballs. With this information, I was able to create expected values knowing a hitters batted ball mix. For reference, here are the top five regular hitters according to xwOBPcon from 2012 to 2014:

Name: xwOBAcon, xBABIP
Giancarlo Stanton: .518, .361
Chris Davis: .504, .361
Jose Abreu: .498, .369
Mike Trout: .496, .363
Chris Carter: .484, .352

Those five batters are decent. Between Carter and Trout are two interesting names, George Springer and Jorge Soler with fewer batted balls. Are they sleeper? WE just don’t know yet, but maybe.

Another feature I plan on utilizing is to find other players similar to a player’s batted ball mix. Take highly touted Gregory Polanco. He had a disappointing 2014 MLB season with a .235 AVG. Currently the Fans here at FanGraphs have him hitting .268 while Steamer has him at .235. Did the Fans see something the computers didn’t?  Here are the percentage of hits he had in each category (ranked by likelihood of a hit occuring).

Note: I am going to either put these values on a +/- system like wRC+ so it is on a 100 scale for easy comparison or maybe have the values measured in standard deviations from the mean.

LD_WH: 9.3%
FB_WH: 3.5%
GB_WH: 1.8%
LD_W: 0.4%
FB_W: 16.4%
GB_W: 40.3%
LD_M: 12.8%
FB_M: 6.6%
GB_M: 8.9%

Just looking at those values it is tough to tell what he is like. What I can do is find some hitters with similar batted ball values. The top ten comparables in order are Erick Aybar, Billy Hamilton, Alexei Ramirez, Juan Pierre, Alexi Amarista, Endy Chavez, Rajai Davis, Jon Jay, Kolten Wong, Mike Aviles. Not exactly the league’s best hitters.

I know there is quite a bit to digest on this subject, but I think progress can be made to identify different hitter types early on. Some early testing of mine shows that batted ball mix stabilizes in about 50 batted balls, so small samples are useful.

Besides getting the type of hitter, changes in swing can be examined. When a hitter says they are changing their approach, the results can be looked at.

Idea 3: In-season expected walk and strikeout rates using Pitchf/x data

A few years back, I found walk and strikeout rates can be found using O-Swing%, O-Concat%, Z-Contact% and Z-Swing% values found here at FanGraphs. By looking at a hitter’s approach, we can get a quick idea if they are hacking too much at bad pitches and will this approach catch up with them.

For example, the master hacker Javier Baez had a 44% K% in the majors. Looking at his swing tendencies, the his K% should be closer to 35%. A lower K% is expected in 2015, but not near the magically 30% level which seems to keep players in the majors.

Idea 4: Player Speed Times

This idea I have wanted to track for years and have not been able to get it going. I am just going to force myself to implement it and hope I can get some help from some others. I would like to collect the times from home to first base when a hitter is giving it their all to beat out a throw for a hit. With this information, changes in talent from age or injuries could be detected.

What I am asking from others helping is the player, handedness if switch hitter, game and inning of the sprint to first. I you can time the run even better. If you don’t have the chance to time the run, just let me know the other information and I can watch the game. Just try to keep a note pad near by while watching your favorite game to mark down such instances. Additionally, I will go back and get some times on players coming back from injuries by looking for double plays or infield hits.

Here are some examples of how the data will look.

Note: The Trout times are from a Jeff Sullivan article and the Altuve time is from a game I watched this morning for Quick Looks.

Jose Altuve
9/22/14: 4.25 (R)

Mike Trout
2012: 3.83 (R)
2013: 3.77 (R)
2014: 4.00 (R)

The times to first information can be added to any of my article comment sections or just Tweet the info to #time2first. I will check each and when I have time, I will add them to a spreadsheet.

Wrapping things up

Overall, I would like to provide more information on hitters. By using some new data, manipulating already available data and doing a little work, hopefully I can provide a better understanding of why a hitter’s performed changed in the past and why it may be different in the future. Right now, it looks like I will provide the raw data each Sunday night so it can be used to set rosters on Monday.





Jeff, one of the authors of the fantasy baseball guide,The Process, writes for RotoGraphs, The Hardball Times, Rotowire, Baseball America, and BaseballHQ. He has been nominated for two SABR Analytics Research Award for Contemporary Analysis and won it in 2013 in tandem with Bill Petti. He has won four FSWA Awards including on for his Mining the News series. He's won Tout Wars three times, LABR twice, and got his first NFBC Main Event win in 2021. Follow him on Twitter @jeffwzimmerman.

25 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
everdiso
9 years ago

can we get ball speed and angle off the bat from pitch f/x?

once we suss out what hit angle range is most successful, wouldn’t that be the ideal measurement of hit and power skill?