Is Stolen Base Rate Predictive of Anything?

January 22, 2018

Last week, I began an examination of stolen base rates. The process is messy with too many variables and nuances to consider. I’m examining the information through several different lenses and seeing what applies. Today, I’m going to look at how success rate plays a role.

Team Level Analysis

As sabermetric principles are being utilized more and more by front offices, they quickly came around to the idea that for stolen bases to be helpful, the success rate needs to be high. In 2000, the success rate was 69% for the entire league and it has increased to 73% last season.

Knowing that each team is made of different players and their individual success rate are a factor, here are the three-year success rate along with total stolen base attempt percentage ((CS+SB)/(1B+HBP+BB)).

Team SB% (2015-2017)

Team	SB%	SBA%
Cleveland Indians	78.6%	6.7%
Arizona Diamondbacks	78.0%	6.2%
Kansas City Royals	76.2%	17.9%
Boston Red Sox	76.2%	4.7%
Washington Nationals	76.2%	6.1%
New York Yankees	75.8%	12.4%
Milwaukee Brewers	75.7%	6.7%
Cincinnati Reds	75.4%	6.4%
Toronto Blue Jays	74.1%	9.0%
San Diego Padres	73.5%	8.0%
Miami Marlins	72.7%	9.9%
Minnesota Twins	72.4%	6.9%
Texas Rangers	72.1%	8.6%
Oakland Athletics	71.4%	7.7%
Houston Astros	70.7%	8.8%
Philadelphia Phillies	70.4%	5.3%
Chicago Cubs	70.0%	8.2%
Los Angeles Angels of Anaheim	70.0%	11.5%
New York Mets	69.7%	8.6%
San Francisco Giants	69.6%	7.5%
Atlanta Braves	69.3%	6.7%
Pittsburgh Pirates	68.6%	5.1%
Los Angeles Dodgers	68.2%	5.6%
Tampa Bay Rays	67.0%	6.5%
Chicago White Sox	66.5%	7.2%
Seattle Mariners	66.5%	9.3%
St. Louis Cardinals	66.1%	3.5%
Colorado Rockies	65.7%	4.3%
Baltimore Orioles	65.3%	5.8%
Detroit Tigers	64.4%	5.8%

The difference between top and bottom is quite amazing with the Indians coming in at 79% success rate and the Tigers down at 64% and there are disparities throughout the list, but the differences are useless unless the rates are predictive from season to season. Without correcting for the talent level, the r-squared from Year-1 to Year-2 from 2015 to 2017 is .36 for all the matched pairs. Next, I removed the matched seasons when the team changed managers. The r-squared dropped to .27. While a decline, the values are similar.

The biggest takeaway seems to be front offices have more of an impact on the stolen base rate than managers.

One item which bugs me is the differences in team speed. To try to limit its effects, here are the success rate rankings but only for players with an average Speed Score (between 4.0 and 6.0) along with the overall success rate.

Team SB% With Speed Score From 4.0 to 6.0 (2015-2017)

Team	4 to 6 Speed Score SB%	Overall SB%
Cleveland Indians	79.0%	78.6%
Boston Red Sox	78.7%	76.2%
Toronto Blue Jays	78.7%	74.1%
San Diego Padres	77.0%	73.5%
New York Yankees	76.2%	75.8%
Arizona Diamondbacks	76.1%	78.0%
Oakland Athletics	76.0%	71.4%
Milwaukee Brewers	75.3%	75.7%
Washington Nationals	74.7%	76.2%
Miami Marlins	74.7%	72.7%
San Francisco Giants	74.4%	69.6%
Kansas City Royals	73.7%	76.2%
Minnesota Twins	73.6%	72.4%
Houston Astros	73.0%	70.7%
Texas Rangers	72.4%	72.1%
Orange County Angels	72.0%	70.0%
Chicago Cubs	71.4%	70.0%
Los Angeles Dodgers	71.3%	68.2%
Atlanta Braves	70.8%	69.3%
Baltimore Orioles	70.2%	65.3%
New York Mets	69.7%	69.7%
Cincinnati Reds	69.7%	75.4%
Pittsburgh Pirates	68.2%	68.6%
Seattle Mariners	67.9%	66.5%
Tampa Bay Rays	67.8%	67.0%
St. Louis Cardinals	67.7%	66.1%
Philadelphia Phillies	67.0%	70.4%
Detroit Tigers	64.9%	64.4%
Colorado Rockies	64.2%	65.7%
Chicago White Sox	61.3%	66.5%

Teams which accept low stolen base success rates from their average runners do so with all of them.

For fantasy owners, we should be leery of players going to teams which require a high success rate. An example is Stephen Piscotty (three for nine in last season or 33% SB%) going from the Cardinals (66% SB%) to the Athletics (71%). His Steamer projection has him at 4 SB next season. I could see the Athletics completely remove his stolen base opportunities (SBA%). He’s not a perfect example but take note of the teams on the extremes.

That’s it for team-level data, at least for now. Time to move onto the player data.

Player Level Analysis

The general idea I’m looking to investigate is how much impact success rate has on stolen base opportunities. My first test was to take hitters who had varying levels of success in season 1 and compare how their attempt rate change in season 2. I used hitters from 2006 to current who had 300 PA in each paired season (all values are in percentage point changes).

Team SBA% For Various Success Rates In Y1 to Y2

Success rate	< 50%	50% to 60%	60% to 70%	70% to 80%	80% to 90%	> 90%
SBA% Diff
Average	0.1%	0.1%	-0.1%	-1.3%	-2.4%	-2.3%
Median	-1.1%	-1.1%	-0.8%	-2.4%	-3.6%	-1.4%
SBA%
Average	9.8%	10.7%	13.6%	18.3%	18.9%	15.8%
Median	9.1%	9.3%	12.8%	16.5%	16.3%	12.2%

A couple observations: The first is that all saw their attempt rates drop some. This not a surprise with most hitters reaching their peak speeds before they reach the majors. The second point may be more fantasy relevant. Those hitters who had high success rates in the previous season saw their SBA% drop more than average. Just because a hitter was successful in the previous season, it doesn’t mean he’ll steal a lot in the next one.

Next, here are the SBA% changes for runners who saw their success rate drop in a season’s first half (again all values are in percentage point changes).

First Half SB% Change & Second Half Results

	Average		Median
Change in SB%	1H to 2H SBA%	1H to 2H SB%	1H to 2H SBA%	1H to 2H SB%
>30% point Drop	3.4%	37.1%	4.1%	38.9%
20% to 30% Drop	1.5%	17.2%	2.7%	20.8%
10% to 20% Drop	-1.1%	5.2%	-0.4%	4.7%
0% to 10% Drop	0.7%	2.7%	0.6%	2.9%
0% to 10% Increase	-0.1%	-3.2%	0.1%	-1.9%
10% to 20% Increase	1.8%	-9.3%	2.3%	-7.1%
20% to 30% Increase	3.9%	-10.3%	3.0%	-13.3%
>30% point Increase	4.6%	-26.2%	5.7%	-23.7%

Welcome to a table on human nature. Here’s how each pair column is important.

Column one and three. If a player was highly successful or highly unsuccessful in the season’s first half, they will try to steal more in the season’s second half. I’m guessing those who were unsuccessful may have been injuried while those who were successful kept on stealing.
Column two and four: The jump or drop in stolen bases comes back to regress to the mean.

Owners may be able to take advantage of high first half success rates and trade them off before their second half crash.

The following examples were the only factors I could find in which SB% is useful. I talked to someone who was working on a similar project and he gave me some much-needed insight on SB%. It’s not much of a predictive factor for anything and has probably been over utilized for years. I’m not going to give away any more of his work but generally ignore SB% except to find a team’s tolerance level and find those runners who will be over confident and keep running, but less successfully, into a season’s second half.

My quest for a better understanding of stolen bases numbers has been slow. Even while examining SB%, I went down several avenues with no luck. The next major change is using StatCast’s Sprint Speed values. I need to find and add the values to my database to easily link to players. I’m not sure when I will complete the task so any more stolen base talk is on hold for now. Until then, let me know if you have any ideas for possible study areas or improvements.

5 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Alan

7 years ago

A possible explanation for the first half/second half result is, in some cases, “strength of schedule” with respect to opposing pitchers and catchers. If a base stealer failed a lot in the first half, it may mean they were trusted to steal against the best. Their manager will then also trust them against lesser opponents in the second half, expecting the success rate to rebound.

This could probably be evaluated with some digging, but it would be hard to make very strong conclusions because of how noisy the data would be. One such source of noise is that catchers get dinged up often and then heal, and this may be hard to consistently identify in the data.

-1

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG