Imperfect Game, Part 2: Attack of the Sea Lice

August 19, 2015

Last week, we began our quixotic attempt to design the Universal Baseball Association—the ideal full-season Roto-style Fantasy league. This quest was interrupted by a vacation, or, more accurately, “vacation.” The chief benefit of doing something you don’t especially want to do in someplace you don’t especially want to do it in is that circumstances induce you to imagine you’re doing something you’d prefer to be doing in a place in which you’d prefer to be doing it. So there we were, being nibbled upon by sea lice in the saltiest water this side of the Dead Sea. Suddenly and magically, the similarly-immersed children who’d foolishly been placed in our care seemed to vanish—don’t worry; they reappeared all too soon–and we were transported, in the kingdom of our minds, to our desks and computers, free to brood further about the categories and rules of the UBA.

Herewith the results. First, though, a quick summary of the criteria by which we judge our success; if you want more of the same, check last week’s post. The categories in an ideal league should, we think, be as few as possible; be susceptible of appreciation by both the average baseball fan and the stat geek; be “primary,” in that they reflect the building blocks of baseball performance, rather than being “derivative” stats that massage these primary categories to and beyond the vanishing point; focus on a player’s individual achievement rather than the team-based context in which that achievement occurs and can be distorted; yet produce stats that, in the aggregate, look roughly like the stats that a Reality Baseball team would produce over a full season.

The first thing we remembered during the Plague of the Sea Lice is that, if Sabermetricians are to be believed, the three categories of the game—offense, pitching, fielding—aren’t equally significant for Reality Baseball outcomes. Rather, the ratio among them is something like 3/2/1. Preserving these proportions as simply as possible means a six-category game. The hitting part of offense isn’t that hard. OBP is a no-brainer, and of course you also want a category that reflects power. Our initial idea was to use Isolated Power, which, as a stat, is perfectly complementary to OBP—it’s the stuff a player does that takes him to second base and beyond before the next hitter’s plate appearance begins. But as we ruminated further, we concluded that good old, plain old home runs do a better job of reflecting a player’s pure power, while also correlating with ISO and being more average-fan-friendly.

We don’t know whether speed is one-third of all offense, and do know that you’re unavoidably double-counting it if you use it as a category when you also have OBP. But you’ve got to include it somehow, and (thanks to alert reader Pat’s Bat, whose comment last week got us thinking harder about this) we see that SB-CS puts too much emphasis on stolen bases, which aren’t that significant to overall outcomes and are too context-dependent anyway. We drafted Billy Hamilton in the second round of our NFBC Main Event draft, and we’re not sorry, but any set of rules that makes this a viable strategy needs tweaking.

There are many noble attempts to capture overall speed-on-offense (as opposed to offensive speed, which is what Gertrude married Claudius with). Speed Scores are the best known. (As Fangraphs notes, there are more nuanced ways to capture total speed in a stat, whereas Speed Scores are “slightly outdated,” but hey—so are we.) Even Speed Scores, though, mix percentages and counting stats, and rely on hidden-stat “opportunities” to steal a base, ground into a double play, or score a run. They produce a “blender” number that runs afoul of our accessibility criterion. So the best we can do for our Speed category is to combine the counting stats found in Speed Scores. Let’s call the result Modified Speed Score. Here’s the formula: (SB-CS)+3B-GIDP-(H+W-R). It’s ungainly; it probably produces a negative number; but it satisfies our criteria, and should reflect reality more than SB or SB-CS.

Can pitching be reduced to two categories? Yes, we think. One of them has to be Component ERA, which for Fantasy purposes—God bless Bill James– is a truly elegant stat. What it is, essentially, is an estimate of a pitcher’s ERA on the basis of the outcomes the individual hitters that he faces produce. It thus avoids the pitfalls of both regular ERA (which often depends on what happens after a pitcher leaves the game) and WHIP (which looks at baserunners, but not at what base the pitcher puts them on). Plus, it’s widely available, and it takes the same form as the most commonly-used measure of pitcher performance.

The other pitching category has to combine starters and relievers, should be as context-independent as possible (which isn’t very), and has to use stats that are within the ken of ESPN (or MLB.com, which we’re adding for reasons that will presently become clear). With starting pitchers, we like Quality Starts, for the same reason other people do: they attempt to isolate the (duh) quality of a starter’s performance, apart from what his team’s hitters happen to be doing. The standard definition of a Quality Start is one in which the guy pitches at least six innings and surrenders no more than three runs. We can live with that. True, it doesn’t sufficiently distinguish between, say, Clayton Kershaw and Colby Lewis, but we count on CERA to do that, and there’s no question that the Colby Lewises of the world have significant value to their Reality Baseball teams, precisely because they’re able to get through 6 innings without surrendering more than 3 runs. Indeed, sometimes we think a QS category should consider only the first six innings—if a guy gives up Run Number Four in the 7th inning, it’s the manager’s fault for not resorting to the overstuffed bullpens that are now universal in MLB. But we won’t insist on it.

As for relief pitching: there are metrics out there that combine the “cleanliness” of a relief pitcher’s performance with the “leverage” of the situation in which he appears to produce a number that purportedly evaluates the guy’s performance with precision. Someday, perhaps, when androids have supplanted us, one of these metrics will prevail. For now, though, the best we can come up with is something that doesn’t pretend that Saves are all that matters, and the best we can do with that is to use Holds. Not perfect, because they’re still too outcome-dependent. We don’t see why a guy who, say, enters a tie game in the 8thh or 9th inning and keeps things that way shouldn’t get an honorific stat of some sort.

But the formula can’t just be QS+H+SV-BS. That puts too much emphasis on relief pitching. You want, at a minimum, to have your starter/reliever ratio reflect the underlying innings-pitched ratio, which is about 2-to-1. This is doable (do the math if you like, or else trust us) by tripling (not, as we suggested last week, doubling) the QS total. So the formula for this category is (3xQS)+H+SV-BS.

And now for fielding, the rock on which all attempts at a comprehensive set of Fantasy categories founder. One problem with fielding is, as everyone knows, that the traditional fielding measure of FPCT doesn’t reflect true quality of fielding performance at any position. Another is that FPCTs aren’t commensurate among positions, which will lead you to distort your roster if you use the aggregated FPCT of your individual players. If your league uses this as a category, your Utility guy is never going to be a Third Baseman, your Corner Infielder is always going to be a First Baseman, your Middle Infielder is usually going to be a Second Baseman, and your MVP is going to be someone like the 2014-model Danny Santana, just because he qualified at SS but played OF.

If, like us, you really want to factor defense into your league, and you want to use a blender stat like dWAR, we understand. But something else, which we’re borrowing from Fantasy Football, occurs to us. In FF, you don’t draft, say, J.J. Watt or Richard Sherman. You draft an entire team’s defense. How about the same approach in Fantasy Baseball? Though we’ve been emphasizing (to the extent possible) individual achievement as one touchstone for our categories, defense in Reality Baseball is considerably more a team effort than the primal pitcher-vs.-hitter gladiatorial showdown. So why not reflect that by drafting the defense of an entire baseball team in addition to your individual players? And that’s your sixth category.

But the Fielding Percentage problem continues to bedevil us. FPCT is driven by errors. It is a truth generally acknowledged that, whatever may have been the case in the days of the pancake baseball glove, nowadays range (the balls a fielder can get to) matters more than errors (what the fielder does if he is able to intersect with them). There is, fortunately, a readily-available stat that reflects a team’s defensive efficiency. Serendipitously, it is called a Defensive Efficiency Rating, and it measures the percentage of balls put in play that a team’s defense turns into outs. ESPN doesn’t track it, but MLB.com does, so the stat squeaks by if you squint a bit. The top team in DER this season is Kansas City, which is 12th in FPCT; Detroit, which is 4th in FPCT, is 20th in DER. We doubt that the pitchers on either team would tell you that the Tigers’ defense is better than the Royals’.

If you think that DER is too outré for the casual fan, then go ahead and use full-team FPCT. It’s better than nothing: there’s a positive correlation, but it’s less than you might think (or we thought, until we checked). But for us, the six categories are:

OBP
HR
MSS, which is (SB-CS)+3B-GIDP-(H+W-R)
CERA
(3 x QS)+H+SV-BS
Full-Team DER

A couple of words about league rules and settings. The only things we absolutely need are an innings minimum and maximum; we can fine-tune the exact numbers later on, when we know how many pitchers each team has. We’d like our UBA rosters to approximate those of Reality Baseball teams, so we’d probably go with 13 hitters (eliminating the Corner Infield position) and 10 pitchers, though we can see using 14/10, 13/11, maybe even the classic 14/9 or the actually-prevailing 13/12. And since there’s such a thing as FIP (Fantasy-Independent Life), we’d process all transactions overnight rather than immediately, and use either FAAB or rolling waivers, rather than first-come.

So: is this all just a silly exercise? Well, of course it is. Do we think the UBA is an improvement on standard 5×5? Maybe not, especially if 5×5 uses OBP rather than BA, SV+H rather than SV, perhaps QS rather than Wins, and transmutes into 6×5 by using full-team DER. But maybe so. And would we play in a league with these categories and rules? In a sea louse’s heartbeat. How about you?

7 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Stephen

9 years ago

I like what you’re doing here, but to be honest this would be stripping down fantasy baseball to the point where it is so predictable that it’s boring. The happy medium is, as you mentioned, switching to OBP, QS, SV+H, and SB-CS, but a league using the modified speed score seems like it would be too predictable, even moreso with the team defensive metric. With the type system you’ve theoretically proposed, there just wouldn’t be nearly as much excitement in watching a SP’s start, as Ks would no longer be a ‘yes’ moment, same goes for any run that your SP’s team scores. Same goes for anticipating a SB from your guy on 1st. It would be nice to change/combine runs and RBI into ISO or wRC, as those are the most team and lineup position dependent, but any time you do that you get more intertwined with the other cats, so idk. It’s a good notion but it just doesn’t sound as entertaining as the modified layouts that you stated at the end.

hebrewMember since 2016

Reply to Stephen

totally agree. you need some randomness to make for the excitement.

i’m in a 20-team mixed league that is 6×6 and uses: R, HR, RBI, OBP, SLG, SB. A nice blend of more forward-thinking rate stats and traditional counting stats.

Pitchers are similar: QS, ERA, WHIP, SV, HLDS, K/9. It’s still a little RP-heavy, but we solve that issue by having an IP minimum.

Reply to hebrew

My only knocks on using SLG is that it runs into multcollinearity problems with both HR and, to a lesser extent, OBP, whereas ISO eliminates the OBP/AVG factor. Including Slugging and then also having an OBP category is almost the same as having both an AVG and OBP league, and ISO eliminates that completely. You should eliminate K/9 and just do Ks with a very small parameter for a total innings cap and minimum, yet I can see how that would be a little RP-heavy with 5 of the 6 cats being more beneficial to having a RP over an SP. Might be solvable by having more rigid P position requirements, like 4 SPs, 2 RPs, and one P, idk.

how are OBP and SLG as similar as AVG and OBP? I don’t buy that too much.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG