Modeling Whiffs and GBs Using Velo and Movement: A Reprise
Pitch modeling isn’t anything particularly unique or groundbreaking. It’s the kind of thing Harry Pavlidis and Jonathan Judge (of Baseball Prospectus) and our once-editor Eno Sarris (now of The Athletic) have investigated for years. I won’t claim to break new ground here. I’m just a nerd who likes testing hypotheses for himself.
Last year, I used velocity and movement, courtesy of PITCHf/x, to model swinging strike and ground ball rates for pitchers. That post was not my best work (easy to say in hindsight), primarily because of limitations with the data. The data, from Baseball Prospectus, was aggregated, such that I couldn’t isolate any single pitch thrown by a pitcher. The advent of Statcast has enabled us to do exactly that, providing publicly accessible hyper-granular pitch-level data and changing how the public sphere of sabermetricians nerd out.
Something I have wanted to do for a long time is refresh my previously-linked analysis, but with (1) Statcast data and (2) a different modeling approach — namely, the use of a probit model rather than a multiple regression model. For most of you, this means nothing. It’s gibberish. I don’t intend to wade too deeply into the weeds of the modeling, lest I disorient or alienate. Mostly, I just want to communicate I think it’s an exciting and different way to answer the everlasting question: how does a pitch’s velocity, movement, and spin rate affect its outcome?