Friday, December 14, 2007

The Bi-Modal Mr. Clemens

Per New York Times coverage of the Mitchell Report, Brian McNamee testified to serving as Roger Clemens' personal steroid injector:

Mr. Clemens had a 40-39 record from 1993 through 1996 and was not re-signed by the Boston Red Sox. The next year, he signed with the Toronto Blue Jays and began working out with Mr. McNamee (…) Mr. Clemens had two of the best years in pitching history in 1997 and 1998, winning the Cy Young Award in both seasons and also led the league in wins, earned run average and strikeouts (…) After Mr. Clemens declined to 14-10 with a 4.60 ERA in 1999, New York hired Mr. McNamee as assistant strength coach. During one stretch after that, Mr. Clemens won 27 games against three losses for the Yankees.

Damning on its face, but does the data back up such a claim? Utilizing training with McNamee as a proxy for steroid use, we did a basic t-test; our null hypothesis, “Rocket better on than off juice,” holds at 95% confidence, for a difference in Earned Run Average of -.953. A similar analysis also holds at 95% confidence for a difference in Walks plus Hits per Inning Pitched of -.230.

But given the multitude of confounding factors, a roughshod comparison of averages is by no means conclusive. What could be, is that that there is no statistically significant difference in ERA, in mean or variation, between Roger’s first 12 seasons and last 12 seasons. In fact, the only extended subgroup of years statistically aberrant from the whole, is his 1986-1992 golden age with the Red Sox. However, this supports two opposing schools of thought.

Defending the Rocket, since he varied a great deal within any short span, we can’t presume that a particular surge in performance is resultant from steroid use. Condemning him, is the conventional wisdom that it just isn’t natural for a fastball pitcher to have symmetrical halves to a such a lengthy career — regardless of workout regimen and even for such a Clydesdale of a Texan.

We call for further study, including a regression analysis that tests the explanatory power of developing a splitter at 42.