Sunday, September 16, 2007

Why Andy MacPhail is wrong

Note: This article contains some math: nothing heavy-duty, indeed stuff that probably many college-educated readers have seen at some point in their life, but may well have forgotten. I know this may alienate some readers, but unfortunately I don't see any way of getting to the conclusions I want without it. If you don't understand the math at one point, you will have to just trust me that the conclusions I draw really do follow from the premises.

I was watching an Orioles-Red Sox game last week--a good game, a pitching duel between Jeremy Guthrie and Josh Beckett--when Andy MacPhail, the COO of the Orioles, came on. When asked what he would do to remedy yet another disappointing season, he emphasized that he would focus heavily on improving the Orioles' pitching. MacPhail could not resist appealing to a number of hoary old chestnuts, such as that "pitching is 75% percent of the game" and other such nonsense; but it still seems like he has a good point. After all, the Orioles offense is bad, but not outrageously so. Their pitching, despite the presence of arguably the most dominant pitcher in the AL this year (Bedard), has been abysmal; they are second only to the incomparable Tampa Bay Devil Rays in runs allowed per game.

While plausible, is there any way of testing MacPhail's contention? After all, even though their offense is not in the sorry state of their pitching, the Orioles would certainly be helped by improved hitting. Is there some way of telling whether they should be focusing on their hitting or their defense (overall defense, e.g. pitching and fielding)? Baseball Prospectus' book Baseball Between the Numbers looks at this question in one chapter, and although it is good the chapter is, ironically, more historical and anecdotal. I want to take a more technical look at the question.
My starting point is one of the best formulas of sabermetrics, Bill James' pythagorean winning percentage. What this formula says is that, if RS is the number of runs a team scores and RA is the number of runs a team allows, their pythagorean winning percentage is RS^2/(RS^2+RA^2) (the name comes from the exponent "2" reminding James of the Pythagorean formula in Euclidean Geometry; other exponents are actually more accurate, but this does not affect the analysis.) With some notable exceptions, this formula usually predicts a teams actual wins to withing about 3 games, and the prediction is even more accurate when several seasons are considered. It also, during the season, a better predictor of future performance than past record (a fact which the Yankees and the Mariners have exemplified this year, in opposite ways.) As such, the pythagorean winning percentage is sometimes considered a better reflection of a teams "true talent" than their actual winning percentage (the interested reader can consult this article by a math professor at Brown for a derivation of the formula under reasonable assumptions about run distributions.)
To simplify notation, we will use x instead of RS, y instead of RA, and write P(x,y)=x^2/(x^2+y^2) for the pythagorean winning percentage. The goal of teams front office should be to make P(x,y) as big as possible, i.e. to win as many games as possible (actually, this is not quite right; we will discuss a complication a little later.) Thus, the question of whether to focus on hitting or fielding becomes: does P(x,y) increase more when x increases or when y decreases? Ideally, we would look at this question by looking at the marginal benefit of scoring one more run compared to allowing one less, i.e. we would compare P(x+1,y)-P(x,y) with P(x,y)-P(x,y-1). Unfortunately, these expressions are ugly. Instead, we will note that, even though as a matter of fact x and y are always positive inegers, the formula P(x,y) still makes sense if they are arbitrary real numbers. Thus instead of considering the marginal benefit P(x+1,y)-P(x,y) we instead look at the partial derivate Px(x,y) of P with respect to x, and similarly we consider -Py(x,y), the partial derivate with respect to y, in lieu of P(x,y)-P(x,y-1).
These derivates are trivial to compute, but I don't want to clutter this post with more symbols. After computing them, one looks at the ratio -Py(x,y)/Px(x,y); then improving defense is more valuable than improving hitting if and only if this ratio is bigger than 1. After a little algebra, one can compute that
-Py(x,y)/Px(x,y)=x/y
In other words, the ratio of the benefit to improving offense to the benefit to improving defense is equal to the ratio between runs scored and runs allowed! This is a conclusion that, to the best of my knowledge, has not appeared before. It has immediate applications to the problem at hand: one would expect a team to score more runs than it allows if and only if it is an above average team, i.e. has a record over .500. Thus, good teams should focus on improving their pitching, while bad teams should focus on their hitting. Furthermore, the better you are, the larger the ratio x/y presumably is, and therefore the larger the relative benefit to improving defense compared to offense; the inverse point holds for bad teams. This is a surprising result: it says, for instance, the Red Sox should be focusing more than any other team this offseason on improving their pitching, while the Devil Rays should be focusing primarily on improving their hitting.

There are a few other things worth pointing out. First, as already mentioned, the exponent "2" was irrelevant to the computation; a reader who knows calculus can convince him or herself that in fact the same ratio holds irrespective of the exponent. Furthermore, it has been shown that similar formulas work pretty well in other sports, like baseketball and hockey, albeit with slightly different exponents. Thus, the conclusion of this post--good teams should focus on defense, bad teams on offense--is true for many other sports as well.

So, was MacPhail wrong? Probably; there are still complications. For instance, it may be the case that, even though improving hitting would benefit the O's more than improving pitching, it is cheaper to improve pitching, so the marginal benefit to the team's bottom line. However, a) I doubt that that this is true, given how expensive pitching is, and b) I don't think MacPhail would admit on public television that he was sacrificing the success of the team to maximize profits. Of course, the Orioles won't ever contend with pitching this bad (although if Bedard and Guthrie and keep it up and Mazzone can teach Cabrera to control himself, they would have a pretty nice 1-3); but at the moment, of bigger concern to MacPhail should be the fact that no one on the team has as many as 20 home runs.


No comments: