Monday, March 16, 2009

David Ortiz's Home Run Distribution in Relation to Game Score

Well for a while I had wanted to crunch some numbers, so for no particular reason I will examine the home run totals of David Ortiz in relation to the relative score of the game. We will find out if he hits more home runs in closer games, and if any difference between his actual home run distribution and his expected home run distribution is statistically significant. I’ll try to keep it as basic as possible, but when I can I’ll provide links if you want to read more about some stuff I talk about.

First, here is the table indicating how many home runs David Ortiz hit in each run differential situation (+4… means up by 4 or more runs, +3 means up by 3 runs, etc.) and his expected home runs in each situation. For example, about 12% (651 out of 5428) of David Ortiz’s plate appearances came when his team was up by 4 or more runs. Therefore, if he hits home runs at consistent rates no matter what the score is, 12% of his home runs should come when his team is up by 4 or more runs, or 34.66 of his 289 home runs. The “X^2 Value” is the Chi-Squared value for each situation. In a sentence, a Chi-Square test determines if something is independently distributed, i.e., this test will tell us if David Ortiz’s home runs are not independently distributed across each run differential based on how many plate appearances he has for each situation. If you don’t care to learn about chi-square tests, you can completely ignore this past sentence and the numbers in the bottom row, because I’ll explain everything without fancy numbers.

Score+4…+3+21Tied-1-2-3-4…Total
PA65132341558414986134512876065428
HR292130238933221329289
Exp. HR34.6617.2022.1031.0979.7632.6424.0115.2832.26289
X^2 Value0.9250.8412.8282.1071.0710.0040.1690.3400.3308.614


And the data graphically:



Just by looking at the data, the expected home runs match up with the actual home runs pretty well, and the test supports that conclusion; David Ortiz’s home runs are distributed among each run differential as they should be based on his plate appearances for each situation.

However, this does not tell the whole story. As Ortiz has spent most of his career hitting third, he nearly always starts the game with a plate appearance in a tied game. Also, when you come up down one or tied in the 3rd or 4th inning, it is much different than doing so in the 8th or 9th. It may be more interesting to look at these same numbers restricted to the 7th inning or later. So let’s do that. Here is the same table as before, but now it only includes plate appearances and home runs in the 7th inning or later.


Score+4…+3+2+1Tied-1-2-3-4…Total
PA2961021281442321731601223211678
HR146962011561390
Exp. HR15.885.476.877.7212.449.288.586.5417.2290
X^2 Value0.2220.0510.6640.3854.5890.3191.4950.0451.0338.802


And the data graphically:



Here, the story looks a little different. The test still tells us that as a whole, David Ortiz’s home runs are distributed as they should be based on his plate appearances. However, the one weakness of the chi-square test is that it only looks at distributions as a whole and is not able to tell if single differences are statistically significant. As is plain from the graph, there is a large discrepancy between Ortiz’s expected home runs in tied games after the 7th inning and his actual home runs, a difference that could somewhat be seen in the previous graph. To be exact, he was only expected to hit 12.44 home runs in those situations, but in his career he has hit 20. A similar phenomenon can be seen when his team is up or down by 4 or more runs. In those less pressure-packed (and less important) situations, Ortiz hits less home runs than he should. There are two possible explanations for this. One is that the difference is not significant at all, and that it is just random chance that accounts for this. The second is that Ortiz simply tries harder in those situations. If the game is close late, a hitter will be concentrating more and it is possible he will perform better. If the outcome is already decided, a player like David Ortiz will likely be mailing it in, and therefore may hit less home runs. I am of the opinion that for the situations where the game is decided, the second explanation works, that Ortiz will not be trying as hard. However, I think it is much harder to “turn it on” in late and close situations. The difference in home runs in late tie games is likely due to chance, not Ortiz’s so-called clutch hitting ability. Still, the debate about clutch hitting is a never ending one that has yet to be firmly answered with statistics. You will have to form your own opinion on this one.

Interesting baseball fact of the day: Of David Ortiz’s 9 career walk-off home runs, 4 have been hit to right field, 3 to center field, 1 to left-center field, and 1 to left field.

P.S. If anybody has an interesting baseball question that maybe can be answered with numbers or otherwise, post it in the comments, I'd be more than happy to have some ideas to play around with.

No comments:

Post a Comment