An Excel spreadsheet can be used to perform the simulation and a player’s statistics can be put into the sheet so probabilities of various outcomes. There are 17 possible outcomes to each plate appearance: Strikeout, walk, HBP, error, short/medium/long single, short/long double, triple, home run, ground out, ground into double play (with men on base), line drive/infield fly, and short/medium/long fly out. Innings are simulated and the number of runs scored per inning (and by extension, game) can be recorded. The obvious upside to this method is that a true simulation is used, which should be much more accurate than basing it off of plain old statistics. The downside is that there is no way (currently) to factor in steals and players who can advance the extra base on a base hit better than other players. However, this should not have a significant effect on the study. The players who were in the top 10 for Batting Value in 2009 according to fangraphs.com were examined. A total of 1440 innings, or 160 “games” were simulated for each player. The table of results follows:
Player | Runs/Game |
Albert Pujols | 10.24 |
Joe Mauer | 9.90 |
Prince Fielder | 8.71 |
Hanley Ramirez | 8.45 |
Mark Teixeira | 8.00 |
Ben Zobrist | 7.83 |
Adrian Gonzalez | 7.76 |
Miguel Cabrera | 7.68 |
Derrek Lee | 7.64 |
Ryan Braun | 7.55 |
The results are not very surprising. Albert Pujols and Joe Mauer were the MVPs of their respective leagues and by a considerable margin the best offensive players in Major League Baseball. For perspective, the average runs scored per game in the NL was 4.43, and in the AL it was 4.82.
These numbers can be used to roughly gauge a player’s value. With them, we can examine which statistics are most highly correlated with Runs/Game as given by Monte Carlo Simulation, so we can examine which statistics are most important to a team’s success. The results will likely tell us nothing new, but it will be interesting nonetheless.
Two statistics had correlations over .9: OBP and OPS. Remember the closer the correlation is to 1, the more related the two statistics are.
Statistic | Correlation |
AVG | 0.579 |
OBP | 0.928 |
SLG | 0.796 |
OPS | 0.922 |
Hits | 0.244 |
HR | 0.258 |
Walks | 0.356 |
This should not surprise any good stat-minded baseball fan. On-Base Percentage has the highest correlation and OPS is just behind. Slugging Percentage is the only other statistically significant correlation, which is of little surprise as well. Here are the scatter plots of Average Runs/Game against OBP and AVG. Note that “Average” is the R/G from the table above.
As you can see, there is a fairly strong linear relationship between R/G and OBP, while the relationship between R/G and AVG is considerably more scattered.
Interesting Baseball Fact of the Day: The lowest single-season ERA for a pitcher who gave up more than one hit per inning (since 1961) was Tommy John in 1977. Tommy John gave up 225 hits in 220.1 IP en route to a 2.78 ERA, 1.25 WHIP, 20-7 record and a second place spot in the Cy Young voting.
No comments:
Post a Comment