An analysis by a specially-convened team of academic researchers to project the finish times of runners at the 2013 Boston Marathon, will be presented publicly for the first time at the New England Symposium on Statistics in Sports, to be held on September 21 at Harvard University in Cambridge, MA.
The Symposium is a meeting of statisticians and quantitative analysts connected with sports teams, sports media, and universities to discuss common problems of interest in statistical modeling and analysis of sports data. The symposium is part of a year-long series of programs and events around the world during the International Year of Statistics.
The scheduled presentations include a statistical model for predicting the finish times of individuals who were running in the 2013 Boston Marathon but were unable to complete the race when it was abruptly halted after bombs exploded near the finish line last April. That research was done at the request of the Boston Athletic Association, organizer of the Boston Marathon.
The researchers will explain how multi-year data was analyzed “to create projected times for this year’s runners and discuss some features of the resulting projections.” The BAA opted not to use the statistical model that was developed and provided to the race organizers, choosing instead to use a direct extrapolation of an individual’s time at the point the race was stopped – a result that was, in most cases, more favorable to the runners than the more complex analytical model developed by the research team. Runners were provided those times within two months of the Marathon.
Just over 5,600 official entrants who were unable to cross the Boylston Street finish line on April 15 when the race was stopped at 2:50 p.m. Of those, 2,611 are from Massachusetts and 726 are international participants, according to the Boston Athletic Association. In total, residents of all 50 U.S. States (and four U.S. territories), and 47 countries are among this group.
The lead researcher was Richard L. Smith, the Mark L. Reed III Distinguished Professor of Statistics and Professor of Biostatistics at the University of North Carolina, Chapel Hill. He is also Director of the Statistical and Applied Mathematical Sciences Institute, which is supported by the National Science Foundation. Smith has previously run in the Boston Marathon, as have three members of his research team.
The researchers included Dorit Hammerling of the Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, North Carolina; Matthew Cefalu and Francesca Dominici of the Department of Biostatistics, Harvard School of Public Health; Jessi Cisewski of the Department of Statistics, Carnegie Mellon University; Amy Grady of the Department of Statistics and Operations Research, University of North Carolina at Chapel Hill; Charles Paulson of Puffinware LLC, State College, PA; and Giovanni Parmigiani of the Department of Biostatistics, Harvard School of Public Health and Dana Farber Cancer Institute, Boston.
Smith said that while the BAA decision is “perfectly understandable,” the statistical model his team developed has been credited by the BAA as being helpful in their decision-making, and has both merit and validity that would be of interest to statisticians, runners and sports fans. The model has potential for future use in projecting runners’ finish times from intermediate times during the race, and the research team intends to focus their Symposium presentation on that potential.
The research team analyzed the times of individual runners from the 2010 and 2011 Marathons (2012 data was not used because the day was unusually hot, unlike 2013) at various points of the 26-mile course and developed a statistical model the projected the finish for every 2013 runner based on how similar runners finished in the previous years. Many runners, for example, tend to slow, but at differing rates, in the race’s final miles. Others have a strong finish. The “sophisticated” analysis was developed to offer a more elaborate extrapolation of what individual finish times might have been. The researchers provided the BAA with a “complete file” indicating a projected finish for each runner.
“It was an interesting challenge, and we were pleased to be asked by the BAA to work on this project,” Smith said. “Their decision makes perfect sense, but we are proud of our work and the way in which it came together.” Smith said that in addition to the first-time presentation at the Symposium, the team plans to publish their work in a professional journal.
The group of official entrants who were prevented from completing the race includes 2,983 women and 2,650 men, and ages range from 18 to 82. A month after the race, the B.A.A. announced that all of the official entrants who did not finish would be invited back to participate in the 2014 Boston Marathon, to be held on April 21. A special registration period for those individual closed last Thursday, and more than 4,500 runners have signed up. As part of a rolling registration process, registration for runners who have qualifying times opens on September 9.
The BAA has also announced that the 118th Boston Marathon field will be increased to 36,000 due to increased interest in next year’s race. Traditionally, the field numbers about 25,000. Last year, just over 400 participants from Connecticut were registered for the race. The largest field in recent years was in 1996, for the 100th anniversary of the race, when 38,708 individuals registered.
“We understand many marathoners and qualifiers want to run Boston in 2014, and we appreciate the support and patience that the running community has demonstrated because of the bombings that occurred this past Spring,” said B.A.A. Executive Director Tom Grilk.
The conference co-chairs of the New England Symposium on Statistics in Sports are Mark Glickman and Scott Evans. Glickman is Senior Statistician at the Center for Healthcare Organization and Implementation Research and a Research Professor in the Department of Health Policy and Management at Boston University. Evans is a Senior Research Scientist in the Department of Biostatistics at Harvard’s School of Public Health. Registration is now open for the Sept. 21 symposium.
© 2013 CT by the Numbers