api data anomaly during olympics

While preparing some analysis comparing this summer with last, I discovered something strange. The Ministry of Environmental Protection’s datacenter now lists two Air Pollution Index values for 8/16/08:

8 16 08
This is very unusual for several reasons:

1) There should only be one API value for each day. Although I have occasionally noticed data points missing, I’ve never seen two different data points for the same day.

2) This double data point did not appear until months after the Olympics were over. I know this both because I was tracking the Olympic air quality data on a day-by-day basis last August, and also because I downloaded datasets earlier this year that did not include the double point.

3) There is a rather large discrepancy between the existing, reported 8/16/08 API, 23, and the new, second data point for the same day, 84. Given that 8/16/08 was right in the middle of the Olympics, if the second data point is indeed correct, it has implications for the overall air quality assessment of the Games.

My guess and hope is that it’s just some strange glitch in the data reporting system. I’ll continue to monitor it to see if anything changes.

Leave a Reply