On May 3, 2006, the Motion Picture Association of America (MPAA) issued a press release sharing the results of a piracy study performed by LEK Consulting. The release read, in part:
According to the study, MPAA studios lost $6.1 billion to piracy in 2005, which is consistent with a piracy study conducted by Smith Barney in 2003 that predicted the motion picture industry would lose $5.4 billion to piracy in 2005….The average film copyright thief is male, between the ages of 16-24 and lives in an urban area. College students in the U.S., Korea and Hungary contribute the most to each country’s individual loss.
The study itself claimed that college students were responsible for 44% of the MPAA studios’ losses in the United States. This gathered a lot of attention at the time (see, for example, this article in The Hollywood Reporter) but it turns out that number was a little bit off. And by “a little bit” I mean “a lot”. In another press release on January 22, 2008, the MPAA stated:
While in the process of recently updating that study with current data, we discovered there had been an isolated error in the LEK process two years ago that resulted in an inflated number for piracy by college students. The 2005 study had incorrectly concluded that 44 percent of the motion picture industry’s domestic losses were attributable to piracy by college students. The 2007 study will report that number to be approximately 15 percent — or nearly a quarter of a billion dollars in stolen content annually by college students in the U.S.
Apparently the cause of the mistake was in the assumptions that were used for modeling. According to “MPAA admit error in piracy study” in The Hollywood Reporter, in the 2005 study LEK Consulting assumed that every downloaded movie is one that would have been bought. For the 2007 study, LEK made the assumption that a person would only buy a proportion of the movies that they otherwise illegally download for free. This mistake is categorized as a data entry error although it seems more fundamental to me, since reasonable assumptions are at the heart of mathematical modeling.
Extra credit question: If the change from 44% to 15% was entirely due to assuming that college students would only buy x% of the movies they download, can you find x?
This error is interesting in and of itself, but what I found equally interesting was the challenge to describe, in numerical terms, the size of the error. The percentage error is just under 200% (using ). However, two of the stories I found were:
“Whoops! MPAA’s College Piracy Numbers Off by Nearly 30%” over on Rotten Tomatoes, and
It is easy to see where each of those numbers comes from, but it still amuses me how the headlines inadvertently emphasize how easy it is to make simple errors, even as the MPAA correction emphasizes that those errors can sometimes have significant consequences.
Embarrassing Confession: I posted this last night, and when I looked at it this morning I found an error in my math formula.