Big Data/BI Zone is brought to you in partnership with:

Adi is a social business blogger and community manager that writes for sites such as Social Business News and Social Media Today. Away from the computer he enjoys cycling, particularly in the Alpes. Adi is a DZone Zone Leader and has posted 859 posts at DZone. You can read more from them at their website. View Full User Profile

Can Wikipedia predict movie success?

08.28.2013
| 1151 views |
  • submit to reddit

Using online data to predict trends is increasingly popular.  Whilst we’ve seen a number of these endeavours take an altruistic angle, such as attempts to use social data to co-ordinate disaster response, there have been an even larger number of projects with a commercial intent.

For instance there was the project that tried using Google search data to predict stock prices, or Twitter mentions to predict box office success.

The latest attempt has set its sight on the box office, although this time researchers from Oxford University are attempting to predict success based upon that movies Wikipedia entry.

They believed that the activity level of editors combined with the number of views a page receives can predict the success a movie will have at the box office.

To test out the theory, they analysed activity levels at 312 Wikipedia pages for movies prior to their release at the cinema. The analysis included the number of views the page received, the number of editors who had contributed to the article, the number of individual edits made and the collaborative rigour of the editing train of the article.

They found that there were clear links between the activity of the Wikipedia page and the revenue earned at the box office.

The analysis presented here can make predictions with reasonable accuracy as early as one month before release. It is evident that the prediction is more precise for more successful movies. Some examples of the movies whose box office receipts were predicted accurately are Iron Man 2Alice in WonderlandToy Story 3InceptionClash of the Titans, and Shutter Island.

The researchers also believed that monitoring Wikipedia represented a better approach than scouring Twitter.

The predicting power of the Wikipedia-based model, despite its simplicity compared to the Twitter, can be explained by the fact that many of the Wikipedia editors are committed followers of movie industry who gather information and edit related articles significantly earlier than the release date, whereas the “mass” production of tweets only occurs very close to the release time, mostly evoked by marketing campaigns.

What’s more, the researchers are also confident that the approach can easily be extended to other fields, including finance and public policy.

Original post