John Beieler

PhD Student in Political Science

Mapping Protest Data

Edit July 17, 2013:

Since this map got picked up by the Guardian, I thought I would clarify some points further for those who aren’t familiar with the data.

First, the GDELT data is based on news reports from a variety of sources (a list of sources used can be found here under “Data Sources”). For better or for worse, journalistic accounts of events are about the best we can do for large-scale, global projects such as this. Second, if an event occurs but does not have a specific location within a country, e.g., “Protestors in Syria‚Ķ”, the event is geolocated to the centroid of the country. This means that there may be some odd events at some locations, and with a high number of events. Third, the “Event Count” featured on the map is the number of protest events that occurred at that location for the entire first half of 2013. This means that if the “Event Count” variable shows 60, then there were 60 unique protest events at that location. This is not a measure of scale or intensity of a given protest, or even how many times a certain protest was mentioned in the news media, though GDELT does record this, it is simply a measure of unique events. Next, geolocation is hard, especially on the scale GDELT works on (300+ million events spanning over 30 years), so some of the points may not be perfect. Even if 10 million events are located in the wrong place, however, that’s still an error rate of about 3%. Finally, and this is mentioned in the post further below, GDELT uses the CAMEO coding scheme to classify events. This means that many different types of protest behavior are recorded, not just the protests or riots that come to mind when one thinks of Egypt or Turkey. Russia verbally protesting actions of the United States is a protest event. This means that there are both a higher number of events, and events that occur in locations that a person might not tie to a protest or riot.

GDELT and Protests

Given the recent spate of protests around the world, there was some discussion between Jay Ulfelder, Patrick Brandt, Kalev Leetaru, Phil Schrodt, and myself about the possibility of using GDELT to examine some of the protest activity. Much of this still remains in the discussion stage, but some data was pulled from GDELT, and I decided to venture into the world of map making. As a caveat, I’ve never really worked with geographic visualization of data, and this is my first cut at this type of work. So, without further ado, the map is located at http://cdb.io/14RHla0.

Data

The data used to create the map contains all CAMEO codes that begin with 14, which is the general category for “protest” events, for the year 2013. Including data from earlier than 2013 made the map much too cluttered. A potential issue with the use of this CAMEO category is that it picks up governments protesting other governments, politicians protesting policies, etc. Thus why the U.S. is blank; it was a shining beacon of protest activity that distracted from the other parts of the map. If anyone is interested I can put the U.S. data back in and regenerate the map. The data was grouped by the latitude and longitude coordinates, and a count of protest events at each location is included. If you zoom in, you are able to see the individuals points, which when clicked provide information about the location and number of events.

The main takeaway from this map seems to be that GDELT does a pretty good job of capturing the broad trends of protest activity; the areas that are “bright” are those that would generally be expected to be so.

Note on Tools

The map was created using the fantastic CartoDB along with CSS provided by Josh Stevens