This semester I’m taking a class on terrorism. Overall I’ve found the class very enjoyable; the topic of political violence is one that is always fascinating. With that said, I found one issue that would repeatedly pop up in the readings for the seminar. Almost every paper would demarcate some type of theory, discuss the data used, and run some statistical tests, which is pretty standard social science research. The issue arose in the “Discussion” or “Conclusion” sections. Almost invariably the authors would discuss the practical implications of their research, which is fine until the dreaded prediction word appears. Then claims about the predictive accuracy of the models used in the paper would rear their ugly heads. These models were explicitly not predictive models. This became the soap box that I would drag out repeatedly throughout the semester. Finally, I decided to put my own assertions to the test and see how some models performed on out-of-sample predictive tests.
The Paper
The week I chose to perform this analysis we discussed papers that focused on counterterrorism policies. One paper was “Foreign Aid Versus Military Intervention in the War on Terror” (behind a paywall) by Jean-Paul Azam and Veronique Thelen. A full discussion of their theory is beyond the scope of this post, but the authors’ basic argument is that foreign investment can reduce the number of transnational terrorist events for a number of reasons. They test this general assertion using a variety of models. Luckily, they provide replication data for their paper on the Journal of Conflict Resolution website linked above. I set off to exactly replicate the model presented in Table 2, Model 4 for those following along in the paper.
The Replication
The first step to replicating the paper was to run their Stata do-file on
the data they provided. This is done to ensure that I have exactly the same
variables in my models that they do in theirs, specifically the residualized
variables they present. As a note, I will avoid commenting on the general
modeling choices made in the paper; this post is only about the predictive
accuracy of models on out-of-sample data. I work with what the authors worked
with. After running the do-file, I check to make sure that the results
generated by Stata are the same as those presented in the paper, which is
roughly true in the case of the coefficient estimates. I then take the data,
including the newly generated variables, and read it into Python using
statsmodels. After the data is read in, I replicate the formula used in
the paper, generate a train-test split of 75% and 25%, and then fit the
negative binomial model. The random 75%-25% split is appropriate here since
the data is cross sectional and not time series in nature.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 | |
The results of this analysis roughly match the results presented in the paper.
Some differences are to be expected, of course, due to the reduced number of
observations. Since these results seem to match those in the paper, the
out-of-sample forecasts can be run. This is done using the predict method
of the mod object. The true and predicted results are then compared using
the root mean squared error. I’m not completely confident in the use of the
RMSE, but it is a quick and easy way to perform a comparison in this case.
1 2 3 4 5 6 7 | |
The value of the RMSE error shows that the predicted values tend to be off by about 23 events. This indicates that the model does not predict the impact of foreign aid on terrorist events well. Admittedly, this analysis may not be the most favorable to Azam and Thelen. In order to give them the benefit of the doubt, I also used another modeling technique to see if this inaccuracy persists. I used a random forest to fit the same model used in the negative binomial and then examined the RMSE.
1 2 3 4 5 6 7 8 9 10 11 12 13 | |
Even using a method that is possibly more robust, the prediction still doesn’t seem very satisfactory.
Models in the Social Sciences
In case it was not explicitly clear in the beginning of this post, I want to make it clear now: this analysis is not a direct critique of Azam and Thelen. I think their theory is sound and it makes sense that foreign investment and the other variables they analyze can make a difference in terrorism. I also do not think that the type of inferential statistics used by Azam and Thelen, and commonly used elsewhere in political science, are wrong in all cases. I do believe we can gain knowledge from models like those presented by Azam and Thelen. What I do take issue with, however, is the call to arms that is frequently sounded in the conclusions of papers, which are often directed towards policy makers, that the results of the paper predict phenomenon X and thus policy Y should be enacted. Especially when policy Y is something like military intervention. In order to make these types of claims, some type of out-of-sample test should be performed in order to determine the effect of the variables under examination.