Here’s 11,904 rows of new data from the US Department of State’s 2018-2019 Foreign Military Training Report

This weekend, security and human rights wonks received an early Christmas present from the U.S. Department of State: the Foreign Military Training and DoD Engagement Activities of Interest, 2018-2019.

Published annually since 2000, this extensive report contains detailed information about non-classified training activities the Department of State and Department of Defense have funded and delivered to non-US security forces in the preceding fiscal year. The 2018-2019 report, then, delivers detailed data on trainings that took place in the 2017-2018 period, along with a less detailed list of those planned and ongoing in the 2018-2019 period.

Earlier in the year, I blogged about why the data in these reports are important, and why at Security Force Monitor we started a project to make the data more accessible:

The value of this very detailed data is high but its accessibility is very low. This is because of the backwards way the data are published: 1000s of pages of tables in PDFs […] Our hope is that when the next report arrives in a short few months, we will be able to turn it into machine readable data and pass it around the sector in minutes, rather than months.

So, over this weekend we grabbed the 462 page PDF of the 2018-2019 report and extracted 11,904 rows of new training data from Volume I: Section IV, which details trainings that have taken place in the 2017-2018 period. We’ve added this data to our online database so you can search and download it:

https://trainingdata.securityforcemonitor.org/state-department-data/training_data?source_uuid__exact=8d439057-bc2d-4141-8ff3-48d6842150eb

(To download the data for use in a spreadsheet, scroll down until you see “Advanced export” – tick “Download file” and “Stream all rows” and then select “Export CSV”)

With the inclusions of this new 2018-2019 data, we’re publishing data on 213,603 trainings in 185 countries between 1 October 2000 and 30 September 2018.

The data publishing platform is flexible and powerful and gives you tools to search, sort, facet, filter and download all or parts of the dataset. The platform offers a user interface that you can use to build queries; you can also query the data directly using Structured Query Language (SQL), but this will take a bit of practice to master. I’ll write up some better instructions and add them to our Research Handbook in due course. In the meantime, here are some queries to help you get started:

The nature of the automated data extraction process means we can’t guarantee that the data are error-free. You can see the complete extraction code and process over on Github, and we’ve done our best to ensure that it accurately turns the content of the original PDF report into data – before using this data, however, please check it against the source.

Happy data wrangling!

Image: Tom Longley, CC-BY-4.0

%d bloggers like this: