This repo contains a conveniently-packaged CSV file containing data from the Pentagon's military-gear-for-civilian-police program (a.k.a LESO 1033), as well as the Python 3 scripts for reproducing the process.
In response to some very persistent Freedom of Information Act requests by investigative journalists, the Pentagon now posts the LESO 1033 records as Excel files. However, the data is split into many sheets. The code in this repo cleans and cats the data, and also joins it to the product codes from the federal procurement system.
If you just want the data as a plaintext, comma-delimited file with nearly 195,000 rows, you can download it from here:
raw/master/stash/compiled/leso-1033.csv
I make no guarantees for the accuracy of this dataset. I have not double-checked the numbers so it's up to you to confirm them. This repo contains the original files. And you can of course read and run the code and reproduce it yourself.
I've listed a number of data journalism stories and projects that you can cross-check...in fact, I highly recommend you read these stories to understand the origins of this dataset. Because like all datasets, is not all that it seems to be. At the very least, remember to multiply the Quantity column by the Acquisition Value when calculating totals...
A sample row with headers:
Header | Sample value |
---|---|
State | TN |
Station Name (LEA) | SMITH COUNTY SHERIFF DEPT |
NSN | 2355-01-590-1660 |
Item Name | MINE RESISTANT VEHICLE |
Quantity | 1 |
UI | Each |
Acquisition Value | 733000 |
DEMIL Code | C |
DEMIL IC | 1 |
Ship Date | 2014-02-13 |
PSC NAME | COMBAT, ASSAULT, AND TACTICAL VEHICLES, WHEELED |
The U.S. Defense Department's Law Enforcement Support Office (LESO) 1033 Program was originally established as a way to send unused surplus military gear to U.S. law enforcement agencies involved in the War on Drugs. The 1033 program was later expanded to any U.S. law enforcement agency, though preference is given to requests related to counter-drug and counter-terrorism requests.
The program has, of late, come under scrutiny because of photos of heavily-equipped police during the civil unrest in Ferguson during 2014. After persistent FOIA requests, particularly by MuckRock, the Pentagon released the records of what it has sent out and to which law enforcement agencies.
Here are some relevant news stories and projects:
- MuckRock • How we got the Pentagon to reveal what gear they gave cops
- The Pentagon Finally Details its Weapons-for-Cops Giveaway
- Muckrock FOIA: Program 1033 transfers nationwide 2000 to 2014
- Central Florida police acquire military surplus gear (clickorlando.com) - Just so that you're aware, not all helicopters are equal. You should read this excellent local story to see how this dataset has a lot of ambiguous details.
- Mapping the Spread of the Military’s Surplus Gear - The New York Times
- MRAPs And Bayonets: What We Know About The Pentagon's 1033 Program : NPR (npr.org)
- A reusable data processing workflow - a guide by NPR Visuals on how it processed the 1033 data
- In my public affairs data journalism class last year, I made the 1033 data the subject of the midterm
Note: In the stories (including my midterm) that happened before December 2014, the dataset used is a previous iteration in which the Pentagon refused to disclose which law enforcement agencies were involved. So that version of the dataset only includes details of the state and county of the receiving agency.
The dataset hosted in this repo has the name of the agency and the state.
The LESO 1033 Program data comes from DLA Disposition Services eReading Room . (You can see a mirror of that page as captured on August 12, 2015 and rendered on Github Pages)
The 1033 data is distributed as Excel spreadsheets, and contain a row for every line item distribution, e.g. what kind of equipment, how many of it, and to what agency. Included in the data is the original acquisition cost of the item, which can be a (very) rough estimate of evaluating the amount of surplus distributed.
Product Service Codes (PSC) are used by the Federal Procurement Data System to categorize the various products. Acquisition.gov has a PDF manual with way more detail than you need (I've stashed a copy here).
With PSC data, we can add some general categorization to the LESO data, which contains a NSN (short for National Supply Number) and Item Name column:
NSN | Item Name |
---|---|
1005-00-073-9421 | RIFLE,5.56 MILLIMETER |
8415-01-546-8809 | JACKET,COLD WEATHER |
1240-01-439-2730 | BINOCULAR |
8465-01-465-2057 | POUCH,RADIO,MOLLE |
8465-01-515-8615 | FIELD PACK |
1005-01-562-9455 | HOLDER,MULTIPLE MAGAZINE |
The first 4 digits of the NSN correspond to a PSC. The scripts/compile_data.py does the work of matching those NSN digits to the corresponding PSC:
NSN | Item Name | PSC NAME |
---|---|---|
1005-00-073-9421 | RIFLE,5.56 MILLIMETER | GUNS, THROUGH 30MM |
8415-01-546-8809 | JACKET,COLD WEATHER | CLOTHING, SPECIAL PURPOSE |
1240-01-439-2730 | BINOCULAR | OPTICAL SIGHTING AND RANGING EQUIPMENT |
8465-01-465-2057 | POUCH,RADIO,MOLLE | INDIVIDUAL EQUIPMENT |
1005-01-562-9455 | HOLDER,MULTIPLE MAGAZINE | GUNS, THROUGH 30MM |
As you can see, you can't quite rely on PSC NAME alone, as it defines certain firearm accessories in the same categories as firearms themselves.
The FPDS has a wiki which has a direct link to an Excel spreadsheet of PSC data:
http://www.fpdsng.com/downloads/psc_data_Oct012015.xls
The programming scripts in this repo are written in Python 3.4.x and use a few third-party libraries, most notably Requests and xlrd (for Excel work). Setting up your programming environment can be quite complicated, so I won't offer detailed directions on that. But one relatively easy way to start is to use the Anaconda distribution for Python 3.
If you clone this repo:
$ git clone https://github.com/datahoarder/leso_1033.git
# change into the directory:
$ cd leso_1033
And then have the right dependencies and setup, you can run the two scripts needed to collect and clean the data:
$ python -m scripts.fetch.fetch_data
$ python -m scripts.compile.compile_data
Here's a brief description of what they do:
This script simply downloads the data files and stores them into the stash/fetched directory.
This script uses the xlrd library to open each Excel file and extract the data. Each of the LESO Excel workbooks contain multiple spreadsheets, one for each state, and so this script does the menial work of joining those together.
This script also contains the logic that matches up the product categories from the PSC spreadsheet with each record in the LESO 1033 spreadsheets.
When the script is finished reconciling and cleaning the data, it outputs a plain-text CSV: stash/compiled/leso-1033.csv
Since government sites can change, I've stashed mirrors for the relevant DoD and FPDS landing pages with links to copies of the data files, just in case you want to run my fetching scripts and get the same results:
The wget command that I used to do the mirrors:
wget --adjust-extension\
--no-directories \
--no-host-directories \
--recursive --level=1 \
--execute robots=off \
--convert-links --backup-converted \
--timestamping --page-requisites \
--directory-prefix=dispositionservices.dla.mil-1033 \
--user-agent="Mac OS X" \
http://www.dispositionservices.dla.mil/EFOIA-Privacy/Pages/ereadingroom.aspx
wget --adjust-extension -HD www.fpdsng.com \
--no-directories \
--no-host-directories \
--recursive --level=1 \
--execute robots=off \
--convert-links --backup-converted \
--timestamping --page-requisites \
--directory-prefix=fpds.gov-PSC \
--user-agent="Mac OS X" \
https://www.fpds.gov/wiki/index.php/PSC,_NAICS_and_more