The collation part of the intelligence cycle (if, indeed, it is a cycle) is the least appreciated and ‘sexy’ part of the whole thing.
Defined, collation (sometimes also called ‘processing’ or considered a subset of that word) is:
Once the collection plan is executed and information arrives, it is processed for exploitation. This involves the translation of raw intelligence materials from a foreign language, evaluation of relevance and reliability, and collation of the raw intelligence in preparation for exploitation. 1
In short, its converting the varied inputs of intelligence (interviews, bank records, photographs, etc.) into forms of data that allow the information to be analyzed. Generally, it sucks. Converting or coding one set of data, for example, into spreadsheets, databases, etc. It’s also very time consuming and can be fraught with errors.
It may interest you to know that in the law enforcement and homeland security fields there is a real dearth of standardized datasets from which to draw to assist with analysis. Want to know if the use of explosive devices by extremists is on the increase or decline? Well, it depends on whose data you use and none of it is great or universally accepted.
With regards to terrorism studies, START‘s Global Terrorism Database is among the best but it has problems associated with it as well. These aren’t really defects in the START folks (they’re very nice and extremely helpful) but rather because, in part, of problems with collations. A few examples…
Let’s say you want to capture terrorist plots and attacks in the United States. What would you count? When does a plot go from some knucklehead blowing off steam and talking nonsense to a real threat? Should we even make that distinction? Is it enough to say ‘I want to attack the United States’ or would the individual need to identify a target?
Many researchers try to sidestep this issue by trying to use the criminal justice system to do their work for them. It’s easy to count people who have been charged under terrorism statutes or convicted of terrorist crimes.
That system leaves a lot to be desires. What happens is charges are dropped or over turned? Many times, a terrorism arrest is announced to much fanfare only to have most of the substantive charges dropped for one reason or another. Then you over count the plots and attacks. What if, for tactical prosecutorial reasons, the suspect is charged with something other than terrorism offenses? Then you risk under counting plots and attacks.
This is a real issue and I’d point you to this group that was arrested in Fort Stewart, Georgia a couple of years ago as evidence of that. A number of current and former soldiers with plans to attack an Army base, assassinate the President and with two (and possibly more) homicides and other crimes under their belt. Never heard of this terrorist threat? I’m not surprised as they weren’t charged under a terrorist statute. So, they fall into the ‘general crime’ category. Clearly, (assuming the initial reports were correct) these are folks you’d want included in any analysis of the terrorist threat.
Unfortunately, this is too big of a job for state and local agencies to tackle and it’s not clear if a federal agency (the FBI or DHS probably) has the interest or will to do this. Further, if you want widespread acceptance, such a task is probably undertaken with broad buy-in, transparency and peer review.
These sorts of things have been bumping around in my mind for awhile when I saw this story from Slate. After the Newtown shooting, they started documenting all the firearms deaths in the United States. Now, one year in, they’ve got more than 10,000 reports of gun deaths and, in order to analyze them, have to….(wait for it…)….collate all that data.
Rather than trying to do it in-house or hire someone to do it (or foist it off on some poor interns) they decided to try crowd-sourcing it.
Here’s how it works. They provide you with an article and the name of a victim about a gun death. You read it and then select if the death fits within one of the following categories:
- shot by law enforcement
- shot by civilian in self defence
- link not working
The nice thing about it is, apparently, it focuses on the victim rather than just the article. So, I suppose an article where two people died (let’s say a murder/suicide) one person may get asked about the suicide and the other the murder. Also, they don’t just rely on one person’s response 2 but rather on a consensus of some number of respondents. That’s a nice check on work, but as we all know, the majority isn’t always right.
Now, they did recently identify a significant flaw in their methodology. While their data collection was exclusively based on open source reporting it quickly became clear that suicide by gun is under-reported. On those occasions when suicides are reported (not very often) it’s almost unheard of for the report to describe the method of death. Still, like in most things, admitting you’ve got a problem is the first step in fixing it.
Another example is through the British Library.
In 2008, the British Library, in partnership with Microsoft, embarked on a project to digitize thousands of out-of-copyright books from the 17th, 18th, and 19th centuries. Included within those books were maps, diagrams, illustrations, photographs, and more. The Library has uploaded more than a million of them onto Flickr and released them into the public domain. It’s now asking for help.
Next year, it plans to launch a crowdsourced application to fill the gap, to enable humans to describe the images. This information will then be used to train an automated classifier that will be run against the entire corpus.
The library is also soliciting ideas for how to present the collection to aid the tagging and metadata generation, and also make the pictures easier to navigate.
I wonder if such a thing couldn’t be done with terrorism? Certainly there are dozens (probably hundreds) of agencies with some stake counter terrorism in the United States and there are far fewer terrorism incidents than gun crimes so the task shouldn’t be too onerous and you could mandate participation on a federal level or (perhaps even better) make it a condition to receive federal grants. You could work out some formula based upon the number of personnel in an agency, the scope of their counter-terrorism mandate and make a determination of how many records they should collate. Each record would be collated by multiple people and either accepted (in cases of significant consensus) or review by some panel when it’s not clear what the community view is.