Big data grows up

October 19th, 2015 by · Leave a Comment

This article was authored by Stefan Hammond, and was originally posted on

Big data is often associated with big retailers. Build a customer database, then push out info (and special deals) that appeals to specific, pre-determined tastes.

The retail angle has become an overused paradigm for big data usage. If you buy a blue sweater every month from a single retail outlet, expect online ads of dreary predictability.

I’m not a good candidate as my online shopping consists of CDs by obscure bands and Blu-ray discs of Japanese films from the 60s. I’m not just in the “long tail” but in obscure sections of the tail. The product-suggestions on the few e-commerce sites I visit are laughable.

I’m not an ideal target for accurate crunching of massive data. A recent story in the LA Times, however, outlined a useful and interesting use of big data.

Be all that data can be
Join the all-volunteer US Army and their attitude is simple: “All your data are belong to us.” It makes sense: a functioning armed force needs a heavy amount of granular data on soldiers it will train and possibly deploy.

Now the Army is using their dataset for a survey which aims to identify soldiers most likely to commit violent crimes. Their methodology warrants a closer look.

According to the Times, researchers studied the military records of all 975,057 soldiers who served during a six-year period and developed an algorithm intended to identify those at greatest risk of perpetrating severe, violent crimes.

The researchers drew on 38 databases that codified 446 variables for each soldier who served between 2004 and 2009, said the Times. During that period, 5,771 soldiers committed murder, manslaughter, kidnapping, robbery or other violent crimes.

Pattern recognition
Researchers examined patterns among the violent offenders and used the data to create a risk model that took account of their demographic characteristics, health histories, career details, and other factors predating their crimes.

For men (the vast majority of both soldiers and offenders), 24 factors were identified. Those most at risk were young, poor, ethnic minorities with low ranks, and had had disciplinary trouble, a suicide attempt, or a recent demotion, according to a report published Tuesday in the journal Psychological Medicine.

The highest-risk group – just 5% of the total population of male soldiers – accounted for 36% of the crimes perpetrated by men, and each year (on average), 15 of every 1,000 of those men committed a violent offense, said the Times. To test their model, researchers applied it to a sample of 43,248 soldiers who served between 2011 and 2013 – they found that the 5% identified as most at-risk were responsible for 51% of the violent crimes committed by those soldiers.

Effectiveness questioned
There’s debate about how to redirect people prone to violence, said the Timesarticle, pointing out that even in the highest-risk group, most people do not become offenders.

An intensive violence-prevention program “would make sense only if the interventions are shown to be highly efficient — something that has not yet been demonstrated,” said study co-author John Monahan, a law professor at the University of Virginia.

Well, OK. But as a whole, this is the best use-case for big data I’ve seen to date. It identifies an at-risk group in a proactive manner, without the need for individual evaluation (which is expensive, time-consuming, and otherwise problematic).

I don’t agree with Monahan that an intensive violence-prevention program “would make sense only if the interventions are shown to be highly efficient.” I’m not a law professor, but I believe that if someone is having difficulties and shown that an organization cares, they have a chance to talk to someone who can help, and turn things around if necessary.

In short, this is where big data meets the uncertainties of human behavior and provides a way out for at least some of those the data identifies as being at-risk. And that’s a good thing.

“The military has extraordinary data systems,” said Ronald Kessler, a Harvard sociologist and co-author. “There is an ability to do this targeting in a way that can’t be done anywhere else.”

If you haven't already, please take our Reader Survey! Just 3 questions to help us better understand who is reading Telecom Ramblings so we can serve you better!

Categories: Big Data · Other Posts

Discuss this Post

Leave a Comment

You may Log In to post a comment, or fill in the form to post anonymously.

  • Ramblings’ Jobs

    Post a Job - Just $99/30days
  • Event Calendar