Mar 1, 2018 2:00 AM

UK police are using AI to inform custodial decisions – but it could be discriminating against the poor

Durham Constabulary, which has been testing its HART algorithm since 2017, recently made changes to avoid reinforcing human biases against people living in certain areas

iStock / Maxchered

An algorithm designed to help UK police make custody decisions has been altered amid concerns that it could discriminate against people from poorer areas. A review of its operation also found large discrepancies between human predictions and those made by the system.

For the last five years Durham Constabulary and computer science academics have been developing the Harm Assessment Risk Tool (HART). The artificial intelligence system is designed to predict whether suspects are at a low, moderate or high risk of committing further crimes in a two years period.

The algorithm is one of the first to be used by police forces in the UK. It does not decide whether suspects should be kept in custody but is intended to help police officers pick if a person should be referred to a rehabilitation programme called Checkpoint. The scheme is designed to intervene in proceedings rather than push people through the UK's court system.

HART uses data from 34 different categories – covering a person's age, gender and offending history – to rate people as a low, moderate or high risk. Within these data categories is postcode information. The police force is now removing the primary postcode field, which includes the first four digits of Durham postcodes from the AI system. "HART is currently being refreshed with more recent data, and with an aim of removing one of the two postcode predictors," a draft academic paper, published in September 2017, reviewing the use of the algorithm reads. The paper was co-authored by one member of the police force.

"I have a concern about the primary postcode predictor being in there," says Andrew Wooff, a criminology lecturer at Edinburgh Napier University, who specialises in the criminal justice system. Wooff adds that including location and socio-demographic data can reinforce existing biases in policing decisions and the judicial system. "You could see a situation where you are amplifying existing patterns of offending, if the police are responding to forecasts of high risk postcode areas."

The academic paper, the first review of HART to be published, states that postcode data could be related to "community deprivation". A person's "residence may be a relevant factor because of the aims of the intervention," the paper continues. If the postcode data is relied upon for building future models of reoffending then it could draw more attention to the neighbourhoods. "It is these predictors that are used to build the model – as opposed to the model itself – that are of central concern," the paper states.

The paper also highlights a "clear difference of opinion between human an algorithmic forecasts". During initial trials of the algorithm, members of the police force were asked to mimic its outcomes by predicting whether a person would be of a low, moderate or high risk of reoffending. Almost two-thirds of the time (63.5 per cent) police officers ranked offenders in the moderate category. "The model and officers agree online 56.2 per cent of the time," the paper explains.

WIRED contacted Durham Constabulary with questions about its alterations to the algorithm but had not received a response at the time of publication.

Inside HART

"You are being invited to take part in a research study," a script read by police officers in Durham says. The study may "change your life forever", officers are told to say, and the person charged with an offence will not receive a criminal conviction if they complete the study.

The Checkpoint programme is an experiment being run by Durham Constabulary and Cambridge University. Its aim is to reduce reoffending by dealing with why a person has committed a crime – drug or alcohol abuse, homelessness, or mental health are listed as areas where help can be provided.

And it's Checkpoint that the HART algorithm feeds into. People who are classed as having a "moderate" chance of committing another crime can be offered inclusion in the programme. If they're judged to be a high or low risk, they cannot be included.

"People's lives are already being affected by the status quo," says Jennifer Doleac a professor of public policy and economics at the University of Virginia. "But is there a better way to do this that could lead to more just outcomes and get us closer to social goals than other practices?" Checkpoint was given an award by the charity the Howard League for Penal Reform which commended it for trying to keep people out of the criminal justice system.

HART is a machine learning system that uses the R programming language and makes decisions through random forests, a way of making predictions based on a series of different outcomes.

Every decision HART comes to is based on historical data: it looks at previous information and predicts future outcomes. In the first model of HART, Durham Constabulary gave the system the details of 104,000 custody events from 2008-2012. From here it used 34 predictors – including the location data – to create a prediction about each person. All conclusions reached by HART are based on 509 votes by the system (a vote is either, low, moderate or high).

Research published by Sheena Urwin, Durham Constabulary's head of criminal justice and lead the project, shows HART working on real-world data. The earliest version of the algorithmic model predicted a 24-year-old man who had a criminal history of violence – police had 22 previous intelligence reports on him – would be a high-risk of offending again (the model gave him 414 high votes, 87 moderate and 8 low). Later he was arrested and convicted for murder.

UK police forces using AI

South Wales Police: The Welsh police force is using AI was part of its facial recognition system. It is able to use its face scanning technology in realtime and has been doing so since 2017. Several arrests have been made after people were cross-referenced against a database of 500,000 custody images.

Kent Police: Since December 2012, officers in Kent have been used a system called Pred Pol to predict where crimes may occur. The system is trained using historic crime data and uses this to highlight areas where police officers may be needed.

Durham Constabulary: Durham police's use of HART focusses on custody decisions and is designed as a tool to support officers. Its use was initially started in 2017.

Policing by algorithm

While the use of artificial intelligence predictions in police and law enforcement remains in its early stages, there are plenty of warning signs for policing bodies intent on developing algorithmic systems. A widely cited Pro Publica investigation in 2016 revealed how the COMPAS software – created by Northpointe – was biased against black offenders.

Another study from George Mason law professor Megan Stevenson looking at the impact of algorithmic risk assessments in Kentucky, found no large benefits from the system. Analysing data from more than one million criminal cases Stevenson concluded pretrial risk assessments "led to neither the dramatic efficiency gains predicted by risk assessment’s champions, nor the increase in racial disparities predicted by its critics". It also said judges using the Kentucky system tended to revert back to their own opinions and methods the longer they used the risk assessment methods.

To prevent existing human biases – around race and social prejudices – creeping into use with HART, Durham Constabulary has provided members of its staff with awareness sessions around unconscious bias. The police force also stresses it has not included race within the predictors its algorithm uses and the tool's output is an aide for humans to help them make decisions. ("I cannot give you the figures," Urwin told MPs in December 2017, "but they go against the algorithmic forecast because that is not the be-all and end-all; it is decision support").

Edinburgh's Wooff says he worries that the "time-pressured, resource intensive" world of policing could see officers relying too much on a computer-generated decision. "I can imagine a situation where a police officer may rely more on the system than their own decision-making processes," he says, adding that a paper trail may be useful for officers who are making decisions. "Partly that might be so that you can justify a decision when something goes wrong."

A separate study looking at COMPAS' accuracy has also found it makes the same decisions as untrained humans. "The COMPAS predictions were no more accurate than the predictions of humans responding to an online survey, who we have little to no criminal justice experience," says Julia Dressel, the author of the study who is now an engineer at Apple.

Dressel and Dartmouth College professor Hany Farid paid people using Amazon's Mechanical Turk to predict whether criminals would be likely to reoffend and compared it to the results from COMPAS. Both humans and the algorithm predicted reoffending with around 67 per cent accuracy. "We can't assume that because something was built using big data that it is going to be able to predict the future," Dressel says. "We need to hold them to really high standards, we need to test them, we need to have them prove they are as accurate and effective as they say they are."

Open to scrutiny

Durham's algorithm is a black box. It isn't possible for the system to fully explain how it makes decisions, which are based on more than 4.2 million points within the model. "Opacity seems difficult to avoid," concludes the September 2017 review of HART. At present the system only contains data held by the Durham force but in future it is possible extra information from local councils or the UK's Police National Database could be incorporated.

The police force has attempted to get around a lack of transparency by creating a framework for when algorithmic assessment tools should be used by police. Called Algo-care it says algorithms should be lawful, accurate, challengeable, responsible and explainable.

"Really, accountability can never just be a checklist," says Dillon Reisman, a technical fellow at the AI Now Institute, which examines the social impact of AI on society. "It's good to see they have considered [Algo-care] but they [also] have to consider whether it is appropriate to use these algorithms in the first place."

The police force has refused to publicly reveal the underlying code of HART, arguing that it wouldn't be in the public interest and could undermine the system in its research stage. However, the police force says it would be willing to give the underlying system to a central organisation.

"Durham Constabulary would be prepared to reveal the HART algorithm and the associated personal data and custody event datasets to an algorithmic regulator," the police force said in a response to a query about publication of the data.

Reisman argues that more information will be needed. "Code is not sufficient to auditing algorithms," he says. "You need information about how people will act on the decisions that come out of algorithmic decisions."

But until this happens the effectiveness of AI policing systems is still open to question. The September 2017 review of HART, which was co-authored by Durham's Urwin, found issues around whether algorithmic predictions are "appropriate at all" and whether some data – such as race – should ever be included in policing systems.

"It is really hard to look at past behaviour and predict with a high accuracy what somebody will do in the coming two years," Farid, who co-authored the analysis of COMPAS, says. "If you cannot predict that accurately maybe we should just try to abandon trying to use that as a criteria and look for things that are actually easier to predict and find a balance for civil liberties and the safety of our societies."

Updated March 1, 2017: The headline on this article has been altered to clarify HART can inform decisions, not make them.

This article was originally published by WIRED UK