How A Horrific Murder Exposes The Great Failure Of Facebook’s AI Moderation

How A Horrific Murder Exposes The Great Failure Of Facebook’s AI Moderation

When 21-year-old Brandon Andrew Clark posted a series of graphic images last week of the slain corpse of 17-year-old Bianca Devins to Instagram and Discord, users immediately began spreading the gory pictures online, often alongside brutal, misogynist commentary.

Some said the victim, an “˜e-girl’ who was popular on 4Chan, deserved it, and others called for even more violence against women. Clark, who appears to have live-posted the murder itself on Instagram — a series of posts reportedly showed the body, the road near the crime scene, and an act of bloody self-harm — took the time to change his bio to hint at his forthcoming suicide attempt, and to attempt to craft a multi-platform narrative around the killing as it unfolded.

Police on Monday charged Clark with Devins’ murder.

Like the deeply socially mediated Christchurch, New Zealand, shootings in March, the act demonstrates both the suspected killer’s savvy for web platform mechanics and just how rapidly extreme content spreads.

It also demonstrates how little has changed since these acts of real-time violence have grown prominent, and, yet again, how badly platforms are failing to keep content like this off their farms: The Instagram post of Devins’ body was left on a platform shared by 1 billion monthly active users, for what was reportedly most of Sunday, when the content was posted. At one point, Instagram placed the image behind a filter screen that merely cautioned against graphic content before finally removing the post and the account altogether.

Tech executives have long said they’re deploying cutting-edge AI and automated content moderation systems to keep this from happening. It’s been nearly a year and a half since Mark Zuckerberg, the CEO of Facebook, which owns Instagram, said advanced AI tools would soon make occurrences like this a thing of the past.

“Over the long term, building AI tools is going to be the scalable way to identify and root out most of this harmful content,” Zuckerberg said in a congressional hearing in April 2018.

“The combination of building AI and hiring what is going to be tens of thousands of people to work on these problems, I think we’ll see us make very meaningful progress going forward,” he said on an earnings call the same month. “These are not unsolvable problems.”

And yet, here we are in July of 2019, and Instagram users exposed to graphic content are forced to take matters into their own hands, trying to drown out a flood of repulsive and opportunistic murder porn posts by co-opting the hashtag and mass-posting pics of innocuous pink clouds. It still wasn’t enough to stymie the spread of incel-aggrandising murder pics.

Content moderation is a deeply complex and incomprehensibly difficult undertaking on platforms with billions of users. But not all moderation tasks are created equal. AI, for instance, is much better at flagging instances of nudity and gore than it is at picking up hate speech. Facebook said as much when, under fire for allowing the spread of disinformation, it explained at its F8 conference last year how its AI tools would help it fight extreme content.

“The bottom line is that automated AI tools help mainly in seven areas: nudity, graphic violence, terrorist content, hate speech, spam, fake accounts and suicide prevention,” CNET reported at the time. “For things like nudity and graphic violence, problematic posts are detected by technology called “˜computer vision,’ software that’s trained to flag the content because of certain elements in the image. Sometimes that graphic content is taken down, and sometimes it’s put behind a warning screen.”

So the question is, why, a year after Zuckerberg touted AI moderation tech, did Instagram, and its parent company Facebook, reportedly take most of a day to remove the Devins post? A post that has terrorised, traumatised and enraged the victim’s family? A post that could not more obviously violate Facebook and Instagram’s community guidelines?

On Facebook, the fact that posts depicting “˜Violence and Incitement’ will be banned is the subject of the very first part of the very first section of its guidelines”a lengthy, 22-point document. Instagram’s community guidelines similarly state, “Sharing graphic images for sadistic pleasure or to glorify violence is never allowed.”

Why wasn’t Zuck’s oft-discussed automation-led system up to the task of flagging this very obviously graphic, obscene, and hateful post before it reached any user’s screen? Why do automated content moderation systems, which have for years been heralded by the platforms as the chief weapon in our arsenal against offensive, extreme content, continue to fail their remit?

“When it comes to sharing violent images or gore, most of the major platforms already have rules prohibiting this kind of content, so yes, the issue is not necessarily “˜having a policy’ but how that policy is implemented,” says Robyn Caplan, an affiliate researcher at the Data & Society Research Institute who studies platforms and content moderation. “The major platforms are increasingly using automation for videos and photographs. They use hashing to create a unique identifier for the offending photo, which can then be compared against other identical photos that have been uploaded.”

This is one reason why there’s been some success in keeping extreme content like ISIS-related pro-terrorism posts and child pornography off the big platforms”the tech companies operate a shared database of flagged terrorist content, and can in many cases tag and remove hateful content before anyone sees it. (Though nonprofits like the Counter Extremism Project say it’s not as many cases as they’d like you to think.)

But then there’s the mounting list of seemingly obvious failures: The terrorist videos Theresa May excoriated Facebook for in the aftermath of the 2017 London Bridge attack. The Christchurch shooting videos, which were allowed onto Facebook 20 per cent of the time — making for hundreds of thousands of posts — when users shared them on the platform, according to the company. And now we have the Bianca Devins posts, which were repeatedly reported by users, and yet stayed live on the platform for hours.

Facebook is supposedly operating an advanced computer vision system that auto-flags offensive posts of graphic violence, and a database where offensive images receive hashes and can be taken down automatically when uploaded.

An openly gratuitous, gruesome post depicting fatal violence against a teenager posted by a man insinuating that he was the one who killed her should rank among the easiest type of content to flag by an algorithm trained to do so. And yet these posts persisted. Why?

It could be that the technology Facebook is using is simply not good enough.

“The tech we already have can do a vastly better job, but there is just no incentive for companies to deploy it for content moderation,” Kalev Leetaru tells me in an email conversation. Leetaru is Senior Fellow at Auburn University and runs GDELT, a “global open monitoring project” with support from Google Jigsaw.

“While today’s deep image classification algorithms are far from perfect and can’t catch everything, they are quite good at flagging a wide swath of common violent imagery in all its forms, including imagery depicting weapons being used or visible blood or persons exhibiting extreme visual anguish,” he says. “The technology is there to catch a great deal of the violent imagery that proliferates. The reason platforms are reluctant to deploy it comes down to several factors.”

Those factors, he argues, are context, cost, and profit.

Context is clear enough — as in the infamous case of the “˜Terror of War’ photo, which depicts a naked and napalm-scarred girl in anguish and which Facebook erroneously censored to ample criticism, the autonomous models can flag a post as offensive and the moderators will still have to untangle whether they did so correctly. AI is no panacea for human judgment.

Then there’s the cost of running a sufficiently advanced system. “High-quality models that have been trained on diverse imagery are computationally expensive to run,” he says, and “unlike copyright infringement in which the platforms are forced legally to spend what they need to to catch illegal uploads, there are no legal requirements in most countries to combat violent imagery, so there is no incentive for them to do so.”

Finally, Leetaru notes the profit motive. Extreme posts generate a lot of clicks, shares, and commentary, and contrary to common sense, perhaps, “[t]errorism, hate speech, human trafficking, sexual assault and other horrific imagery actually benefits the sites monetarily,” he says.

And ultimately, in a commercial platform that has prioritised growth above all else, all discretion over content ultimately resolves before matters of profitability.

“While meaning and intent of user-generated content may often be imagined to be the most important factors by which content is evaluated for a site,” Sarah T. Roberts, an assistant professor in the Department of Information Studies at the University of California, Los Angeles wrote in a recent paper, “its value to the platform as a potentially revenue-generating commodity is actually the key criterion and the one to which all moderation decisions are ultimately reduced.”

This is also a reason that platforms’ moderation algorithms more aggressively target posts linked to ISIS terrorism than they do, say, white nationalism — if Arabic language speakers who don’t violate any rules get swept up in the dragnet, that’s less of a risk to their bottom line than if high-profile conservatives do.

If it’s expensive and resource-intensive to run good, autonomous content moderation systems, and doing so will deprive Facebook and Instagram of engagement, then it’s not hard to see why the platforms would continue to drag their feet in upgrading the tech and attendant policy. After all, despite more than a year straight of nearly nonstop scandal and public policy failures, Facebook’s stock continues to climb. (“Facebook … has been embroiled in privacy scandals, Russian election interference backlash, and more for well over a year now,” Yahoo! Finance noted in May, rating the stock as a “˜buy’. “Despite all of the negativity, it seems that the average Facebook user doesn’t seem to care.”)

After Zuckerberg’s umpteenth mea culpa publicity tour, we could maybe be forgiven for thinking that Facebook has civic duty in mind as it fumbles through apology after apology”but lax, ultimately poisonous moderation policies have not proven to be particularly injurious to the company’s bottom line. And that, again, is what matters to these platforms at the end of the day.

When I emailed Instagram about why, despite their automated content moderation system, it took so long to remove the offensive post by Clark, who went by @yesjuliet on the platform, the company sent me this statement, attributable to Facebook spokesperson: “Our thoughts go out to those affected by this tragic event. We are taking every measure to remove this content from our platforms.”

Instagram also sent me an outline of other points regarding the case, which the spokesperson said was on background — a condition to which I did not agree in advance. The details are worth sharing unfiltered, as they illustrate what the company means by “taking every measure” and how it characterises its use of its much-touted AI moderation technology in a real-world scenario:

– As with other major news stories, we’re seeing content on our site related to this tragic event and we’re removing content that violates our policies, such as support for the crime.

– Once this tragic event was surfaced to us on Sunday — we removed the content in question from @yesjuliet’s Instagram Stories, and our teams across the company began monitoring for further information and developments in real time to understand the situation and what else could manifest on Instagram.

– While we’re unable to share the time it took to remove the post, it did not take 24 hours. This is inaccurate.

– Our policy and operations teams, as well as our teams who communicate with law enforcement, began coordinating to ensure we had as much information possible about the event so that we could determine whether content on our site violated our policies.

– Then on Monday, when the crime and the suspect’s identity was confirmed, we immediately removed his accounts on both Instagram and Facebook.

– Additionally, our teams also knew to expect that once the suspect was named, that people may try to create accounts impersonating him so they immediately starting proactively looking for those and removing them. They’ve been using a combination of technology, as well as reports from our community, to take these accounts down.

– They are also reviewing hashtags and accounts claiming to share this content and taking action in line with our policies, for example, we blocked the hashtag #yesjuliet, #yesjulietpicture,#checkyesjuliet ,#yesjulietvideo for attempting to spread the image.

– Finally — to stop the content from spreading, we are using technology that allows us to proactively find other attempts to upload the image in question, and automatically remove them before anyone sees them. We have also put this measure in place for images shared on other sites, to ensure these images aren’t also posted on Instagram.

We’re currently in touch with law enforcement.

Since that didn’t answer the question of why the automated system didn’t detect and remove an image obviously in violation of its policies for many hours — if it was not a full day, reports say it was close — I followed up. The Instagram spokesperson confirmed the platform does “have artificial intelligence in place to find violating content like this,” but she did not explain the lag.

“Our goal is to take action as soon as possible,” the spokesperson said, “there is always room for improvement. We don’t want people seeing content that violates our policies.”

Maybe the most infuriating possibility is that, if Leetaru and other critics are right, and Facebook and Instagram are simply delaying or failing to use of the technology they’ve paid so much lip service to, because it risks impeding the rapid proliferation of content.

“Facebook/Instagram have been particularly bad about deploying deep learning to combat violent imagery,” Leetaru says. Despite having “top notch” AI research staff, they continue to lag behind their peers. “It’s unclear why.” He notes that Facebook says they failed to better deal with the New Zealand shooting video because they didn’t have enough training examples.

“But in reality one would never attempt to build an all-in-one classifier for those kinds of videos because there simply are not enough videos to generate robust training sets.” Instead, he says, one would build models to look for instances of blood, weapons, and so forth, to form a composite — a pretty obvious distinction, in his eyes.

This has been a recurring complaint with Facebook, in fact, that it has failed to show interest in adopting better auto-moderating technology. After the London Bridge attacks two years ago, Dr. Hany Farid, the chair of the computer science department at Dartmouth and one of the minds behind PhotoDNA, which helps platforms ID and ban child pornography, told On the Media that he’d tried to help Facebook improve its capacity for detecting terrorist activity. He said he offered them access to his eGLYPH system”and was turned away.

“I have to say it’s really frustrating because every time we see horrific things on Facebook or on YouTube or on Twitter we get the standard press release from the companies saying, “˜We take online safety very seriously. There is no room on our networks for this type of material,’” Farid said, according to CNBC. “And yet the companies continue to drag their feet. They continue to ignore technology that could be used and doesn’t affect their business model in any significant way.”

Robyn Caplan of Data & Society is also at a loss for why, at least in the case of the Devins murder photos, the autonomous systems failed. “This is the same technology they use for copyright and the terrorist database, which I think is why people are so confused as to why it’s not working more effectively here,” she tells me.

She adds that the context concerns of Christchurch”where Facebook backed off on heavier moderation because it didn’t want to ban the news clips that were edited into some of the pro-terrorism posts — wouldn’t really be present in the Devins incident. “This could be a case where brigades, botnets, motivated groups of individuals are just uploading faster than platforms can handle it,” Caplan tells me. “In that sense, both increasing teams of moderation and better hashing could help.”

Which, once again, comes down to a matter of resources. And Facebook may simply be uninterested or unwilling to dedicate the resources necessary to improve its systems. After all, it’s had little incentive to do so.

“Other than a few high-publicity cases of advertiser backlash against particularly high profile cases,” Leetaru tells me, “advertisers aren’t forcing the companies to do better, and governments aren’t putting any pressure on them, so they have little incentive to do better.”

This is obviously a problem. As we saw in Christchurch, and now, again, closer to home with the killing of Devins, we’re watching long-simmering incubators of hate spill over into the real world. It used to be hyperbole to say the walls between online extremism and reality were breaking down and leading to violence, but just look at the performative nature of these last two attention-hounded killings. It’s happening now, and it’s going to happen again.

“People become fluent in the culture of online extremism, they make and consume edgy memes, they cluster and harden. And once in a while, one of them erupts,” Kevin Roose wrote in the New York Times after the Christchurch shooting.

“We need to understand and address the poisonous pipeline of extremism that has emerged over the past several years, whose ultimate effects are impossible to quantify but clearly far too big to ignore. It’s not going away.” Indeed, as Miles Klee points out, users are moving through that pipeline faster than ever, blurring the line between gaming murderous behaviour and just plain murder.

The automated moderation systems that might stop content that promulgates a culture of hate — content plausibly capable of inciting copycat violence — are failing. They’re failing because, yes, it’s a complex and difficult problem, and many communities on Facebook and beyond are ruthlessly intent on promoting toxic content. But they’re also failing because tech companies are apparently unwilling to put their money where their feeds are, and to deploy robust systems able to block the rot as quickly as possible.

It’s as simple as that, sadly: Profit and expansion have been prioritised over tools potent enough to stop the posts and a moderating program sufficient to contextualise them.

The tragic posts of Bianca Nevins should be an ideal use case for computer vision and deep learning moderation AI — disturbing images featuring a victim of a deranged murderer should be a prime target for a program capable of flagging blood, wounds, and death.

If the world’s largest platform, run by one of the wealthiest companies on the planet, can’t block the most outwardly extreme content, after dozens of promises that AI will enable effective moderation from the platform’s CEO, then it has failed its users and the public at large.

It’s time to look at Facebook and Instagram’s failure with AI moderation systems in the eye, and demand some legitimate, buzzword-free answers as to why.

Lifeline, Australia’s 24-hour crisis support and suicide prevention phone line service is available at 13 11 14 for those that need it.

The Cheapest NBN 50 Plans

It’s the most popular NBN speed in Australia for a reason. Here are the cheapest plans available.

At Gizmodo, we independently select and write about stuff we love and think you'll like too. We have affiliate and advertising partnerships, which means we may collect a share of sales or other compensation from the links on this page. BTW – prices are accurate and items in stock at the time of posting.