Social Media and Content Moderation: Why ‘Dangerous Individuals and Organizations’ Policies...

Courtney Davis ’21

Courtney Davis ‘21 is a Hackworth Fellow with the Markkula Center for Applied Ethics at Santa Clara University. Views are her own.

Facebook claims its policies aim to prevent and disrupt real-world harm by disallowing individuals that proclaim a violent mission from having a presence on their platforms, and yet the insurrection was a significant real-world harm and Trump’s mission was undoubtedly violent.

Perhaps the post-by-post enforcement model inadequately defends against nuanced and coordinated incitement campaigns. Perhaps a more robust approach to policy enforcement is necessary—one that captures and contextualizes patterned rhetoric and controversial posts that do not explicitly violate community standards.

In early January, Facebook de-platformed Donald Trump for violating their Dangerous Individuals & Organizations policy which stands to “prevent and disrupt real-world harm” by disallowing organizations or individuals “that proclaim a violent mission [from having] a presence on Facebook.” Almost simultaneously, Twitter de-platformed Trump for violating their Glorification of Violence policy. This policy also aims to prevent users from glorifying acts of violence on their platform, particularly content that could “inspire others to replicate those violent acts and cause real offline harm.”

Representatives from each platform issued statements addressing their company’s approach to policy enforcement. The justification for Trump’s removal was nearly identical in both instances. “We removed these statements yesterday because we judged that their effect—and likely their intent—would be to provoke further violence,” wrote Zuckerberg on January 7th. He went on to call the risks of preserving Trump’s account “simply too great.” Twitter’s company account echoed this sentiment the following day. “We have permanently suspended the account due to the risk of further incitement of violence.” Both companies linked their decisions to individual posts made by Trump either during or immediately after the insurrection. While Facebook cited posts published by Trump on January 6th, Twitter cited tweets published on January 8th.

Importantly, the fate of Donald Trump’s infamous social media presence came down to a few poorly timed posts, at least from the perspective of policy enforcement. Two posts were the difference between Trump’s status as a “public figure with protected speech” and a “dangerous individual with a violent mission”—his speech had finally crossed the hotly debated harm threshold. Had he stayed silent during the insurrection, Trump might still boast a monopoly on our digital news feeds.

Now, nearly three months later, former President Donald Trump is fending off an onslaught of criminal investigations and lawsuits that directly cite his old social media posts as primary evidence of misconduct. An unusual body of Twitter threads and Facebook posts even shaped the Senate impeachment trial.

The Article of Impeachment passed by the House of Representatives cited Trump as the primary cause for the violent insurrection in January. “Incited by President Trump,” the Article reads, “members of the crowd. . .unlawfully breached and vandalized the Capitol, injured and killed law enforcement personnel, menaced members of Congress, the vice president, and Congressional personnel, and engaged in other violent, deadly, destructive, and seditious acts.”

Surely, these are consequences that Facebook and Twitter executives would qualify as “real world,” and “offline harm,” yet all of these acts occurred before Trump was de-platformed.

This fact might hold less weight if the House hadn’t presented a comprehensive analysis of Trump’s social media activity during the impeachment trial in an attempt to prove that Trump qualified as a “dangerous individual with a violent mission” long before January 6th. The House managers’ opening statements alone contained over a dozen screenshots of Trump’s Twitter feed. Posts dating back to October were presented as evidence that Trump’s incendiary speech was patterned and intentional. They argued that Trump used disinformation and incendiary rhetoric to design a back-up plan in case he didn’t win the election fairly. Of course, his back-up plan was nothing short of a hostile takeover, and all of these posts, they argued, implicated him as inciter-in-chief.

^{Photo: Associated Press}

Why, then, were Trump’s accounts removed in January after it was too late? And what does all of this say about the efficacy of Facebook and Twitter’s incitement policies? For one, it suggests that a post-by-post enforcement model inadequately defends against nuanced and coordinated incitement campaigns. Perhaps a more robust approach to policy enforcement is necessary—one that captures and contextualizes patterned rhetoric and controversial posts that do not explicitly violate community standards.

Even after the impeachment trial, several House Representatives sued Donald Trump for incitement. Mississippi Congressman Bennie Thompson accused Donald Trump of violating the Ku Klux Klan Act, calling the insurrection at the Capitol a “direct, intended, and foreseeable result of the Defendant’s unlawful conspiracy.” This case also leverages tweets from early December as evidence of incitement.

On December 10th, days after armed protestors threatened Michigan Secretary of State Jocelyn Benson at her home, Trump tweeted, “People are upset, and they have a right to be...This is going to escalate dramatically. This is a very dangerous moment in history.” Trump also made a habit of endorsing the actions of his most radical supporters through his social media. As “Stop the Steal” rallies became particularly violent in early December, Trump tweeted, “I’ll be seeing them!” He also used his platforms to direct this momentum: “Big protest in DC on January 6th. Be there, will be wild!”. None of these posts qualified as incitement on Facebook and Twitter’s terms.

Beyond Trump’s own posts, the case also draws attention to his following’s response, illustrating just how these campaigns gain momentum on social media.

It highlights how some Trump supporters discussed the need to “go to war” with dissenters on message boards. It mentions how the Arizona GOP Twitter account posted a clip from the movie Rambo and asked if supporters were willing to die for Trump. On January 1st, the chair of the Women for America First organization assured Trump that the cavalry was coming in a post that the president then re-tweeted.

When evaluated individually, none of these posts violate Facebook and Twitter’s Community Standards. They do not contain explicit evidence of the poster’s violent intentions. There is no “proclamation of a violent mission.” When evaluated all together (as they were in the Impeachment Trial and the Thompson Case), the posts prove that Trump led a pointed and enduring incitement campaign.

Whether or not he was conscious of it, Trump took advantage of the policy language that blurs the line between implicit and explicit incitement in Facebook’s Community Standards. Facebook claims to prevent and disrupt real-world harm by disallowing individuals that proclaim a violent mission from having a presence on their platforms. Yet the insurrection was real-world harm, and Trump’s mission was undoubtedly violent. Clearly, Facebook’s content moderation procedures are morally dubious.

Nearly all content moderation decisions occur at the individual-post level. Harmful content must be flagged by AI or reported by a user before it is sent to one of Facebook’s 15,000 content moderators. These moderators are responsible for deciding whether or not the flagged post violates any relevant Community Standards. They are provided no context beyond the language that is contained within the individual post. Sure, low-level Facebook employees were not responsible for evaluating Donald Trump’s posts. But they were certainly responsible for handling many of the accounts that internalized and reciprocated his rhetoric.

Even the higher-up Facebook executives responsible for monitoring Trump’s account could only operate within the post-by-post enforcement model. This is proven by the official justifications that were given by Zuckerberg and other members of the Facebook team about their policy decision. Trump was de-platformed because of the comments he made on January 6th, not because of the instrumental role he played in constructing a larger hate-fueled, anti-democratic narrative.

And while he began writing it long before October, “Stop the Steal” gave Trump’s hate-fueled narrative cohesion and direction. Within this framework, Trump and his followers were able to transmit coded messages that bypassed community standards on social media platforms. Suddenly, plans to coordinate violent action at rallies and election events were politically legitimized. The message “Stop the Steal” itself is a moral appeal—a call to correct serious and perverse injustices. Trump’s most devout followers believed they could not act otherwise. It was their duty to act. Evidence of their sense of duty was scattered all over Facebook and Twitter in the months leading up to the insurrection—the same duty that inspired thousands to chant “Victory or Death” at Trump rallies.

So what then, is the content moderation equivalent of evaluating all of Trump’s posts together? What is the content moderation equivalent of evaluating all of Trump’s posts in conjunction with the rhetoric espoused by his followers? What might qualify as a “dangerous narrative,” or is Facebook only able to account for dangerous individuals and organizations?

May 4, 2021

Social Media and Content Moderation: Why ‘Dangerous Individuals and Organizations’ Policies...

Subscribe to Our Blogs