Every day, we read the news, and we read quotes from people in it. Most of the time these are journalism’s “experts.” At other times we read the voices of those within the communities being covered. But who is quoted, how often are they quoted, what are the proportions of people quoted by gender, title, race/community, etc.? These are questions annual diversity audits address for news organizations that can afford them.
But for everyday reporters and editors who are part of large and small newsrooms, there is no “everyday system” to monitor their own quoting patterns and nudge themselves towards diversity, equity, and inclusion (DEI) norms. To start addressing this gap with human-augmented technology, the Journalism and Media Ethics program at the Markkula Center for Applied Ethics has prototyped a WordPress Source Diversity Dashboard and Monitor toolkit.
The WordPress plugin helps reporters visualize source-diversity proportions for quotes in their article drafts as well as published pieces. Unlike one-time, manual audits, the plugin offers near on-demand feedback to ease barriers to assessing stories’ representativeness, and provides immediate opportunities to fix inequitable reporting.
Figure 1: Prototype of Source Diversity Dashboard. A reporter or editor can request annotation of a draft or published article and see a table of quotes displayed with DEI data.
Figure 2. Prototype of Source Diversity Dashboard. The Source Diversity plugin displays monthly DEI annotated data for all the articles from the news site or a given author. The display includes top-quoted persons as well.
The plugin works in conjunction with a news article annotation server (located at Santa Clara University) to identify quotes in the text of stories and create source-diversity proportions for review in real-time around gender, title, and community representation. We demonstrated early versions of this plugin with a few early adopter WordPress newsrooms beginning in the fall of 2021 and the plugin is now in trial deployments as of January 2022.
Separately and to complement this, we have prototyped a web monitor application that offers the same visualizations at the site level for thousands of U.S. “news” sites. At full scale, the monitor system will be able to host source-diversity data for up to 7,000 U.S. sites. The data dashboards will be at site level (i.e. source-diversity for all the articles coming out of a site), and are organized by month. We’ve built a database of site source-diversity proportions by processing text taken from the Lexis-Nexis news archival service. The monitor tool’s demos will begin in February and will primarily be offered to editors of news sites interested in seeing their own source-diversity patterns.
The Google News Initiative and Local Media Association recently conducted a webinar, titled Innovation Challenge Webinar: Showcase of newsroom tools that advance diversity, equity and inclusion. Subbu Vincent, director of Journalism and Media Ethics at the Markkula Center for Applied Ethics, and Yi Fang, associate professor, Department of Computer Science and Engineering at Santa Clara University shared more about the project and its application in local newsrooms.
A third unanticipated outcome of this toolkit effort is that the project now has a DEI news annotation API server for limited use. Newsrooms that want to build their own source-diversity dashboards can prototype their efforts with a REST API service. Send an article to this server and you will get the data back for that article in JSON format – the quotes, who spoke them, their likely gender, likely community group (BIPOC or white), expert quote or not, etc. This service is now running on a proto SCU server. This effort is not yet ready for at-scale use. We're happy to share the API with any newsrooms staffed with developers who want to give it a spin to test and build their own DEI audit on-demand apps. Get in touch with us if you’d like to know more. [email@example.com]
We’re currently testing the WordPress plugin with Wisconsin Watch and preparing for trial with a few other newsrooms.
If your newsroom uses WordPress and you are interested in testing the plugin, please contact Subbu Vincent by email at firstname.lastname@example.org.
Stay tuned for more this year.
Our prototype software development team
We appreciate all the efforts our engineering students have put in to bring the DEI toolkit to this stage of release. We started with wireframes. From there to UX screens to plugin coding, back-end database setup, annotation API service bring up and bootstrapping the plugin with data for the dashboards, they have done it all.
Xiaoxiao Shang, Computer Science and Engg (Ph.D. student) - backend and API server
Zhiyuan Peng, Compter Science and Engg (Ph.D. student) - Race/Ethnicity/Community Group suggestion API
Qiming Yuan, Computer Science and Engg (M.S, '22) - WordPress development and Monitor app
Sabiq Khan, Computer Science and Engg (B.S. Senior '22) - WordPress development
Lauren Xie, Computer Science and Engg (B.S. Junior, '23) - Testing and UX evaluation
[Also: Upadnya Chavarkar (MS Computer Science and Engg, ‘21) helped with the early stages of this work.]
Subramaniam Vincent, director, Journalism and Media Ethics, Markkula Center for Applied Ethics
Yi Fang, associate professor, Department of Computer Science and Engineering, Santa Clara University.
We gratefully acknowledge support from Google News Initiative, Facebook Research, and Santa Clara University.
The DEI audit toolkit we have developed is based on a news article quote-annotation kernel we built for the Markkula Center’s Journalistic Behavior Detection Dataset project in 2020. This was funded by the News Quality Initiative without which the 2021 DEI Audit toolkit work would not be possible. https://newsq.net/2019/12/18/how-might-we-detect-who-is-and-who-is-not-doing-journalism-online/