ChatGPT and the Ethics of Deployment and Disclosure

A black keyboard at the bottom of the picture has an open book on it, with red words in labels floating on top, with a letter A balanced on top of them. The perspective makes the composition form a kind of triangle from the keyboard to the capital A. The AI filter makes it look like a messy, with a kind of cartoon style.

Irina Raicu

Irina Raicu is the director of the Internet Ethics program (@IEthics) at the Markkula Center for Applied Ethics. Views are her own.

The need to identify the relevant stakeholders—all of the people and groups who will be impacted by a particular choice—is the first step mentioned in many ethical decision-making frameworks. It is a key component that underlies the ethical analysis to follow. Yet, in its recent decision to release ChatGPT into the wild, OpenAI seems to have failed to consider students and educators among the stakeholders who would be impacted by the text-generating technology.

OpenAI announced the public release of its AI tool on November 30th of last year. Within weeks, there were reports of students submitting ChatGPT-generated responses to assignments in a variety of courses, at various educational levels. Teachers scrambled to respond, as did others.

Programmers are more likely to understand and use such tools, so they acted first: by December 5th, StackOverflow (which is a question-and-answer site for developers) had already moved to ban the submission of ChatGPT-generated answers. As the administrators of the site explained,

“because the average rate of getting correct answers from ChatGPT is too low, the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking and looking for correct answers. The primary problem is that while the answers which ChatGPT produces have a high rate of being incorrect, they typically look like they might be good and the answers are very easy to produce.” [emphasis in the original]

Some teachers have moved to ban the use of ChatGPT in the classroom, too; others have tried to clarify that their long-standing policies against plagiarism apply to AI-generated plagiarism; some have sought to find ways to use the new tool creatively, in ways that don’t involve plagiarism—and sometimes in ways that directly critique its usage (for example by asking students to review ChatGPT’s privacy policy).

Some technologists are also playing an educational role, by demystifying the technology and clearly explaining its limitations. As one of them, Colin Fraser, explains it, language models are programmed to “record empirical relationships between word frequencies over a historical corpus of text, and use those empirical relationships to create random sequences of words that have similar statistical properties to the training data.” He adds, “The only thing anchoring the output of a language model to the truth is the truth’s relationship to word frequencies in the training data, and nothing guarantees that relationship to be solid.”

Compare those comments to OpenAI’s statements in its announcement blog post on November 30:

“We’ve trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer followup questions, admit its mistakes, challenge incorrect premises, and reject inappropriate requests. ChatGPT is a sibling model to InstructGPT, which is trained to follow an instruction in a prompt and provide a detailed response.”

It is only further in the post, if you scroll past “Samples” and “Methods” sections, ignoring multiple buttons that invite you to try the program, that you get to the “Limitations” section. The first one listed is that “ChatGPT sometimes writes plausible-sounding but incorrect or nonsensical answers.”

Another limitation listed is that ChatGPT “is often excessively verbose and overuses certain phrases…. These issues arise from biases in the training data (trainers prefer longer answers that look more comprehensive).” Left unsaid is that “longer answers that look more comprehensive” is a pretty good description of a lot of bad writing.

Imagine if the introductory paragraph would have read, instead, “We are introducing a model optimized for dialogue—not for accuracy or for good writing.” Would that have discouraged at least some of the widespread misuses of the tool—including those by students?

As instructors continued to struggle to identify AI-generated text and put policies in place to react to its deployment, on January 31, 2023, OpenAI published another blog post—this time announcing a “New AI-classifier for indicating AI-written text.” This time, some careful internal drafters chose the word “indicating”—as opposed to “identifying.” This time, a disclaimer came much earlier: the first line of the second paragraph, bolded, reads, “Our classifier is not fully reliable.” Unfortunately, that immediately proves to be a startling understatement. The next line quantifies the “not fully reliable” part: “In our evaluations on a ‘challenge set’ of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as ‘likely AI-written,’ while incorrectly labeling human-written text as AI-written 9% of the time (false positives).”

The post continues with its understatements—for example when it states, “[w]e recognize that identifying AI-written text has been an important point of discussion among educators…” “Discussion” doesn’t begin to do justice to the upheaval that the release of this tool has created in education.

No organization can identify every possible way in which its products might be misused. Some likely misuses, however, are fairly obvious. OpenAI couldn’t have been surprised by the fact that students might turn in ChatGPT-generated writing as their own. It could have consulted with educators and students before releasing the product into the wild. It could have waited to release it at the same time as a tool that would do a better job of identifying and watermarking AI-generated responses.

Instead, two months after the public release of ChatGPT, OpenAI blogged, “We are engaging with educators in the U.S. to learn what they are seeing in their classrooms and to discuss ChatGPT’s capabilities and limitations…. These are important conversations to have as part of our mission is to deploy large language models safely, in direct contact with affected communities.”

We need a redefinition of what constitutes ethical “direct contact with affected communities.”

As the observant data scientist Colin Fraser pointed out, “ChatGPT seems to assign relatively high probabilities to responses containing an admission of error. This appears to the user as though it recognizes and corrects its mistakes.” It’s ironic, then, to read the latest OpenAI blog post, with its careful sidestepping of any admission of error—or apology. But then, Fraser also reminds us that ChatGPT’s apparent recognitions of error are themselves just probabilistic analyses of word combinations: “There is no reexamination, there is no searching for errors, there is no correction, there is no apology.”

Lack of self-awareness should be expected in AI, but not accepted in organizations run by humans.

Feb 7, 2023