Beyond the Black Box: Unpacking the Impacts of Generative AI in Academia

Erik Wieboldt, Senior Staff Writer

Pandora’s box is open. Generative AI exists and will continue to influence academic and instructional settings. How we choose to monitor, detect, and utilize this tool as individuals and at a university level will determine what will come from this technology. For many, GenAI tools feel indispensable as our expectations for research are changing alongside it. To explore the impact of GenAI — e.g., ChatGPT— on educational structure and learning, I participated on a student panel during UC San Diego’s Academic Integrity Virtual Symposium. My fellow panelists were Kharylle Rosario, a junior majoring in Molecular & Cell Biology; Nathaniel Mackler, a senior majoring in Cognitive Science; and Sukham Sidhu, a junior majoring in Economics & Data Science. Our panel was moderated by Avaneesh Narla, a doctorate candidate in Physics with a Quantitative Biology Specialization. 


Our panel discussed the range of impacts that GenAI has in education, including the fields of law, medicine, and even creative writing. In education, we acknowledged that while GenAI can be used as a tool to support learning, there is also the potential for malicious use. For example, the line between plagiarism and original work becomes blurred with GenAI use. Also, in many cases, we cannot identify the sources from which the GenAI is pulling, so there is an argument to be made that GenAI is stealing intellectual property when it generates text or images. With that being said, there is no strict legal code to guide GenAI use — at least in the United States — and in education, there is inconsistent implementation of restrictions on its use.


Detection of GenAI use is another hot topic in education. Tools like GPTZero provide a percentage likelihood that a provided text is AI generated or written by a human. While this novel tool could theoretically deter students from simply submitting GenAI output as their own work because of the risk of being detected, it is also true that GPTZero is not flawless. They claim to have a detection accuracy rate “higher than 98%,” which is outstanding for such a new technology. However, it’s also worth noting that, in this margin of error, there can be false positives and negatives. With some institutions considering an expulsion policy for the use of GenAI, false positives could result in serious harm.


Our panel also discussed the ethical implications of GenAI use in other areas. Systems such as Microsoft’s Tay chatbot had to be taken down within 16 hours of  its 2016 launch because of inflammatory hate speech. Because the data GenAI was trained on is influenced by human biases, so too are the outputs. There is also the issue of the “Black Box” of artificial intelligence; those who created the code that drives GenAI do not really understand how it works. This Black Box effect is of concern because, in some cases, language generative tools have pulled from nonexistent sources, have been wildly incorrect, and have provided sources that are fabricated. On top of inaccuracy, there have been specific examples of tools like ChatGPT having strange and discriminatory outputs. It’s also important to highlight that GPT-3, the predecessor to ChatGPT created by Open AI, was prone to “violent, sexist, and racist remarks” as well. According to a report by the Time Magazine, to curb these biases, OpenAI “sent tens of thousands of snippets of text to an outsourcing firm in Kenya” using very graphic material to train the system to detect and filter these materials. This outsourcing, on behalf of San-Francisco-based firm Sama, paid their workers between “$1.32 and $2 per hour depending on seniority and performance” on some of the most vile content the internet had to offer. While this relationship between OpenAI and Sama later fell through, the creation of artificially generated text relies on exploitative labor. Regardless of possible benefits, its origins cannot be overlooked.


The origins of GenAI systems are important to consider when assessing their usefulness in academic settings. These tools are still being worked on. They have flaws, and in many cases need human oversight to function well and ethically. The usefulness of these GenAI tools does not exist in a vacuum. While there have been many helpful uses of AI systems, such as in predicting abnormalities early in health screenings and training models to translate obscure languages that may have otherwise been lost to time, the ethics and ground rules of this technology need to be seriously considered for general, academic, and industry use. I’m happy to have spoken on a panel of students from different majors in different departments, different educational backgrounds, and different perspectives on how artificial intelligence impacts our environments and learning. It frightens me, however, that these conversations are not more mainstream. The ethical implications, origins, and future of Gen AI systems are crucial to unpacking understanding their relative impact. I hope that conversation discussing the net good of GenAI systems continues to happen so that we can figure out how to best use AI. The possibilities are beyond our imagination, but hopefully not beyond our control.

Generated with the assistance of OpenAI’s DALL-E 2 by Sparky Mitra