Maya Ackerman (@ackermanmaya) is a faculty scholar at the Markkula Center for Applied Ethics and an assistant professor at Santa Clara University’s department of Computer Science and Engineering. She is also the CEO/co-founder of musical AI startup, WaveAI. Views are her own.
November 2022 marked the beginning of a new era–the great awakening of generative AI. The emergence of creative machines, which seemingly appeared out of nowhere, captivated the minds of many. People were both in awe and horror of the machines' shocking creative abilities in visual art and language. However, amidst the general reaction, as a generative AI researcher I saw something quite different.
Firstly, a little bit of history. I will not delve into how generative AI existed since the 1950s, with creative machines spinning songs in the style of Bach and Vivaldi (see the work of UCSD professor emeritus David Cope) and I won’t dwell on how machine-made artworks were already showcased in galleries in the 1960s (see Harold Cohen’s AARON). I will start our journey, instead, with two all-too-familiar players: Google and Microsoft.
The largest players in tech are able to invest in their own research and stay ahead of emerging trends, which also means that, with their deep PR budgets, they are often credited with innovation that has been born out of academia. Google and Microsoft were involved in generative AI well before November 2022. You may recall Google Magenta releasing automated music in classical styles. Or, perhaps, you’ve seen the hallucination-like artworks produced by Deep Dream, a unique art style allegedly born out of a bug in Google’s image recognition software.
Microsoft, on the other hand, gave us Tay, a chatbot prematurely unleashed onto Twitter. Much to Microsoft's embarrassment, Tay rapidly turned racist, while on display for the entire world to see. Microsoft has learned its lesson: Keep your brand safely away from risky innovation. Then came OpenAI, a seemingly independent startup that's often associated with Elon Musk. But if you peek under the hood, you'll find that OpenAI is nothing but Microsoft under another name, with the tech giant holding a majority share in this incredibly well-funded startup.
OpenAI made a massive step forward in generative AI using a very simple recipe: It took generative AI models developed by academics, and poured many millions of dollars to make them orders of magnitude larger than ever before, training the models on about 10% of the internet. Though not as innovative as it may appear, this is a legitimately major step forward. Just as the number of neurons in the human brain is a critical aspect of our intelligence, so does the size of an artificial neural network greatly impact its abilities.
The GPT text models by OpenAI were complemented by DALL-E, a very large text-to-image model, which produces original artworks based on user prompts in the style of just about any style or artist you can imagine. This is where Stability AI comes in. Founded by a Hedge Fund Manager, Stability AI was able to accrue the millions needed to build their own version of DALL-E, which they named Stable Diffusion. Their big move was then making the model public domain.
With OpenAI (think: Microsoft) pouring a generous PR budget to popularize DALL-E, it is no surprise that the developer community rushed to check out the open-source version offered by Stability AI. This paved the way for Stability AI to raise an incredible $101M in November 2022, giving it unicorn status.
With the investor community waking up to this emerging trend, startup founders rushed to pivot and create new companies in the generative AI space. For a moment, all eyes were on Lensa AI, as the internet filled with AI-generated selfies. Then OpenAI released ChatGPT, solidifying generative AI as the next great thing.
As we enter 2023, the potential of generative AI to transform industries is becoming more apparent. From creating art, music, text, marketing copy, and perhaps even threatening search, generative AI is already making its mark. However, there are also concerns about the impact of this technology on human creatives. Some worry that if anyone can easily copy an artist's style, what is the point of creating art? This has led to legal action, with artists suing companies like Stability AI for using their images without permission.
It's important to remember that generative AI has had decades of development within academia. In my experience as a generative AI researcher, I've never seen creative machines in the academic context that focus on imitating the style of a living artist, unless the system was created by that artist. Instead, we have general purpose systems that can create anything from poetry, music, pottery, stories, dance choreography and more. Many academics have even experimented with the idea of AI developing their own unique styles to push the creative abilities of their machines.
As someone who has dedicated her career to the research and commercialization of creative machines, I can attest to the incredible potential of these systems. However, I also believe that it is crucial for the industry to find more ethical and sustainable ways of working alongside human creatives, rather than competing against them. As academia has shown, it is certainly possible to create powerful generative AI technology without these pitfalls.
The true power of generative AI lies in its ability to elevate human creativity, rather than replace it. In fact, instead of encouraging imitation of other human artists, AI can help creatives find their own unique voice. As the CEO/co-founder of WaveAI, I have seen firsthand how our systems have helped millions of musicians and aspiring creators to express themselves in new and exciting ways. This is not simply about speed and efficiency, but about opening up a realm of creative possibilities and offering a new type of creative process where inspiration is always abundant.
As we navigate the social changes brought about by generative AI, it is important to keep a clear vision for the future: A future where machines are used to expand our creative capabilities and push the boundaries of what is possible. Within a decade, generative AI will no longer be a source of shock or awe, but a fundamental part of our lives. It is our responsibility to ensure that this transition is handled in an ethical and sustainable manner.