The A.I. Art Rebellion: How They Work
Wanna fight back against the Machine? A series of essays on how (and why) to do it.
Preamble: Where I’m Coming From
Most people who have been following my social media this year are aware that I have been experimenting with MidJourney, one of the more popular and accessible A.I. Art Generators. Unlike the majority of casual users, who log in to Discord to enter a few free prompts into the new art toy and lose interest within fifteen minutes, I actually have experience with A.I. development projects (I’ve worked on two of them) and I’m interested in A.I. Ethics (in fact, I was interviewed for a Course in A.I. Ethics at Athasbaksa University).
I immediately recognized that A.I. Art tech had the potential to be immensely powerful…but also immensely disruptive. So I decided to pay for my MidJourney account and dedicate some serious focused time and attention to figuring out how to use it to make images.
For those who might wonder why? I had two reasons.
Reason Number One was professional development. I am a narrative designer, a wordsmith who operates in the videogame industry. When I am working on a team of any size, a certain amount of my time and energy every day has to be spent in communicating my ideas to other people. To do this, I use whatever tools are available to create documents and decks (slideshows) which summarize and illustrate my ideas about characters, story beats, world-building and lore in a game universe. I often need images to communicate my ideas, not just words.
I’ve been doing this job for over 20 years, and I’ve worked alongside many amazing artists. Some were my colleagues in-house, doing concept art and 2D/3D assets for games on a full time basis. Others were freelancer artists who signed on with me part-time when I was acting as my team’s outsource art director, particularly on the Sword of the Stars series.
Over time, I’ve developed considerable patience and competence in writing briefs and summaries, assembling visual references, and working through the stages of concepting with artists doing 2D and 3D work. But I have discovered that there can be profound misalignments of vision between people who tend to think in words (Word People), and people who tend to think visually (Picture People).
Since A.I. Art Generators literally function by bridging the gap between words and pictures, and since all the skills I’ve developed working with real human artists are very similar to the skills needed to work well with the A.I., I thought I should give it a whirl.
Reason Number Two for studying MidJourney can be summed up as “KNOW THINE ENEMY”. I love and respect human artists. My mother and my daughter are both life-long creators in visual media. I’ve tried many kinds of visual art myself, including textile painting etc.. And I’ve spent thousands and thousands of dollars supporting visual artists in various ways over the years, from funding their Patreons to commissioning their work while doing pre-production or spin-off projects as a game developer.
In order to assess MidJourney as a threat to the livelihoods and property rights of human artists, I had to see how it works—and specifically what it can and cannot do. I wanted to see what it could accomplish, where it struggled and failed, what its limits might be—and perhaps more importantly, how fast it could learn and improve.
In the past 30 days or so, as beta versions of the various AI Art Generators have become more and more powerful, the conversation about their impacts on the world around us is definitely coming to a head.
At this year’s Colorado State Fair, an A.I.-Generated Work of Art won a juried digital art contest, defeating human artists using other digital tools.
An A.I. Art Generator has produced the cover art of a Big 5 book and graced the cover of Cosmopolitan Magazine.
Author and artist Ursula Vernon has created some beautiful and profound webcomics using a combination of AI-generated art and backgrounds, as well as her own drawing and graphic design skills.
Science fiction writer John Scalzi has posted his thoughts about exploring AI art, followed quickly by an update renouncing public exploration of AI art when it was used for a book cover.
So. It is probably past time for me to start talking seriously about A.I.-Generated Art, rather than just sharing the results of my experiments.
Some people really don’t seem to know what is is, how it works, and what we can/should DO about its disruptive effects on human artists. (Hint…it’s probably the same thing we need to do for human workers in any field. More on this later.)
For those who might be wondering how we got here, technologically and legally, the first thing you need to know about an AI Art Generator is how it produces an image.
How It Works
To do its magic, any given A.I. Art Generator needs a pair of neural nets. One side of the bot has to be trained to understand human language; the other side has to be trained to understand and produce imagery of some kind. In some respects, an A.I. Art Generator very much mimics the functioning a real human brain—our linguistic/rational and daydreaming/visual cortices are similarly divided.
The linguistic side of the bot uses its training to interpret a “prompt” (a word or series of words) and break it down into a series of information tags which are handed over to the visual side of the bot to serve as the guidelines to assemble an “asset” (an image). One of the reasons I like MidJourney is that the AI always makes four attempts at generating an image from any given prompt, producing quick thumbnails which the human user then has a chance to use, reject completely, or modify by making variations.
When the linguistic data is transferred to the artistic side of the bot, however, you can often see the imperfections in the communication between the two sides of the A.I.’s brain.
Example: when I entered the following simple prompt into MidJourney today, this was the output.
If I change my settings to a previous version of MJ, I can see the “artistic growth” of the A.I. over the past several months. Here’s what the same prompt would have given me with the version of MidJourney that was available in July 2022:
And going back in time to when I first joined the beta test group, here’s the thumbnails the bot could have given me back in April.
As you can see, at various stages of its history, MidJourney has struggled mightily with the production of human features. And it still struggles with facial symmetry of people and animals, and complex features such as hands, paws, tails and other extremities, even in Version 4. Artists who work with the bot often have to spend a certain amount of time correcting the wonky elements of its outputs—the more time they spend making these corrections, the better the result looks.
Another quirk is that MidJourney struggles with grammar. This is most evident when you add a color variable to the prompt. As you can see above, I asked for a woman with brown skin and red hair, but the earlier versions of the bot sometimes also wanted to give me red flowers, a red dress, or a red haze in the background. Because I asked for red, right?
I also asked for a field of wildflowers, too, but the wildflowers sometimes creep onto the woman’s clothing and hair. Because MOAR FLOWERS. The A.I. is looking at a set of word tags and applying them as best it can to create an image that will serve my needs, but it has trouble placing boundaries on the various parameters of the image. It’s the same problem that gives it trouble counting the fingers on a human hand, or the legs on a dog or cat—something that human children figure out by the second grade, as a rule.
When I give it a prompt, the machine understands that “Woman” is the subject of my desired image. “Brown-skinned” is a descriptor of the Woman. “Standing” is a pose. “In a field” is a location. “—of wildflowers” is a modification of “field”, but also an element to be added to the image. Etc..
When I don’t specify a feature of the image, like the time of day, the weather or the quality of light, MidJourney will offer me a default—in this image, the defaults are obviously day-time, clear to partly cloudy skies, summer or spring, etc..
Being able to work with all this data is the reason that MidJourney, like a real human artist, can take a verbal prompt and translate it into an image which has never existed before. But like a real human artist, MJ needs to look at a lot of art, in order to make art. All artists need training and most require references—believe me, I’m speaking from painful personal experience on that score.
Don’t Forget the User
In all the discussion and focus on the machine itself, of course, people often forget that all A.I.-Generated Art relies on a third, much more sophisticated machine—the human user. Or, more accurately, in the case of MJ—hundreds or thousands of human users per day.
When you look at the sequence of MidJourney results above, please keep in mind that no new Training Data was added between the release of the different versions. What took MidJourney from making crude blobby pastel drawings to making polished, commercially viable imagery was the users—people who have donated thousands of hours of their time in generating, upgrading, and flagging the output of the AI, telling the devs when it was getting warmer or colder.
So just as 2.3 billion human-made images were used to teach the A.I. what an image is…every human being who interacts with the A.I. once it began generating images has been teaching the bot how to please us, and what human beings want to see. Every time we use it, we’re telling it what is beautiful, interesting, funny, cute, moving, etc—what is “art”.
Similarly…our desire to see beauty, and our desire to take what’s inside our brains and get it out into the visible world, is the entire driving force behind this tech. Human beings want to create, and the AI is a tool that allows some of us to overcome our handicaps and obstacles in order to be creatively expressive.
So yes. It’s easy for a person who has had the time, money, and physical ability to become a professional artist to tell us that this new tool is evil, and that “real artists” should retain their monopoly on creating visual images that have aesthetic or commercial value.
But are they in the right? I’m not sure. And what gives me doubts is not just that the A.I. is a fun toy, or that I’ve made some art I like with it.
I’m dubious because this is not the first disruptive technology I’ve seen in my lifetime.
I remember a time before there was such a thing as digital art at all.
I remember a time before there was such a thing as digital photography.
And I’m old enough to remember that every time technology expanded a creative field to empower more people with less money, time and access to be creative, the people who had already beaten the game were angry, scared and disgusted at the disruption of the status quo…and bitter that the new artists had it so easy.
And yet, from my perspective, the world became a better and more interesting place every time a new group of people started making images in these new ways. And although some business models changed or went out of vogue, others are still going strong—you seldom find the tech department of your local drugstore developing film nowadays, for example, but they’re still happy to make prints of your digital photos.
Anyway. Right now, I don’t know what the future holds. At present what I see is disruption, and disruption is always dangerous—it lifts some people up while it threatens others.
So I will continue sharing what I’ve learned. I will write a few more essays this week to try and dispel some of the misinformation and distress I’ve been seeing on-line, and make it a little more clear how the technology works. And I’ll also be sharing what I’ve learned as a MidJourney user to help concerned people to take effective action and pressure the developers of A.I. Art Generators to behave in a more caring and ethical fashion.
This is about as much information as I can fit into a single essay, but I’ll be back next time with some detailed information about where AI Art Generators acquire their Training Data, and how you can find out if your art, likeness or images have been used as Training Data without your consent.
In the mean time, though, I want you to walk away with some information that will be useful in the battle to support real human creators whose livelihoods depend on the commercial value of their original creations.
Understand that all A.I.-Generated Art ALREADY HAS developer-created boundaries. MidJourney in particular is engineered to avoid erotic images, drug images, and certain types of political and horror imagery. There are explicit bans on terms related to sex, sexual body parts, blood and gore, and certain kinds of violence—over 400 terms in all. You are also not allowed to ask it to draw pictures of any known porn star, Hitler or President Xi JingPing.
The fact that prompts containing a politician or a porn star’s name can be denied service, means that a prompt containing ANY person’s name could be denied service. And yes. That means yours! If you are an artist who has invested significant time and labor into developing a visual style, and you don’t want your style to be copied by legions of MidJourney users, the tools already exist to protect your property rights. ALL YOU HAVE TO DO IS FORCE THE DEVELOPERS TO USE THE TOOLS THEY ALREADY HAVE. That means that if/when you target them with law suits or petitions, one of the first things you should fight for is the ability to OPT OUT and have your name banned from the language parser.
Being incorporated into Training Data does not mean your art has been stolen per se…at least in the legal sense of the word. Please be aware that the laws as currently written are on the side of the developers, so far as legally generating these images go—and they and their investors know it. At this stage of the game, banning the use of training data gleaned by scraping the Internet would be incredibly hard. You would have to overturn a bunch of different judicial rulings that protect data scraping at various stages, and knock down two or three different companies who collect data legally.
Whether you can own or steal the copyright of any images created by an A.I. is a difficult question. Current copyright laws are out of date, and have no references to images that are created using tech that didn’t exist when they were written. Right now, different A.I. Art Generation Developers are taking advantage of this ambiguity, and making very different (and mutually contradictory) claims about who owns the art that is output by their A.I.’s. One wants to claim the company itself owns all outputs of the A.I.. Another claims all A.I. art exists or should exist under a Creative Commons license. A third says your A.I.-Generated Art may belong to you if you have a paid account with them, but if you’re using the free account it belongs to the developers and the community. And even if you have paid, they may have the right to be credited if you used their machine to generate the work.
While it’s not really possible to copyright most AI-generated art at present, it also wouldn’t be easy to ban it. Currently, a huge number of AI-generated images would be naturally protected under the same laws that protect parody and satire in other media—including images of corporate-owned characters, logos etc., like this painting where artist Paweł Kuczyński makes fun of Pokémon GO.
My current recommendation for artists and their supporters who consider AI-generated Art a threat is to apply pressure the developers directly, on two fronts.
First: while your images MAY have been acquired legally to be used as Training Data for the AI, your art is still copyrighted to you, and you have not consented to have derivative works made, sold, or used commercially by the company or its users without your consent.
Thanks to recent events, it is provably the case that certain outputs of the AI do threaten the livelihood of living artists and can undercut the commercial value of any given artist’s unique visual style.
That means that Capitalism is actually on your side! (For once!) You just have to organize an effective resistance to the companies making the bots, and pressure them to acknowledge the threat to your livelihood.
Second: focus your demands on the tools that the developers have already created for their own use. Every time the users of an AI input your name, or the name of your copyrighted characters/IP’s, that information could be useful in one of two ways: you can interrupt and prevent the AI from using those words to generate images (“opt out”), or you can demand a cut of the company’s profits based on the number of times your works or name are used to generate digital fan art (“opt in”) .
If you work collectively with other living artists (and the estates of those artists who may be deceased, but whose works have not entered the public domain), I think you have a good chance of pressuring the developers to recognize your legal and moral rights and block users from exploiting you with the machine—what I would call “the Forbidden Name” model.
Or, if you would prefer to be inspiring to the unwashed masses, there’s an equally good chance that you could start to collect passive income every time that your name, your IP’s or your copyrighted images are used as prompts—what I would call “the Spotify Model”.
Someday soon, it may no longer be possible to use certain names for “in the style of __X” prompts. You could probably also prevent the AI from using prompts based on characters owned by a particular artist, since fan art of that character is likely to showcase the artist’s style.
Example: if you ban the name “Mike Mignola” but also the character “Hellboy”, you couldn’t get the result below:
I like your perspective on this! Would you say the art community as a whole is against text-to-image products?
I really appreciate your thoughtful exploration! I’m fascinated with AI generated art. It gives me a tool to see images in the world that I am incapable of seeing in my head (I only have words in my head…no pictures.)