A question mark in a ball (datacurious logo)


Jennifer Chen
December 15, 2022

What is AI Generated Art?

Maybe you’re interested in art or technology. Maybe you’re not interested in either! Regardless, you’ve probably noticed that AI generated art has been getting a lot of attention. With a simple prompt, such as “friendly robot riding a large horse,” programs like DALL-E and Midjourney generate images that look like they were created by an actual person. As these programs have become increasingly popular, more and more people are sharing their creations and sparking a lot of discussion about the future of art and technology. Some are optimistic that AI art is a new tool to further enhance creativity, while others worry that this is another case of “AI replacing humans.” 

As with many technology issues, the debate over AI generated art is quite complex and with this post, I hope to break down those complexities for you. Even if you consider yourself a ‘spectator’ and not an artist, understanding how the technology works will help you come up with your own opinions about AI generated art and other ways AI is affecting our lives. 

How it works

To say the AI is “creating” art is a bit misleading. Defining precisely who is doing the creating is one of the issues though! But first, here’s how it works. 

With many of the popular platforms used to generate art through AI, when someone types a prompt, a computer program generates an image using specific instructions that determine the best match to the words in your prompt. These instructions are called algorithms[1]. To build these programs, massive amounts of data are used to train the algorithms to know what to create. If your prompt is “friendly robot riding a large horse,” the program is trained to find images that match those terms and then generate an image. Such programs usually begin by training the algorithm on a massive dataset – for example, Microsoft’s Common Objects in Context (COCO) which depicts complex everyday scenes of common objects in a natural context. The entirety of this dataset, including both the visual image and any accompanying metadata (e.g. tags, a caption describing the image) are ingested by the algorithm to be trained on. 

This process is not automatic. Collecting the massive amount of data is done by humans. Labeling the data (accurately) is done manually by humans. Developing the instructions (algorithms) is done by humans. And refining the algorithms to be better is done by humans. 

As you can see, AI generated art depends heavily not only on human instructions, but also on human-created data that was used to train the program. So, where does all the data come from? 

Source of data

One common source for AI generated art are publicly available sets of images from online galleries. In one example, an AI art generator was given a dataset consisting entirely of a recently-deceased artist’s (Kim Jung Gi) works, along with metadata[2] that described each painting with various keywords and properties: colors, subject of the painting, objects, etc. After ingesting the entire set and being trained on the contents – what a person looks like in Kim Jung Gi’s style, what perspective and lighting are often used in his drawings, the style of his brushstrokes – a user could then prompt the program to create “a horse on a rooftop” drawn in the style of Kim Jung Gi. The result was then a visual image matching the prompt, in Kim Jung Gi’s drawing style. The program[3] was then made available to any user who wished to use it to create images in Jung Gi’s style.


Three general areas of concern with AI generated art include:

  1. Is it actually art? 
  2. Where does the data come from? 
  3. Does AI generated art replace human artists, especially those whose livelihood depends on their art?

1. Is it actually art?

This is a big question, one that is asked over and over as new technologies emerge. Many argue that for something to be considered art, it should be the result of the process of human creativity[4] – from the thinking to the execution – therefore purely AI-generated imagery should not be classified as ‘art’. Others argue that since there is still a human entering the prompt, there is still creativity in choosing the initial keywords. As this technology is so new and still developing, there aren’t many guidelines for users and creators – particularly around the massive datasets used to train the program, which leads us to the second concern.

2. Where does the data come from?

As we mentioned earlier, in order to train the algorithms, you need a lot - a huge amount! - of data. Many artists are concerned that by posting their art publicly in digital spaces, anyone can scrape the data without consent or permission and then generate images that replicate their art style, all without the artist’s knowledge[5]. In the earlier Kim Jung Gi example, there was immediate backlash as the creator of the algorithm created the generator by feeding it images of Kim Jung Gi’s work from Google – without notifying or asking for consent[6]. As you form your own opinions about AI generated art make sure to consider where the data comes from, such as public domain sources.

3. Does AI generated art replace human artists, especially those whose livelihood depends on their art?

Some say these programs to create AI generated art are no different from other tools, like a pen or paintbrush, and that human creativity will always be needed. Even among artists there are those who feel it is a new direction for art, giving artists a powerful new tool for their creations. Others fear AI generated art will diminish the value of human-created art. However, there is always a risk that advanced technology will replace some jobs, no matter what sector. 


While there are concerns around the ethics of using AI to generate images, that doesn’t mean it should be dismissed. Being such a new technology means that we as both creators and consumers can help to shape the conversation right now. Whether you’re interested in trying out these AI generators for yourself, or simply looking through the creations, here are some aspects to keep in mind:

  • From where was the initial data used to train the algorithm sourced? 
  • Was the data from publicly available sources or with consent from the creator?
  • If you’re considering using non-public domain sources of data (e.g. personal online galleries), have you asked the artist for their permission to use their work?
  • Regardless of what data may be ingested for training, did you acknowledge the source?

Thanks to Cassia Artanegara, Jared Maslin, and Jessica Traynor 


[1] To be more explicit, it's a three-step process involving a text encoder, a mapping processing from text encoding to image encoding, and then an image decoder to translate the mapping result into the image we see returned to the user.

[2] Metadata provides high level descriptions of the data. They are essentially tags. 

[3] See the original twitter post.

[4] There's a lot of current chatter around whether AI generated material is even copyrightable.

[5] As our friend and colleague Cassia Artanegara points out, “This is the main concern around the training data - artists don't necessarily consent to the use of their art to train the algorithms. This also falls in line with familiar patterns of unknown/independent artists and designers getting their work plagiarized or co-opted by people/businesses with more social power.”

[6] In this case the artist was unfortunately deceased, but many felt it was disrespectful to the family.

What do you want to know about data, privacy, or technology?

Data Curious is a public resource supported by Good Research LLC in collaboration with the Center for Digital Civil Society at University of San Diego.

To contact us, send us an email at hello@datacurious.org.

AboutPrivacy Policy