Dalle2 AI is a tool for stealing copyrighted IP

AI vs IP

Is Artificial Intelligence being used as a means to extract value from copyrighted intellectual property? For years pundits and advocates have struggled to attach tangible meaning to the value of privacy. The article “Privacy is Cybersecurity for People” is an attempt to show that watching your everyday behavior had become an industry of transforming your data into intelligence for the business world. Building on the concept of “Surveillance Capitalism“, the idea has gained traction in the public discourse, even if not in practical terms. Most of us now recognize that our mundane daily activities are being turned into profitable business inputs. But what can be done about it and why do we really care? With the advent of AI there is now a reason to care – let’s look at a practical example of this process at work and we’ll use Dalle2 to illustrate.

Copyright and Intellectual Property

I’m not going to retread old ground here except to say that anything you create is your copyrighted intellectual property. There are caveats and nuance to be sure but for simplicity we’ll stick with that definition for now. So, when an AI like Dalle2 “creates” an image does it truly “make” that image from its “imagination”? No, it does not because it is not alive. Dalle2 AI (and similar tools) are really Generative Search Engines that amalgamate a lot of images based on textual input. Put simply, they search the open web intensely, categorize everything algorithmically, and recombine the results using predefined techniques like an “automated Photoshop.” Remember, software is nothing but stacks upon stacks of patterns. But if Dalle2 AI can’t actually create anything then what it is doing?

A Practical Example

Here are some popular images that Dalle2 AI shows off in its main interface when users log in. They are compelling and you are invited to believe the miracle that Dalle2 created them. But since we know Dalle2 isn’t alive, and can’t actually create, what is happening? What is the source of these images? It is a parlor trick that uses pattern-based sleight of hand to convince the user that something like magic is happening in front of their eyes. As though the AI has conjured gold from a mass of digital lead.

Standard Search Engines

If you search for similar images in a common search engine using descriptive text and look for photos you realize Dalle2’s images aren’t unique at all. Using a few search term constructs similar to the descriptions of the images above produces very similar results to the Dalle2 AI images. Except these are images created by humans in one manner or another. Some are obviously actual photos created by users staging a scene. Other images are created by users with digital imaging software to manifest their artistic vision. Finally, others are digitized photos of traditional art created by hand with brushes, pens, inks, and paints. Once you recognize Dalle2 AI has no awareness of the world in which it exists – or the actual meaning or value of the images it creates – it is easy to see it is just “Photoshop” with a really intuitive interface. You realize that Dalle2 AI is using a mountain of existing material and generating a composite image based on user input.

search engine david with headphones
Various Examples of David and Headphones
Search Engine Monkey Astronaught
Various Examples of “Monkey Astronaut”
Search Engine Color Explosions 2
Various Example of “Color Explosions”

Giving Credit Where Due

If you get access to Dalle2, they make it plain that anything generated by it is the copyrighted intellectual property of OpenAI. Dalle2 AI even automatically adds a little color band watermark in the bottom-right of any image. But Dalle2 isn’t a free public resource – it is a pay for fee tool. To summarize: a machine is generating images based on the copyrighted work of humans and selling access to use the tool that generates the images. You are only paying for access to the tool – you don’t own the generated images. To drive the point home, let’s look at what happens if you use a Reverse-Image Lookup tool with the generated Dalle2 AI images. The fact that another algorithm-based tool for image analysis (like TinEye) only returns results from Dalle2 AI drives home the impression that Dalle2 actually created a unique asset. One “Search Engine” (say, Google) provides the images to another (Generative) Search Engine (Dalle2) while yet another (Reverse) Search Engine (TinEye) appears to validate authentic originality. Madness!

What About the CYA Problem?

The most perverse and socially subversive result is that companies using artificial intelligence are likely to use copyright protections to shield themselves. AI companies are likely to argue that every instance of image generation is a variation different from anything that existed before it. This is the basis of copyright protections for creative works and, combined with the area of “Fair Use“, governs the business of Intellectual Property. So, it is telling that Dalle2 AI terms specify that everything it creates is owned by OpenAI and not by the people who pay to supply the prompts (instructions of the end user). So, on one hand a corporate tool is automatically turning publicly available copyrighted content into a corporate asset; on the other hand, nothing created by those paying to use the tool have any property right to what is created. Likewise, no attempt is made to recognize or remunerate the original creators on which generated images are based. In short, Dalle2 AI is nothing but a giant copyright appropriation engine. A means of transforming the work of others into the profit of the artificial intelligence’s owner.

  • Step 1: Scan Massive Archive of Copyrighted Material without Demonstrable Market Value
  • Step 2: Train Machine Learning Algorithm to Synthesize Material based on a Taxonomy of Classification
  • Step 3: Associate Taxonomy of Classification with Popular Vernacular (voice prompts, characterizations, everyday parlance,etc)
  • Step 4: Claim Copyright of Synthesized Output is the Property of Algorithm Owners
  • Step 5: Sell Access to the Algorithm and Quietly Ignore Users Application of Produced Results
  • Step 6: Sue Users for Copyright and/or Terms Breach in the Event Produced Results have Civil or Criminal Penalties

This is an Affront to Common Sense

If we continue on this path, the eventuality is we find ourselves in a position that corporations use the “Commercial Free Speech” legal interpretation to claim Fair Use protections for their AI Algorithm’s “Speech.” After all, Dalle2 AI tools aren’t selling the Image but access to the tools to create the Images. The logical conclusion is that OpenAI (and others) will use copyright protections intended for the creation of novel work – or fair use – as a shield. This sets up the potentially perverse situation where corporate machines are using copyrighted works to generate profits using “Fair Use” protections to shield themselves from input costs/fees. If you recall, the point of the “Fair Use” doctrine was to protect the freedom of speech of citizens. Nowhere in Dalle2 AI are you allowed to create expressions that reference politics, political parties, sexuality, gender, brands, trademarks, logos, slogans, celebrity – basically anything that might be specific parody or critiques. Dalle2 AI relies on Fair Use to create copyrighted outputs that manage to undermine the expression of free speech while making money by creating valuable intellectual property. That is some next-level “Meta” Orwellian Corporate Dystopia.

dalle2 social commentary
Celebrity, Political Commentary, Criticism are Denied

The Future of AI

What comes next? If this pattern applies as a means to extract value from copyrighted work expect it to apply to open source code, scholarship and academic research, your likeness in images on your social media, all those emails and correspondence stored in free email services – anything for which you cannot prove infringement or demonstrate damages. How did Google train LaMDA? Based on this example it seems likely they used all your emails you have blithely stored in Gmail.

I guess this should not really be a surprise. If you think about it, you’ll notice the same principle is at work with Alexa, Siri, and other Voice AIs. If they aren’t selling you a product all they can tell you is whats on National Public Radio, Wikipedia, or the Weather.

It may be that the category of “generative search engine” itself is simply problematic in that all it really does is shift value from copyright holders to AI companies. That realization on the behalf of corporate AI owners seems like a more plausible explanation for why their output is obfuscated and purposely sub-optimized. They aren’t worried about disrupting society. Its a slow-drip campaign to see if they get sued: first by artists generally, then by artists specifically, then by brand-owners, then by celebrities, and finally by public figures. In the end the question will be whether AIs have the right of free expression.

Header Photo by Arseny Togulev