Arthur's Blog

A Copyright Danger of AI Generated Content

2026-04-04 || Tags: programming ip copyright

While it has not been tested in any courts, there are some people[1], including myself, that think that there is a real risk that "authors" of content generated by AI will not obtain any copyrights in this work. I use content in a very broad sense here includes images, videos, novels, source code and more. As it is most relevant to me, I'm going to basically only consider source code though.

Why Does It Matter If I Don't Get Copyright?

So, why should we care at all? What's does copyright even give me? Good questions!

Copyright is a right that gives authors/coders/companies-that-employ-programmers the abilty stop someone else from copying their work (duh! hint is in the name). If you do not have this right it means you will have one less, quite powerful, tool in your quiver to prevent someone from copying your work.

If you're into your free and open source software, this will also mean that your free and open source software license will not apply and mean nothing at all - essentially making your lovely open sourced repository into public domain code.

Without this right, you are basically only left with legal recourses relating to trade secret protection and breach of contracts, if any were in place [2]. These sorts of fights are tough. Giving up any tools that you have is not something you want to be doing.

But I Came Up With The Original Ideas Behind The Work!

Doesn't matter. Unless you have a patent or other specific agreement/contract, there is sweet fuck all you can do to stop someone taking the idea underlying your code/novel/image. Copyright only covers the "expression" of an idea and code is considered an expression of an underlying idea/algorithm. Patents cover the idea itself. This is sometimes called the Idea-expression distinction. If you're reading this blog, I highly suspect you'd be interested in reading about this if you haven't yet.

So, What Can I Do? (For Proprietary Software)

A first, extremely obvious, way is to simply not use AI. The issue discussed here is purely a result of copyright potentially not existing for works generated using AI.

Given how the world is changing though, especially with agentic coding, this is very quickly becoming a less and less likely option if you wish to remain competitive. Your competitors are writing thousands of lines of code using it and it really does feel like if you're not using agents to write your code, you are being left in the dust.

So, with AI code generation as a given, what can you do?

Show Your Working

The tests for whether copyright exists come down to human authorship. Again, no case law, but I do think that there would be some argument around a software engineer's ability to select, adapt, reject, reform, rephrase, and re-prompt during the process of using AI to generate code might show that a real human is giving actual creative input. And where there is this creative input, we are very much moving towards copyright existing in the work.

To do this, I would suggest you/your engineers taking detailed-ish notes about the what and how they are generating their code. A note on what prompts were used, a note on where perhaps a solution the AI devised wasn't ideal, a comment on directing the AI to do something a certain way. All these things are great. How you store this, I'm not sure.

If you are using AI like an auto-complete, then the best solution I can think of is to provide comments on commits of the code. This has the advantage of being directly associated with the code, without adding a whole lot of noise to the code itself should a human need to read it later.

Comments in code might be appropriate in some cases, however, as mentioned, that could be noisy.

If you are going full-on, large swaths of code generation using AI (à la vibe coding), this might not be appropriate as the AI is generating thousands of lines of code and doing all their commits etc themselves. In this case, documenting how your prepared your prompts, how you planned it out with the AI, and anything of the sort is, I think, a good way to go. Then, obviously, you have to store this somewhere. Where you store that I'm not sure beyond the normal engineers log book (or Confluence page or whatever people use these days). Painful for sure.

Clear Delineation

I would very highly suggest you separate out any AI generated code into its own modules and/or functions. As with many things in life, clarity is good. If a function is AI generated, say so. If a file is AI generated, label it as such. This does kind of suggest "anyone reading this may use this code as they like", but, the flip side is the assertion that anything not labelled with this would be covered by copyright.

With this, should the worst happen, and your entire code base is leaked, at least you will have a stronger case around the parts that do have copyright protection and it will hopefully be one less thing to argue in a court.

Even putting a comment in there somewhere (especially if "free" software) to say as much would help.

Keep It Secret

This entire problem can be avoided if no one ever sees the code. If no one can read the code, then no one can copy it.

Simple and easy. But at the same time, difficult and impossible.

I'm not going to go into techniques on how to do this. I think that might be an alright topic for another blog post.

Keep It Protected

The first point of call for this is education around keeping things secret. Most people understand that the code they write for a company is company property, but not everyone will fully appreciate what that actually means. Telling people "this is secret, do not share" can go some way.

Secondarily, make sure contracts are in place for anyone that interacts with the code. Employment contracts are the most obvious. Your employment contract likely already includes clauses about not sharing anything. This does have its limits though. Even if an employee does share something, with malice even, and you believe you've given away a million pounds worth of value, what are you going to do? Sure, you can try sue them for damages, but that individual is not going to have a million pounds to pay you back in an way.

A contract with another company with similar "no sharing" clauses will be of more value as they likely have some money and it would be worth going after them.

So, What Can I Do? (For My Open Source Project)

Permissive Licenses

If you're using a permissive license (e.g. MIT, BSD, Apache), there is probably nothing for you to worry about. Permissive licenses already allow other to take your code without giving you anything of much in return (sometimes not even attribution). The permissiveness you've allowed through using such a license already means anyone to take your code and do whatever they want with it.

Free Licenses

If you're using a free license (e.g. GPL) this story is a bit different. You need copyright to exist to enforce the four freedoms for your users. That is legally how free software works.

Everything I have written above about proprietary software above applies here (except the keeping it secret and keeping it protected parts).

Caveats

As with everything legal, your specific situation will be different. I am an attorney, but I am not your attorney and this is not legal advice for you or your company.

This is presented as a 70% correct to cover a general case and highlight this potential problem (as well as for the sake of brevity - I don't want to write a legal essay exploring all corners).

Finally, this is not settled law. Given the hotness of AI generated code, I would not be surprised if we do get a nice juicy court case in the next couple of years where this question is addressed directly. Until that occurs though, all we can do is interpret law and take good guesses at how we think courts will decide.

If, after reading this, you can see something like this being a problem, ask your lawyer.

[1] See second to last paragraph.

[2] There are probably more - I can't think of any obvious, readily applicable ones.