AI tools like ChatGPT are built on mass copyright infringement

AI tools like ChatGPT are built on mass copyright infringement



@dragonwriter 3d
In the US, Copyright is not viewed as a natural property right but a contingent property interest for the limited purpose of promoting “the progress of science and useful arts”.

Pointing out that a major advancement would be foreclosed by a proposed interpretation of the scope of copyright law is an argument that that interpretation is either an incorrect interpretation of the statute or an area where the statute conflicts with its Constitutional purpose.

Of course, the Globe and Mail is Canadian, but, is Canadian law the applicable governing law here?

@Animats 3d
Probably not. Humans are trained by reading books and writing something similar, but not identical. Unless what comes out looks a lot like a specific thing that went in, it's probably not copyright infringement in the US.

Artists making copies of famous pictures is a standard part of artist training. Tribute bands are a thing. Elvis impersonators are a mini-industry. The lawsuits so far tend to involve "passing off" new work as someone else's work, as with that ML-generated popular song passed off as a new work by a major artist. That's not a copyright issue. That's ordinary fraud.

The bitching from writers and musicians reflects not that they're being ripped off, but that they're being out-produced. Authors didn't expect to be in the position of John Henry vs. the steam hammer. Now they are.

@koboll 4d
Copyright infringement is when you take copyrighted work and distribute it directly, or so close to directly that it can't be said to be "transformative".

Obviously LLM outputs are transformative, so this argument falls completely flat. As the writer is a copyright lawyer, it's hard to conclude anything other than they are knowingly lying, or at minimum wishcasting what they want the law to say instead of what it does say.

I think the misconception stems from the laymen understanding of copyright clipping off the last part of that sentence so it's just "Copyright infringement is when you take copyrighted work".

Proof of the success of industry campaigns to vilify things like taping broadcast television.

@version_five 4d
The time for copyright has long passed anyway. It's not clear to me that any infringement actually takes place in training AI, and I think we've all seen the arguments. But even if it did, it just means that "infringement" is a stupid construct that needs to be done away with. Intellectual property as a whole is a failed experiment that doesn't actually spur creation, which the the argued point for why the rest of us give up natural rights.

FWIW, the author appears to be the founder or former founder of Licensy - you can guess what it's about, so I wouldn't take the article as a neutral legal opinion anyway.

@DennisP 4d
If it weren't for our dumb copyright laws, we could train AI on all the world's books and scientific papers instead of whatever's available online, and take the next great leap in human progress.
@fooker 3d
If it's a useful technology, no amount of lawyering will stop it.

If you don't have it, your competitors will, and your people will use the competitors software anyway.

@guardiangod 3d
What distinguishes a transformation function like a human (20+ years of education with different input->output) vs a transformation function of a computer? (20k hours of training->output)

What about musician training? Movie directors?

@dataviz1000 3d
They output the most likely token (unless I'm mistaking) which means tools like ChatGPT are the ultimate prior art machines. They answer what is the most parallel amongst similar sources.For example, when I ask ChatGPT to build a state machine, the state machine it attempts to build is the sum of all prior art not any one specific copyrighted machine.
@gmuslera 3d
Probably the biggest copyright infringement is related to open licenses. Is the kind of code easier to scrape, but which licenses have some restrictions on how it can be used, like including the license file or that the code that you based on it must have the same freedoms as the original code (I.e. not using it in closed source commercial programs).

Probably the same goes for open content licenses.

@Exuma 3d
Now here's an opinion I don't care about.