ai-legalclaude-codeopen-sourceip-risk

Can AI Coding Agents Relicense Open Source? The chardet Precedent

@simonw

article5 Mar 20263 min read

Assuming AI-generated code is free of IP entanglement is legally untested — companies using AI to rewrite or reimplement existing software may be inheriting liability, not escaping it.

Can AI Coding Agents Relicense Open Source? The chardet Precedent

By Simon Willison

What Happened

chardet 7.0.0 was recently released as a rewrite under the MIT license — a more permissive license than the original LGPL. The rewrite was accomplished using Claude Code, an AI coding agent. The intent was to produce a clean-room reimplementation that sidesteps the original license's restrictions.

On the surface, this looks like a clever use of AI: feed a coding agent the specification, get a fresh implementation, change the license. Problem solved.

Except it isn't that simple.

Why This Isn't a True Clean-Room Rewrite

A clean-room rewrite has a specific legal meaning. It requires that the engineers writing the new implementation have no prior knowledge of the original code. The idea is to eliminate any possibility that the new code is a derivative work — consciously or unconsciously — of the original.

Two conditions break that standard here:

The maintainer's knowledge. The person overseeing this rewrite has twelve years of deep familiarity with the original chardet codebase. That knowledge doesn't disappear because an AI is doing the typing. The direction, review, and judgment applied to Claude's output is shaped by that prior knowledge.

Claude's training data. The original chardet code almost certainly appeared in Claude's training corpus. The model's outputs are, at some level, informed by having processed the original implementation. This makes the "clean" in clean-room genuinely questionable — the AI itself may be a conduit for the very knowledge a clean-room process is designed to exclude.

The Legal Precedent Problem

This isn't just a philosophical point. As more projects attempt AI-assisted relicensing, courts and legal teams will have to decide what counts as a derivative work when an AI is in the loop. Neither current copyright law nor open source licensing frameworks were written with this scenario in mind.

The chardet case is likely to become a reference point — and not necessarily a reassuring one for teams hoping AI rewrites solve their licensing problems cleanly.

What This Means Beyond Open Source

The implications extend well past the open source community. Any company using AI to write code that is derived from, inspired by, or structurally similar to existing codebases faces comparable IP exposure. That includes:

Engineering teams using AI agents trained on proprietary competitor code

Companies rewriting vendor-licensed software in-house with AI assistance

Any scenario where the AI's training data and the human's prior knowledge together constitute implicit access to protected implementations

The comfortable assumption — that AI-generated code is inherently original — is not legally established. It is an assumption, and an increasingly tested one.

View original source