Hiltzik: An AI firm faces a $1 billion piracy bill


The artificial intelligence camp likes large figures. The sum raised by Openai in its latest financing cycle: $ 40 billion. Investments expected on AI by Meta, Amazon, Alphabet and Microsoft this year: 320 billion dollars. NVIDIA Corp. market value, the flea supplier for AI companies: 4.2 dollars.

These figures are all considered by AI members as validating the promise and potential of the new technology. But here is a figure that points to the opposite direction: 1.05 Billion of dollars.

This is how the company of Anthropic could be on the hook if a jury decides that it voluntarily hacked 6 million pounds protected by copyright during “training”, its Bot Ai Claude, and if the jury decides to hit it with maximum statutory damage of $ 150,000 per employment.

Anthropic faces at least the potential for liability for the end of business.

– Edward Lee, Santa Clara University School of Law

This is anthropic in “A legal struggle for its very existence“Consulates that Edward Lee, an intellectual property law expert at the Santa Clara University School of Law.

The threat was born on July 17, when the American district judge William Alsup certified a trial for violation of the copyright brought by several authors published against Anthropic As a collective recourse.

I wrote on the case last month. At that time, Alsup had rejected the request to violate the copyright of the complainants, noting that the use by anthropic of the material protected by copyright to develop its bot IA fell into a copyright exemption called “fair use”.

But he also noted that the download of anthropic of copies of 7 million pounds of “Shadow libraries” online, which included countless works protected by copyright, without authorization, felt hacking.

“We will have a lawsuit on hacked copies … and the resulting damage,” he advised anthropically, in a disturbing manner. He put meat on these bones with his subsequent order, designating the class as copyright owners of anthropic books downloaded from the Shadow Libgen and Pilimi libraries. (Several of my own books are found in Books3, another library of this type, but Books3 is not part of this case and I do not know if my books are in other libraries.)

Class certification could considerably rationalize the anthropogenic dispute. “Instead of millions of distinct proceedings with millions of juries,” Alsup wrote in his initial decision, “we will have only one procedure before a single jury.”

Class certification adds another wrinkle – potentially major – to current legal disputes on the use of works published to “train” AI systems. The process consists in nourishing enormous quantities of published material – part of it drawn on the web, some drawn from digitized libraries which may include the content protected by copyright as well as the material in the public domain.

The objective is to provide BOTS AI with enough data to allow them to glean language models that they can regurgitate, when a question has been asked, in a form that seems to be (but is not really) the exit of an intelligent entity.

The authors, the musicians and the artists have filed numerous prosecution by affirming that this process undermines their copyright, because in most cases, they did not grant the authorization or were compensated for the use.

One of the most recent cases, deposited last month at the New York Federal Court by authors such as Kai Bird – Co -author of “American Prometheus”, which has become the authorized source of the film “Oppenheimer” – The charges that Microsoft has downloaded “About 200,000 pirated pounds” Via Books3 to form your own AI bot, Megatron.

Like many other copyright cases, Bird and his complainant colleagues argue that the company could have trained Megatron using works in the public domain or obtained under license. “But one or the other of them would have taken more time and will cost more money than the option that Microsoft has chosen”, according to the complainants: form your bot “without authorization and compensation as if the laws protecting works protected by copyright did not exist.”

I asked Microsoft an answer, but I did not receive an answer.

Among the judges who thought about the problems, the tide seems to develop in favor of the training process as a fair use. Indeed, Alsup himself arrived at this conclusion in the anthropogenic case, judging that the use of the equipment downloaded for the formation of AI was fair use – but he also heard evidence that Anthropic had kept the equipment downloaded for other purposes – specifically to build a clean research library. It is not fair use, he found, anthropic exponent to copyright hacking accusations.

Alsup’s decision was unusual, but also “Salomonic”, Lee told me. Its conclusion of fair use offered a “partial victory” for anthropogenic, but its possible hacking discovery put Anthropic in “a very difficult place”, says Lee. It is because Financial sanctions for copyright violation can be gargantuanranging from $ 750 per work to $ 150,000 – the latter if a jury notes that the user has embarked on a voluntary violation.

Up to 7 million works may have been downloaded by Anthropic, according to documents in the trial, although an indefinite number of these works was duplicated in the two shadow libraries used by the company and may also have been duplicated among the works protected by copyright that the company has really paid. The number of works will only be known at least on September 1, the Alsup deadline gave the complainants to submit a list of all the allegedly raped works downloaded from the Shadow libraries.

If the subtraction of duplicates carries the total of individual infers work at 7 million, an invoice of $ 150,000 per employment would totally $ 1.05. This would marcast financially anthropic: the annual turnover of the company is estimated at around $ 3 billion, and its value on the private market is estimated at around $ 100 billion.

“In practical terms,” wrote Lee on his blog, “Chatgpt eats the world“, Class certification means” Anthropic faces at least the potential for the end of the end of business “.

Anthropic did not respond to my request for comments on this perspective. In a motion asking Alsup to send its decision to the 9th Circuit Court of Appeals or to reconsider its discovery, however, the company stressed the blow that its position would produce in the AI industry.

If its position was widely adopted, said Anthropic, then “training by any company that downloaded the works from third -party websites like Libgen or Books3 could constitute a copyright violation.”

It was an implicit admission that the use of ghost libraries is widespread in the IA camp, but also a suggestion that, as it was the ghost libraries that committed alleged hacking, IA companies that used them should not be punished.

Anthropic also noted in his request that the complainants in his case did not raise the hacking problem themselves-Alsup proposed it by itself, by treating the formation of AI robots and the creation of a research library as two separate uses, the first authorized in fair use, the second was not disturbed as an offense. Who deprived an opportunity to respond to the theory in court.

The company observed that a colleague federal judge of the Palais de Justice de San Francisco d’Alsup, Vince Chhabria, arrived at a contradictory conclusion only two days after Alsup, absolving the meta-platforms of a complaint for violation of copyright on similar facts, according to the exemption of fair use.

Certification of alsup classes is likely to hike both the applicant and the defendant camps in the continuing controversy on the development of the AI. Applicants who have not made a piracy in their prosecution may be invited to add it. The defendants will undergo greater pressure to prevent prosecution by rushing to conclude license agreements with writers, musicians and artists. This will occur especially if another judge accepts the argument of Alsup on hacking. “It could well encourage other proceedings,” says Lee.

For Anthropic, the challenge will be “to try to convince a jury that the granting of damages should be $ 750 per work,” said Lee. ALSUP’s decision makes this case one of the rare proceedings in which “the complainants have the upper hand”, now that they have won class certification. “All these companies will have great pressure to negotiate colonies with the complainants; Otherwise, they are at the mercy of the jury, and you cannot set up in terms of what a jury could do. ”

Leave a Reply

Your email address will not be published. Required fields are marked *