Pune

Authors Sue Microsoft Over Alleged Copyright Infringement in AI Training

Authors Sue Microsoft Over Alleged Copyright Infringement in AI Training

Authors Sue Microsoft, Alleging Unpermitted Use of Books in AI Training.

Microsoft: Amidst the rapid expansion of artificial intelligence, the question of whether AI's progress infringes on creative freedom and copyright is increasingly being raised. This question has taken a new turn with a recent lawsuit filed in a New York court in the United States, where several renowned authors have leveled serious accusations against Microsoft.

What is the Case?

A group of authors has accused the tech giant Microsoft of using their copyrighted books without permission to train its AI model, Megatron. The authors claim that Microsoft surreptitiously downloaded digital versions of pirated books available on the internet and used them for model training.

The lawsuit includes prominent figures like biographer Kai Bird, author and critic Jia Tolentino, and writer Daniel Okrent, who have filed complaints in court, stating that their books were fed into the AI model without authorization.

AI Model Trained on Pirated Content?

According to the complaint, Microsoft used a large collection of approximately 200,000 pirated books to train the Megatron AI. These books were illegally available on the internet and were incorporated into the model without any license or permission.

The authors allege that this model now mimics the creative works—such as writing style, subject matter, tone, and even sentence structure. In other words, the system now generates responses that resemble the original authors' works.

AI Companies' Argument: 'Transformative Use' – Fair Use

Several companies in the AI sector, such as OpenAI, Meta, and Anthropic, have previously faced similar allegations. These companies claim that they are utilizing copyrighted material under the principle of fair use because they are creating new and transformative content from it.

They also argue that if a license fee had to be paid every time any data was used, then technological development would become impossible for AI startups.

What Has Happened in Court So Far?

This lawsuit has emerged at a time when a federal court in California ruled in the Anthropic case that training an AI model with copyrighted material could fall under 'fair use,' but this would be considered an infringement if that material is pirated.

This is the first major legal decision in the United States regarding the use of copyrighted material for AI training, which may set the direction for many future cases.

Authors' Demands

The authors have sought two primary reliefs from the New York court:

  1. Injunction: To prevent Microsoft from using such copyrighted material in the future.
  2. Statutory Damages: Compensation of up to $150,000 (approximately ₹1.28 crore) for each infringed work.

If the court rules in favor of the authors, Microsoft could face a fine of billions of dollars.

Microsoft's Response?

Currently, Microsoft has not provided any formal response to this case. Meanwhile, the authors' lawyers have also refrained from making any specific comments to the press. However, the lawsuit has intensified the debate in the technical and literary worlds.

Why This Case is Important for the AI Industry?

AI models require a vast amount of data for training. This includes books, websites, blogs, social media posts, etc. If the courts consistently rule in favor of copyright, this could become a legal and economic challenge for the AI industry.

Experts believe that in the future, tech companies may need to enter into licensing agreements with creators and publishers. This could slow down the development of AI systems, but it would also signal a move towards a more just model.

Leave a comment