New Federal Bill Could Require Disclosure of Songs Used in AI Training
Source: Billboard
Representative Adam Schiff (D-Calif.) introduced new legislation in the U.S. House of Representatives on Tuesday (April 9) which, if passed, would require AI companies to disclose which copyrighted works were used to train their models, or face a financial penalty. Called the Generative AI Copyright Disclosure Act, the new bill would apply to both new models and retroactively to previously released and used generative AI systems.
The bill requires that a full list of copyrighted works in an AI models training data set be filed with the Copyright Office no later than 30 days before the model becomes available to consumers. This would also be required when the training data set for an existing model is altered in a significant manner. Financial penalties for non-compliance would be determined on a case-by-case basis by the Copyright Office, based on factors like the companys history of noncompliance and the companys size.
Generative AI models are trained on up to trillions of existing works. In some cases, data sets, which can include anything from film scripts to news articles to music, are licensed from copyright owners, but often these models will scrape the internet for large swaths of content, some of which is copyrighted, without the consent or knowledge of the author. Many of the worlds largest AI companies have publicly defended this practice, calling it fair use, but many of those working in creative industries take the position that this is a form of widespread copyright infringement.
The debate has sparked a number of lawsuits between copyright owners and AI companies. In October, Universal Music Group, ABKCO, Concord Music Group, and other music publishers filed a lawsuit against AI giant Anthropic for unlawfully exploiting their copyrighted song lyrics to train AI models.
-snip-
Read more: https://www.billboard.com/business/legal/federal-bill-ai-training-require-disclosure-songs-used-1235651089/
highplainsdem
(49,006 posts)highplainsdem
(49,006 posts)how unfair it is that AI companies will have to list the copyrighted works they used to train their AI, since this will cost them money, and they'll have to register more copyrighted works they used whenever there's a major update, and the registry will be public which is so unfair when it might be the AI's competitive secret sauce AND they might be sued if the owners of those copyrights find out their intellectual property was used. She talked about the need to limit "the power of copyright monopolists."
In other words, how dare the government expect AI companies to stop stealing copyrighted work to train their AI!
When someone pointed out that AI companies made the decision to steal all that data, the lawyer didn't disagree with it being called stealing, but said that AI companies needed immense, diverse datasets, and trying to get permission before taking all that data would have been "untenable."
highplainsdem
(49,006 posts)Jose Garcia
(2,598 posts)LudwigPastorius
(9,156 posts)Are you seriously suggesting that an author has to disclose every single book they've ever read in their lifetime? Or, a musician every single song they've ever heard?
Jose Garcia
(2,598 posts)Why is it necessary for AI artists to have to make this disclosure that traditional artists do not? It would appear that this legislation simply an attempt to try to suppress one group of individuals at the expense of another.
Ned Ludd would be proud.
LudwigPastorius
(9,156 posts)Unfortunately, I foresee AI companies striking licensing deals on the Spotify model.
Something like, $0.0000001 to every artist scraped for every copy of software sold.