Meta torrented & seeded 81.7 TB dataset containing copyrighted data

gameshot911512 days ago

arstechnica.com

1,270 points938 comments

Summary

Meta has been accused of using over 81.7TB of pirated books to train its artificial intelligence language model, leading to concerns about the legality and ethics of this practice from authors and publishers.

Read full article View on HN

Comments (938)