hazumi
Back

Meta torrented & seeded 81.7 TB dataset containing copyrighted data

gameshot911arstechnica.com
1,270 points938 comments

Meta has been accused of using over 81.7TB of pirated books to train its artificial intelligence language model, leading to concerns about the legality and ethics of this practice from authors and publishers.

Comments (938)