OpenAI and benefactor Microsoft, maker of ChatGPT, are now the target of a $3 billion class action lawsuit alleging that they broke the law in scraping the web for training data. The merits of the lawsuit are not fully clear to me, but I can say that massive scraping is probably necessary for a large language model with ChatGPT's flexibility, various fights about this scraping (including this one and beyond it) are already heating up, and web scraping is in fact a legally murky activity.

OpenAI targeted in $3 billion class-action lawsuit

Alexander Mueller