Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That was basically my edit, we may have missed each other.

The only point I can see making is that for some reason I doubt openai actually paid the author for the contents of the book, but maybe they did.

Maybe they borrowed ebooks one by one from some digital library and ingested them that way for free.



Somehow I feel like they just scraped libgen and sci hub

Much more "scalable" i.e. easy to get raw training dataset. Move fast and break things




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: