Prosecraft.io, a site that used novels to help power a data-driven project to display word count, passive voice, and other much more subjective, writing-style markers such as vividness, shut down today after authors protested the project. Prosecraft used the full text of over 25,000 books—which is entirely copywritten material—in order to develop a library of data. Authors, once they caught wind of what was happening, immediately hated this.
Zach Rosenberg was the author who first brought this site to the larger attention of authors on X, the site formerly known as Twitter. Pretty soon, more and more authors spoke out, including high-profile authors like Jeff VanderMeer (The Southern Reach trilogy), Indra Das (The Devourers), Gretchen Felker-Martin (Manhunt)
Part of this is because Prosecraft has admitted to using “AI algorithms.” In a blog post dated October 5, 2018, Benji Smith, the developer of both Prosecraft and the writing program Shaxpir that was based on the data mined from Prosecraft’s library, stated that “we taught our machine-learning [AI] algorithms to recognize which kinds of words can be used in which kinds of contexts, by looking at the types of words and phrases that tend to occur within similar sentences and paragraphs.” Additionally, he wrote that Shaxpir “[analyzed] more than 560 million words of fiction, from more than 5,800 books, written by more than 3,300 popular authors.” He does not disclose where he received those works of fiction, or whether or not he received permission to do so.
While the technology used is not necessarily a large language generative model like ChatGPT, it is not a stretch to say that incorporating generative LLM algorithms could have been on the horizon for Prosecraft. And since the site had a massive library of books, author’s fears are incredibly valid. In the wake of this backlash, Smith has written a lengthy blog on medium explaining why he voluntarily took down Prosecraft.
Although Prosecraft was only using portions of the text, it did not have permission from any authors or publishers to create a database based on the entire work of an author or the full text of a book. Smith wrote on the blog, “since I was only publishing summary statistics, and small snippets from the text of those books, I believed I was honoring the spirit of the Fair Use doctrine, which doesn’t require the consent of the original author.”
While this holds some water, Fair Use does not, by any stretch of the imagination, allow you to use an author’s entire copywritten work without permission as a part of a data training program that feeds into your own “AI algorithm.” While this situation is certainly going to be a lesson for many people, it’s clear that authors are not going to allow their work to be used to train LLMs and vector networks.
Want more io9 news? Check out when to expect the latest Marvel, Star Wars, and Star Trek releases, what’s next for the DC Universe on film and TV, and everything you need to know about the future of Doctor Who.
Trending Products
![Cooler Master MasterBox Q300L Micro-ATX Tower with Magnetic Design Dust Filter, Transparent Acrylic Side Panel, Adjustable I/O & Fully Ventilated Airflow, Black (MCB-Q300L-KANN-S00)](https://m.media-amazon.com/images/I/51WfytAtGCL._SS300_.jpg)
Cooler Master MasterBox Q300L Micro-ATX Tower with Magnetic Design Dust Filter, Transparent Acrylic Side Panel, Adjustable I/O & Fully Ventilated Airflow, Black (MCB-Q300L-KANN-S00)
![ASUS TUF Gaming GT301 ZAKU II Edition ATX mid-Tower Compact case with Tempered Glass Side Panel, Honeycomb Front Panel, 120mm Aura Addressable RGB Fan, Headphone Hanger,360mm Radiator, Gundam Edition](https://m.media-amazon.com/images/I/41JUuW8Yc5S._SS300_.jpg)
ASUS TUF Gaming GT301 ZAKU II Edition ATX mid-Tower Compact case with Tempered Glass Side Panel, Honeycomb Front Panel, 120mm Aura Addressable RGB Fan, Headphone Hanger,360mm Radiator, Gundam Edition
![ASUS TUF Gaming GT501 Mid-Tower Computer Case for up to EATX Motherboards with USB 3.0 Front Panel Cases GT501/GRY/WITH Handle](https://m.media-amazon.com/images/I/41j9qzlOi2L._SS300_.jpg)
ASUS TUF Gaming GT501 Mid-Tower Computer Case for up to EATX Motherboards with USB 3.0 Front Panel Cases GT501/GRY/WITH Handle
![be quiet! Pure Base 500DX ATX Mid Tower PC case | ARGB | 3 Pre-Installed Pure Wings 2 Fans | Tempered Glass Window | Black | BGW37](https://m.media-amazon.com/images/I/41xW6xrbicL._SS300_.jpg)
be quiet! Pure Base 500DX ATX Mid Tower PC case | ARGB | 3 Pre-Installed Pure Wings 2 Fans | Tempered Glass Window | Black | BGW37
![ASUS ROG Strix Helios GX601 White Edition RGB Mid-Tower Computer Case for ATX/EATX Motherboards with tempered glass, aluminum frame, GPU braces, 420mm radiator support and Aura Sync](https://m.media-amazon.com/images/I/41T-2v3IuML._SS300_.jpg)
ASUS ROG Strix Helios GX601 White Edition RGB Mid-Tower Computer Case for ATX/EATX Motherboards with tempered glass, aluminum frame, GPU braces, 420mm radiator support and Aura Sync
![Bgears b-Voguish Gaming PC Case with Tempered Glass panels, USB3.0, Support E-ATX, ATX, mATX, ITX. (Fans are sold separately)](https://m.media-amazon.com/images/I/41p2u3NJN6L._SS300_.jpg)