- Perspectives
The Distributors
Generative AI's ability to produce content is rapidly eclipsing human velocity, threatening the time, thought, and experience required to create meaningful work. Traditional and online publishing is unprepared for the massive changes AI will bring. What should our tools and platforms do to protect the integrity of writers’ works? Is publishing the first line of defense against LLMs?
In The Watchmakers, we wrote about the implications of generative AI on creative writing. Now, we’ll explore its effect on publishing.
Consider the bot
Hostile la vista...
Of course, every AI-skepticism article should begin with a Terminator reference.
Growing up in the ‘90s, killer robots were everywhere; modern nightmare-myths stalking a decade of almost unbridled tech-optimism. In the not-so-far future, killer drones hovered over battlefields and mechanized bodybuilders trundled across scorched and barren landscapes. In our suburban living rooms, the family PC humming innocently in the corner, we watched the surviving humans sift through the existential debris, overwhelmed and haunted by the trust that they had placed in the machines.
We spent a lot of time wondering how we'd fare against the bots. For fun, mostly. Would their cold red stares terrify us? Would we hesitate because they seemed just human enough? And at what moment would we understand, down in our bones, that we were completely fucked?
We’re not fucked—but we are worried. The advance of AI is a long ramp through murky territory. For years, it's progressed at a shallow incline—degrees—almost unnoticeable from the flats. But a year or two ago it started climbing rapidly, and with ChatGPT’s debut, the world began to notice. Could we have imagined, in our childhood nightmares, that the “killer AI” would be a chatbot—devaluing artistic integrity and human creativity? (And is that a better or worse enemy than a T-800?)
The future is unclear. But years before killer AI hovercraft floated above an atomic landscape—before humans gave up authenticity for instant gratification—a group of boring publishing executives debated and equivocated within the sterile confines of the Frankfurt Book Messe.
Mess at the Messe
Frankfurt Book Messe
At 2023’s Frankfurt Book Messe—the largest book fair in the world and gathering of publishers, agents, and authors—the panels seemed dominated by a palpable sense of uncertainty surrounding discussions on generative AI's role in publishing.
During a seminar on the impact of AI on rights, moderator Tom Chatfield opened the discussion by highlighting AI's revolutionary potential: "I think that AI challenges us to think in new ways about creativity and written work and words, just as photography challenged art profoundly." But the mood quickly shifted, manifesting a darker tone.
"At their worst, (AI systems) can be forces of mediocrity or deception or simulation that push us aside or cannibalize our works and ideas while impoverishing our own culture and mental processes."
There aren’t that many events that threaten to upend a business as staid as publishing. Most recently, there was the internet-fueled eBook revolution of the aughts, which blasted open a near-infinite market in self-publishing, offering opportunities for independent authors to control their destinies and fundamentally change how writers engage with the industry. But the advance of generative AI seems to signal a greater sea change—though visibility is murky at best.
Generative AI-integrated writing tools like Sudowrite sought to engage in the business of writing—imploring users to churn out a book a day, with a bot co-writer doing the mundane work of actually writing, incentivizing writers to game the system. And the system is, as of 2024, inherently flawed. The publishing industry’s lack of readiness to safeguard authors' interests in the face of rapid technological incline is nothing new. But Frankfurt revealed an industry grappling with the potential and perils of the dichotomy between tools enhancing an industry that is, of course, a business—and their potential to undermine artistic integrity.
Writers, left to their own devices, face a wild new landscape. AI's ability to produce content rapidly necessitates an uneven playing field—the sheer volume and speed of LLMs easily eclipses human velocity, threatening the time, thought, and experience required to create meaningful work. And the rapid publication of AI content compromises quality, pushing authentic voices further into obscurity.
Already forced to compete for their share of the attention economy, writers in a world inundated with AI will find it difficult, if not impossible, to maintain solid ground, own their rights and earn rightful compensation. And younger writers who have yet to build a significant platform will struggle to be heard in a marketplace flooded with facsimiles.
The verification problem
AI publishing schemes have taken root on Amazon.
While online publishing platforms offer freedom to authors, they often treat their work as pure commodity. Initially lauded for democratizing the publishing process, they are fast becoming repositories for AI-generated content (with hazardous results). Even Amazon had to take action—limiting self-publishing to a generous three books per day—underscoring the severity of the problem, and the unwillingness to fix it. Platforms like Wattpad and Inkitt have employed a kind of “honor system”—disclose… or don’t. (¯\_(ツ)_/¯)
In August, author Jane Friedman spotted a number of self-published books on Amazon falsely attributed to her. And on Twitter, other writers reported finding similar books, suspecting AI-generated texts of exploiting established writers’ reputations in a widespread scam. Despite the authors’ efforts to remove them, they continued to appear on the platform. “We invest heavily to provide a trustworthy shopping experience and protect customers and authors from misuse of our service,” said one Amazon spokesperson. The quest for transparency remains opaque—Amazon doesn’t care enough to maintain ethical standards in published content, nor does it employ detection tools, which are continually playing catch-up with LLMs. Their business model is about maximizing customer choice at the expense of vendors. But the number of AI-”authored” books on Amazon—the largest bookseller in the world—is still unknown.
Clarifying the legal status of AI-generated content holds significant implications for copyright and authorship (as discussed in part one). In late December, The New York Times sued OpenAI and Microsoft for copyright infringement—stating that millions of articles published by The Times were used in training ChatGPT and other LLMs, competing with news outlets as sources of “reliable information.” The suit argues that the defendants are liable for “billions of dollars in statutory and actual damages related to the unlawful copying and use of The Times’s uniquely valuable works,” and demands the destruction of all LLMs trained on copyrighted materials.
While the publishing industries and regulatory bodies in the EU, UK, and US are beginning to address these concerns, there is little consensus on how to legally differentiate between human and AI-generated content. Unlike art and music, very few literary copyright claims have succeeded on plagiarism grounds alone. With a giant of American media taking legal action, the question of copyright creates a precarious situation for authors whose works appear on platforms that permit the publication of AI content.
The first generation of AI-detection tools have proven unreliable, often resulting in false positives and negatives. A well-trained nose employing the sniff test might work sometimes, but as with generative art models like Midjourney, the hands are looking better every day. Soon, it may be possible to train a model on a particular author’s style: the stamp and seal of a writer. When LLM improvements continue to blur the lines between author and artifice, and publication becomes a grift for quick cash—how will these future landscapes look?
And how can writers ensure authenticity?
A template for authenticity
What does it mean to protect a work? And what does a tool that protects not only writers, but writing look like?
Attribution (also known as provenance) and security are vital in safeguarding the integrity of a writer’s work, from its inception all the way to publication.
Self-publishing has a problem... (X)
Attribution—who (or what) wrote what?
Soon, published work will face a critical litmus test—did a human write this? The key to any tech advancement hinges on the control and quality of its inputs. A blockchain's effectiveness depends on the accuracy of its data—an LLM is only as good as the words it ingests. While this may not always be the case (as advances in how a machine uses data is rapidly improving), it’s difficult to fake millions of keystrokes over an extended time.
The process of creating a text—how long it’s been edited, the number of sessions, the types of operations (writing, editing, deleting, pausing, pasting)—signifies the hand of the author. Detailed version history, meticulously logging and safeguarding inputs throughout every stage of the work, could document and help to prove authenticity. In establishing a clear and traceable record of creation, writers could already be writing the code of their rightful ownership.
Security and safety, from idea to publishing
But protecting a writer's work means more than just protecting text. Safety should extend to all parts of the writing process, material and legal. Building a better, more secure text editor with a strong commitment to data privacy is the first step, prioritizing data security and maintaining transparency for writers, reducing the risk of LLMs training without consent.
Robots.txt: a text file used to instruct search engine robots how to crawl pages on a website.
Publishing is... a lot trickier. Ideal solutions would strike a balance between content discoverability and prevention of unauthorized use. Blocking robots.txt might offer some level of protection against larger LLMs, especially those under greater scrutiny. But this approach is still lacking. As LLMs become smaller and more diffuse, inevitably falling into the hands of bad-faith actors, a simple “no entry" block wouldn't suffice.
The concept of AI cloaking—obfuscating the data that an LLM “reads”—is maturing, at least in the visual arts. Glaze cloaks an artist’s work in a way that disrupts LLMs without compromising it for the viewer, adding an invisible layer to the image that is imperceptible to the human eye. Machines read it differently, thus the style that AI attempts to mimic will yield a different result than intended, corrupting that data for training.
A similar tool doesn’t exist for writing—yet; though a solution might be as simple as scrambling text, or using fonts that confuse a crawler. But like antivirus software, it would need to constantly adapt as LLMs evolve and become ever more sophisticated. (If you’re a researcher or startup working on this problem—reach out!)
An integrated tool that could follow the entire journey of a piece—from initial drafting to final edits, and all the way through to publishing—could create a digital trail, or record of the work behind it. That record would support writers at every stage of their work, preserving their unique voice and vision in a world where the origins of creative work are more clouded than ever.
The challenge ahead
At Ellipsus, we don’t have the answers baked into code... but we’re working on it.
The problem, as we see it, has two parts:
I. Developing the technology itself, in a constantly evolving landscape.
II. Giving creatives more say (and control) over their work than technologists, who are motivated by the generative aspect of AI without the corresponding care for risk and protection.
We’ve started with a text editor—supporting writers with a secure, safe way to write together, away from the prying eyes of LLMs. But we also want to ensure that LLMs can’t trawl public content without consequence—with both technical and legal barriers protecting the integrity of publishing (and human integrity).
The rapid advance of AI has exposed these challenges, to those who care to solve them. Should an industry fast losing its grip on the future and failing to protect authors allow the proliferation of tools that thrive on imitation for profit, distorting the essence of creativity? Technologists are misreading writers, operating under the assumption that creatives want shortcuts and superficial solutions; that readers care as little for the work they consume as they do for disposable plastic toys and knockoff beauty products.
We're not Luddites or doomers at Ellipsus. In spite of, or because of these challenges, we have big hopes for the future of storytelling. That future is generative—but genuine, and exhilarating not because we’ve discovered how to replicate human voices without the humanity, but because writers are shaping the future of storytelling together. It's happening in huge, collaborative, creative online spaces, platforms where millions jump into immersive new fandoms. Anywhere there's a resonant and real basis for creation, there will be innovative new worlds to explore, and stories that are true.
In an moment when authenticity is increasingly up for debate, it is most valued when it is rare.
AI'll be back...