Perspectives

The Distributors

Generative AI's ability to produce content is rapidly eclipsing human velocity, threatening the time, thought, and experience required to create meaningful work. Traditional and online publishing is unprepared for the massive changes AI will bring. What should our tools and platforms do to protect the integrity of writers’ works? Is publishing the first line of defense against LLMs?

In The Watchmakers, we wrote about the implications of generative AI on creative writing. Now, we’ll explore its effect on publishing.

Written by

Rex Mizrach

Publish date

05/01/2024

Share this article

Consider the bot

Hostile la vista...

Of course, every AI-skepticism article should begin with a Terminator reference.

Growing up in the ‘90s, killer robots held our fascination as modern nightmare-myths that our parents let us watch (for some reason). Killer drones hovering over battlefields; mechanized bodybuilders trundling across the landscape. Humans left to sift through the debris of their existence, overwhelmed by despair. We spent a lot of time thinking about how we would fare fighting these bots. Would their red-bulbed stares terrify us? And when would be the moment we would believe—down to our bones—that we’re fucked?

We’re not fucked—but we are worried. The advance of AI is a long ramp through shadowed territory. For years, it has progressed at a shallow incline—mere degrees—almost unnoticeable from the flats. But a year or two ago it steepened. With ChatGPT’s debut, the world began to notice, and our legs began to fatigue. Could we have imagined, in our childhood nightmares, that the “killer AI” would be a chatbot—devaluing artistic integrity and human creativity? (And is that a better or worse enemy than a T-800?)

The future is unclear. But years before killer AI hovercraft floated above an atomic-scorched landscape—before humans gave up authenticity for instant gratification—a group of boring publishing executives debated and equivocated within the sterile confines of the Frankfurt Book Messe.

Mess at the Messe

Frankfurt Book Messe

At 2023’s Frankfurt Book Messe—the largest book fair in the world and gathering of publishers, agents, and authors—the panels seemed dominated by a palpable sense of uncertainty surrounding discussions on generative AI's role in publishing.

During a seminar on the impact of AI on rights, moderator Tom Chatfield opened the discussion by highlighting AI's revolutionary potential: "I think that AI challenges us to think in new ways about creativity and written work and words, just as photography challenged art profoundly." But the mood quickly shifted, manifesting a darker tone.

"At their worst, (AI systems) can be forces of mediocrity or deception or simulation that push us aside or cannibalize our works and ideas while impoverishing our own culture and mental processes."

There aren’t that many events that threaten to upend a business as staid as publishing. Most recently, there was the internet-fueled eBook revolution of the aughts, which blasted open a near-infinite market in self-publishing, offering opportunities for independent authors to control their destinies and fundamentally change how writers engage with the industry. But the advance of generative AI seems to signal a greater sea change—though visibility is murky at best.

Generative AI-integrated writing tools like Sudowrite sought to engage in the business of writing—imploring users to churn out a book a day, with a bot co-writer doing the mundane work of actually writing, incentivizing writers to game the system. And the system is, as of 2024, inherently flawed. The publishing industry’s lack of readiness to safeguard authors' interests in the face of rapid technological incline is nothing new. But Frankfurt revealed an industry grappling with the potential and perils of the dichotomy between tools enhancing an industry that is, of course, a business—and their potential to undermine artistic integrity.

Writers, left to their own devices, face a wild new landscape. AI's ability to produce content rapidly necessitates an uneven playing field—the sheer volume and speed of LLMs easily eclipses human velocity, threatening the time, thought, and experience required to create meaningful work. And the rapid publication of AI content compromises quality, pushing authentic voices further into obscurity.

Already forced to compete for their share of the attention economy, writers in a world inundated with AI will find it difficult, if not impossible, to maintain solid ground, own their rights and earn rightful compensation. And younger writers who have yet to build a significant platform will struggle to be heard in a marketplace flooded with facsimiles.

The verification problem

AI publishing schemes have taken root on Amazon.

While online publishing platforms offer freedom to authors, they often treat their work as pure commodity. Initially lauded for democratizing the publishing process, they are fast becoming repositories for AI-generated content (with hazardous results). Even Amazon had to take action—limiting self-publishing to a generous three books per day—underscoring the severity of the problem, and the unwillingness to fix it. Platforms like Wattpad and Inkitt have employed a kind of “honor system”—disclose… or don’t. (¯\_(ツ)_/¯)

In August, author Jane Friedman spotted a number of self-published books on Amazon falsely attributed to her. And on Twitter, other writers reported finding similar books, suspecting AI-generated texts of exploiting established writers’ reputations in a widespread scam. Despite the authors’ efforts to remove them, they continued to appear on the platform. “We invest heavily to provide a trustworthy shopping experience and protect customers and authors from misuse of our service,” said one Amazon spokesperson. The quest for transparency remains opaque—Amazon doesn’t care enough to maintain ethical standards in published content, nor does it employ detection tools, which are continually playing catch-up with LLMs. Their business model is about maximizing customer choice at the expense of vendors. But the number of AI-”authored” books on Amazon—the largest bookseller in the world—is still unknown.

Clarifying the legal status of AI-generated content holds significant implications for copyright and authorship (as discussed in part one). In late December, The New York Times sued OpenAI and Microsoft for copyright infringement—stating that millions of articles published by The Times were used in training ChatGPT and other LLMs, competing with news outlets as sources of “reliable information.” The suit argues that the defendants are liable for “billions of dollars in statutory and actual damages related to the unlawful copying and use of The Times’s uniquely valuable works,” and demands the destruction of all LLMs trained on copyrighted materials.

While the publishing industries and regulatory bodies in the EU, UK, and US are beginning to address these concerns, there is little consensus on how to legally differentiate between human and AI-generated content. Unlike art and music, very few literary copyright claims have succeeded on plagiarism grounds alone. With a giant of American media taking legal action, the question of copyright creates a precarious situation for authors whose works appear on platforms that permit the publication of AI content.

The first generation of AI-detection tools have proven unreliable, often resulting in false positives and negatives. A well-trained nose employing the sniff test might work sometimes, but as with generative art models like Midjourney, the hands are looking better every day. Soon, it may be possible to train a model on a particular author’s style: the stamp and seal of a writer. When LLM improvements continue to blur the lines between author and artifice, and publication becomes a grift for quick cash—how will these future landscapes look?

And how can writers ensure authenticity?

A template for authenticity

What does it mean to protect a work? And what does a tool that protects not only writers, but writing look like?

Attribution (also known as provenance) and security are vital in safeguarding the integrity of a writer’s work, from its inception all the way to publication.

Self-publishing has a problem... (X)

Attribution—who (or what) wrote what?

Soon, published work will face a critical litmus test—did a human write this? The key to any technological advancement hinges on the control and quality of its inputs. A blockchain's effectiveness depends on the accuracy of its data—an LLM is only as good as the words it ingests. While this may not always be the case (as advances in how a machine uses data is rapidly improving), it’s difficult to fake millions of keystrokes over an extended time.

The process of creating a text—how long it’s been edited, the number of sessions, the types of operations (writing, editing, deleting, copying, pasting)—signifies the hand of the author. Providing detailed version history, meticulously logging and safeguarding inputs throughout every stage of the work, could document and prove authenticity. In establishing a clear and traceable record of creation, writers could provide undeniable evidence of their rightful ownership.

Security and safety, from idea to publishing

Shielding a writer's work involves not just protecting its text, but ensuring safety in all parts of the process, balancing both material and legal. Building a better, more secure text editor with a clear commitment to data privacy is the first step, prioritizing data security and maintaining a transparent stance on usage terms, reducing the risk of LLMs training without consent.

Robots.txt: a text file used to instruct search engine robots how to crawl pages on a website.

Publishing is trickier. Ideally, the goal is to strike a balance between discoverability of content and preventing its unauthorized use. Disallowing robots.txt offers some level of protection against larger LLMs, especially those under greater governmental scrutiny. But this approach is still lacking. As LLMs become smaller and more diffuse, inevitably falling into the hands of bad-faith actors, a simple “no entry" block will prove inadequate.

The concept of AI cloaking—obfuscating the data that an LLM “reads”—is maturing, at least for the visual arts. Glaze cloaks an artist’s work in a way that disrupts LLMs without compromising it for the viewer, adding an invisible layer to the image that is imperceptible to the human eye. Machines read it differently, thus the style that AI attempts to mimic will yield a different result than intended, corrupting that data for training.

A similar tool doesn’t yet exist for writing—though a solution might be as simple as scrambling text, or using fonts that confuse a crawler. But much like antivirus software, it would need to constantly adapt as LLMs evolve and become ever more sophisticated. (If you’re a researcher or startup working on this problem—reach out!)

An integrated tool that can encompass the entire journey of a piece—from its initial drafting to editing, all the way through to publishing, would facilitate not only the creative process, but serve as a digital trail. Relevant data to support writers at every stage of their work can ensure their unique voice and vision are preserved and recognized in a world that is more clouded than ever.

The challenge ahead

At Ellipsus, we don’t have the answers baked into code—but we’re working on it.

The problem, as we see it, has two parts:

I. Developing the technology itself, in an ever-evolving landscape.

II. Giving creatives more say (and control) over their work than technologists, who are motivated by the generative aspect of AI without the corresponding care for risk and protection.

We’ve started with a text editor—supporting writers with a secure, safe way to write together, away from the prying eyes of LLMs. But we also want to ensure that LLMs can’t trawl public content without consequence—with both technical and legal barriers protecting the integrity of publishing (and human integrity).

The rapid advance of AI has exposed these challenges, to those who care to solve them. Should an industry fast losing its grip on the future and failing to protect authors allow the proliferation of tools that thrive on imitation for profit, distorting the essence of creativity? Technologists are misreading writers, operating under the assumption that creatives want shortcuts and superficial solutions—and that readers will care as little for the content they consume as they do for disposable plastic toys and knockoff beauty products.

We don’t want to be accused of Ludditism or doomerism. In spite of, or because of these challenges, we have high hopes for the future of storytelling. That future is generative—but genuine. This new world is exhilarating not because we’ve discovered how to replicate human voices without the humanity, but because writers are shaping the future of storytelling together—finding shared joy in collaborating on original projects, discussing novels with friends, being immersed in vibrant new fandoms. Anywhere there's a resonant and real basis for creation, there will be innovative new worlds to explore, and stories that ring true.

In an age when authenticity is increasingly up for debate, it is most valued when it is rare.

AI'll be back...