I Built an AI Pipeline for Books, Here's the Architecture

Published: 2 months ago (February 22, 2026 at 04:52 PM EST)

6 min read

Source: Dev.to

Source: Dev.to

Here’s What We Learned From 50 K Books

Most AI‑writing tools are just chat wrappers: paste a prompt, get text, copy it into Google Docs, repeat. For a full book that means hundreds of round‑trips and a total loss of context between them.

I’ve spent three years in the AI + publishing space—publishing books myself, building a reading platform (NanoReads, 130 + books, 341 K readers), and talking to hundreds of authors. The same complaints kept coming up:

AI loses track of what happened ten chapters ago.
Every chapter sounds different.
Dialogue is flat.
The output is full of “Moreover…”, “Furthermore…”, “It’s worth noting that…”.

These aren’t model‑quality problems. After generating 50 K+ books on our platform (AIWriteBook), we’re confident the bottleneck is the specification pipeline, not the language model.

The Architecture

We treat book creation as a multi‑stage compilation pipeline:

Book Metadata → Character Graph → Chapter Outlines → Chapter Content
     |               |                  |                  |
  (schema)       (schema)           (schema)          (streaming)

Each stage produces schema‑constrained structured output that feeds the next stage. Nothing is free‑form until the final prose generation.

Stage 1 – Book Metadata

The user supplies a title and a short description. The AI then generates a structured metadata object that becomes the single source of truth for everything downstream.

{
  "title": "The Dragon's Reluctant Mate",
  "genres": ["Fantasy", "Romance"],
  "tone": ["dark", "romantic", "suspenseful"],
  "style": ["dialogue‑heavy", "fast‑paced"],
  "target_audience": "Adult fantasy romance readers",
  "plot_techniques": ["enemies‑to‑lovers", "slow‑burn", "foreshadowing"],
  "writing_style": "..."
}

Tone, style, and audience are constraints, not suggestions.

Stage 2 – Character Graph

Each character is a structured node containing voice, motivation, arc, and internal conflict. When generating a chapter we only pass the characters that actually appear, together with their current arc position and relationship dynamics.

{
  "name": "Kira Ashvane",
  "role": "protagonist",
  "voice": "Sharp, clipped sentences. Uses sarcasm as defense.",
  "motivation": "Prove she doesn't need the dragon clan's protection",
  "internal_conflict": "Craves belonging but fears vulnerability",
  "arc": "Isolation → reluctant alliance → trust → sacrifice"
}

Because the model receives explicit voice specs per character, dialogue no longer sounds homogeneous.

Stage 3 – Chapter Outlines

This is the most important stage. Every chapter gets a detailed spec that guides the downstream generation.

{
  "chapter_number": 3,
  "title": "The Binding Ceremony",
  "events": [
    "Kira is forced to attend the bonding ritual",
    "..."
  ],
  "locations": [
    "Dragon temple, obsidian halls lit by bioluminescent moss"
  ],
  "twists": [
    "The ritual reveals Kira has dormant dragon magic"
  ],
  "character_interactions": [
    {
      "characters": ["Kira", "Draethor"],
      "dynamic": "hostile tension with undercurrent of curiosity"
    }
  ],
  "word_count": 2800
}

Internal A/B Test

Metric	Default Outline	Customized Outline
Export rate	16 %	34 %
Satisfaction (out of 5)	3.4	4.3
Regenerations / chapter	1.8	0.7
Completion rate	41 %	72 %

A mediocre model with a detailed outline beats a good model with a vague outline. As in software engineering, garbage in → garbage out.

Stage 4 – Chapter Generation

The only streaming stage. The model receives:

Book metadata
Relevant characters with voice specs
The chapter’s outline
Summaries of previous chapters (for continuity)
Author’s writing‑style samples

We use a two‑model strategy:

Gemini Flash – handles all structural work (fast, cheap, excels at schema‑constrained output).
Frontier model – produces the final prose.

Voice Training

Authors can upload 3–5 writing samples. We extract style features and feed them as few‑shot examples during generation.

Results from our data:

2.4 × higher export rate with voice training.
41 % fewer regeneration requests.
67 % less manual editing.

Fewer than three samples → marginal improvement.
More than five samples → diminishing returns.

Without voice training the output feels like generic GPT; authors either abandon the project or spend hours rewriting. With voice training, the “AI slop” problem largely disappears because the model now has concrete anchors for style.

Fiction vs. Non‑Fiction Pipelines

Fiction

Uses the character graph + plot‑continuity pipeline described above.

Non‑Fiction

A separate architecture that starts from reference material.

Reference Files → Content Extraction → Book Structure Selection
                                   |
                     Chapter Outlines (with assigned references)
                                   |
                     Chapter Content (with citations)

Impact of Reference Material

Condition	Export Rate	Satisfaction
With reference materials	+38 %	4.4 / 5
Without reference materials	baseline	3.5 / 5

When the model has concrete data—named studies, real quotes, specific statistics—it produces far more trustworthy and satisfying nonfiction.

Takeaways

Specification matters more than model size. A detailed, schema‑driven pipeline yields higher quality than simply scaling the LLM.
Voice specs per character prevent flat dialogue.
Chapter outlines are the single biggest lever for consistency, continuity, and author satisfaction.
Few‑shot voice training dramatically reduces post‑generation editing.
Non‑fiction requires a data‑centric pipeline that injects citations and reference material early.

Treating book generation like a compiler—metadata → graph → outline → stream—turns the chaotic “prompt‑and‑hope” workflow into a predictable, repeatable production line.

Things We Learned From 50K Books

Chapter length sweet spot is 2,000‑3,500 words.

Below that, chapters feel underdeveloped.
Above 3,500, the model starts repeating itself with different phrasing, introducing tangents, padding with unnecessary description.
Above 5,000, quality drops hard. If a chapter needs to be long, splitting it works better than generating one long one.

Genre Matters a Lot

Genre	Export Rate
Romance	31 %
Literary fiction	11 %
Humor	13 %
Poetry	9 %

AI performs best with genres that have established conventions and abundant training data, and struggles with voice‑dependent, highly creative writing.

Only 23 % of generated books get exported for publishing.

The successful books share traits:

3.2× more time on outline editing
Voice training enabled in 74 % of cases
At least one manual edit in 89 % of chapters

Books that make it to publish are iterated on, not one‑click generated.

Multilingual Quality Varies

Spanish, French, German are close to English quality.
Polish, Russian, Japanese, Korean are good but noticeably lower.
Smaller languages are usable for drafts.

Quality correlates with the volume of training data. For authors in less‑represented languages, generating in English and translating often yields better results than native generation.

Stack

Frontend: Next.js, Tailwind, Supabase client
Backend: Supabase Edge Functions (Deno)
AI: Gemini Flash (structural), Frontier models (prose)
Languages: 30+ supported

Wrapping Up

The main thing we took away from building this: the quality problem in AI‑generated books is a specification problem, not a model problem.

Vague prompt + “generate” → slop.
Detailed character graph, structured outline, voice samples, and proper constraints → genuinely good output.

If you want to try it, there’s a free tier that gives you a full 7‑chapter book.

Happy to answer questions about the architecture, the data, or anything about AI + publishing.

Tags: #ai #writing #books #showdev #webdev #productivity

I Built an AI Pipeline for Books, Here's the Architecture

Here’s What We Learned From 50 K Books

The Architecture

Stage 1 – Book Metadata

Stage 2 – Character Graph

Stage 3 – Chapter Outlines

Internal A/B Test

Stage 4 – Chapter Generation

Voice Training

Fiction vs. Non‑Fiction Pipelines

Fiction

Non‑Fiction

Impact of Reference Material

Takeaways

Things We Learned From 50K Books

Genre Matters a Lot

Multilingual Quality Varies

Stack

Wrapping Up

Related posts

Python SDK for building autonomous AI teammates

The Illusion of Digital Sovereignty: Why Vendor Swapping is Not a Compliance Strategy

Warm Introduction

Visual Studio Weekly: Copilot Memories, AI-Powered Testing, and Custom Agents

Here’s What We Learned From 50 K Books

The Architecture

Stage 1 – Book Metadata

Stage 2 – Character Graph

Stage 3 – Chapter Outlines

Internal A/B Test

Stage 4 – Chapter Generation

Voice Training

Fiction vs. Non‑Fiction Pipelines

Fiction

Non‑Fiction

Impact of Reference Material

Takeaways

Things We Learned From 50K Books

Genre Matters a Lot

Multilingual Quality Varies

Stack

Wrapping Up

Related posts

Python SDK for building autonomous AI teammates

The Illusion of Digital Sovereignty: Why Vendor Swapping is Not a Compliance Strategy

Warm Introduction

Visual Studio Weekly: Copilot Memories, AI-Powered Testing, and Custom Agents

Here’s What We Learned From 50 K Books

Stage 1 – Book Metadata

Stage 2 – Character Graph

Stage 3 – Chapter Outlines

Stage 4 – Chapter Generation

Things We Learned From 50K Books