The web should be readable by machines. Here's a simple way to do it.

Published: (February 17, 2026 at 08:23 AM EST)
5 min read
Source: Dev.to

Source: Dev.to

The problem is simple

Every AI agent on the internet is doing the same thing: fetching HTML, guessing what’s content, and getting it wrong.

When an AI tool tries to use your website, this is what it sees:

![](/avatars/sarah.jpg)
Sarah Chen

AI Agents Need Structure

The web was built for...

BUY NOW

Is “Sarah Chen” the author or a commenter? Where does the article end and the ad begin? The machine has to guess, and it often guesses wrong.

We have robots.txt to tell machines what to stay away from. We have nothing to tell them what we have.


What I believe

  • The web should be readable by machines — not just humans.
    AI agents are becoming how people find information. If your content isn’t structured for them, it’s increasingly invisible.

  • Structured data shouldn’t require a custom API.
    Every integration today is bespoke. Every scraper is a hack. There should be one simple convention that works everywhere.

  • Attribution shouldn’t be optional.
    If a machine reads your content, it should know who made it and how to credit them. That should be part of the protocol, not an afterthought.

  • Open beats proprietary.
    If we don’t build an open standard for this, every AI company will build its own closed pipeline. That’s worse for everyone.


So I built something

FlyWeb is a JSON file at /.well-known/flyweb.json. It lets any website describe its content in a way machines can understand.

{
  "flyweb": "1.0",
  "entity": "My Tech Blog",
  "type": "blog",
  "attribution": {
    "required": true,
    "must_link": true
  },
  "resources": {
    "posts": {
      "path": "/.flyweb/posts",
      "format": "jsonl",
      "fields": ["title", "author", "date", "tags", "content", "url"],
      "access": "free",
      "query": "?tag={tag}&limit={n}"
    }
  }
}

One file. An AI agent that finds it knows what content you have, where to get it as clean data, how to query it, and how to credit you. No SDK, no API key, no OAuth—just a file and a convention.


How it works

Discovery – AI agents check /.well-known/flyweb.json, like how crawlers check robots.txt.

Structure – Content is served as clean JSON or JSONL at paths you define.

GET /.flyweb/posts
{"title": "Why AI Needs Structure", "author": "Sarah Chen", "date": "2026-02-15", "content": "..."}
{"title": "The Future of Web Protocols", "author": "Sarah Chen", "date": "2026-02-10", "content": "..."}

Query – Standard URL parameters. Nothing fancy.

GET /.flyweb/posts?tag=ai&limit=5

Before and after

Without FlyWeb – the AI guesses. It parses your Tailwind classes, hopes it finds the right element, and gives you zero credit.

With FlyWeb – the AI gets this:

{
  "title": "AI Agents Need Structure",
  "author": "Sarah Chen",
  "date": "2026-02-15",
  "tags": ["ai", "web"],
  "content": "The web was built for...",
  "url": "https://example.com/posts/42"
}

No guessing. No scraping. No hallucinated metadata.


Attribution is not optional

This is the part I care about most.

{
  "attribution": {
    "required": true,
    "license": "CC-BY-4.0",
    "must_link": true
  }
}

You can give your content away for free. You shouldn’t have to give up credit. In FlyWeb, attribution is part of the protocol—not a suggestion, not a “best practice”, but a spec requirement.


Adding it takes minutes

CLI

npx flyweb init

Framework plugins

npm i next-flyweb      # Next.js
npm i astro-flyweb     # Astro
npm i sveltekit-flyweb # SvelteKit
npm i nuxt-flyweb      # Nuxt
npm i express-flyweb   # Express

WordPress

There’s a plugin that auto‑generates the config from your posts and pages.

Validate

npx flyweb check https://your-site.com

For AI developers

Client SDK for consuming FlyWeb data

import { discover, fetchResource } from 'flyweb/client';

const site = await discover('https://techcrunch.com');
const articles = await fetchResource(
  'https://techcrunch.com',
  site.config.resources.articles,
  { params: { tag: 'ai' }, limit: 10 }
);
// Clean JSON. No scraping.

MCP server for Claude Code, Cursor, and similar tools

{
  "mcpServers": {
    "flyweb": {
      "command": "npx",
      "args": ["-y", "flyweb-mcp"]
    }
  }
}

I don’t know if this will work

I’m not going to pretend this is guaranteed to succeed. Protocols are hard. Adoption is harder.

But the problem is real. AI agents are scraping the web blind, and content creators are left without proper credit or structure. FlyWeb aims to give both sides a clear, open, and easy‑to‑implement solution.

Getting Zero Credit

Every month that passes without an open standard is another month where proprietary pipelines get more entrenched.

FlyWeb is a small bet that a simple, open convention can fix this before it’s too late to fix.


The Protocol Is Open

  • GitHub
  • Spec
  • Website
  • Docs
  • npm
  • MCP Server

MIT licensed. No vendor lock‑in. No payment. If you think the web should be readable by machines, try it out. If you have ideas, PRs are welcome.

The web was built for human eyes. It shouldn’t stay that way.

0 views
Back to Blog

Related posts

Read more »