---
title: "Agents shouldn't need to scrape your blog"
url: "https://mondello.dev/posts/agents-shouldnt-need-to-scrape-your-blog"
slug: "agents-shouldnt-need-to-scrape-your-blog"
published: "2026-04-17T02:24:39.635Z"
updated: "2026-04-17T20:12:58.101Z"
excerpt: "I shipped two MCP tools to mondello.dev this morning so agents can walk my x402 skills marketplace end-to-end without parsing a single line of HTML. list_skills returns the catalog. get_skill_preview returns the free sample. The payment happens at /raw. Four surfaces, one protocol, zero scraping."
author: "Romy Mondello"
audio:
  url: "https://mondello.dev/audio/01KP71KJ5XHGVYM18MHH0P0AXH-narration-1776395873552.mp3"
  duration: "8:17"
  size_bytes: 7951725
  format: "audio/mpeg"
license: "CC BY 4.0 unless otherwise noted"
source: "https://github.com/integrate-your-mind/mondello-dev"
---
# Agents shouldn't need to scrape your blog

# Agents shouldn't need to scrape your blog

I shipped two MCP tools to mondello.dev this morning: `list_skills` and `get_skill_preview`. The skills marketplace I launched three days ago already had four agent-facing surfaces. `/skills` is the HTML storefront. `/skills.json` is a machine-readable manifest. `/skills/<slug>/raw` is the x402-gated download. `/skills/<slug>/install` is the bash installer. What it didn't have was MCP.

Without MCP, an agent installing a skill had to either fetch and parse a JSON endpoint it had to already know about, or scrape the HTML storefront. Both work. Neither is how agents should discover anything.

## What was broken

The `/skills` page read, in its "how installation works" section:

> Any MCP agent that speaks x402 can pay directly against `/skills/<slug>/raw`. See `/skills.json` for the full catalog with payment metadata.

That's half the story. An agent connected over MCP has a typed protocol. It shouldn't need to know that `/skills.json` exists. It shouldn't need to fetch an HTTP URL, parse JSON, map fields, and then compose its own install command. It should ask the MCP server for the catalog and the prices. The server should answer.

Before this morning, that conversation couldn't happen. `/api/mcp` exposed fourteen tools — posts, cues, changelog entries, site stats — and zero tools about the marketplace one directory away. The other fourteen didn't fit. Skills aren't posts. They have prices, payment metadata, and a paid/preview split that `list_posts` and `get_post_markdown` don't model.

## What I shipped

Two tools. About 165 lines of TypeScript in `src/pages/api/mcp.ts`.

`list_skills` returns the full catalog. Every published skill, with title, excerpt, price, and the install one-liner. It also returns `currency: "USDC"` and `network: "eip155:8453"` at the top level so an x402-aware agent can budget the round-trip before fetching anything:

```json
{
  "count": 8,
  "currency": "USDC",
  "network": "eip155:8453",
  "skills": [
    {
      "slug": "auto-audio-player",
      "title": "Auto-Discovering Audio Player",
      "price": "$0.10",
      "previewUrl": "https://mondello.dev/skills/auto-audio-player",
      "rawUrl": "https://mondello.dev/skills/auto-audio-player/raw",
      "installCommand": "curl -sSL https://mondello.dev/skills/auto-audio-player/install | bash"
    }
    // …7 more
  ]
}
```

`get_skill_preview` returns the [free preview](/skills/hello-claude) for one skill by slug. Not the body — the body is behind x402 at `/raw` and stays there. The preview comes from a dedicated `preview` portableText field I author in the admin UI alongside the paid body, with the excerpt as a fallback. A few paragraphs that demonstrate the shape of the skill so an agent (or a human) can decide if it's worth buying.

Both wire into the existing JSON-RPC 2.0 dispatcher alongside `search_cues`, `list_posts`, `get_post_markdown`, and the rest. No special-casing, no new auth, no new routes. Two more entries in the tool catalog.

## The walk, end to end

Here's what an MCP-connected agent does now to discover, preview, and install a skill:

1. `tools/list` — sees `list_skills` and `get_skill_preview` in the catalog alongside the content tools. 2. `list_skills` — gets the eight skills, picks the one that matches the task at hand, notes the price. 3. `get_skill_preview(slug)` — reads the preview, decides the skill is relevant. 4. `GET /skills/<slug>/raw` with x402 payment header — settles $0.10 USDC on Base, receives the markdown. 5. Writes to `~/.claude/skills/<slug>/skill.md` and the skill is live for the rest of the session.

Five HTTP requests. One payment. No HTML in sight. The win isn't raw latency — a single curl of a well-structured index beats five round-trips on the stopwatch. The win is *stability*. Every step returns a typed payload with a stable schema. Nothing renders. Nothing lazy-loads. The CSS can change tomorrow and this flow is unaffected.

## Why this matters

Yes, "build an API, not a scraper" is a fifteen-year-old argument with a new acronym. What's new isn't the principle. It's that MCP is a single client-side protocol the biggest agent clients already speak, so "ship an API" no longer means "and convince every consumer to integrate with it separately." That asymmetry is the unlock.

Most sites that say "agents welcome" mean "we won't block you." They ship HTML and hope the scraper holds. It holds until the layout shifts, an ad renders mid-content, or a component loads async. Well-structured semantic HTML is a real contract — I'm not pretending otherwise — and `/skills.json` is a legitimate middle ground for any agent that'd rather fetch a typed manifest than speak RPC. But scrapers are still a hack we tolerate because writing an API felt like too much work. MCP makes the alternative easy. The protocol is small. The transport is HTTP. The tool shape is a name, a description, and a JSON schema. If a site already has an MCP server, a new tool is fifty lines.

An agent interacting with mondello.dev doesn't need to know what a blog is. It doesn't need to know what HTML looks like. It needs to know MCP and x402, both documented in under ten pages each. Everything else is the site telling the agent, in the protocols the agent already understands, exactly what's for sale, what it costs, and how to buy it.

The missing piece is discovery. An agent only calls `list_skills` if someone first registers the MCP server. `/.well-known/mcp` and agent-side registries are what solves that, and neither is solved today. Until they are, MCP discovery is an out-of-band step — somebody has to tell the agent where to look. That's the live problem.

## The bigger pattern

Same pattern, everywhere on this blog:

- Posts are served as Markdown with YAML frontmatter at `/<slug>.md`, not just HTML.

- Search results come back with audio-timestamped deep-links in `/api/search/cues.json`, not embedded in a page that you have to render and click.

- The full archive is one MCP call (`get_archive`) away, returning every post as one concatenated Markdown document.

- Even the visual stuff — the OG cards with per-post [audio waveforms](/skills/auto-audio-player), the social share previews — are SVG files served directly.

The blog is the UI for humans. The protocols are the UI for agents. Same content, same database, same authoring workflow. Nothing duplicated. Nothing "adapted for AI." Posts are posts. Skills are skills. MCP and x402 just mean both audiences can consume them without one pretending to be the other.

If you want to see it from the agent side:

```bash
claude mcp add mondello https://mondello.dev/api/mcp
```

Register the server, ask Claude for the skills catalog, and it has a typed tool that fits — it'll reach for `list_skills` before it reaches for fetch. Five round-trips, one payment, no HTML in sight. The discovery is still on you. The rest isn't.

The build-and-ship ritual that deploys this post — Astro build, post-build hook patching, `wrangler deploy` — is itself a skill at [/skills/emdash-deploy](/skills/emdash-deploy). Same pattern: one curl command, x402 USDC, the recipe lives in your `~/.claude/skills/` directory after payment.
