For providers / Content gatewayDocs menu

Content gateway

Sell access to web content (articles, research, premium docs) through APIHub. Verify your domain, set URL rules, pick a protection pattern.

How it works

Agents call proxy.apihub.io/{your-slug}/fetch?url=... with a target URL. The proxy:

  1. Looks up the verified domain by hostname
  2. Finds the most-specific matching pricing rule by URL pattern
  3. Charges the agent that rule's price
  4. Fetches the URL via headless browser
  5. Returns clean extracted text + structure (no nav, no ads)

1. Create a content service

From your dashboard, click New service and pick "content" as the service type. The slug becomes the path: proxy.apihub.io/{slug}/fetch.

2. Verify your domain

Before any pricing rules apply, you prove you own the domain. Two methods:

DNS TXT record

Add a TXT record on your apex domain:

DNS record
Type:  TXT
Name:  @ (or your apex)
Value: apihub-verify=<your-token>

We query Cloudflare DNS-over-HTTPS to verify. Click "Verify" in the dashboard once the record propagates (usually within minutes).

HTML meta tag

Add this tag to your homepage:

HTML
<meta name="apihub-verify" content="<your-token>">

We GET your homepage and look for the tag. Faster to deploy if you control your site code.

Until verified

Requests to your domain return DOMAIN_NOT_VERIFIED. No charges happen, no content is served.

3. Add pricing rules

Each verified domain gets a list of URL-pattern rules. Each rule has:

  • URL pattern: glob-style match against the URL path (/articles/*, /premium/**, /blog/2026/*)
  • Price per request: in microdollars
  • Free preview chars (optional): if set, the response includes a truncated preview when no payment is provided
  • Wait-for selector (optional): CSS selector the headless browser waits for before extracting (for JS-rendered pages)
  • Exclude selectors (optional): CSS selectors to strip from the extracted content (nav, ads, related posts)

Rules are matched most-specific-first by pattern length. If no rule matches the request URL, the call returns 404 NO_CONTENT_RULE with no charge.

4. Pick a protection pattern

The proxy fetches your URLs as a normal browser would. Whether direct scrapers can also reach the content is up to your backend. Three patterns:

Pattern A: Soft paywall

Page is publicly readable. You rely on APIHub being the easy/legal way for agents to consume; direct scrapers technically can read the content but aren't in your distribution funnel.

Best for: blogs, news, indexable reference content.
Setup: just verify the domain and add pricing rules.

Pattern B: IP whitelist

Your CDN/origin firewall blocks all traffic except Cloudflare IPs. The proxy fetches via Cloudflare, so it's naturally allowed.

Best for: mid-value gated content where stopping casual scrapers is enough.
Caveat: any other Cloudflare-hosted scraper could also pass; combine with Pattern C for stronger guarantees.

Pattern C: Signed auth header (recommended for high-value content)

You issue APIHub a private token. The proxy injects it on every fetch. Your backend gates content based on that header. Scrapers without the token get whatever logged-out users see.

Setup: on your service's edit page, set:

Service config
upstream_auth_header: X-Provider-Auth
upstream_auth_value: sk_provider_abc123xyz

Then your backend checks the request header and issues content only when it matches your secret. Rotate by updating the service config; the proxy picks it up immediately.

What APIHub guarantees vs what you handle

APIHub guarantees

  • Domain ownership verification before pricing rules apply
  • Atomic per-fetch payment (no double-charges or missed charges)
  • Transaction record for every fetch attributed to your service
  • Daily batched payouts to your wallet (commission deducted)
  • Clean structured content delivery (text, headings, metadata)
  • Update pricing or pause your service at any time

You handle

  • Whether your content is actually gated (Pattern A vs C)
  • Setting URL patterns and per-pattern prices
  • Content quality (the proxy returns whatever your upstream serves)
  • Rotating the upstream auth secret if Pattern C
  • Wallet address configuration for payouts

Response shape

A successful fetch returns the clean structured content. If the rule has free preview chars configured and the agent has no balance, you get a truncated preview.

Response
{
  "ok": true,
  "data": {
    "source_url": "https://blog.example.com/premium/article-1",
    "title": "The Future of AI Agents",
    "description": "An in-depth analysis...",
    "text": "Full article text...",
    "headings": [
      { "level": 1, "text": "The Future of AI Agents" },
      { "level": 2, "text": "Introduction" }
    ],
    "word_count": 1200,
    "truncated": false
  }
}

Recommendation

Start with Pattern A to ship fast. If you later see direct scrapers cutting into your revenue, add Pattern C without changing the URL surface.
Content gateway (providers) - APIHub Docs - APIHub