TL;DR

Firecrawl's /crawl endpoint can now monitor an entire website for changes by adding changeTracking to scrapeOptions.formats. Every URL returned is flagged new, same, changed, or removed, with an optional git-diff (free) or JSON-schema diff (5 credits/page). What used to be a custom pipeline of scrape → hash → state DB → diff renderer is now a single API flag.

What's new

Change tracking originally launched on /scrape in April 2025 (Launch Week III, Day 1). The unlock now is that the same primitive works inside /crawl and /batch/scrape — so you can fire one job at a documentation tree, a competitor pricing site, or a product catalog and get back a per-page change verdict for the whole site. Version 2.6.0 (Nov 2025) made the comparison engine faster and more reliable, and recent 2026 releases added onlyCleanContent to strip nav and ads before diffing.

How it works

You tell Firecrawl to compute change tracking the same way you'd ask for markdown:

POST /v2/crawl
{
  "url": "https://example.com",
  "limit": 50,
  "scrapeOptions": {
    "formats": ["markdown", "changeTracking"]
  }
}

The markdown format must always accompany changeTracking — comparisons run on the markdown body, not raw HTML. Each page in the response gets a changeTracking object with these fields:

  • changeStatusnew, same, changed, or removed
  • previousScrapeAt — ISO timestamp of the last scrape, or null on first run
  • visibilityvisible (linked from nav) or hidden (URL works but unlinked)
  • diff — line-level changes when git-diff mode is on
  • json — structured field comparison when JSON mode is on

Two diff modes

Pick the mode that matches what you actually need to react to:

ModeWhat you getCostBest for
git-diffLine-by-line text diff with add/del/normal markers + structured chunksFreeDocs, blog posts, ToS, policy pages
jsonSchema-extracted fields compared old vs new5 credits/pagePricing, stock, SKU, structured data

Combine both by passing modes: ["git-diff", "json"] when you want the full text diff and a structured field-level summary in one call.

Use cases that actually move the needle

  • Competitor pricing intel — daily crawl of a competitor's pricing page, JSON mode against a {plan, price, billing_period} schema, alert on any field delta.
  • Doc-driven RAG hygiene — weekly crawl of API docs, re-embed only pages flagged changed — massive cost saver versus full re-indexing.
  • Inventory and SKU tracking — monitor product feeds for in-stock / out-of-stock and price shifts at scale.
  • Compliance watch — track ToS, privacy policy, and regulatory page edits with git-diff for an auditable record.
  • News and market reports — financial tools that fetch only modified reports, not the whole archive.

Tagging: track the same URL on multiple cadences

The tag parameter lets you keep parallel histories for the same URL. Run an hourly tag for the homepage and a weekly tag for the same page if you want both signal levels — comparisons stay scoped to matching tag, team, URL, and markdown config.

{
  "type": "changeTracking",
  "modes": ["git-diff", "json"],
  "tag": "hourly",
  "schema": { "type": "object", "properties": { "price": { "type": "string" } } }
}

How it compares

Plenty of tools watch web pages for changes — Visualping, Distill, Hexowatch — but they're built around a UI dashboard. Firecrawl's bet is different: change tracking is a format, not a product. You get the same primitive in the same call you already make for scraping, so it drops into existing RAG pipelines, agents, and ETL jobs without a separate vendor. Compared to rolling your own (markdown hash + state DB + diff renderer), you skip the snapshot store, the diff library, and the schema-extraction LLM call — Firecrawl already does all three.

Limitations & pricing

  • Snapshots are persistent and never expire — long gaps between scrapes still produce valid diffs.
  • Different includeTags, excludeTags, or onlyMainContent values across runs produce unreliable comparisons. Keep your config stable per tag.
  • Change tracking requests bypass index caching and ignore maxAge — expect a fresh fetch every time.
  • Pricing: basic status + git-diff are free with standard scrape credits; JSON mode is 5 credits per page because it runs LLM extraction.

What's next

Pair the crawl with webhooks (already supported on async jobs) so you receive changed pages as they're processed instead of polling for the final result. Or skip the plumbing entirely and try Firecrawl Observer, the open-source monitoring dashboard built on top of this same API.

Nguồn: Firecrawl docs, Launch Week III announcement, Changelog, Firecrawl on X.