Firecrawl /crawl + changeTracking: Monitor an entire site for pricing, docs, and inventory shifts with one flag

TL;DR

Firecrawl's /crawl endpoint can now monitor an entire website for changes by adding changeTracking to scrapeOptions.formats. Every URL returned is flagged new, same, changed, or removed, with an optional git-diff (free) or JSON-schema diff (5 credits/page). What used to be a custom pipeline of scrape → hash → state DB → diff renderer is now a single API flag.

What's new

Change tracking originally launched on /scrape in April 2025 (Launch Week III, Day 1). The unlock now is that the same primitive works inside /crawl and /batch/scrape — so you can fire one job at a documentation tree, a competitor pricing site, or a product catalog and get back a per-page change verdict for the whole site. Version 2.6.0 (Nov 2025) made the comparison engine faster and more reliable, and recent 2026 releases added onlyCleanContent to strip nav and ads before diffing.

How it works

You tell Firecrawl to compute change tracking the same way you'd ask for markdown:

POST /v2/crawl
{
  "url": "https://example.com",
  "limit": 50,
  "scrapeOptions": {
    "formats": ["markdown", "changeTracking"]
  }
}

The markdown format must always accompany changeTracking — comparisons run on the markdown body, not raw HTML. Each page in the response gets a changeTracking object with these fields:

changeStatus — new, same, changed, or removed
previousScrapeAt — ISO timestamp of the last scrape, or null on first run
visibility — visible (linked from nav) or hidden (URL works but unlinked)
diff — line-level changes when git-diff mode is on
json — structured field comparison when JSON mode is on

Two diff modes

Pick the mode that matches what you actually need to react to:

Mode	What you get	Cost	Best for
git-diff	Line-by-line text diff with add/del/normal markers + structured chunks	Free	Docs, blog posts, ToS, policy pages
json	Schema-extracted fields compared old vs new	5 credits/page	Pricing, stock, SKU, structured data

Combine both by passing modes: ["git-diff", "json"] when you want the full text diff and a structured field-level summary in one call.

Use cases that actually move the needle

Competitor pricing intel — daily crawl of a competitor's pricing page, JSON mode against a {plan, price, billing_period} schema, alert on any field delta.
Doc-driven RAG hygiene — weekly crawl of API docs, re-embed only pages flagged changed — massive cost saver versus full re-indexing.
Inventory and SKU tracking — monitor product feeds for in-stock / out-of-stock and price shifts at scale.
Compliance watch — track ToS, privacy policy, and regulatory page edits with git-diff for an auditable record.
News and market reports — financial tools that fetch only modified reports, not the whole archive.

Tagging: track the same URL on multiple cadences

The tag parameter lets you keep parallel histories for the same URL. Run an hourly tag for the homepage and a weekly tag for the same page if you want both signal levels — comparisons stay scoped to matching tag, team, URL, and markdown config.

{
  "type": "changeTracking",
  "modes": ["git-diff", "json"],
  "tag": "hourly",
  "schema": { "type": "object", "properties": { "price": { "type": "string" } } }
}

How it compares

Plenty of tools watch web pages for changes — Visualping, Distill, Hexowatch — but they're built around a UI dashboard. Firecrawl's bet is different: change tracking is a format, not a product. You get the same primitive in the same call you already make for scraping, so it drops into existing RAG pipelines, agents, and ETL jobs without a separate vendor. Compared to rolling your own (markdown hash + state DB + diff renderer), you skip the snapshot store, the diff library, and the schema-extraction LLM call — Firecrawl already does all three.

Limitations & pricing

Snapshots are persistent and never expire — long gaps between scrapes still produce valid diffs.
Different includeTags, excludeTags, or onlyMainContent values across runs produce unreliable comparisons. Keep your config stable per tag.
Change tracking requests bypass index caching and ignore maxAge — expect a fresh fetch every time.
Pricing: basic status + git-diff are free with standard scrape credits; JSON mode is 5 credits per page because it runs LLM extraction.

What's next

Pair the crawl with webhooks (already supported on async jobs) so you receive changed pages as they're processed instead of polling for the final result. Or skip the plumbing entirely and try Firecrawl Observer, the open-source monitoring dashboard built on top of this same API.

Nguồn: Firecrawl docs, Launch Week III announcement, Changelog, Firecrawl on X.

Firecrawl /crawl + changeTracking: Monitor an entire site for pricing, docs, and inventory shifts with one flag

TL;DR

What's new

How it works

Two diff modes

Use cases that actually move the needle

Tagging: track the same URL on multiple cadences

How it compares

Limitations & pricing

What's next

Tiếp tục lướt

Firecrawl mở mã nguồn web-agent: framework dựng AI agent tự động duyệt web, bring-your-own-model

Firecrawl open-source web-agent: framework autonomous search-scrape-interact cho dev

Firecrawl web-agent: framework open-source để tự build web research agent trong vài phút

Firecrawl mở mã nguồn web-agent: tự build agent nghiên cứu web có cấu trúc