For most scraping workloads you don’t want raw HTML - you want clean JSON with the fields that matter. PassDocumentation Index
Fetch the complete documentation index at: https://developers.scrapeunblocker.com/llms.txt
Use this file to discover all available pages before exploring further.
parsed_data=true to /getPageSource and ScrapeUnblocker extracts structured data using the best available method for that page.
Request
Response shape
page_type
Detected category of the page. Common values:
product- e-commerce product detail pagelisting- search results or category pagearticle- news, blog, or editorial contentjob- job postingreal_estate- property listingunknown- extractor could not classify the page
source
Which extraction strategy produced the data:
| Source | What it means |
|---|---|
schema_org | The page exposed JSON-LD or microdata using schema.org vocabulary. Most reliable. |
next_data | Extracted from a Next.js __NEXT_DATA__ <script> block. Common on modern e-commerce. |
nuxt_data | Extracted from a Nuxt __NUXT__ block. |
og_meta | Fell back to OpenGraph / Twitter Card meta tags. Limited fields but always normalized. |
ai_rule | Custom selector rule generated by AI for this domain. Used when no structured data is available. |
data
The extracted fields. Schema depends on page_type. Field names are normalized across sources - a product always has title and price regardless of whether source is schema_org or ai_rule.
When parsed data is the right choice
Use it when you’re scraping a known page type at scale - products, articles, listings, jobs. Saves you from writing per-site parsers.
Combining with get_cookies
You can set both parsed_data=true and get_cookies=true on the same request. The response gains a cookies field and a proxy field alongside data:

