Web

Extract Every URL
from Any Website
Instantly

Sitemap.xml Parsing HTML Link Discovery Up to 500 URLs 1 credit

# pip install supacrawlx
from supacrawlx import Client

client = Client("YOUR_API_KEY")

# Map all URLs on a website
result = client.web.map(url="https://example.com")
print(f"Found {len(result.urls)} URLs")
for url in result.urls:
    print(url)

Features

What Makes This API Special

Automatic Sitemap Detection

The API fetches /sitemap.xml from the target domain automatically. No need to know the sitemap URL — just pass the root domain.

Nested Sitemap Index Support

Handles sitemap index files that link to multiple sub-sitemaps. All child sitemaps are parsed and URLs are merged into a single response.

HTML Link Discovery Fallback

When no sitemap exists, the API crawls the page HTML and extracts all internal links — ensuring you get URLs from any site.

Same-Domain Filtering

Only URLs on the same hostname are returned. External links and off-domain assets are automatically excluded from the result set.

SSRF Protection

Requests to private IP ranges (127.x, 10.x, 192.168.x) are blocked at the network level. Safe to use in multi-tenant environments.

Limit Parameter

Control the maximum number of URLs returned (up to 500). Useful for sampling large sites or capping the scope of a downstream crawl.

Use Cases

Platform-Specific Workflows

SEO Auditing

Get a complete URL list before running an SEO audit. Feed the URLs into your crawler, broken link checker, or page speed tester.

SEOAuditTools

Broken Link Detection

Collect all URLs from a site in one call, then probe each one for 404s, redirects, or slow response times at scale.

QAMonitoringLinks

Site Structure Mapping

Understand the architecture of any website before crawling it. Use the URL list to plan targeted scraping or selective content extraction.

ResearchPlanningCrawl

FAQs

Web Map API Questions

What does the Map API return?

A single urls[] array containing all discovered URLs on the target website. One API call, one array of links. Nothing else.

What happens if the site has no sitemap.xml?

The API automatically falls back to HTML link discovery — fetching the root page and extracting all internal href links it finds.

Does it handle sitemap index files?

Yes. If sitemap.xml is a sitemap index (containing links to sub-sitemaps), all child sitemaps are fetched and their URLs are merged into a single response.

Will it include external links?

No. Only URLs on the same domain as the target are returned. External links, CDN assets, and third-party resources are excluded automatically.

How is this different from the Web Crawling API?

The Map API returns a list of URLs only — it does not fetch or extract page content. Use it to plan a crawl, then pass the URLs to the Scraping API for content extraction.

What does a map request cost?

1 map request = 1 credit, regardless of how many URLs are discovered.

Ready to Build Something Extraordinary?

Start with 100 free requests. No credit card. No setup fee. Ship your first AI-powered feature today.

Start Building Free View Pricing

Extract Every URLfrom Any WebsiteInstantly