Depth 2 crawling
Homepage plus relevant contact, about, team, location, legal, and support pages.
AI-first REST API that turns a website URL into structured company, contact, and lead data.
Website to Company data is an AI-first REST API service for lead aggregators, CRMs, enrichment workflows, and internal automations. It crawls a website with depth 2, prepares deterministic parser context from structured data and page content, validates discovered emails, and sends the gathered website context to OpenRouter AI for strict structured extraction on every successful request. The response uses fixed category keys so downstream systems can parse it safely.
/api/website-to-company-data/v1/extract
REST API service
Website to Company data is built for lead aggregators, enrichment products, CRM imports, and internal automations that start with only a website URL. The API crawls relevant pages, extracts structured company data, validates emails, and returns a stable JSON schema that can be parsed without guessing dynamic keys.
The extraction flow is AI-first. The crawler collects high-signal pages, deterministic parsing prepares support context, and selected HTML is sent to the default OpenRouter model with a strict extraction prompt on every successful request. The API still keeps fixed categories for sales, support, info, billing, careers, press, legal, technical, general, and other.
Homepage plus relevant contact, about, team, location, legal, and support pages.
Emails, phones, and people always use the same fixed category keys.
Invalid, example, demo, and no-MX email addresses are filtered out.
OpenRouter extraction is always used as the primary extraction layer.
How it works
The service checks the token, quota, URL shape, and blocks private/local network targets.
The crawler fetches the homepage and high-signal internal pages up to depth 2.
HTML, JSON-LD, meta tags, mailto links, tel links, and page text are parsed as support context.
Selected HTML and parser context are sent to OpenRouter for strict JSON extraction on every successful request.
Stable schema
Every response keeps the same category keys for email addresses, phone numbers, and people, even when a category is empty.
Use cases
AI-first extraction with deterministic parser context, fixed categories, and validated email results.
Fetches the homepage and relevant candidate pages such as contact, about, team, legal, and location pages.
Returns company name, industry, address, city, country, location, and source confidence in a stable JSON schema.
Emails, phones, and contact person names are returned under stable keys such as sales, support, info, billing, careers, legal, technical, general, and other.
Discovered email addresses are filtered, deduplicated, checked for syntax and MX records, and generic demo addresses are ignored.
Every successful extraction sends selected website HTML and parser context to the default OpenRouter model for structured JSON output.
Every request is tied to a service token and plan-based lifetime or daily website quota.
Website extraction quotas for enrichment workflows of different sizes.
Test the API with a one-time lifetime allowance.
25 websites lifetime
Daily capacity for small lead enrichment workflows and internal tools.
200 websites per day
Recommended
Higher daily capacity for production lead aggregators and CRM pipelines.
1000 websites per day
Large daily quota for high-volume enrichment systems.
20000 websites per day
Endpoint pattern
Use the documentation page to generate your token, inspect quotas, and test live responses.
GET https://ai.mihajlo.mk/api/website-to-company-data/v1/extract?website=https://example.com&token={serviceTokenHere}