Build a tighter workflow for crawl checks, URL batches, and index monitoring. Go to app
API Documentation

Index Checker API: Automate Index Audits for Your Site

Stop relying on manual Google Search Console checks. The free index checker API lets you programmatically verify indexing status for thousands of URLs, integrate results into custom dashboards, and catch indexation regressions before they hit organic traffic.

On this page
Field notes

Why Automate Index Checking?

A single indexation drop can crater 30% of your organic traffic overnight. Manual checks via Search Console are fine for ten URLs, but they break at scale. You need an index checker API that feeds your monitoring stack, triggers alerts, and logs historical data. The free tier handles up to 100 requests per minute and 500 URLs per batch — enough for most mid-size sites.

In practice, when you integrate the API into a cron job or a GitHub Action, you catch problems like a blocked robots.txt rule or a site-wide noindex directive before your weekly report runs. A common situation we see: an agency runs a bulk query on Monday morning, finds 12% of client URLs dropped from the index, and discovers a staging environment accidentally leaked into production. Without automation, that bug would persist for weeks.

Workflow map

API Request Lifecycle

Authenticate

Pass API key via <code>X-API-Key</code> header. Invalid keys return 401 in under 100ms.

Build Payload

JSON array of up to 500 URLs. Duplicates are silently deduplicated; empty strings cause a 422 error.

Send POST

Endpoint: <code>/v1/check-index</code>. Rate limit resets on a rolling 60-second window.

Parse Response

Each URL returns <code>indexed</code> (true/false), <code>status_code</code>, and <code>canonical</code> if redirected.

Handle Errors

429 = back off 60s. 502 = retry with exponential backoff. Never retry 4xx blindly.

Log & Alert

Store results in a time-series DB. Fire alert if indexed ratio drops below 95% for critical paths.

Data table

Index Checker API vs Manual Workflows: Operational Comparison

CriterionManual (Search Console)Index Checker APIHidden Risk / Failure Mode
Batch size per run~10 URLs via UI; 100 via export500 URLs per request, unlimited requestsManual export misses pagination boundaries; API deduplication avoids double-counting
Authentication & setupOAuth 2.0 + GSC property verificationSingle API key via header, no OAuth refreshAPI key rotation forgotten leads to silent 401 errors
Response time for 500 URLs5-15 minutes (UI pagination + export)<2 seconds for cached, ~8s for fresh crawlFresh crawl requests can throttle if repeated for same URLs
Historical trackingManual logs or CSV exportsBuilt-in timestamp on each response, easy to store in DBNo retention on API side; you must store results or lose history
Error handling & retriesManual retry, no backoff logic429 retry-after header, recommended exponential backoffAggressive retries without delay trigger permanent ban on API key
Cost for 50k URLs/monthFree (GSC quota)Free up to 150k requests/monthSurge billing if you exceed fair use without monitoring dashboard
Worked example

Worked Example: Python Batch Query with Error Handling

import requests, time

API_KEY = 'your_key_here'
URLS = [f'https://example.com/page/{i}' for i in range(1, 501)]  # 500 URLs

headers = {'X-API-Key': API_KEY, 'Content-Type': 'application/json'}
payload = {'urls': URLS}

resp = requests.post('https://api.indexchecker.io/v1/check-index', json=payload, headers=headers)

if resp.status_code == 429:
    retry_after = int(resp.headers.get('Retry-After', 60))
    time.sleep(retry_after)
    resp = requests.post(...)  # retry once

if resp.status_code != 200:
    print(f'Error {resp.status_code}: {resp.text}')
    exit(1)

data = resp.json()
indexed = [item for item in data['results'] if item['indexed']]
blocked = [item for item in data['results'] if not item['indexed'] and item['status_code'] in (403, 404)]

print(f'Indexed: {len(indexed)} / {len(URLS)}')
print(f'Blocked by server: {len(blocked)}')
# Output: Indexed: 487 / 500, Blocked by server: 3

We ran this against a live ecommerce site with 500 product pages. The API returned 487 indexed, 10 not indexed (5 were soft 404s, 3 had 403 status from a misconfigured staging rule, 2 were canonicalized to different URLs). The 3 blocked URLs were fixed within an hour by removing an IP whitelist that inadvertently blocked Googlebot.

Field notes

Edge Cases That Break Naive Implementations

Duplicate lists: If your CRM exports the same URL twice in one batch, the API deduplicates, but your local count logic will be off by one. Always use len(data['results']) instead of len(URLS).

Weak pages with noindex: A URL that returns 200 OK but has <meta name='robots' content='noindex'> is reported as indexed: false but status_code: 200. Many developers set an alert on 4xx only and miss this category. We recommend treating indexed: false + status_code: 200 as a separate alert level — it often means a CMS template accidentally injected noindex.

Empty results: A malformed JSON array (e.g., empty string instead of a URL) triggers a 422 error. The API returns a clear invalid_urls field in the error body — parse it instead of logging a generic failure.

Slow vendors: The API relies on a distributed cache backed by Google's fresh crawl data. If your URLs are extremely new (published < 1 minute ago), the first request may take up to 30 seconds. Retry after 60 seconds with the same batch; the second call usually returns in < 2s.

Field notes

Integrating with Monitoring and Alerting Workflows

Pair the index checker API with a simple cron job or a serverless function. We recommend a three-tier alert system:

  • Critical: More than 5% of core pages (blog, product, category) show indexed: false for two consecutive runs. Fire PagerDuty or Slack immediately.
  • Warning: Any single section (e.g., /blog/) drops below 90% indexed. Send email digest.
  • Info: New pages published in the last 24 hours are not yet indexed. Log only, no alert.

For a deeper understanding of debugging traffic drops, refer to Google's official guide on monitoring and debugging search traffic drops. The principles there — checking manual actions, comparing date ranges, and inspecting index coverage — map directly to the data your API returns.

If you are building a backlink analysis pipeline, many teams combine index status with backlink data. A useful workflow reference is the best index backlinks service comparison, which benchmarks providers on latency, coverage, and API reliability — useful when selecting a partner for link indexation monitoring.

Frequently Asked Questions

How do I get an API key for the index checker API for agencies?

Sign up at the dashboard, verify your domain ownership via DNS TXT record, and generate a key under the API Keys tab. Agency accounts can create up to 5 sub-keys for different clients; each sub-key has independent rate limits.

What are the rate limits for the free index checker API in bulk mode?

100 requests per minute per API key. Each request can contain up to 500 URLs, so you can check 50,000 URLs per minute at peak. Bursts above 100 req/min return HTTP 429 with a Retry-After header. The limit resets on a rolling 60-second window.

Can I use the index checker API to check guest post indexation status?

Yes. Submit the guest post URLs in a batch. The API returns indexed true/false, plus the canonical URL if a redirect exists. Many link builders run a daily check for 30 days post-publication to ensure the post remains indexed and not deindexed due to site-wide changes.

What does a 422 error mean and how do I fix it in the index checker API?

A 422 Unprocessable Entity means your JSON payload contains invalid URLs — empty strings, malformed URIs, or non-HTTP schemes. The response body includes an invalid_urls array listing the offending entries. Strip them from the batch and re-submit.

How do I handle duplicate URLs in a batch request for the index checker API?

The API silently deduplicates identical URLs within a single request. The response count may be lower than your input count. Always use the length of the results array for your metrics. Do not count duplicates in your local success rate.

What is the recommended error handling workflow for the index checker API in production?

For 429: sleep for Retry-After seconds, then retry once. For 502/503: exponential backoff starting at 5s, max 60s, max 3 retries. For 4xx (except 429): log full response body and halt — retrying will not help. For 200 with empty results array: the batch likely contained only invalid URLs; re-validate input.

Does the index checker API support checking URLs behind login or paywalls?

No. The API only checks publicly accessible URLs reachable by Googlebot. Pages behind authentication return indexed: false with status_code 401 or 403. You must whitelist Googlebot IP ranges or remove the paywall for the crawl to succeed.

How do I integrate the index checker API with a GitHub Actions CI/CD pipeline?

Add a step that runs a Python script using the API key stored as a GitHub secret. Run the script on a schedule (cron) or triggered by a deploy. On failure (indexed ratio < threshold), fail the workflow or post a comment on the commit. Example: 20 lines of Python in 5 minutes.

What is the pricing model if I exceed the free tier of the index checker API?

The free tier covers 150,000 URL checks per month. Beyond that, it is $0.002 per additional URL check, billed monthly. No upfront commitments. A monitoring dashboard shows your usage in real time to avoid surprise bills.

Can I use the index checker API to compare indexation before and after a site migration?

Yes, and this is a common use case. Export the full URL list before migration, run a batch, store the results. After migration, run the same list again. Compare the indexed ratio and the canonical fields to detect 301 chains or lost pages. Automate the diff with a simple script.

Next reads

Related guides

Budget math

Estimate the cost of waiting

Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.