This article is a follow-up to the first part, in which we introduced a starter kit to get started with the Oncrawl MCP server.
In less than 20 minutes, you were able to create an agent/auditor capable of producing a comprehensive and actionable audit of your site’s structure, as well as generating natural-language OQL queries to run in the Data Explorer, without any prior technical expertise.
This second article is intended for users who are already comfortable with the MCP and have experience with Oncrawl. We’re taking it a step further: four advanced use cases that expand on what you can do natively within the platform.
- Use case 1: Cross-reference the page dataset with the link dataset. This is a join that the interface does not currently support.
- Use case 2: Set up log-based alerts to monitor bot behavior.
- Use case 3: Trigger Slack alerts for crawl-over-crawl variations, with thresholds you control.
- Use case 4: Combine the Oncrawl MCP with other MCPs (such as Google Search Console’s URL Inspection API) to answer questions that no single tool can address on its own.
A quick note before we begin. Large language models (LLMs) perform better on smaller datasets than on structures containing millions of URLs. We recommend working with a dataset of around 100,000 URLs and a link source that does not exceed one million links. Beyond that, token costs and computation time rise rapidly.
Use case 1: Join the page dataset with the link dataset
As of this writing, the interface does not allow for an in-app join between the pages dataset and the links dataset using the URL as a key. These two datasets are accessible via a radio button in the top-left corner of the Data Explorer, which lets you switch between them but does not allow you to cross-reference them.
If the join were available natively, you would get a unified table, queryable from the Data Explorer, that would cross-reference the metrics from the pages dataset (inrank, depth) with those from the links dataset (origin position, anchor text, etc.).
You could then ask questions such as:
“Which pages contain links originating from the body that point to pages with an inrank between 1 and 5?”
The MCP overcomes this constraint. Here are three useful prompts you can give to an LLM connected to Oncrawl’s MCP.
Cross-referencing backlinks and Google Search Console performance
If you have the GSC connector enabled, request a report on inbound links originating from editorial sections (origin: main) that point to pages in the bottom 25% of page views (75% of the site’s pages have more page views than these).
This prompt identifies pages that receive internal link equity without converting it into organic visibility. It is a classic sign of an underutilized editorial opportunity.
Identifying wasted SEO equity
Which internal links are wasting SEO equity within the site structure, i.e., those pointing to non-indexable pages or pages returning 4XX, 5XX, or 3XX status codes?
The results provide a list that your development team can use directly to fix or redirect these links in bulk.
Target dead-end redirects
Which pages have a 3XX status code but a final redirect target with a 4XX or 5XX status code? Generate a table with the following columns: Source URL | Status code (origin) | Final redirect target | Final redirect target status code.
These are the most harmful redirects: they consume crawl budget without serving any purpose and are not visible in a standard audit.
Preview the results
The results can be displayed in a table. To make it easier to read large datasets, you can also connect a visualization skill to your LLM, which is a module that automatically converts the data into charts.
The open-source repository chart-visualization-skills offers several free options that are compatible with Claude Code.
Use case 2: Log alerts
One of our users recently spent six days investigating why entire sections of their site had disappeared from their site map.
The cause: an update to their JavaScript rendering had made certain groups of pages completely invisible in their site map, without triggering any monitoring alerts. With the right MCP setup, they would have been alerted within a few hours.
This type of silent failure is common, and setting up log alerts remains one of the areas where many of our users struggle. The MCP simplifies this process: once logged in, you write your alert logic in natural language, without having to code a system on top of the exports.
For this use case, your LLM must be connected to both the Slack MCP and the Oncrawl MCP.
Verify beforehand that your logs are being collected correctly by Oncrawl.
Pages vs. events: choosing the right object
The Oncrawl MCP provides a get_data_search_logs tool that allows you to query project logs for two types of objects:
pagesevents
Use pages when you want to aggregate hits by URL (for example, to track which pages are crawled and how often). Use events when you need raw log entries rather than aggregated metrics.
When querying pages, you must specify a granularity: days, weeks or months.
Ready-to-use prompt
Aggregate hits by bot and by day to get a complete daily series per bot over 14 days — not a period-over-period comparison. Filter on bot hits only. Bots to monitor: [comma-separated list, e.g. "claude_bot, openai_gpt_bot, google_smartphone, bing_web_search" — or write "all available bots" to track every bot detected by Oncrawl]. For each bot, examine its daily curve over 14 days and flag: - Sudden spike: a day clearly above the bot's normal level - Sudden drop: a day clearly below the normal level - Slow but steady decline: downward trend over several days - Silent bot: 2+ consecutive days at zero after an active period Send a Slack alert to [#CHANNEL or @USER] summarizing your findings: the bot concerned, the type of variation, the days impacted, and the affected volume compared to the bot's 14-day daily average. If nothing unusual stands out, send a short note "all bots are stable over the past 14 days". Run this check [daily | weekly].
Use case 3: Crawl-over-crawl alerts
Whereas the previous use case monitors bot behavior in your logs, this one monitors changes in your own structure. Our advanced users consistently report three key needs:
1. “Tell me what’s changed since the last crawl.”
A comparative view, not just absolute metrics. Ideally with a statistical baseline (standard deviation) to distinguish normal variation from anomalies.
2. “Notify me on the right channel, at a reasonable frequency, with a threshold that I control.”
Slack and Teams instead of email, daily updates on critical URLs, percentage changes rather than absolute values for large sites.
3. “Please don’t require me to build an alerting system on top of the exports.”
Today, many users rely on cron jobs, scripts, and Data Studio to fill this gap.
The prompt template below addresses all three of these needs. It combines crawl-over-crawl comparisons, configurable thresholds (absolute, percentage, standard deviation), and Slack delivery.
About CronCreate
The template below can be used on an ad hoc or recurring basis. When a recurring execution is requested, it relies on CronCreate, a scheduling tool provided by the MCP server.
CronCreate allows you to schedule a task so that it runs automatically at a defined interval (daily, weekly, etc.). In the implementation covered by this template, the schedule is preserved and continues to function even after a session restart.
You therefore do not need to manually restart the prompt for each execution. This is what transforms a one-time crawl-over-crawl analysis into an automated and recurring alert system.
Template (recommended content)
Run an Oncrawl crawl-over-crawl analysis on the project
[PROJECT NAME or PROJECT ID — find it via list_projects if unknown].
Compare the latest crawl with [the previous crawl | crawl ID X].
Track the following changes:
[pick one or more:
- new pages added since the last crawl ← "rogue content" case
- pages removed since the last crawl
- status code transitions (e.g. 200→404, 200→301, 4xx→2xx)
- new orphan pages (pages that have lost all their inlinks)
- depth shift greater than [N] levels
- inrank drop greater than [X%] on pages in [page group]
- traffic drop on indexable pages: gsc_clicks down by >[X%]
- custom OQL filter: [paste your OQL JSON or your filter in plain English]
]
Scope:
- segment by [page group | language | country | depth bucket | none]
- restrict to pages where [optional OQL: e.g. status_code equals 200
AND indexable equals true AND page_group equals "/products/"]
Trigger the alert only if:
- the number of changed pages exceeds [N in absolute terms]
OR
- the change exceeds [X%] of all pages in the scope ← mega-site threshold
OR
- today's count is more than [Z=2] standard deviations above
the daily average [over 14 days] for this type of change
← standard-deviation baseline
For each flagged change, return:
- the type of change
- the count + % of the scope
- the top [10] affected URLs (sorted by [inlinks desc | gsc_clicks
desc | inrank desc])
- the previous vs current values side by side
- the OQL query used (so it can be replayed in the UI)
Post the result to Slack:
- channel or DM: [#CHANNEL or @USER]
- format: short executive summary at the top, then one bullet per
flagged change, then a collapsible block with the affected URLs
- if nothing crosses the threshold: send a single line "all stable"
on the same channel (or skip — your choice: [send | skip])
- delivery: [draft for review before sending | send directly]
Cadence:
- run [every day at 09:00 Europe/Paris | every Monday at 09:00 |
one-shot]
- business days only: [yes | no]
- if recurring, register the schedule via CronCreate so it survives
session restarts; in the first response, remind me how to cancel it.Use case 4: Combining multiple MCPs to answer questions that no single tool can address
The three previous use cases relied on a single MCP, the one from Oncrawl, sometimes combined with Slack for alert delivery.
But certain SEO issues lie at the crossroads of several data sources: the crawled structure, Search Console performance, and Google’s indexing status.
Connecting multiple MCPs to the same LLM allows you to address these issues without having to switch back and forth between five tools.
The following example illustrates this principle with a concrete case: cross-referencing data from Oncrawl with that from Google Search Console’s URL Inspection API.
Connect the GSC MCP as a complementary tool to the Oncrawl MCP
A use case frequently reported by our users involves connecting the Google Search Console MCP to their LLM, in addition to the Oncrawl MCP.
An open-source community project exposes GSC tools as an MCP. It allows you to merge Search Console data with Oncrawl data directly within the conversation.
Security note: This is not an official Google MCP. The project is open source and maintained by the community. We recommend conducting a security audit before running it on your device, especially if you are working on client projects.
Project link: https://github.com/AminForou/mcp-gsc
The project provides several tools, including:
inspect_url_enhanced: returns the detailed crawl/index status for a URL.check_indexing_issues: checks for indexing issues on a list of URLs.
The benefits of combining the two sources
The main benefit of this MCP is that it cross-references GSC data not available in Oncrawl and compares it with your crawled site structure.
Specifically:
- Pages actually indexed by Google
- The date of Googlebot’s last visit
- Indexing issues reported in GSC
This provides you with an actionable overview that answers specific questions:
- How many pages are indexed in Google but considered orphaned by Oncrawl?
- How many pages are not fetched by Oncrawl even though they are crawled and indexed by Google?
- How many pages are excluded from Google’s index despite being present in the site structure?
Three diagnostic buckets
The prompt below organizes the analysis into three buckets, each corresponding to a distinct failure mode:
- Bucket A: pages present in the site structure but which Oncrawl cannot fetch. Indicates a technical blockage (robots.txt, intermittent server error, blocked resources).
- Bucket B: orphaned pages outside the structure that nevertheless show signs of discovery on the GSC side (clicks, presence in the sitemap). Indicates an internal linking issue.
- Bucket C: pages crawled by Oncrawl but not indexable. Indicates an on-page configuration issue to be addressed (canonical, noindex, redirects).
Note on the GSC quota: The URL Inspection Tool is subject to a limit of 2,000 URLs per day. Please take this into account when selecting your sample.
Ready-to-use prompt
Cross-reference Google Search Console URL inspection data with
Oncrawl crawl data for {{GSC:property}} (project {{Project_IID}},
latest crawl).
Objective: understand why certain pages are uncrawled, less
crawled, or not indexed.
Use the Oncrawl MCP and the GSC MCP. Build three diagnostic
buckets:
- Bucket A: in-structure pages that Oncrawl cannot fetch
(`depth has_value AND fetched=false`)
- Bucket B: out-of-structure orphans with discovery signals
(no depth, but GSC clicks / SEO visits / sitemap presence)
- Bucket C: crawled but non-indexable
(`fetched=true AND indexable=false`)
For each bucket, sample the URLs via the GSC URL Inspection API,
then assign a diagnostic label per pair (Oncrawl state × GSC
verdict). Respect the GSC quota (10 URLs per batch, 2,000 per day).
Deliverable: a structured plan document with:
- objective
- prerequisites
- approach
- findings (counts, top orphans by SEO traffic, diagnostic
matrix)
- prioritized actions (P0 → P4)
- open questions
- dependencies.Wrapping up
These four use cases are not set in stone, and each should be tailored to your specific context. There are two principles worth keeping in mind before implementing them.
Define your scope
LLMs perform better on smaller datasets than on structures containing millions of URLs. Sample the data, segment by page group, or restrict it using an OQL filter before running the analysis. This will improve accuracy and reduce token costs.
Test before scaling up
A one-shot prompt used to explore a hypothesis has different requirements than a prompt scheduled to run recurrently via CronCreate. Before pushing a daily Slack alert into production, run the prompt manually for a week or two, verify the thresholds, and confirm that the generated OQL queries match what you expect in the Data Explorer.
Consider multi-MCP for any metrics that cannot be generated by Oncrawl
The most interesting SEO questions lie at the intersection of multiple sources: crawled structure, GSC performance, bot behavior in logs, and Google’s indexing status. Connecting multiple MCPs to the same LLM is what allows you to answer these questions without switching back and forth between five tools.
If you set up your own prompts on top of the Oncrawl MCP and identify recurring patterns worth sharing, let us know.
It’s often the workflows you build in the field that give rise to the platform’s next native features.

