- Use cases
- Customer Success
- LOG IN
- Start free trial
Throughout my SEO career I have used many crawlers to audit website. Each crawler has its pros and cons, but what in my opinion makes a good crawler is the way you can “hack it” to fit your needs.
I have used OnCrawl for a while now and I always try to push the app’s boundaries for my analysis. I don’t believe we can build a “one size fits all” tool. There will always be that ONE thing missing. However, with a good SaaS crawler you can use technology to twist it for your own needs.
I have been playing with the OnCrawl API lately in order to pull data out of the app. This has helped me go further than what is currently available in our existing dashboards. I spoke with our users to understand what is underused and what is missing in the app.
Today’s article is going to focus on quick filters, a feature which has a lot of potential for creative people. Mixing this with the API will help you achieve a lot of new things!!
Let’s start with a quick introduction to quick filters and our data explorer.
The data explorer is where you can query all the data available in the app whether it is crawl data, log data or combined data (crawl x analytics, crawl x log data etc..).
It works a simple way:
You can apply filters on the data using our OnCrawl Query Language, which allows you xto drill down to what you need.
You can add extra columns to show the information you want in a single table.
Pretty easy right?
Rebecca wrote a full tutorial on how to use our data explorer.
The next step is to use quick filters.
Once you started playing around with OQL you can create your own quick filter, which allows to save the OQL you applied on your dataset so that you can use it again later.
All the data in the charts available in the app are based on this OQL. You can click on any metric to go to the data explorer, where the OQL is already populated for you.
Time for some creative thinking. We are going to use these quick filters as our own KPI:
In the app you can find the following:
Now let’s say I want all my H1 which are a single word I could use the following OQL.
The field looks empty, but it’s actually a whitespace, see the following screenshot where it is highlighted.
Now let’s say you want to look at pages with near duplicate content issues which are crawled by Google.
How about looking at all the pages identified, but not crawled (many reasons can explain why a page was not crawled)?
Or you can also use this:
You can go even further with scraped data including:
And even use data ingestion in order to pull pages with:
You can also save metrics available in OnCrawl as your own quick filters.
Some may have realised by now that some of these quick filters are also used as page groups for segmentation. It up to you how you want to visualise and use the data.
You now have a list of handy quick filters, so what’s next?
We are now going to use OnCrawl’s API to pull all of our data! This will allow us to create a table with each quick filter and the number of URLs per filter for each live crawl!
Head to this Jupyter Notebook in order to see the code to pull all this data and send it to a Google Sheet.
To run the Notebook on your own, you will need to provide:
We will then automatically pull all your live crawls and own custom filters. For each crawl, we will count the number of URLs matching each filter.
Now that you have all this data into Google Sheet what can you do?
In a future article, I will share a template you can use directly in Google Spreadsheet. Keep an eye out for it!