The Search Console Coverage Report provides information on which pages on your site have been indexed and lists URLs that have presented any problems while Googlebot tries to crawl and index them.
The main page in the coverage report shows the URLs in your site grouped by status:
This coverage report gives a lot more information than the old google search console did. Google has really improved the data it shares, but there are still some things that need improvement.
As you can see below, Google shows a graph with the number of URLs in each category. If there is a sudden increase in errors you can see the bars and even correlate that with impressions to determine if an increase in URLs with errors or warnings can drive impressions down.
After a site launches or you create new sections, you want to see an increasing count of valid indexed pages. It takes a few days to Google index new pages, but you can use the URL inspection tool to request indexation and reduce the time for Google to find your new page.
However, if you see declining number of valid URLs or see sudden spikes, it is important to work on identifying the URLs in the Errors section and fix the issues listed in the report. Google provides a good summary of action items to carry out when there are increases in errors or warnings.
Google provides information about what the errors are and how many URLs have that problem:
Remember that Google Search Console doesn’t show 100% accurate information. In fact,there have been several reports about bugs and data anomalies. Furthermore, Google Search console takes time to update, it has been known for the data to be 16 days to 20 days behind. Also, the report will show sometimes list more than a 1000 pages in errors or warning categories as you can see in the image above, but it only allows you to see and download a sample of 1000 URLs for you to audit and check.
Nevertheless, this is a great tool for finding indexation issues on your site:
When you click on a specific error, you’ll be able to see the details page that lists examples of URLs:
As you can see on the image above, this is the details page for all of the URLs responding with 404. Each report has a “Learn More” link that takes you to a Google documentation page providing details about that specific error. Google also provides a graph that shows the count affected pages over time.
You can click on each URL in order to inspect the URL which is similar to the old “fetch as Googlebot” feature from the old Google Search Console. You can also test if the page is blocked by your robots.txt
After you fix URLs, you can request Google to validate them so that the error disappears from your report. You should prioritize fixing issues that are in validation state “failed” or “not started”.
It is important to mention that you should not expect all of the URLs on your site to be indexed. Google states that the webmaster’s goal should be to get all of the canonical URLs indexed. Duplicate or alternate pages will be categorized as excluded since they have similar content as the canonical page.
It is normal for sites to have several pages included in the excluded category. Most websites will have several pages with no index meta tags or blocked through the robots.txt. When Google identifies a duplicate or alternate page you should make sure those pages have a canonical tag pointing to the correct URL and try to find the canonical equivalent in the valid category.
Google has included a dropdown filter at the top left of the report so you can filter the report for all known pages, all submitted pages or URLs in a specific sitemap. The default report includes all known pages which includes all of the URLs discovered by Google. All submitted pages include all of the URLs you’ve reported through a sitemap. If you’ve submitted several sitemaps you can filter by URLs in each sitemap.
If Google is correct and the URL was incorrectly blocked, you should update your robots.txt file to allow Google to crawl the page.
Working at an agency, I have access to a lot of different sites and their coverage reports. I’ve spent time analyzing the errors that Google reports in the different categories.
It has been helpful to find issues with canonicalization and duplicate content, however sometimes you encounter discrepancies as the one reported by @jroakes:
Looks like Google Search Console > URL Inspection > Live Test incorrectly reports all JS and CSS files as Crawl allowed: No: blocked by robots.txt. Test about 20 files across 3 domains. pic.twitter.com/fM3WAcvK8q
— JR%20Oakes 🍺 (@jroakes) July 16, 2019
AJ Koh, wrote a great article soon after the new Google Search Console became available where he explains that the real value in the data is using it to paint a picture of health for each type of content on your site:
As you can see in the image above, the URLs from the different categories in the coverage report have been classified by page template such as blog, service page, etc. Using several sitemaps for different types of URLs can help with this task since Google allows you to filter coverage information by sitemap. Then he included three columns with the following information % of Indexed and Submitted pages, Valid Rate and % of discovered.
This table really gives you a great overview of the health of your site. Now if you want to dig into the different sections, I recommend reviewing the reports and double checking the errors google presents.
You can download all of the URLs presented in different Categories and use OnCrawl to check their HTTP status, canonical tags, etc. and create a spreadsheet such as this one:
Organizing your data like this can help keep track of issues as well as add action items for URLs that need to be improved or fixed. Also, you can check off URLs that are correct and no action items are needed in the case of those URLs with parameters with correctly canonical tag implementation.
You can even add more information to this spreadsheet from other sources such as ahrefs, Majestic and Google Analytics with OnCrawl integrations. This would allow you to extract link data as well as traffic and conversion data for each of the URLs in Google Search Console. All of this data can help you make better decisions as to what to do for each page, for example if you have a list of pages with 404s, you can tie this with backlinks to determine if you are losing any link equity from domains linking to broken pages on your site. Or you can check indexed pages and how much organic traffic they are getting. You could identify indexed pages that do not get organic traffic and work on optimizing them (improving content and usability) to help drive more traffic to that page.
With this extra data, you can create a summary table on another spreadsheet. You can use the formula =COUNTIF(range, criteria) to count the URLs in each page type (this table can complement the table that AJ Kohn suggested above). You could also use another formula to add backlinks, visits or conversions that you extracted for each URL and show them in your summary table with the following formula =SUMIF (range, criteria, [sum_range]). You would get something like this:
I really like working with summary tables that can give me a summarized view of the data and can help me identify the sections I need to focus on fixing first.
What you need to think of when working on fixing issues and looking at the data in this report is: Is my site optimized for crawling? Are my indexed and valid pages increasing or decreasing? Pages with errors are they increasing or decreasing? Am I allowing Google to spend time on the URLs that will bring more value to my users or is it finding a lot of worthless pages? With the answers to these questions you can start making improvements to your site so that Googlebot can spend its crawl budget on pages that can provide value to your users instead of worthless pages. You can use your robots.txt to help improve crawling efficiency, remove worthless URLs when possible or use canonical or noindex tags to prevent duplicate content.
Google keeps adding functionalities and updating data accuracy to the different reports in Google search console, so hopefully we will continue to see more data in each of the categories in the coverage report as well as other reports in Google Search Console.