When it comes to improve your rankings, a XML sitemap can be a really good partner. This protocol helps Google and other main search engines to easily understand your website structure while crawling it. It was first introduced by Google in 2005, with MSN and Yahoo offering their support to the protocol a year later. Sitemaps are known as URL inclusion protocols as they advise search engines on what to crawl. It comes in opposition to robots.txt files that are an exclusion protocol as it tells search engines what not to crawl.
The website Blue Corona made a good comparison between a XML sitemap and a blueprint for a house.
Think of your website as a house and each page of your site as a room. You can think of a XML Sitemap like a blueprint for your house and each web page were a room, your XML Sitemap would be a blueprint—making it easy for Google, the proverbial home inspector of the web—to quickly and easily find all the rooms within your house
In other words, a XML sitemap will ease Google to find your pages when it crawls your website because all your pages could be ranked, not only your website as a domain. It informs search engines about pages on their sites that are available for crawling. While having no XML sitemap is not penalized, creating yours is highly recommended because it can improve your SEO.
Why should you get a XML sitemap
Like we said, having an efficient XML sitemap can improve your rankings. But this is particularly useful when:
- You have a website with a complicated structure or many internal links
- Your site is a new one or if you have just a few external links
- Your site is consistent and have archived content
- Your website has dynamic pages (mainly occurs for e-commerce website).
Benefits of having a XML sitemap
Having a sitemap on your site passes more data to search engines. So it also:
- Lists all URLs from your site. And this includes pages that would not have been foundable by search engines
- Gives engines page priority and thus crawl priority. You can add a tag on your XML sitemap saying which pages are the most important. Bots will thus first focus on this priority pages.
- Gives temporal information. You can also include two other optional tags that will pass extra data to search engines to help them crawl your website. The first one, “lastmod’ informs them when a page last changed. The second one, “changefreq” tells how often a page is likely to change.
- Gives you information back from the Google Webmaster Central. You can access googlebot activity for instance.
How to set up your XML sitemap
Creating your XML sitemap can be quite easy as many website content management systems offer the ability to automatically create yours. But if you use that solution, be sure that the output is in the correct format and is error-free. For Google, the required protocol is Sitemap Protocol 0.9. Your sitemap should:
- Begin with an opening tag and end with a closing tag.
- Specify the namespace (protocol standard) within the tag.
- Include a entry for each URL, as a parent XML tag.
- Include a child entry for each parent tag.
- And use UTF-8 encoding
Then you must verify your XML sitemap with Google Webmaster Tool to ensure it is in the right format and correctly uploaded to your web server.
For small websites that do not have content uploaded that often, you can use the XML Sitemap Generator. It allows you to define how often your pages are updated and what modified date is used. Once the generator has created your sitemap, you need to upload it to the root of your domain e.g. www.yoursite.com/sitemap.xml.
However, this tool is limited in many ways. You can only add five hundred pages, it defines the same “change frequency” for all URLs and is obviously not suitable for any website that publishes content every week as you want your home page spidered more frequently than other pages.
If you are under WordPress and already using the plugin WordPress SEO by Yoast, keep it to create your sitemap because it is deadly simple. The website Elegantthemes published a nice guide to set up your sitemap with WordPress.
Oncrawl helps you monitor your sitemaps by giving you a clear state of the art regarding:
- pages in sitemaps
- compliant pages in sitemaps
- 3xx redirects in sitemaps
- 4xx errors in sitemaps
- types of sitemaps
- structured data in sitemaps
- orphan pages in sitemaps