Use semantic HTML5 tags to focus your page content and avoid pollution and dilution of the page’s theme.

So, before we get down to the nitty-gritty, for those of you who are not really sure about what exactly the difference is between semantic HTML5 tags and the other HTML tags, generic html tags like <div>, <span>, <p> are just containers that give no indication of the type of content they contain. In fact, a <div> can hold just about anything and is used as the most basic building block for structuring pages.

So what then are semantic HTML5 tags?

Semantic HTML5 tags have a specific role to play and tell us what kind of content we can expect them to contain. Two that have been around since forever are <head> and <body> and a browser knows for sure that everything in <head> is metadata about the page and everything in <body> is the visible part of the page it will show to the user.

The tags we’ll be looking at specifically in this article are <header>,<footer>, <main> ,<article>,<aside>,<section> and, to a certain extent,<nav>.

Why just those seven tags? Because they are all we need to show the search engine algorithm where the important content is.

Why do we need to do this?

So why is it important to show the search engines where this content is? Can’t it just work it out for itself? I mean, Google is pretty smart, right? Yes, Google is smart, and getting smarter, but by flagging the important content you are not only saving it some work but YOU control the game!

Let’s consider a case where this can be particularly useful. A few years ago I worked on the SEO of a car leasing website. In theory this is a really easy job to structure as the offers were categorized by manufacturer > range > model, so there was plenty of inherited context as you reached the model level.

The problem was that each model page contained similar offers and related blog articles. So, for example, a page about a BMW car would contain similar offers for Mercedes, Audis and Jaguars, and blog articles about the car industry in general, that may mention Renault, Volkswagen or any other brand. If you look at the text content of the page there’s a lot of pollution and mentions of all kinds of stuff that has nothing to do with BMWs.

A case study

There is much debate about how important the semantic tags are and what Google actually does with them, so here is a screenshot of a relatively high-traffic site I worked on two years ago with Jason Barnard, in which the red line shows the moment the semantic HTML5 tags were integrated into the page templates and you can see the 30% gain in traffic that resulted.

HTML5

So how on earth do search engines know what your page is about?

As humans we can analyse the page layout visually and we know instinctively from experience what the main content is. But how do search engines see your page? They see a jumble of text that is pretty much all about cars. Ok so far, but remember that your site mentions BMW, Mercedes, Audi, Jaguar and Renault in the same page. Along with car insurance and other news about the car industry in general.

A machine can “guess” by looking at signals like the <title> and the <h1> tags. It can also look at the number of times words appear in the text. This all pretty standard analysis and it will probably reach the correct conclusion … but why not tell it very specifically where the only content it has to consider is located?

How can we do this?

In the same way as we use the <head> and <body> tags to delimit areas of the html code, we’ll build a structure, an invisible structure, that will add only a few bytes to the weight of the page but will act like the administrative districts in a city. The bots will know exactly where they are and what the purpose of each area is.

* NOTE: do not apply classes or styles to the semantic elements. You need to be able to add them, remove them or move them around with it affecting the look of the page in any way!

The first thing we need to do is to separate the header bar stuff and the footer bar stuff from the main content.

We need to split it up into smaller chunks to organise the content blocks and, until now we’ve been using <div> tags to do this. (shudders remembering when page layouts were done using html tables). So what’s the problem with using <div> tags? Nothing, except that they tell us nothing about the role of their content.

You can give divs an id, like this:

<div id=”header”>
<div id=”main”>
<div id=”footer”>

but this doesn’t actually tell the machines anything. You might as well call them:


<div id=”john”>
<div id=”paul”>
<div id=”george”>
<div id=”ringo”>

We need something that tells us what the role of each block is, just as if we wrote:

<beatles >
     <singer id=”john”></singer>
     <bassist id=”paul”></bassist>
     <guitarist id=”george”></guitarist>
     <drummer id=”ringo”></drummer>
</beatles>

Luckily, there are semantic HTML5 tags to do just this: we can use <header>, <main> and <footer> tags. Like this:

<body>
     <header ></header>
     <main ></main>
     <footer ></footer>
</body>

The <header> and <footer> will probably contain some navigation menus contained in <nav> tags, but that doesn’t concern us here.

So let’s look at the <main> block.

The <main> tag

Because there is a huge range of different types of content that we can put in the <main> block we need to be able to isolate the content that is specific to the current page and leave out everything else. To do this we can use the <article> tag, which will contain the <h1>, like this:

<main>
     <article>
           <h1></h1>
           Specific page content
     </article>
</main>

All the specific content for this page will go into the <article> tags.

Note here that “article” does not necessarily mean article in the sense of a newspaper article but just a thing, like an article of clothing, a product, a blog post, an “About Us” page, a recipe…

So far, so good. But what about all the other content in the page? We need to divide it into two groups: content items that are associated in some way with the main page content and content items that are more general to the site.

 

<main>
     <article>
           <h1 >< /h1>
          Specific page content

         [Additional content directly associated with the article content]
     </article>
     [Additional content NOT associated with the article content]
</main>

 

Look at the table below for some ideas about what additional content needs to be inside the article tags and what needs to stay outside.

Page contents to include or exclude in the article tag
Page type Additional content inside the <article> Additional content outside the <article>
Blog article Author information
Comments
Ratings
Associated articles
Links to other blog categories
Product promotions
Sign-up form
Any other unrelated content
Product page Reviews and ratings of the product
Comments about the product
Mentions of the product elsewhere on the web
Similar products
Links to blog articles associated with the product
Links to other product ranges
Products on special offer
Latest blog articles
Sign-up forms
Any other unrelated content

How do we tell the machine that this content that we have just defined as “additional content” is just that? This is where the <aside> tag comes into play.

The <aside> tag

This is what our simplified code will look like when we have included the <aside> tags:

 

<main>
     <article>
          <h1></h1>
          Specific page content
          <aside>

               [Additional content directly associated with the article content]
          </aside>
      </article>
      <aside>

      [Additional content NOT associated with the article content]
      </aside>
</main>

 

Now we have told search engines to ignore anything in the <aside> tag and to not consider it as part of the main content.