Website-migrations--Thinking-beyond-your-HTML-pages-250px

Website migrations: Thinking beyond your HTML pages

October 3, 2023 - 12  min reading time - by Ziggy Shtrosberg
Home > Technical SEO > Website migrations: Thinking beyond your HTML page

For a number of different reasons, you may have to migrate your site. Regardless of whether it’s for a domain change or a platform change, there are a lot of things to consider before implementing a migration. One thought you should consider is that search engines do not merely index and rank HTML content but also files like PDFs, images and videos.

In this article, I will explore the importance of thinking about non-HTML assets as part of your SEO migration strategy and key opportunities that could be used along the way – especially when working with websites using a custom CMS or a complex enterprise technology stack.

Thinking and optimizing these files will contribute to a successful migration and improve search engine rankings and visibility in the search results.

PDF documents

PDF documents still significantly affect web traffic and conversions for many websites. However, the importance of PDFs can vary depending on the type of website, its content, and its target audience.

As part of your initial migration benchmarking process, it is critical to discover the PDFs available on your website and whether they generate organic traffic or contain valuable backlinks.

The most common method is to use an SEO tool, such as Oncrawl, with which you can enable the scraping feature to find any pages in your internal website structure that contains PDFs.

[Case Study] Boost your SEO with scraping

Find out how Vroomly's SEO team was able to use scraping to better segment their site and prioritize SEO actions.

However, you can also use Google Search Console, GA4 or Google’s site search operator to check if PDFs are found in your site structure.

Convert PDFs to HTML pages

Before a migration occurs, you have the opportunity to convert PDF documents to standard HTML pages. While PDFs have their place in regards to specific types of content, there are a lot of advantages to replacing PDF content with HTML pages, such as:

  • Making them easier to crawl and index.
  • It improves accessibility.
  • The pages are more mobile friendly.
  • HTMLs tend to load faster.
  • It is better for link building and passing PageRank across the site.
  • Making the content easier to update.
  • The pages can include structured data.

Converting PDFs to HTML pages could give your website a nice boost in post-migration SEO performance and is worth considering as part of your strategy.

Hosting PDF documents

Particularly for larger or non-traditional website builds, it is essential to consider where PDFs will be hosted after migration.

Developers often prefer alternative hosting solutions for non-HTML assets, such as AWS, Azure or Cloudfront. There are many advantages to using such solutions, including the cost-saving, scalability, load balancing and CDN infrastructure aspects.

At the same time, it’s equally important to consider whether the URL structure will change and whether the PDFs will still be served from your domain. After all, it can dramatically change the internal linking profile of PDF documents on your newly launched website.

Redirect mapping and PDF indexation

If PDFs generate organic traffic or contain valuable backlinks, they should be redirected as part of your SEO migration strategy.

Additionally, you may wish to remove some PDFs from the search results – either because they don’t generate meaningful traffic or don’t belong in search results.

For instance, I’ve recently worked with a client in the insurance industry that unknowingly had many of its insurance policy documents accessible to search bots and showing in the search results. The migration was a perfect opportunity to remove the PDFs from the search engine index.

Associated risks

If your website gets organic traffic or high-quality backlinks from PDF documents, setting redirects as part of your migration strategy will be an important step you shouldn’t omit.

Benchmarking and taking stock of your existing PDFs will be necessary when performing a pre/post analysis once the new website is migrated. Failing to carry out pre-migration checks may result in traffic and position drops.

At the same time, thinking beyond your website’s existing PDF strategy could provide a nice post-migration boost in SEO performance, so keeping an open mind in how you approach PDF documents as part of your migration strategy is worthwhile.

Images

Some websites receive a decent amount of organic traffic from images; therefore, thinking about an image SEO migration strategy is paramount. In my professional experiences, I’ve come across a number of different opportunities with different clients that you may want to factor into your migration plan.

One of the best ways I’ve found to speed up the indexation of images in your newly migrated website is to nest images in your XML sitemap or create a dedicated image XML sitemap.

Often, there can be a delay in search engines discovering and indexing new images. Yet, including image URLs in your XML sitemap can speed up the process.

What’s more, search engines can attribute images found in your XML sitemap to your content, thereby improving the visibility of your images in the image search results and driving additional organic traffic to your website.

However, it is important to remember that images don’t rank on their own. The image references in the sitemap must match the image URLs found on the page while also being relevant and adding value to the page’s topic.

Image alt text

When working with clients, I typically use the pre-launch phase of a migration process to assess whether image alt text is missing or needs optimization and subsequently plan out the work required.

Alt text provides an opportunity to add contextual relevance to a webpage’s HTML content and insert head and long tail keywords in the alt text description. Furthermore, it enhances the accessibility and the user experience of your website for people with visual impairments who rely on screen readers or other assistive technologies to access web content.

Page speed performance

According to the 2022 Web Almanac, images make up a significant portion of a typical web page weight.

From experience, it is also an issue I encounter frequently – whether I work with small business or large enterprise clients. Poorly optimized images contribute to slow page load and poor user experience.

From a technical SEO perspective, large image sizes also negatively impact a search bot’s experience of a webpage and, as a byproduct – your overall SEO performance.

A website migration often includes re-platforming to a new technology stack or upgrading the existing platform to the latest version. In both instances, the developers frequently use the migration to improve the website’s infrastructure or add new services. It also serves as the perfect occasion to think about and discuss image optimization with developers.

Some of the more common image optimization solutions include:

  • Resizing and compressing images before uploading files to your CMS.
  • Using the <img srcset> HTML attribute to specify multiple versions of an image file, along with their respective sizes or resolutions.
  • Using the HTML or JS-based lazy loading functionality to defer the loading of images until they are needed.
  • Enabling next-generation image file formats such as webP and AVIF.
  • Implementing network compression with GZIP or Brotli to efficiently transfer files over the network.
  • Implementing a media CDN to serve your images from a location closer to a user’s browser.
  • Using Priority Hints to improve image LCP (Largest Contentful Paint) in your Core Web Vital scores.

Text inside images

Search engines rely on text-based content to understand the relevance and context of a web page. They cannot “read” or interpret text within images. Not yet, anyway. If your crucial textual content is embedded in images, search engines won’t be able to index and rank that content effectively.

Additionally, users with visual impairments who rely on screen readers or other assistive technologies won’t be able to access text within images.

Again, the migration process offers an opportunity to pull embedded text from images and have them show as part of the HTML.

I’ve previously migrated a website where many of its key service pages contained paragraphs of text embedded in images. We fixed the issue in the early pre-migration phase and once the new website was launched, we noticed that the service pages started to perform better in the search results for their target keywords.

Structured data

If images in the existing website contain structured data, it should also be implemented in the new website as part of your migration plan. You can add image-related structured data to improve your image SEO strategy and help search bots better understand the context of images on the new website.

For example, you can use the ImageObject schema type to describe various attributes of key images on a page in more granularity, such as their location, author, date published, description, name and EXIF data.

Additionally, you can use the ‘primaryImageOfPage‘ schema type to indicate to search bots the primary image on the page. The schema can be nested in various schema types such as a WebPage, AboutPage, ProfilePage, RealEstateListing and many others.

The reality is, however, that for most migration projects, I would classify image structured data as a “nice-to-have” activity – as opposed to a “must-have”. That said, if image SEO makes a big part of a website’s traffic, it is worthwhile to implement and, ideally, work with the developers to dynamically add structured data templates into the source HTML, which includes image schema.

Associated risks

In my particular experience, most websites I’ve helped migrate receive minimal traffic from images. However, if your website does, it is vital to ensure that images are reindexed after migration as soon as possible if you don’t want to risk a potential loss of traffic.

You can identify image traffic in the Performance section of Google Search Console using the image search type. The dashboard is also the best place for pre/post-migration analysis. If you spot any migration issues, you can proactively take the necessary steps to resolve the matter.

Image tracking in GSC

If you use a media CDN (content delivery network) for your newly migrated website, ask the developers to use a branded subdomain, such as media.example.com and verify it in Google Search Console. The verification allows you to monitor indexation rates and check for organic image traffic for your CDN.

Videos

Unlike standard HTML pages, you cannot redirect videos. So, you’ll have to follow the most important rule of video SEO: only videos present on a page can “rank”.

As part of your migration planning, ensure that any video showing on the old site is embedded on the same page on the new website. Again you can use a tool like Oncrawl to identify pages with videos pre-migration and monitor the same pages post-migration. Additionally, you can access the ‘Video pages’ section found in Google Search Console to take stock of video indexation levels.

Video pages in GSC

Video sitemap

You can use a video sitemap or a regular XML sitemap with nested video references to help search engines make sense of the video migration changes in your newly launched website. The sitemap video references will help search bots discover new video locations faster and speed up indexation.

Guardian newspaper’s video sitemap

Above is an example of the Guardian newspaper’s video sitemap

 

Media CDN for video files

Another golden opportunity to take advantage of during a migration is the chance to discuss the implementation of a media CDN with the developers to deliver video files. There are many advantages to using a CDN infrastructure for large video files, such as improved loading speed, reduced server load and global reach.

As mentioned earlier in the Images section of the article, I recommend getting the media CDN verified in Google Search Console so you can monitor indexation and analytics data. The verification typically requires a DNS change to rename the CDN as a subdomain of your property, such as www.media.example.com. It is best to speak to your developers about the correct implementation.

Video platforms

Many websites opt to use video hosting platforms like YouTube. But, if a user clicks on your video in the SERPs, they will be directed to YouTube instead of your website, somewhat reducing your possibility for conversion. Getting a user to convert to a customer from a third-party platform is exponentially more difficult than if they were already on your website. In addition, you will also get less analytics data if the user is directed to another website.

A site migration or a complete change in tech stack presents a great chance to review where your videos are hosted and rethink your strategy to get more users on your actual site.

Keep in mind, however, that not all video hosting platforms are the same. Vimeo, for example, offers some additional benefits including the addition of relevant structured data with the video embedded on your website.

Video structured data

Similar to the image structured data we discussed earlier, adding structured data to your videos allows you to include significantly more information about a video than is reasonable using the standard HTML.

The VideoObject schema, for example, allows you to tell search engines about the video’s actors, director, music, copyright information and more.

To increase the visibility of your video in the search results, add the Clip schema type that acts as a timestamp for the various sections of your video. The Clip schema allows you to qualify for Google’s ‘In this video’ rich results that help to increase CTR and get more video views.

Associated risks

In the early stages of the migration planning, try to establish how many videos are available on your site and what traffic levels (or views) they generate. You want to ensure that a page containing a video on your pre-migration website also contains the same video on the newly launched website.

The key objective is to benchmark the existing video indexation levels and traffic so you can tell how the migration is performing once the new website is live. If you spot any down trends, you have the data to determine what has gone wrong and how to fix the issue.

In conclusion

While HTML is a major player in site migrations and making sure they are carried out effectively, you should also consider a number of the non-HTML assets.

There can be a lot of moving pieces in a migration and you need to be sure you take into account all parts that can have a negative impact on your website’s visibility, user experience, and accessibility. Take the time to plan out a migration methodically and meticulously because ignoring these assets can result in missed SEO opportunities and a less competitive online presence.

Be sure to capture a snapshot of your existing website and use the data to measure the before/after migration performance. Even though PDFs, images and videos often play a secondary role in migration, it is essential to consider their impact on organic traffic.

Ziggy Shtrosberg See all their articles
Ziggy Shtrosberg is a Senior Technical SEO Analyst at 26. He works with SMBs and globally renowned brands to optimize the technical foundation of their websites for optimal search engine performance. With many years of hands-on experience in-house and agency settings, Ziggy has honed his skills across various sectors, such as e-commerce, medical, manufacturing, motorsport, nonprofit and the property market.
Related subjects: