Custom Search for SEO: Analyzing Search Intent with Google’s Programmable Search Engine and its Custom Search JSON API

January 18, 2022 - 22  min reading time - by Johanna Maier
Accueil > Technical SEO > Custom Search for SEO

What is the lifeblood of search engine optimization these days? Exactly, search intent. Or phrased in another way: What do users want to see when they search for your target keyword?

Chances are high that a look at the current top 10 rankings can get you close to the right answer – or at least close to the best answer that is out there in the open. Of course, there is a chance that nobody has found the best way to meet users’ search intent yet – and that your new content might outperform everybody. So keeping an open mind is always advisable.

Especially if you are a smaller player, an in-depth topical research might reveal content opportunities that your competition has missed so far. But even ‘only’ extracting search intent from current rankings requires manual analysis that is seldom scalable to large keyword sets.

In this article, we go another way: How can we assess aspects of our users’ search intent on a larger scale? And which fundamental questions relevant to our SEO strategy can we answer in bulk by analyzing search results?

A practical example to hit it off

Let’s start with a use case to make this abstract topic a bit more tangible. Ever noticed that there exists a “keyword gender bias” (article only in German) in the organic search results of the fashion industry? In the top rankings for terms like “hoodies” you mainly see “male” landing pages with products for men, while for “coats” there are mostly pages with female products in the rankings.

So it likely does not make sense to create a “neutral” page www.example.com/c/coats/ with men and women products since the general term “coats” and its search intent is already served by the female page www.example.com/c/women/coats/.

As SEOs we want this kind of information before creating our landing pages – and ideally without running every single keyword through Google search manually. Google’s Programmable Search Engine and its Custom Search JSON API makes exactly this possible.

But let’s start from the top. This is what we will cover throughout the article:

First, we will explore what a Programmable Search Engine (PSE) is and how we can access its API. A little spoiler: It is a free way to get a proxy of Google’s search results for your keyword terms in a standardized, accessible and scalable way.

We will take a step-by-step look into the setup of your own custom search engine, generate the API key to access its data and set up a Google Sheets template to analyze the output.

Then, we will put these skills into practice on three SEO use cases. This will show you how the API can answer very practical SEO questions revolving around keyword and search intent analysis.

And to conclude, we will have a look at the state of search intent and how this article adds a new perspective.

What is Google’s Programmable Search Engine and its Custom Search JSON API?

First things first, let’s explore what Programmable Search and the API are and how both can help us to assess search intent.

It is very likely that you already came across Google’s Programmable Search Engine on your journey through the World Wide Web. The reason is simple: It provides webmasters with a search engine functionality that they easily can integrate into their websites.

The powerful thing about it: You do not just integrate an iframe of the regular Google search. No, you can fully customize and tweak the search experience with the manifold settings of the Programmable Search Engine (short: PSE).

  • Do you need an internal search functionality for your website? Then restrict the PSE to only deliver URLs from your domain.
  • Do you have an affiliate site on a certain topic? Then provide your users with additional value by creating a topical search engine that only delivers results on the topical entities (provided by Google’s Knowledge Graph) that your website focuses on.

Or an image search that only delivers URLs hosted by public domain image providers? Or … You get the point: The options are endless. And you can even make money from the ads displayed on your customized Programmable Search Engine.

Now let’s move on to the Custom Search JSON API and how you can use it as soon as you have set up your own Programmable Search Engine. I already gave you my personal definition of the API above: The API is a free way to get a proxy of Google’s search results for your keyword terms in a standardized, accessible and scalable way. Let’s break it down:

Why is the Custom Search JSON API scalable?

The API takes the results of your custom search – from your Programmable Search Engine – and delivers them to you in a JSON format. JSON-LD (JavaScript Object Notation for Linking Data) is a widely used format to exchange data in a simple textual form between applications. So you can use the API’s data, e.g. URLs and title tags ranking for a keyword, in any of your own applications, like in Google Sheets as we will do later.

Is the Custom Search JSON API really free?

Yes, with a certain limitation. You can submit 100 free requests per day per Google account or pay a fee (5 $ per 1.000 queries) to increase this to up to 10.000 daily requests (more info).

In my experience, with a little planning, the ~3.000 monthly requests should suffice for many SEO use cases. And if needed, you could always split tasks on several accounts of your team members.

Why is Programmable Search only a proxy of real Google search results?

A powerful feature of the Programmable Search Engine: You can choose not to restrict your custom search at all and search the entire web. This basically gets you a proxy of the results that a real Google search would deliver – but only the raw organic results without fancy integrations like rich results or featured snippets and with differences in the exact positions of the ranking URLs.

Here another question might pop into your mind: Why do we not just get the same data but from the real Google search results?

What are the advantages of working with the Programmable Search Engine?

Two things: Firstly, the Custom Search API is for free and, secondly, there is no gray area involved. Officially, Google does not allow any scraping of its search results (see info on automated queries). Since many did it anyway, they put in place a no access restriction via CAPTCHAs.

If you ever tried to crawl Google’s search results with a web crawling tool, you will have noticed that you run into a status code 403 or 302. This is usually a CAPTCHA verifying your humanness or a block due to unusual traffic (see Figure 1).


Figure 1: Output from Crawling Google Search Result URLs with a Web Crawling Tool and Block of Requests

In contrast, the results of the Programmable Search Engine are freely accessible via the Custom Search JSON API. Still, as often in life, also here there is a tradeoff: The search results from the Programmable Search Engine are only a proxy, so not a 100% representation of the real Google search results – even with matching location and language settings.

Which raises the next question: Is the output good enough to inform our SEO decision-making?

How do the results of the Programmable Search Engine compare to real Google search results?

Is the output good enough to inform our SEO decision-making? My short answer: Yes it is.

Let’s take a look at our previous examples of “hoodies” and “coats”. Below, I listed the top 10 rankings that I manually extracted from the real UK Google search (localized via Valentin App) vs. a Programmable Search Engine set to the location UK and the language English.


Figure 2: Results Comparison – Real Google Search vs. Programmable Search Engine


Figure 3: Results Comparison Overview with Matching Colors for Same URL on the First Page

In Figure 3, I took the first page results and marked all overlapping URLs between both search engines in the same color. The URLs in white are the ones that do not have a match, so only appeared in one of the search engines.

The bottom line: There is a strong overlap between both search engines, although the precise ranking positions differ, as well as the look and feel (see Figure 2). On top, all the fancy SERP integrations like rich results, featured snippets, title rewrites or indented snippets are missing in the results of the Programmable Search Engine.

So far, this has been accurate enough for the use cases that I worked on. But of course, you have to make your own assessment depending on your project.

The results of the Programmable Search Engine are not an exact representation of Google’s real search results. But they still are a good proxy and help us to analyze a project-specific search intent which goes beyond classic categorisations like navigational, transactional and informational keywords.

But now let’s get started with the fun part and set the whole thing up.

Setup Steps: Custom Search and Google Sheets Template

We start with setting up your own Programmable Search Engine and accessing its API. Then we continue to the Google Sheets template to work with the API’s output.

Programmable Search Engine: How to set it up and access its API?

Our first goal is to retrieve two credentials: (1) the ID number of your Programmable Search Engine and (2) the API key to access the JSON output of the Custom Search API.

You can also find all the setup steps that we go through in the intro section of the documentation of the Custom Search JSON API.

 

Setup Step 1: Log into your Google Mail Account.

 

Setup Step 2: Fill in the initial setup page of your Programmable Search Engine.

  • Go to https://programmablesearchengine.google.com/cse/create/new. This leads you to the setup page (see Figure 4) for your Programmable Search Engine (PSE).
  • Fill in a random domain in the field “Sites to search”. For some reason, this is mandatory, even if we intend to search the entire web with our PSE. We can remove this input later on.
  • Set the language that your PSE should be localized in. You can change this later.
  • Now you have already created your own PSE. Try it out on its “Public URL” or go to the “Control Panel” for the next settings (see Figure 5).


Figure 4: Setup Step 2 – Programmable Search Engine Setup Page (Old Layout)


Figure 5: Setup Step 2 – Programmable Search Engine Setup Completed (Old Layout)

 

Setup Step 3: Adjust the settings in the Control Panel of the Programmable Search Engine.

  1. This is where the magic happens. As explained in the intro to the PSE there are countless use cases and settings to choose from.
  2. For our proxy of Google search results, we only need to change four basic settings (see Figure 7):
    • Region: Google’s search is always localized. To create an adequate proxy, you also need to select a region for the results of your PSE.
    • Language: The same holds true for the language settings. Change the language setting if it differs from your initial choice in the setup page.
    • Sites to search: To be safe, delete the restriction to the initially entered domain.
    • Search the entire web: Switch it to ON.
  3. All done: Copy your “Search engine ID” and note it down for later (see Figure 6).


Figure 6: Setup Step 3 – Programmable Search Engine – Search Engine ID (Old Design)

Figure 7: Setup Step 3 – Programmable Search Engine – Control Panel Settings (Old Layout)

Sidenote: New Control Panel Layout of the Programmable Search Engine

When creating a new Programmable Search Engine, you will notice the prompt to “Preview the new Control Panel!”. I gave it a spin: Currently, only the first setup page is different in the new panel (see Figure 8).

When you then want to customize your PSE, it takes you back to the same settings page of the already familiar control panel (see Figure 9). One upside of the new layout: You can directly select the option to “Search the entire web” in the first page and do not have to enter a domain restriction first. In any way, be ready for an update of the look and feel of this tool very soon.


Figure 8: Setup Step 2 – Programmable Search Engine Setup Page (New Layout)


Figure 9: Setup Step 2 – Programmable Search Engine Setup Completed (New Layout)

Setup Step 4: Get an API key for the Custom Search JSON API.

  • To access the JSON output of your newly created Programmable Search Engine, you need an API key. This way, Google can also check how often you query the API.
  • Go to the custom search documentation and click on “Get a Key.” Enter a name for your project and accept the terms and conditions (see Figures 10-11).
  • All done: Copy your “API KEY” and note it down for later (see Figure 12).


Figure 10: Setup Step 4 – Custom Search JSON API – Documentation with “Get a Key”


Figure 11: Setup Step 4 – Custom Search JSON API – Enable API


Figure 12: Setup Step 4 – Custom Search JSON API – Copy API Key

Sidenote: What happens in the background when you click on “Get a Key”?
Your Google Account is connected to the Google Cloud Platform, where the API key is generated for the Custom Search API in a newly created project.

You can achieve the same by signing up and logging into Google Cloud. But clicking the button is definitely faster and more convenient. In any way, you should find your API key again here in the Google Cloud Console: https://console.cloud.google.com/apis/credentials (Make sure to select the right project in the top left bar.)

Oncrawl Data³

Expand your analysis with seamless connections to additional datasets. Analyze your SEO strategy based on data on backlinks, SEO traffic, rankings, and custom datasets from your CRM, monitoring solution, or any other source.

Google Sheet Template: How to analyze search intent with help of the API?

Full disclosure up front: Many credits go to the folks from www.pemavor.com and their great ranking checker script. Their Google Apps Script connects to the Custom Search JSON API, extracts the top 10 search results from the Programmable Search Engine for each keyword on your list and even has a built-in scheduler that allows you to schedule your free 100 daily requests for several days in advance.

This tool helped us solve a very practical problem that we had in our day-to-day SEO tasks: how to assess the previously introduced “keyword gender bias” on a large scale. That was my starting point to dive into the topic and now I want to share some use cases where I applied the script.

But first, how can you set up the Google Sheet including its script yourself? If you have completed the previous setup steps, you already have all the needed ingredients to get the script running: your Google Mail account, the search engine ID of your Programmable Search Engine and your API key to access the Custom Search JSON API.

If you did not write down the ID and the API key before, you can retrieve them here:

  • Search Engine ID: https://programmablesearchengine.google.com/controlpanel/all
    Again, make sure that your PSE has the right settings: “Search all web” plus your desired region and language.
  • Custom Search JSON API Key: https://console.cloud.google.com/apis/credentials
    • Go to the credentials page of the Google Cloud. Here you should find an overview of all the keys that you created for any Google Cloud product.
    • Make sure to select the right project (top left navigation bar) that you used to create the API key for the Custom Search JSON API.
    • If this was your first API key, the project should be selected by default when you click the link.

Now, with the credentials ready, make a copy of my adapted version of the Google Sheet:
Custom Search for SEO | pemavor.com | adapted by Johanna Maier

(If you want to get the original Google Sheet without my adaptations, head over to the original blog post)

You need to complete the following steps to complete the setup:

 

Setup Step 5: Make your own copy of the Google Sheet and authorize the script.
Custom Search for SEO | pemavor.com | adapted by Johanna Maier


Figure 13: Setup Step 5 – Google Sheet Custom Search for SEO – Script Initialization


Figure 14: Setup Step 5 – Google Sheet Custom Search for SEO – Script Authorization

 

Setup Step 6: Enter your credentials and keyword list.

  • Tab “Settings”: Input your Search Engine ID & API Key.
    Set your preferred language and location.
    The daily limit can remain at 100 since this is the maximum for the free plan.
  • Tab “Keywords”: In the first column, enter your keyword list for which you want to fetch the SERPs of the Programmable Search Engine.


Figure 15: Setup Step 6 – Google Sheet Custom Search for SEO – Keyword List

 

Setup Step 7: Run the script to retrieve results of the Programmable Search Engine.

  • Menu Option “Check Rankings”: You find it in the menu next to “Help” (see Figure 16).
    Initialize the script and give it permission to run in your account.
    Depending on your keyword amount, choose “Run Manually” (ad hoc – up to 100 keywords) or “Enable Scheduler” (100 keywords per day).
  • Tab “SERP Rankings”: After starting the script, you can observe how the API output gets pulled in here one after another (see Figure 17).


Figure 16: Setup Step 4 – Google Sheet Custom Search for SEO


Figure 17: Demo of Google Sheet Script

Custom Search API in Action: SEO Use Cases

So far, you have seen the basic functionality of the script as published by www.pemavor.com. In the last three tabs (“Gender”, “Page Type”, “Domain”), I added demonstrations of my personal use cases on how to work with the API data in practice.

Example Fashion Industry | Gender

Again, let’s take the fashion industry as the first example. If you are active in this field, you certainly will have noticed that there exists a certain keyword gender bias when it comes to search intent, for instance in Google’s UK search results.

  • If you search for “hoodies” you will find landing pages with male products in the top 10.
  • For a term like “coats” it is mainly landing pages with female products.
  • For “bathrobes” you encounter more general pages with products for both genders.

What that means for us SEOs: If you want to rank for those terms, you are more likely to achieve your desired result by setting up landing pages that follow the search intent patterns.

If you have been doing SEO for a specific project for a while, you probably have a good gut feeling and know which topic should be targeted with which landing page. But it always helps to verify our intuitions. How? You run it through Google.

  • The old school way: You have to check and assess the “keyword gender” of every single fashion keyword manually.
  • With the API: You can automate that. Get the search titles and URLs from the Custom Search JSON API and match them with terms like “women” | “female” | “men” | “male”.

In the tab “Gender” of the Google Sheet, you find the idea in action. To get the same output as in the GIF, you first have to run the script with the “fashion keywords” in the tab “Keywords”.


Figure 18: Demo Google Sheet Script – Tab “Gender”

How does it work? A filter formula pulls the URLs, snippet title and keywords from the “SERP Rankings” tab. RegEx formulas check if there are matches with terms like “women” | “female” | “men” | “male” | “for her” | “for him”. The resulting “gender count” is summed up on a keyword basis via a pivot table.

Sidenote: Another (SEO) Fashion Advice
Re-check your keyword gender if you combine your category keywords with other terms, like for example with characteristics from your available filters. Take the term “socks” where you should find SERPs filled with landing pages for men’s socks. Once you add the color “pink” though, the SERPs for “pink socks” are geared towards women.

Example Beauty Industry | Page Types

Now we switch to another industry and go from fashion to perfumes, which changes the search intent landscape instantly. Think about it: Your average coat has a generic name and a random article number that nobody ever would search for. In other areas – like perfumes – product names have real search potential.

With the featured snippet for “most famous perfumes” as our source of inspiration, we find perfume names with substantial search volumes in the UK:

  • Chanel N°5 (“chanel no 5” – 33.100 SV)
  • Dolce&Gabbana Light Blue (“light blue dolce and gabbana” – 14.800 SV)
  • Opium Yves Saint Laurent (“opium perfume” – 9.900 SV)
  • Calvin Klein CK ONE (“ck one” – 9.900 SV)
  • Acqua di Parma Colonia (“acqua di parma colonia” – 4.400 SV)

But how can we be certain that we should use our product URLs to target each of these terms? After all, Chanel also has a brandline with the same name Chanel N°5 and Acqua di Parma sells various colognes and not just one. Optimizing both – product URL and brandline URL – will likely lead to cannibalization.

Again, a look into the SERPs helps us make the decision: Intuitively, a selection of products on a brand(line) page might lead to better conversions. But if search results are dominated by product URLs, we should optimize a product URL to compete for the rankings.

What is necessary to do this assessment on a large scale? Identify the patterns of your competitors’ page types, add them to a RegEx and let the script do the rest of the work for you.

Admittedly, this initial research covering the URL patterns of the page types of your competitors – and coming up with the corresponding RegEx patterns – will take a bit of time. But it is worth the effort and also a good insight to have up your sleeve for other analyses.

In the tab “Page Types” of the Google Sheet, you find some examples from the perfumes’ industry. To get the same output as in the GIF, you first have to run the script with the “perfume keywords” in the tab “Keywords”:


Figure 19: Demo Google Sheet Script – Tab “Page Types”

Example Niche Products | Competitor Domains

Last but not least, a final use case and to shake things up a little, let’s leave the world of perfumes and fashion. Imagine you are a SEO at a high-end manufacturer of specialized or niche products, for example design furniture or road racing bikes.

In such cases, you are well advised to carefully pick the keywords that you try to target before you invest time and effort. Ranking for a generic keyword like “bikes” might be out of the ballpark – also because users searching for “women’s bikes” might not look to conclude their customer journey with a £5.000 road racing bike.

Again, the same question: How do you weed out suitable keywords from the large pool of keyword potentials – without checking the search intent manually every single time?

A first step can be to come up with two lists of domains:

  • Domain List 1: Your competitors/partners and all domains that likely serve a search intent similar to the one that you want to rank for.
  • Domain List 2: Everybody else that you cannot or do not want to compete with.

Looking into the UK SERPs for “road racing bikes” vs. “bikes”: If you are a high-end bike shop like www.ribblecycles.co.uk you should put www.decathlon.co.uk on domain list number two. Or, if you are a shop for luxury furniture, you should put all generic furniture shops like www.habitat.co.uk on the second list. Often this can also be used to separate B2B keywords from B2C keywords or even product offers from service offers like “home office furniture” vs. “office interior design”.

If Google chose to rank only domains from Domain List 2 that are not similar to your product offer or business model, then this keyword is not worth your time and effort – or at least not your highest priority.

In the tab “Domains” of the Google Sheet, I put together two domain lists of bike shops to show how this would work in practice for the bike industry. To get the same output as in the GIF, you first have to run the script with the “bike keywords” in the tab “Keywords”:


Figure 20: Demo Google Sheet Script – Tab “Domains”

Please note that I did not go all too deep into the bike industry and some domain groupings might not make sense to a real industry expert. But the example should make the principle clear enough: Make an analysis of your competitive environment, figure out the domains that serve a search intent suitable to what you have to offer and then evaluate your keyword potentials based on this.

Some final thoughts: What is search intent?

To conclude this article, I want to take a step back and look at the big picture: What is search intent? There are many definitions and worthwhile articles that take a deep dive into this question.

The SEO toolbox Sistrix includes four main categories of search intents which are very close to the classic distinction between navigational, informational and transactional queries.

  • Know: The user is looking for passive informational content on a specific topic.
  • Do: The user actively wants to do something, be it to buy, download or install a product.
  • Website: The user wants to navigate to a specific website.
  • Visit: The user wants to visit a local business.

But such classifications only take you so far when it comes to deciding the layout and content of your landing page. Olaf Kopp went beyond this in his in-depth article on how to map your content decisions according to micro search intents (article only in German).

One example: He splits the informational search intent into micro intents like “entertainment”, “definition” or “enablement”, depending on whether the content should entertain users, give them a definition of a previously unknown concept or teach them how to do something.

Today, we added another dimension of search intent analysis: your project-specific search intent. No matter the niche of your project, it is likely that you have noticed some recurring SERP patterns. You can leverage these personal industry insights and apply them to larger keywords sets with the help of the Programmable Search Engine and its Custom Search JSON API.

Google strives to provide users with the best available search results – and spends a fortune to do so. For us SEOs this is an invaluable data source that we should not leave untapped. Especially since keywords’ search intent is also changing all the time, be it due to seasonalities or because Google is constantly refining its natural language processing capabilities. With the methods that we introduced, you can approach search intent analysis programmatically.

On a final note, as usual, one thing holds true: You learn the most when you have a direct practical application. The keyword gender bias in the fashion industry was the use case that made me explore the API in the first place. So ask yourself: Which of your SEO problems can you solve with the Programmable Search Engine and its Custom Search JSON API?

___

PS: If you need assistance with setting up the Google Sheet or get stuck with a RegEx for your personal use case: Feel free to reach out! I am not a RegEx expert myself but can always try and help to figure it out.

Johanna Maier is a Junior SEO Consultant at the international digital agency Dept. Her fascination: the interdisciplinary nature of search engine optimization and the interplay of technology, creativity and data.
Related subjects: