Keyword-clustering-using-Python-and-the-SERP-API-250px

Keyword clustering using Python and the SERP API

June 3, 2025 - 10  min reading time - by Morteza Najafi
Home > Technical SEO > Keyword clustering using Python and the SERP API

A lot of recent publications are focused on AI, AI overviews and how to make sure your content is being found in this newer iteration of search. But, it’s still important to remember that the basics are valuable and in SEO, success often hinges on how effectively you organize and target keywords.

Keyword clustering—the process of grouping similar search terms based on user intent and search engine results—is somewhat overlooked these days, but it’s a step in your on-page SEO strategy that can dramatically improve your site’s performance.

In this article, I will walk you through the importance of keyword clustering, demonstrate how to perform this task both manually and through automation, and provide you with practical tools to implement these techniques immediately.

Whether you’re managing a small website or a large enterprise platform, you’ll see how proper keyword organization can eliminate content cannibalization, streamline your content creation, and ultimately drive more targeted traffic to your site.

Why should you categorize keywords?

After conducting keyword research, you typically end up with a list of dozens—or even hundreds—of keywords, each with its own search volume, user intent, and level of competition.

But simply having this list isn’t enough—the next important step is strategically categorizing these keywords. But why is this step so important?

Consider the following two keywords:

  • “Printer for sale”
  • “Buy printer”

 

At first glance, these phrases might seem entirely different, perhaps even warranting separate landing pages. However, both are aligned with the same underlying user Intent: purchasing a printer.

In reality, users searching for either term are likely expecting to land on a product category or e-commerce page. If you target these keywords with separate pages, you risk keyword cannibalization—where multiple pages from your own website compete against each other for the same or very similar search terms, ultimately hurting your rankings.

Proper keyword categorization allows you to:

  • Avoid duplicate or overlapping content
  • Define a clear goal (e.g., landing page or blog post) for each group of keywords
  • Improve your website’s structure and information architecture
  • Design more precise SEO and advertising campaigns
  • Increase conversion rates by more effectively addressing user intent

 

In short, categorizing keywords doesn’t just bring clarity to your SEO and content strategy , it directly impacts your ability to attract the right audience and drive high-quality, targeted traffic to your website.

[Ebook] AI & The Evolution of Search

Artificial intelligence is reshaping how users interact with search engines, so it's becoming more important to understand how AI tools like ChatGPT perceive, process, and use your web content in its responses.

How to manually categorize keywords

When it comes to keyword categorization—or keyword clustering—the goal is to group keywords that share similar search intent and return similar results in Google’s search engine results pages (SERPs). But how can we manually identify these similarities?

One of the traditional, yet effective, methods is to analyze Google SERPs directly.

Here’s how it works:

  1. Start by searching each of your target keywords individually on Google.
  2. Carefully examine the URLs that appear on the first page of results for each keyword.
  3. If at least 5 or 6 of the search results (URLs) are the same for two different keywords—that is, if the same pages rank for both keywords— it indicates that Google considers those keywords to be similar and expects them to be targeted on the same page.
  4. In such cases, those keywords should be grouped together and targeted within the same page, whether that’s a blog post, a product page, a category page, or another type of content. Creating separate pages for each would be unnecessary and could lead to keyword cannibalization.

 

By analyzing SERP overlap, you can make decisions about which keywords belong together and which clusters deserve their own dedicated pages. While this process is manual and can be time-consuming when dealing with large keyword lists, it remains one of the most accurate methods trusted by professional SEOs.

Now that we’ve explored the traditional manual method of keyword clustering, it’s time to scale the process and automate it.

How to categorize keywords using an API

When dealing with hundreds or thousands of keywords, manually reviewing Google search results becomes virtually impossible. That’s where a Google SERP API comes in – it can be used to automate and streamline the process.

The SERP API is essentially a service that allows you to programmatically access Google search results data.

In this method that uses a Python script, you can automatically:

  • Retrieve Google search results for each keyword
  • Extract the top-ranking URLs (typically from the first page of Google)
  • Compare the overlap and similarity between keyword results
  • Automatically and intelligently group keywords into similar clusters

 

Let’s walk through this automated clustering process step-by-step.

Step 1: Data collection

The first step is preparing your keyword dataset. This file should include two essential columns:

  • Keyword: The search term or phrase
  • Volume: The monthly search volume for each keyword

 

This data is typically gathered from keyword research tools like Google Keyword Planner, Ahrefs, Semrush, or similar platforms.

As shown in the example below, these two columns are the minimum requirement for the script to function correctly.

Keyword dataset

Figure 1: Example of a properly formatted keyword dataset with required columns

Important note: Your file may contain additional columns, which won’t cause any issues. However, make sure that the two required columns are named exactly as follows—with the first letter capitalized: Keyword and Volume; not lowercase or mixed case.

Step 2: Register and get your API key

To automate the retrieval of Google search results, you’ll need access to a SERP API.

In this guide, we’ll be using Serper.dev, a reliable service that allows bulk access to Google SERP data.

Getting started is simple:

  1. Sign up at Serper.dev.
  2. Obtain your API key—either a free version or a paid one. The free plan typically supports up to 2,500 test queries.

 

Of course, you’re free to use alternative APIs such as Google Custom Search API, ValueSerp, or other similar services. The output format is generally similar and the main differences lie in pricing, request limits, and data accuracy.

We’ve chosen Serper.dev for this tutorial because of its simplicity and ease of access.

Serper API key

Step 3: Running the Python script

At this stage, you’ll use a prebuilt Python script to automate the keyword clustering process. The easiest way to run this is via this Google Colab notebook.

Google Colab is a free, cloud-based platform that lets you run Python code without installing anything on your computer.

If you’ve never used Colab before:

  1. You’ll need a Google account to access it.
  2. The document is saved automatically to your Google Drive.
  3. Each code cell can be run by clicking the “play” button next to it or pressing Shift+Enter.
  4. Code cells should be run in sequence from top to bottom.

 

Don’t worry if you’re not familiar with Python—our script is designed to work without requiring you to write or modify code yourself.

The script will:

  • Read keywords from your Excel file
  • Send a request to the API for each keyword
  • Retrieve and store Google SERP data
  • Analyze overlapping URLs between keyword results
  • Automatically cluster related keywords

 

To get started, you’ll need to follow these steps:

1. Make a copy of the Colab file to your own Google Drive by using the File menu and Save a copy in Drive.

2. In the first code cell, allow the required libraries to install and import.
section 01_google colab import

3. Upload your Excel file (containing the Keyword and Volume columns) to the Colab environment.
section 02 - uploaded keywords

4. Paste your Serper API key into the designated section of the script.
section 03 - detail

You can also customize key parameters such as your email, target location, country, and language of the search queries. To better understand the available options, explore the “playground” or demo section provided by the API service you’re using.

sec 03 - 2 - playground

5. Run the next cell to initiate API calls and fetch the SERP data.

section 04 - request to API

6. Run the clustering functions, which analyze SERP URL overlap, and group keywords accordingly.

section 05 - SERP clustering function

7. This is the section where you define the clustering threshold (common_num) and run the clustering function to group keywords based on shared URLs.

This step is critical, as it determines how strict or lenient the clustering process will be and it directly affects the final output.

Clustering Threshold Tip: The `common_num` parameter is basically your similarity meter.

  • A lower threshold (2-3) creates larger, more inclusive clusters.
  • A medium threshold (4-6) requires that keywords have meaningful similarity to be grouped together.
  • A higher threshold (7+)creates very strict, smaller clusters. Only keywords with extremely similar search results will be grouped.

 

You can adjust this value to suit your needs, but it’s generally recommended to choose 4, 5, or 6. The ideal threshold depends entirely on your SEO strategy and goals.

section 06 - configure clustering threshold

8. In this final section, your Excel file will be processed and automatically downloaded.

section 07 - make excel

This approach eliminates the need to manually review thousands of keywords, helping you uncover semantic relationships and organize your keyword list with precision.

Step 4: Using the data

The final output is an Excel file where keywords are organized into categories (clusters). This data can now be used for content planning, website architecture design and SEO strategy development.

For better analysis and visualization of your keyword clusters, you can use Pivot Tables in Excel. With just a few clicks, you can view keyword groups by cluster name and search volume.

Pivot Table displaying keyword clusters

Figure 2: Example of a Pivot Table displaying keyword clusters with search volumes

Running a real-world example

To better understand how keyword clustering works in practice, let’s walk through a real-world scenario.

Suppose we’ve used a keyword research tool to extract a list of 2,000 keywords related to the term “keyword research.”

Our goal is to group these keywords into clusters, so we can determine which sets of keywords should be targeted on the same page of our website. After preparing the Excel file with the required columns, we upload it into the Python script.

In this example, we’ve set the clustering threshold (common_num) to 5—meaning that if two keywords share at least 5 common URLs in their Google SERP results, they’ll be grouped into the same cluster.

Once the script has run, it generates an Excel output file with the following columns:

  • Keyword: The list of all keywords
  • Volume: Search volume for each keyword
  • Cluster Name: The main representative keyword for each cluster (the one with the highest search volume)
  • Number of Keywords in Cluster: How many keywords are grouped in each cluster
  • URLs: The list of pages that rank in Google’s SERP for the given keyword

 

To analyze this data more effectively, you can use Pivot Tables in Excel.

  1. Go to the Insert tab and choose Pivot Table.
  2. Configure the fields as follows:
    • Add Cluster Name and Keyword to the Rows section
    • Add Volume to the Values section
Setting up a Pivot Table

Figure 3: Setting up a Pivot Table to analyze keyword clusters

 

The output will display a list of keywords along with their individual search volumes, clearly grouped by their respective clusters.

clustered keywords search volume

Figure 4: Example of clustered keywords with their respective search volumes

 

In this example, we can see that the keywords “advanced keyword research”, “advanced keyword research and analysis” and “advanced keyword research technique” have all been grouped into a single cluster. The total combined search volume for this cluster is 780.

Additionally, a separate cluster titled “adwords keyword research” was formed, with a total search volume of 1140 and also a complete list of the keywords included in this cluster is displayed in the table.

Important note: After the clustering process is complete, it’s highly recommended to perform a final manual review to ensure the accuracy of the results. While the provided script is quite precise, like any automated process, there is always a small margin of error—especially in large-scale projects.

You can access the clustering script using this link.

Key takeaways

Keyword clustering transforms an overwhelming list of search terms into a manageable roadmap for your content creation efforts. By properly grouping keywords with similar intent, you can improve your site’s architecture and create targeted content that addresses user needs.

Whether you choose the manual method for smaller keyword sets or scale your efforts with the automated Python approach, the goal remains the same: create content that aligns with how search engines understand semantic relationships between terms.

Remember these key takeaways:

  • Keyword clustering prevents cannibalization where multiple pages compete for the same rankings
  • A clustering threshold of 4-6 common URLs works well for most websites
  • Always perform a final manual review of your clusters, regardless of how they were generated
  • Use your clusters to inform not just individual pages, but your overall site architecture

As search engines continue to evolve toward understanding topics and intent rather than just keywords, this clustering approach will become more valuable.

Morteza Najafi See all their articles
Morteza Najafi is an SEO specialist with a passion for automating tasks and analyzing data with Python. He focuses on technical and semantic SEO and is also interested in creating scripts to automate repetitive SEO tasks.
Related subjects:

Comments are closed.