Crafting SEO-winning content means understanding Google’s perception of your keywords. Do they stand alone, or can they coexist on a single page? This script sheds light on that very question!

import pandas as pd
from googlesearch import search
from urllib.parse import urlparse

# Load the CSV file with keywords
keywords_df = pd.read_csv("ORGANIC POSITIONS FROM SEMRUSH EXPORT.csv")

# Create a dictionary to store SERP results for each keyword
serps_dict = {}

# Function to get SERP results for a keyword
def get_serps(keyword):
serps = list(search(keyword, num=10, stop=10))
return [urlparse(url).netloc for url in serps]

# Iterate through the keywords and get SERP results
for keyword in keywords_df["Keyword"]:
serps_dict[keyword] = get_serps(keyword)

# Create a matrix to store the overlap percentages
matrix = pd.DataFrame(index=keywords_df["Keyword"], columns=keywords_df["Keyword"])

# Function to calculate the overlap percentage between two keywords
def calculate_overlap(keyword1, keyword2):
serps1 = set(serps_dict[keyword1])
serps2 = set(serps_dict[keyword2])
overlap_count = len(serps1.intersection(serps2))
if len(serps1) == 0 or len(serps2) == 0:
return 0 # Handle division by zero
overlap_percentage = (overlap_count / max(len(serps1), len(serps2))) * 100
return overlap_percentage

# Fill in the matrix with overlap percentages
for i in range(len(keywords_df)):
for j in range(i, len(keywords_df)):
keyword1 = keywords_df.iloc[i]["Keyword"]
keyword2 = keywords_df.iloc[j]["Keyword"]
overlap_percentage = calculate_overlap(keyword1, keyword2)
matrix.at[keyword1, keyword2] = overlap_percentage
matrix.at[keyword2, keyword1] = overlap_percentage

# Save the matrix as a CSV file
matrix.to_csv("serp_overlap_matrix.csv")

What It Does:

Why It Matters:

How to Use It:

  1. Prepare: Ensure your Semrush export includes a “Keyword” column.
  2. Customize: Replace “ORGANIC POSITIONS FROM SEMRUSH EXPORT.csv” with your file path.
  3. Run: Execute the script (using Python and libraries).
  4. Analyze: Explore the “serp_overlap_matrix.csv” file.

Beyond Numbers:

Remember, overlap percentage is just one piece of the puzzle. Consider:

This script is a valuable tool, but SEO decisions require human expertise and judgment. Use it to inform, not dictate, your content strategy.

Happy keyword analysis!

A little extra:

The below script will then pair up similar keywords discovered in the anlaysis based on the percentage parameters you set.

import pandas as pd

# Load the CSV with SERP overlap data
df = pd.read_csv('serp_overlap_matrix.csv', index_col=0)

# Define a function to filter URLs with a minimum threshold of 60% and a maximum of 99%
def filter_overlap(row):
return [keyword for keyword, overlap_percentage in row.items() if 39 <= overlap_percentage <= 99]

# Apply the filtering function to each row and store the results in a new column
df['Overlapping Keywords'] = df.apply(filter_overlap, axis=1)

# Create a dictionary to store matching keywords and their URLs
matching_data = {}

# Iterate through each keyword and its overlapping keywords
for keyword, overlapping_keywords in df['Overlapping Keywords'].items():
for overlapping_keyword in overlapping_keywords:
if overlapping_keyword not in matching_data:
matching_data[overlapping_keyword] = []
matching_data[overlapping_keyword].append(keyword)

# Create a DataFrame from the matching data
result_df = pd.DataFrame.from_dict(matching_data, orient='index').T

# Save the resulting CSV
result_df.to_csv('filtered_overlap_results.csv', index=False)

Leave a Reply

Your email address will not be published. Required fields are marked *