Crafting SEO-winning content means understanding Google’s perception of your keywords. Do they stand alone, or can they coexist on a single page? This script sheds light on that very question!
import pandas as pd
from googlesearch import search
from urllib.parse import urlparse
# Load the CSV file with keywords
keywords_df = pd.read_csv("ORGANIC POSITIONS FROM SEMRUSH EXPORT.csv")
# Create a dictionary to store SERP results for each keyword
serps_dict = {}
# Function to get SERP results for a keyword
def get_serps(keyword):
serps = list(search(keyword, num=10, stop=10))
return [urlparse(url).netloc for url in serps]
# Iterate through the keywords and get SERP results
for keyword in keywords_df["Keyword"]:
serps_dict[keyword] = get_serps(keyword)
# Create a matrix to store the overlap percentages
matrix = pd.DataFrame(index=keywords_df["Keyword"], columns=keywords_df["Keyword"])
# Function to calculate the overlap percentage between two keywords
def calculate_overlap(keyword1, keyword2):
serps1 = set(serps_dict[keyword1])
serps2 = set(serps_dict[keyword2])
overlap_count = len(serps1.intersection(serps2))
if len(serps1) == 0 or len(serps2) == 0:
return 0 # Handle division by zero
overlap_percentage = (overlap_count / max(len(serps1), len(serps2))) * 100
return overlap_percentage
# Fill in the matrix with overlap percentages
for i in range(len(keywords_df)):
for j in range(i, len(keywords_df)):
keyword1 = keywords_df.iloc[i]["Keyword"]
keyword2 = keywords_df.iloc[j]["Keyword"]
overlap_percentage = calculate_overlap(keyword1, keyword2)
matrix.at[keyword1, keyword2] = overlap_percentage
matrix.at[keyword2, keyword1] = overlap_percentage
# Save the matrix as a CSV file
matrix.to_csv("serp_overlap_matrix.csv")
What It Does:
- Analyzes your existing keywords from a Semrush export.
- Scrapes the top 10 Google search results for each keyword.
- Calculates the percentage of overlapping websites in those results.
- Outputs a matrix showing the overlap between all keyword pairs.
Below is an example of how the spreadsheet will look with the addition of some conditional formatting. (please note I have switched to dummy keywords in this instance.)
Why It Matters:
- Unveiling Intent: High overlap suggests Google views the keywords as similar, potentially justifying combining them on a single page.
- Content Planning: Optimize content based on Google’s perceived relationship between keywords.
- Improved Efficiency: Group related keywords for efficient content creation and management.
How to Use It:
- Prepare: Ensure your Semrush export includes a “Keyword” column.
- Customize: Replace “ORGANIC POSITIONS FROM SEMRUSH EXPORT.csv” with your file path.
- Run: Execute the script (using Python and libraries).
- Analyze: Explore the “serp_overlap_matrix.csv” file.
Beyond Numbers:
Remember, overlap percentage is just one piece of the puzzle. Consider:
- Search Intent: Do keywords have informational, transactional, or navigational intent? Can they be addressed together?
- Content Uniqueness: Can you create unique, valuable content even for overlapping keywords?
- User Experience: Will grouping keywords confuse users or enhance their experience?
This script is a valuable tool, but SEO decisions require human expertise and judgment. Use it to inform, not dictate, your content strategy.
Happy keyword analysis!
A little extra:
The below script will then pair up similar keywords discovered in the anlaysis based on the percentage parameters you set.
import pandas as pd
# Load the CSV with SERP overlap data
df = pd.read_csv('serp_overlap_matrix.csv', index_col=0)
# Define a function to filter URLs with a minimum threshold of 60% and a maximum of 99%
def filter_overlap(row):
return [keyword for keyword, overlap_percentage in row.items() if 39 <= overlap_percentage <= 99]
# Apply the filtering function to each row and store the results in a new column
df['Overlapping Keywords'] = df.apply(filter_overlap, axis=1)
# Create a dictionary to store matching keywords and their URLs
matching_data = {}
# Iterate through each keyword and its overlapping keywords
for keyword, overlapping_keywords in df['Overlapping Keywords'].items():
for overlapping_keyword in overlapping_keywords:
if overlapping_keyword not in matching_data:
matching_data[overlapping_keyword] = []
matching_data[overlapping_keyword].append(keyword)
# Create a DataFrame from the matching data
result_df = pd.DataFrame.from_dict(matching_data, orient='index').T
# Save the resulting CSV
result_df.to_csv('filtered_overlap_results.csv', index=False)