Machine Learning SEO

Python for SEO using Google Search Console

In this post we will learn how to use python for improving the SEO of your site and understanding opportunities. We will start by exploring google search console data and finding topics that convert well and also topics which get impressions but doesn’t convert.

Lets start by looking at the google search console and export the performance data for our website.
search_console

Next upload the Queries file exported from google search console data to Google Drive.
drive

Next we will start looking at the queries and building topic clusters which will help us figure out the next course of action ( Note for this case we will not use LDA). We will use a simple TfidfVectorizer to convert the text into vector format and run KMeans Clustering Algorithm on the vectorized form of the search queries to generate clusters

Python and Machine Learning

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.cluster import KMeans
from sklearn.metrics import adjusted_rand_score
from google.colab import drive

# Mount the drive.
drive.mount('/content/drive')

# Read the performance data into Pandas data frame.

df = pd.read_csv("/content/drive/My Drive/ColabData SearchConsole/Queries.csv")

documents = list(df.Query.values)
vectorizer = TfidfVectorizer(stop_words='english', ngram_range=(1,3))
X = vectorizer.fit_transform(documents)

true_k = 20
model = KMeans(n_clusters=true_k, init='k-means++', max_iter=1000, n_init=42, random_state=42)
model.fit(X)

print("Top terms per cluster:")
order_centroids = model.cluster_centers_.argsort()[:, ::-1]
terms = vectorizer.get_feature_names()
for i in range(true_k):
    
    cluster_terms = []
    for ind in order_centroids[i, :10]:
        cluster_terms.append(' %s' % terms[ind])        
    print("Cluster {}, Terms : {}".format(i, cluster_terms)),

Top terms per cluster:
Cluster 0, Terms : [' classification', ' multiclass', ' keras', ' multiclass classification', ' classification keras', ' lstm', ' multiclass classification keras', ' keras classification', ' keras multiclass', ' text classification']
Cluster 1, Terms : [' uiactivityviewcontroller', ' uiactivityviewcontroller swift', ' uiactivityviewcontroller share', ' share', ' swift', ' uiactivityviewcontroller ipad', ' ipad', ' url', ' swift uiactivityviewcontroller', ' uiactivityviewcontroller delegate']
Cluster 2, Terms : [' ios', ' xcode', ' ios swift', ' charts', ' swift', ' charts ios', ' tutorial', ' ios charts', ' s3', ' app']
Cluster 3, Terms : [' inventory', ' iphone', ' inventory app', ' app', ' app iphone', ' inventory app iphone', ' barcode', ' ios inventory', ' management', ' inventory management']
Cluster 4, Terms : [' multi class', ' class', ' multi', ' class classification', ' multi class classification', ' classification', ' keras', ' keras multi class', ' python', ' classification keras']
Cluster 5, Terms : [' uitableview', ' delegate', ' swift', ' delegate swift', ' uitableview delegate', ' datasource', ' grouped', ' delegate datasource', ' swift uitableview', ' uitableview tutorial']
Cluster 6, Terms : [' make', ' like', ' uber', ' app', ' like app', ' uber like app', ' uber like', ' make app', ' like uber', ' make uber']
Cluster 7, Terms : [' swift', ' share', ' share swift', ' swift share', ' charts', ' uipageviewcontroller', ' charts swift', ' uipageviewcontroller swift', ' coredata', ' chart']
Cluster 8, Terms : [' tutorial', ' apple', ' pay', ' apple pay', ' swift', ' swift tutorial', ' xcode', ' serverless', ' aws', ' app tutorial']
Cluster 9, Terms : [' uisearchbar', ' uisearchbar swift', ' swift', ' swift uisearchbar', ' uisearchbar example', ' uisearchbar delegate', ' ios uisearchbar', ' uisearchbar ios', ' uisearchbar tutorial', ' example']
Cluster 10, Terms : [' uitableviewcell', ' uitableviewcell swift', ' init', ' custom', ' uitableviewcell init', ' swift uitableviewcell', ' swift', ' uitableview uitableviewcell', ' custom uitableviewcell', ' uitableviewcell custom']
Cluster 11, Terms : [' counter', ' step', ' step counter', ' counter app', ' iphone', ' step counter app', ' app', ' counter iphone', ' iphone step counter', ' iphone step']
Cluster 12, Terms : [' core', ' core data', ' data', ' data swift', ' core data swift', ' swift', ' core data tutorial', ' data tutorial', ' swift core', ' tutorial']
Cluster 13, Terms : [' uber', ' uber app', ' app', ' build uber', ' build uber app', ' build', ' tutorial', ' iphone', ' create uber', ' create uber app']
Cluster 14, Terms : [' pedometer', ' iphone', ' pedometer app', ' iphone pedometer', ' pedometer iphone', ' app', ' pedometer app iphone', ' app iphone', ' pedometer swift', ' swift pedometer']
Cluster 15, Terms : [' tableview', ' tableview delegate', ' tableview datasource', ' delegate', ' swift tableview', ' datasource', ' swift', ' methods', ' grouped', ' grouped tableview']
Cluster 16, Terms : [' search', ' bar', ' search bar', ' swift search', ' swift', ' bar swift', ' search bar swift', ' swift search bar', ' search bar ios', ' bar ios']
Cluster 17, Terms : [' example', ' word2vec', ' sagemaker', ' serverless', ' framework', ' cloudkit', ' python', ' uivisualeffectview', ' serverless framework', ' view']
Cluster 18, Terms : [' uialertcontroller', ' uialertcontroller swift', ' uialertcontroller image', ' image', ' swift uialertcontroller', ' ios uialertcontroller', ' uialertcontroller ios', ' uialertcontroller example', ' swift', ' ios']
Cluster 19, Terms : [' lsa', ' gensim', ' vs', ' gensim lsa', ' lsa vs', ' word2vec', ' lda', ' lsa python', ' lsa word2vec', ' lsa gensim']

Google Search Console allows us to download around 1000 keywords and with the above method we have been able to cluster similar search queries together. Now it’ss time to map the clusters to well defined topics.

Next we will assign a topic label to each cluster by looking at the terms which the Clustering Algorithm has provided us. Here are the manually assigned topics.

topic_map = {
    0: 'Multi Class Classification',
    1: 'iOS Sharing (How To)',
    2: 'iOS Charts',
    3: 'iOS BarCode and Inventory Management',
    4: 'Multi Class Classification',
    5: 'iOS UI TableViews',
    6: 'How to make an app series',
    7: 'Core Data & Swift',
    8: 'Apple Pay',
    9: 'Implement Search in iOS',
    10: 'UITableViews & Layouts',
    11: 'iOS HealthKit',
    12: 'Core Data',
    13: 'Build by example: Uber',
    14: 'iOS Step Counter App',
    15: 'iOS UI TableViews',
    16: 'Swift Search Bar',
    17: 'AWS Serverless',
    18: 'Notifications / Alerts Swift',
    19: 'Word Vectors'    
}

Next we will use the model to predict the cluster label to each of the search query we got from the google search console data.

# This will generate labels from 0 to 19 and we will use the map above to map it to readable cluster labels

df['cluster'] = model.predict(vectorizer.transform(df.Query))

df['cluster_label'] = df.cluster.map(topic_map)

# Next create the ctr from Click and impression data
df['ctr'] = df.Clicks/df.Impressions

# Find the topic cluster on your site which are converting the best and also the topic which get impression but doesn't convert well.
df['ctr'] = df.Clicks/df.Impressions
df.groupby('cluster_label')[['ctr', 'Impressions']].mean().reset_index().sort_values('ctr', ascending=False)

cluster_labels

How to use this data

The clusters with low ctr data but high impression data is the opportunity zone and you can either build new posts of improve the existing page to make it more useful. Start with updating meta description aka the snippet that the google show for you link.

  • Add more topic pages and improve the existing ones.
  • Find related keywords for these topic clusters and add new pages
  • Linking to these new pages from existing high traffic pages.
  • Once we start building more content and improving  existing pages we can use the model built above to see how the CTR of the topic improve over time by automating the google search console data and the clustering method described above.

I am not a SEO pro please provide any feedback and if you are interested in collaborating on a areas related to SEO and Machine Learning feel free to ping me.

About the author

Shrikar

Backend/Infrastructure Engineer by Day. iOS Developer for the rest of the time.