πŸ”¬
SEO Utils
  • πŸ‘‹Welcome to SEO Utils
  • ▢️Feature Demo
  • πŸ‘¨β€πŸš’Troubleshooting
  • πŸ“”Changelog
  • πŸ›£οΈRoadmap
  • Guide
    • πŸ”Manage License Key
    • πŸ’‘SEO Data Source
    • ✨Semantic Keyword Clustering
    • πŸ’₯SERP Clustering
    • πŸ”‘Rent DataForSEO API Key
    • πŸ—‚οΈData Sharing with Google Drive
    • βš’οΈGoogle Service Accounts
    • πŸ”Google Search Console
    • πŸͺ„Auto-Indexing Tool
    • πŸ“’IndexNow
    • πŸ“Google My Business Rank Tracker
    • 🧲N.A.P Finder
    • πŸ—ΊοΈLocal SERP Checker
    • πŸ“ˆOrganic Rank Tracker
    • πŸ’»How to Use Proxies
    • πŸš€My Go-To SEO Checklist with Google Search Console & GPTs
    • πŸ“§Bulk Check Mentions
    • πŸ€–Bulk SEO Metadata Optimizer
    • πŸ€‘How to Save Costs when Using DataForSEO
    • Manage SERP Data
    • 🚚Migration Tools
    • 🏠Dashboard
  • NLP Text Analysis
  • 🚧Content Struct
  • πŸ”–White-labeled Client Report
  • πŸ›ƒLegal
    • Privacy Policy
    • Terms of Service
Powered by GitBook
On this page
  • What Are the Differences Between SEO Utils' Keyword Clustering and Other Tools?
  • Semantic Clustering vs SERP Clustering
  • How to Download Embedding Models and Use It on SEO Utils?
  • Popular Models
  • Updated Nov 26, 2024: Semantic Keyword Clustering v2
  1. Guide

Semantic Keyword Clustering

Do you frequently question if two keywords can be targeted together on a page, or struggle with a large list of keywords that ChatGPT or other tools can't cluster due to token limits or cost?

PreviousSEO Data SourceNextSERP Clustering

Last updated 5 months ago

What Are the Differences Between SEO Utils' Keyword Clustering and Other Tools?

Here are 2 of the main differences:

Flexible to Switch the Embedding Model

Embedding models in Natural Language Processing (NLP) are designed to convert words, phrases, sentences, or entire documents into numerical vectors. These vectors represent the linguistic features of the text, allowing machines to process and analyze language in a meaningful way.

To do keyword clustering well, you need a good model that's already been trained. With AI growing fast, new models are coming out almost every day. You can visit , a website, to get a free model and use it with SEO Utils to find one that's best for your type of business.

You can also take one of these models and train it more on words specific to your niche or industry. Then, use this customized model in SEO Utils for even better keyword clustering, which can improve your SEO results.

Unlimited Keywords for Clustering

With SEO Utils, you're not restricted in the number of keywords you can cluster. This is a big advantage over other tools that limit you to clustering between 5,000 to 10,000 keywords at a time. Since SEO Utils runs on your computer, it can handle as many keywords as you need, going way beyond these limits.

There's also no credit-based system, meaning you don’t have to pay extra no matter how many keywords you cluster. This can mean big savings, especially in large niches like Gym or Fitness where you might need to cluster a million keywords.

You might think, "Can't I just cluster keywords with ChatGPT or the OpenAI API?" While it's true you can cluster a few hundred keywords with these tools, they hit a limit when you try more than 10,000 keywords due to token limitations. Even with GPT-4 Turbo, which allows more tokens, the quality of clustering decreases with more keywords. It often loses context, doesn't follow instructions well, and misses keywords because you cannot control the temperature parameter in ChatGPT. You can do it with OpenAI API, but the cost is too high.

That's where a dedicated keyword clustering tool like SEO Utils makes a big difference.

Semantic Clustering vs SERP Clustering

In my experience, SERP Clustering always gives you the best result of clustering. However, it comes with many technical issues like proxy rotation, time-consuming, server resources, etc.

Take , for example. It lets you cluster unlimited keywords, but clustering 1 million keywords takes a really long time and can cost about $2,900 (at 0.5 credit per keyword).

On the other hand, using the Semantic Clustering feature in SEO Utils is a different story. You don't have to pay extra, and you can get results as good as SERP Clustering. You can achieve this by fine-tuning your model to suit your specific needs.

SEO Utils will support fine-tuning soon!

How to Download Embedding Models and Use It on SEO Utils?

Also, version 1 is not available for Linux.

  1. Click on the "Clustering" tab, and then select the language that matches your keywords.

  1. You will see the top embedding models based on their clustering task performance.

Only select the mode that can be used with Sentence Transformers.

  1. Click on the Clone repository to download a model with GIT git-lfs

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/thenlper/gte-large

# if you want to clone without large files – just their pointers
# prepend your git clone with the following env var:
GIT_LFS_SKIP_SMUDGE=1
  1. After downloading a model, open SEO Utils on your machine.

  2. Click on the App dropdown, and go to the Settings page.

  1. Scroll down to the Keyword Clustering section and enter the path to the downloaded model on your machine. Then hit the Save button.

  1. That's all. Now, you can go to the Keyword Clustering page and kick off the process.

Popular Models

English

thenlper/gte-large

BAAI/bge-large-en-v1.5

sentence-transformers/sentence-t5-xl

Dutch

NetherlandsForensicInstitute/robbert-2022-dutch-sentence-transformers

textgain/allnli-GroNLP-bert-base-dutch-cased

Swedish

KBLab/sentence-bert-swedish-cased

Spanish

hiiamsid/sentence_similarity_spanish_es

Japanese

colorfulscoop/sbert-base-ja

sonoisa/sentence-bert-base-ja-mean-tokens

Filipino

meedan/paraphrase-filipino-mpnet-base-v2

danjohnvelasco/filipino-sentence-roberta-v1

Chinese

uer/sbert-base-chinese-nli

Thai

mrp/simcse-model-m-bert-thai-cased

Vietnamese

keepitreal/vietnamese-sbert

Arabic

medmediani/Arabic-KW-Mdel

Indonesia

firqaaa/indo-sentence-bert-base

Universal

These models are pre-trained in multiple languages. If you cannot find a model that is pre-trained in a specific language, you can use these universal models. sentence-transformers / all-mpnet-base-v2

Updated Nov 26, 2024: Semantic Keyword Clustering v2

  • Streamlined Model Selection: No more manual downloads from HuggingFaceβ€”simply select your model from a dropdown menu, and SEO Utils takes care of the rest.

  • Faster Clustering with GPU Support: Speed up keyword clustering with GPU support for Windows and Linux. While macOS only supports CPU, version 2 still delivers faster clustering compared to version 1!

  • Polished Output: All formatting issues in the output file have been resolved.

  • Improved Reliability: Minimize unexpected issues with a more stable setup.

  • Simplified Updates: Updating the clustering script is now easier for me, as I no longer need to build separate executable scripts for each platform.

Why do I still keep version one? Some users run SEO Utils on VPS, and not all VPS can support Docker. That’s why version 1 is still available as a backup option. If your setup supports Docker, I highly recommend using version 2!

How to Install Docker

  • Open Docker and follow the app’s instructions to start it as recommended.

You can close Docker if you’re not using Semantic Clustering version 2 anymore.

Switch to Version 2

Step 1: Go to Settings > Services from the left sidebar.

Step 2: Select Version 2 from the dropdown and click the Save button.

Step 3: Open the Semantic Clustering tool as you normally would. You’ll notice a new field called β€œEmbedding Model”. Simply choose a model from the dropdown based on your language to start clustering keywords.

No more complicated setups or manual downloads! SEO Utils will automatically download the model and cache it for future use, so you won’t need to wait for it to re-download every time.

This documentation is for Semantic Clustering v1. For a better experience and improved performance, please refer to the .

First, you can visit this leaderboard:

Select one model, for example,

I will provide a so that you can easily download them.

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

Download:

More info:

With the release of , I’ve introduced Semantic Keyword Clustering v2, which leverages a consistent for running the clustering Python script. Here’s how this upgrade improves your experience:

Visit , download it, and install it as you normally would.

✨
HuggingFace
Larseo's SERP Clustering
https://huggingface.co/spaces/mteb/leaderboard
https://huggingface.co/thenlper/gte-large
https://drive.google.com/file/d/1O4lm4hnqXoCDloqaw_exWiz9a-Vy0ZZK/view?usp=sharing
https://huggingface.co/thenlper/gte-large
https://drive.google.com/file/d/1e3C2t3r1UrYgiAJf8GipmEUpXFK7zThz/view?usp=sharing
https://huggingface.co/BAAI/bge-large-en-v1.5
https://drive.google.com/file/d/15ijnzpEfbZG0dTiXq6Gyyesqj5DCDaEU/view?usp=sharing
https://huggingface.co/sentence-transformers/sentence-t5-xl
https://drive.google.com/file/d/1sbQ60drQeuqVTFYlbeuDp-Ds8Eoxts8D/view?usp=sharing
https://huggingface.co/NetherlandsForensicInstitute/robbert-2022-dutch-sentence-transformers
https://drive.google.com/file/d/1MKDmybaQihStFcV6N4hxaRz3TINZ5B_4/view?usp=sharing
https://huggingface.co/textgain/allnli-GroNLP-bert-base-dutch-cased
https://drive.google.com/file/d/1-eyV7KURYSd16pJeTWgMg9Sm_2DeEyDG/view
https://huggingface.co/KBLab/sentence-bert-swedish-cased
https://drive.google.com/file/d/1a59Ld6LA_HCWwcEd7PQPogSCGr8_8yRU/view?usp=sharing
https://huggingface.co/hiiamsid/sentence_similarity_spanish_es
https://drive.google.com/file/d/1IMLy6YlVM1irIFwS3eNTrL9GZEdwfcJC/view?usp=sharing
https://huggingface.co/colorfulscoop/sbert-base-ja
https://drive.google.com/file/d/1VmRKpCjnEpY-jY6o_6SYsqUPFajGhbmN/view?usp=sharing
https://huggingface.co/sonoisa/sentence-bert-base-ja-mean-tokens
https://drive.google.com/file/d/1PToqNZKmc1GrrNxPIf2yDR9pC05kyfoD/view?usp=sharing
https://huggingface.co/meedan/paraphrase-filipino-mpnet-base-v2
https://drive.google.com/file/d/1GVzkYOCRuL2QHFOLCDKPRZlQqbGH5wqZ/view?usp=sharing
https://huggingface.co/danjohnvelasco/filipino-sentence-roberta-v1
https://drive.google.com/file/d/16d_5UUxM8cUGy7TXCE0FJitg2NQNnpVc/view?usp=sharing
https://huggingface.co/uer/sbert-base-chinese-nli
https://drive.google.com/file/d/16AukQ0XCBbWyBTTPl7mh6bTb7_chQtjz/view?usp=sharing
https://huggingface.co/mrp/simcse-model-m-bert-thai-cased
https://drive.google.com/file/d/1xMx7x78Dgyv7HcFOYZJALDDEMX3p3an3/view?usp=sharing
https://huggingface.co/keepitreal/vietnamese-sbert
https://drive.google.com/file/d/1dtvn7L4ItcCr3G5M0SOYIKyR4k8hkz1C/view?usp=sharing
https://huggingface.co/medmediani/Arabic-KW-Mdel
https://drive.google.com/file/d/1W6df3Ij-NPCgmdwgv9DyBO7XEe2-MBvM/view?usp=sharing
https://huggingface.co/firqaaa/indo-sentence-bert-base
https://drive.google.com/file/d/15ijnzpEfbZG0dTiXq6Gyyesqj5DCDaEU/view?usp=sharing
https://huggingface.co/sentence-transformers/all-mpnet-base-v2
https://www.docker.com/
documentation for Version 2
list of popular models on Google Drive
Docker environment
SEO Utils - How to use Semantic Keyword Clustering
List of top embedding models
Example a model can be used with Sentence Transformers.
Download a model
Docker will be run in the background
Settings > Services
Select "Version 2" option
Select an ebmedding model from dropdown.
SEO Utils v1.23.2