Home » News » TF-IDF (Term Frequency-Inverse Document Frequency)

TF-IDF (Term Frequency-Inverse Document Frequency)

This method helps determine the importance of a keyword in a text compared to other articles on the topic. The “database” for TF-IDF analysis usually consists of texts from competitors’ pages that occupy high positions in Google.

Here’s how to use the TF-IDF method for an article:

1. Collect competitors’ texts for overseas data analysis. Let’s say there are five of them.

2. Calculate Term Frequency (TF), which determines the frequency of a term in a text:

 

Example: If the term “green car” appears ten times in the text and the total number of words is 200, its TF = 10/200 = 0.05 .

3. Calculate the Inverse Document Frequency (IDF), which measures the importance of a term in a collection of documents:

 

Example: For a term that appears in three out of five documents, IDF = log ( 3 / 5 ) + 1 ≈ 1.2 .

4. Calculate TF-IDF using the numbers obtained above:

 

Example: TF-IDF = 0.05×1.22 = 0.061.

If the TF-IDF for most keywords on what can we say?: the user is the key competitors’ pages is within 0.05-0.1, a value of 0.061 indicates that the keyword is important.

The TF-IDF score can vary widely, from very low values ​​(0.001) to high values ​​(10 or more), depending on the frequency of the term in the document and its use in competitors’ articles.

So that you don’t have to calculate logarithms manually, I’m sharing services that automate the process:

  • Remycarem ;
  • Seobility .

Key Density Services

To quickly determine the density of keywords in the text and analyze it for spam, use online services. Here are the most popular and convenient ones.

Serpstat Text Analytics

The service provides detailed text analysis: austria business directory determination of keyword density, competitor analysis and recommendations for improving the text.

Serpstat analyzes the occurrence of important keys in Title, H1 and text on the page: is there overspam, how many requests, what keys are used by competitors and is the word significant for the analyzed topic.

Scroll to Top