Home    posts    keyword density tool

Posted on: October 20, 2017Updated: October 30, 2017

Basic free keyword density checker (SEO tool)

Today we are going to take a look at how to create a basic keyword density tool using Python that you'll be able to use whenever desired. Only requirement is to have Python 3+ installed on your machine.

Here are some of the features of the tool:

- Reads the content from a text file

- Trims all the special signs and line breaks that come with text before processing

- No word filter

- Analyzing multiple word query

- Keeps consistency of words in multiple word query construct

The last feature is pretty easy to understand. For example, if letters a, b, c, d are words and we have 2 word query, it would go with (a b), (b c), (c d) and in 3 word query it would go (a b c), (b c d). It means that literally nothing is skipped.

Before we start with the tutorial, I would just like to point out that it will be done in 3 parts but only 1 file is required to save and run the code. First part will contain info on how the input text from a file is handled. Second part will be about defining the 1, 2 and 3 word queries and the third and final part will contain the main function.

PART 1 - Process and prepare the text input

# Import sys and re modules, they are needed
import sys, re

# Create a function to open the text file and process all the text to be added to a list
def textinput():

	# Open the file from the path provided. Additionaly, intercept IOError(could not read the file) check condition
		inputs = open(input('Path to the file: ')).read()
	except IOError:

		# Restart the main(defined in Part 3) function if Y or y button is hit, else exit. \n is used for line break
		files = input('\nNo such file or directory! Press [Y] to reload or any other key to exit the program.\n')
		if files == 'Y' or files == 'y':
			print('\n' * 30)

	# Transform all text to lowercase
	transformed = inputs.lower()

	# Replace all the signs with nothing, equal to closed space
	clean = re.sub('[.,!?;:()*&^%$#@_+\~"]', '', transformed)

	# Get rid of the line breaks
	reformat = ' '.join(clean.split())

	# Get the words into a list
	allwords = reformat.split(' ')

	# Return the list
	return allwords

PART 2 - Define query analysis process

Now that we have a list prepared, we can proceed to define the functions for one, two and three word queries.

def oneword():
	# Get the list from the above(Part 1) function and save it to a variable
	allwords = textinput()

	# Create a list of unique words
	uniquewords = set(allwords)

	# Count total of all words
	total = len(allwords)

	# Prepare a new empty list(Product=[]), create a loop where each word in unique words is counted(how many times it repeats in allwords)
	# Each word is then calculated and formated to a percentage of the total. Both the word and its percentage are added to the new list as a single element.
	product = []
	for word in uniquewords:
		result = allwords.count(word)
		percentage = '{:.2%}'.format(result / total) # Percentage is up to a 2nd decimal
		product.extend([percentage + ' - ' + word])

	# Loop through the above created list, now sorted in descending order based on percentage, and print each item
	for item in sorted(product, reverse=True):

For multiple words queries, there is a small difference since a list of joined words has to be created first. The rest of the process is the same.

def twowords():
	allwords = textinput()

	# Create a new list of two words joined together but separated with space.
	# We will use built in zip() function to join words 1 by 1 together and later add them to the list as one element.
	twowordlist = []
	for x,y in zip(allwords[::1],allwords[1::1]):
		twowords = x+' '+y

	# Same as before but now the new list of 2 joined words is used instead
	uniquetwowords = set(twowordlist)
	totals = len(twowordlist)

	product = []
	for word in uniquetwowords:
		result = twowordlist.count(word)
		percentage = '{:.2%}'.format(result / totals)
		product.extend([percentage + ' - ' + word])

	for item in sorted(product, reverse=True):

def threewords():
	allwords = textinput()

	# Same as two words but now three are joined together, same principle
	threewordlist = []
	for x,y,z in zip(allwords[::1],allwords[1::1],allwords[2::1]):
		threewords = x+' '+y+' '+z

	uniquethreewords = set(threewordlist)
	total = len(threewordlist)

	product = []
	for word in uniquethreewords:
		result = threewordlist.count(word)
		percentage = '{:.2%}'.format(result / total)
		product.extend([percentage + ' - ' + word])

	for item in sorted(product, reverse=True):

PART 3 - Create interface and assemble the program

All that remains now is to create an interface of the program and define some conditions. First, we will make another function that will handle a restart option after the process of the above functions is completed.

def restart():

	# Restart the main function option
	restart = input('\nDo you want to run the program again? Press [Y] to reload or any other key to exit the program.\n')
	if restart == 'Y' or restart == 'y':
		print('\n' * 30)

And finally, the main function to handle the functionality of the the program. Simple interface and if,elif,else conditions to give us options whether we want to choose 1 word, 2 words or 3 words analysis, or exit the program.

# The main function
def keydensity():

	print('\n' * 30)
	print('Keyword analyzer v0.01\n')
	print('Simple keyword density tool!\n')
	print('[1] 1 word analysis')
	print('[2] 2 words analysis')
	print('[3] 3 words analysis')
	print('[4] Exit the program\n')

	# Make a choice input and define the conditions
	choice = input('Choose: ')
	if choice == '1':		

		# Get the oneword function and after the restart function

	elif choice == '2':

		# Get the twowords function and after the restart function

	elif choice == '3':

		# Get the twowords function and after the restart function

	elif choice == '4':

		# Exit the program


		# If none of the above is true, ask for a reload
		invalid = input('\nThe program does not understand your command! Please select one of the options. Press [Y] to reload or any other key to exit the program.\n')
		if invalid == 'Y' or invalid == 'y':
			print('\n' * 30)

# All that's left is to execute the main function

That's it. Now you can try to run it and see the results. Just make sure to save it as .py file. Alternatively you can download the tool here


Be the first to comment.

Add a comment:

I have read and agree with the Privacy terms and conditions.