6.6 C
London
Tuesday, March 11, 2025
HomePythonDescriptive Statistics in PythonHow to Calculate Levenshtein Distance in Python

How to Calculate Levenshtein Distance in Python

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

The Levenshtein distance between two strings is the minimum number of single-character edits required to turn one word into the other.

The word “edits” includes substitutions, insertions, and deletions.

For example, suppose we have the following two words:

  • PARTY
  • PARK

The Levenshtein distance between the two words (i.e. the number of edits we have to make to turn one word into the other) would be 2:

Levenshtein distance example

In practice, the Levenshtein distance is used in many different applications including approximate string matching, spell-checking, and natural language processing.

This tutorial explains how to calculate the Levenshtein distance between strings in Python by using the python-Levenshtein module.

You can use the following syntax to install this module:

pip install python-Levenshtein

You can then load the function to calculate the Levenshtein distance:

from Levenshtein import distance as lev

The following examples show how to use this function in practice.

Example 1: Levenshtein Distance Between Two Strings

The following code shows how to calculate the Levenshtein distance between the two strings “party” and “park”:

#calculate Levenshtein distance
lev('party', 'park')

2

The Levenshtein distance turns out to be 2.

Example 2: Levenshtein Distance Between Two Arrays

The following code shows how to calculate the Levenshtein distance between every pairwise combination of strings in two different arrays:

#define arrays
a = ['Mavs', 'Spurs', 'Lakers', 'Cavs']
b #calculate Levenshtein distance between two arrays
for i,k in zip(a, b):
  print(lev(i, k))

6
4
5
5

The way to interpret the output is as follows:

  • The Levenshtein distance between ‘Mavs’ and ‘Rockets’ is 6.
  • The Levenshtein distance between ‘Spurs’ and ‘Pacers’ is 4.
  • The Levenshtein distance between ‘Lakers’ and ‘Warriors’ is 5.
  • The Levenshtein distance between ‘Cavs’ and ‘Celtics’ is 5.

Additional Resources

How to Calculate Hamming Distance in Python
How to Calculate Euclidean Distance in Python
How to Calculate Mahalanobis Distance in Python

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories