Open In App

Compare sequences in Python using dfflib module

The dfflib Python module includes various features to evaluate the comparison of sequences, it can be used to compare files, and it can create information about file variations in different formats, including HTML and context and unified diffs.

It contains various classes to perform various comparisons between sequences:



Class SequenceMatcher

It is a very flexible class for matching sequence pairs of any sort. This class contains various functions discussed below:

2*X/Y 



Where X is the number of similar matches and 

Y is the total elements present in both the sequences.

Example 1:




# import required module
import difflib
  
# assign parameters
par1 = ['g', 'f', 'g']
par2 = 'gfg'
  
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())

Output:

1.0

Example 2:




# import required module
import difflib
  
# assign parameters
par1 = 'Geeks for geeks!'
par2 = 'geeks'
  
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())

Output:

0.47619047619047616

Example 3:




# import required module
import difflib
  
# assign parameters
par1 = 'gfg'
par2 = 'GFG'
  
# compare
print(difflib.SequenceMatcher(None, par1, par2).ratio())

Output:

0.0

Example 1:




# import required module
import difflib
  
# assign parameters
par1 = 'Geeks for geeks!'
par2 = 'geeks'
  
# compare
matches = difflib.SequenceMatcher(
    None, par1, par2).get_matching_blocks()
  
for ele in matches:
    print(par1[ele.a:ele.a + ele.size])

Output:

geeks

Example 2:




# import required module
import difflib
  
# assign parameters
par1 = 'GFG'
par2 = 'gfg'
  
# compare
matches = difflib.SequenceMatcher(
    None, par1, par2).get_matching_blocks()
  
for ele in matches:
    print(par1[ele.a:ele.a + ele.size])

Output:

 

As there are no matching subsequences between GFG and gfg. So no output is displayed.

Example :




# import required module
import difflib
  
# assign parameters
string = "Geeks4geeks"
listOfStrings = ["for", "Gks", "G4g", "geeks"]
  
# find common strings
print(difflib.get_close_matches(string, listOfStrings))

Output:

['geeks']

Class Differ

This class is used for matching sequences in the form of lines of text and creating human-readable variations or deltas. Every line of the Differ delta starts with a two-letter code:

Code Meaning
‘- ‘ line unique to sequence 1
‘+ ‘ line unique to sequence 2
‘  ‘ line common to both sequences
‘? ‘ line not present in either input sequence

Following are the functions contained within this class:

Example 1:




# import required module
from difflib import Differ
  
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
  
# compare parameters
for ele in Differ().compare(par1, par2):
    print(ele)

Output:

- G
+ g
  e
  e
  k
  s
+ !

Example 2:




# import required module
from difflib import Differ
  
# assign parameters
par1 = ['Geeks','for','geeks!']
par2 = 'geeks!'
  
# compare parameters
for ele in Differ().compare(par1, par2):
    print(ele)

Output:

- G
+ g
  e
  e
  k
  s
+ !

Example 1:




# import required module
import difflib
  
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
  
# compare parameters
for ele in difflib.ndiff(par1, par2):
    print(ele)

Output:

- G
+ g
  e
  e
  k
  s
+ !

Example 2:




# import required module
import difflib
  
# assign parameters
par1 = ['Geeks','for','geeks!']
par2 = 'geeks!'
  
# compare parameters
for ele in difflib.ndiff(par1, par2):
    print(ele)

Output:

- Geeks
- for
- geeks!
+ g
+ e
+ e
+ k
+ s
+ !

Example 1:




# import required module
import difflib
  
# assign parameters
par1 = 'Geeks'
par2 = 'geeks!'
  
# compare parameters
for ele in difflib.context_diff(par1, par2):
    print(ele)

Output:

*** 

— 

***************

*** 1,5 ****

! G

  e

  e

  k

  s

— 1,6 —-

! g

  e

  e

  k

  s

+ !

Example 2:




# import required module
import difflib
  
# assign parameters
par1 = ['Geeks', 'for', 'geeks!']
par2 = 'geeks!'
  
# compare parameters
for ele in difflib.context_diff(par1, par2):
    print(ele)

Output:

*** 

— 

***************

*** 1,3 ****

! Geeks

! for

! geeks!

— 1,6 —-

! g

! e

! e

! k

! s

! !


Article Tags :