Skip to content
Related Articles

Related Articles

Improve Article

How to compare two text files in python?

  • Last Updated : 24 Jan, 2021
Geek Week

Python has provided the methods to manipulate files that too in a very concise manner. In this article we are going to discuss one of the applications of the Python’s file handling features i.e. the comparison of files.

Files in use:

Method 1: Comparing complete file at once

Python supports a module called filecmp with a method filecmp.cmp() that returns three list containing matched files, mismatched files and errors regarding those files which could not be compared. This method can operate in two modes :

  • shallow mode: where only metadata of the files are compared like the size, date modified, etc.
  • deep mode: where the content of the files are compared.

Syntax:

cmp(a, b)



Parameters:

a and b are the two numbers in which the comparison is being done. 

Returns:

  • -1 if a<b
  • 0 if a=b
  • 1 if a>b

Program:

Python3




import filecmp
  
f1 = "C:/Users/user/Documents/intro.txt"
f2 = "C:/Users/user/Desktop/intro1.txt"
  
# shallow comparison
result = filecmp.cmp(f1, f2)
print(result)
# deep comparison
result = filecmp.cmp(f1, f2, shallow=False)
print(result)

Output:

False

False



Method 2: Comparing files line by line

The drawback in the above approach is that we can not retrieve the lines where the files differ. Though this is an optional requirement we often want to watch out for the lines where files differ and then manipulate that to our advantage. The basic approach to implement this is to store each line of every file in separate lists one for each file. These lists are compared against each other two files at a time.

Approach:  

  • Open the files to be compared
  • Loop through the files and compare each line of the two files.
  • If lines are identical, output SAME on the output screen.
  • Else, output the differing lines from both the files on the output screen.

Program:

Python3




# reading files
f1 = open("C:/Users/user/Documents/intro.txt", "r")  
f2 = open("C:/Users/user/Desktop/intro1.txt", "r")  
  
i = 0
  
for line1 in f1:
    i += 1
      
    for line2 in f2:
          
        # matching line1 from both files
        if line1 == line2:  
            # print IDENTICAL if similar
            print("Line ", i, ": IDENTICAL")       
        else:
            print("Line ", i, ":")
            # else print that line from both files
            print("\tFile 1:", line1, end='')
            print("\tFile 2:", line2, end='')
        break
  
# closing files
f1.close()                                       
f2.close()                                      

Output: 

Method 3: Comparing complete directory

Python supports a module called filecmp with a method filecmp.cmpfiles() that returns three list containing matched files, mismatched files and errors regarding those files which could not be compared. It is similar to first approach but it is used to compare files in two different directories. 

Program:

Python3




import filecmp
  
d1 = "C:/Users/user/Documents/"
d2 = "C:/Users/user/Desktop/"
files = ['intro.txt']
  
# shallow comparison
match, mismatch, errors = filecmp.cmpfiles(d1, d2, files)
print('Shallow comparison')
print("Match:", match)
print("Mismatch:", mismatch)
print("Errors:", errors)
  
# deep comparison
match, mismatch, errors = filecmp.cmpfiles(d1, d2, files, shallow=False)
print('Deep comparison')
print("Match:", match)
print("Mismatch:", mismatch)
print("Errors:", errors)

Output:



Shallow Comparison

Match: [ ]

Mismatch: [ ‘ intro.txt ‘]

Errors: [ ]

Deep comparison

Match: []

Mismatch: [ ‘ intro.txt ‘]

Errors: [ ]

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course




My Personal Notes arrow_drop_up
Recommended Articles
Page :