Open In App

How to compare two text files in python?

Last Updated : 07 Jan, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Python has provided the methods to manipulate files that too in a very concise manner. In this article we are going to discuss one of the applications of the Python’s file handling features i.e. the comparison of files.

Files in use:

Method 1: Comparing complete file at once

Python supports a module called filecmp with a method filecmp.cmp() that returns three list containing matched files, mismatched files and errors regarding those files which could not be compared. This method can operate in two modes :

  • shallow mode: where only metadata of the files are compared like the size, date modified, etc.
  • deep mode: where the content of the files are compared.

Syntax:

cmp(a, b)

Parameters:

a and b are the two numbers in which the comparison is being done. 

Returns:

  • -1 if a<b
  • 0 if a=b
  • 1 if a>b

Program:

Python3




import filecmp
 
f1 = "C:/Users/user/Documents/intro.txt"
f2 = "C:/Users/user/Desktop/intro1.txt"
 
# shallow comparison
result = filecmp.cmp(f1, f2)
print(result)
# deep comparison
result = filecmp.cmp(f1, f2, shallow=False)
print(result)


Output:

False

False

Method 2: Comparing files line by line

The drawback in the above approach is that we can not retrieve the lines where the files differ. Though this is an optional requirement we often want to watch out for the lines where files differ and then manipulate that to our advantage. The basic approach to implement this is to store each line of every file in separate lists one for each file. These lists are compared against each other two files at a time.

Approach:  

  • Open the files to be compared
  • Loop through the files and compare each line of the two files.
  • If lines are identical, output SAME on the output screen.
  • Else, output the differing lines from both the files on the output screen.

Program:

Python3




# reading files
f1 = open("C:/Users/user/Documents/intro.txt", "r"
f2 = open("C:/Users/user/Desktop/intro1.txt", "r"
 
f1_data = f1.readlines()
f2_data = f2.readlines()
 
i = 0
 
for line1 in f1_data:
    i += 1
     
    for line2 in f2_data:
         
        # matching line1 from both files
        if line1 == line2: 
            # print IDENTICAL if similar
            print("Line ", i, ": IDENTICAL")      
        else:
            print("Line ", i, ":")
            # else print that line from both files
            print("\tFile 1:", line1, end='')
            print("\tFile 2:", line2, end='')
        break
 
# closing files
f1.close()                                      
f2.close()                                     


Output: 

Method 3: Comparing complete directory

Python supports a module called filecmp with a method filecmp.cmpfiles() that returns three list containing matched files, mismatched files and errors regarding those files which could not be compared. It is similar to first approach but it is used to compare files in two different directories. 

Program:

Python3




import filecmp
 
d1 = "C:/Users/user/Documents/"
d2 = "C:/Users/user/Desktop/"
files = ['intro.txt']
 
# shallow comparison
match, mismatch, errors = filecmp.cmpfiles(d1, d2, files)
print('Shallow comparison')
print("Match:", match)
print("Mismatch:", mismatch)
print("Errors:", errors)
 
# deep comparison
match, mismatch, errors = filecmp.cmpfiles(d1, d2, files, shallow=False)
print('Deep comparison')
print("Match:", match)
print("Mismatch:", mismatch)
print("Errors:", errors)


Output:

Shallow Comparison

Match: [ ]

Mismatch: [ ‘ intro.txt ‘]

Errors: [ ]

Deep comparison

Match: []

Mismatch: [ ‘ intro.txt ‘]

Errors: [ ]



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads