Given a number of input files in a source directory, write a Python program to read data from all the files and write it to a single master file.
Source directory contains n number of files, and structure is same for all files. The objective of this code is to read all the files one by one and then append the output into a single master file having structure same as source files.
Taking three input files as example, named emp_1.txt, emp_2.txt, emp_3.txt, output will contain data from all the input files.
Input:Output:
Method #1: Using os module
import os
# list the files in directory lis = os.listdir( 'D:\\python'
'\\data_files\\data_files' )
print (lis)
tgt = os.listdir( 'D:\\python'
'\\data_files\\target_file' )
file_dir = 'D:\\python\\data_files\\data_files'
out_file = r 'D:\\python\\data_files\\target_file\\master.txt'
ct = 0
print ( 'target file :' , tgt)
try :
# check for if file exists
# if yes delete the file
# otherwise data will be appended to existing file
if len (tgt)> 0 :
os.remove( 'D:\\python'
'\\data_files\\target_file\\master.txt' )
open (tgt, 'a' ).close()
else :
# create an empty file
open (tgt, 'a' ).close()
except :
head = open ( 'D:\\python'
'\\data_files\\target_file\\master.txt' , 'a+' )
line = 'empno, ename, sal'
# write header to output
print (head, line)
head.close()
# below loop to write data to output file
for line1 in lis:
f_dir = file_dir + '\\' + line1
# open files in read mode
in_file = open (f_dir, 'r+' )
# open output in append mode
w = open (out_file, 'a+' )
d = in_file.readline()
d = in_file.readlines()
w.write( "\n" )
for line2 in d:
print (line2)
w.write(line2)
ct = ct + 1 w.close()
|
Output:
Method #2: Using pandas
import pandas as pd
# pd.read_csv creates dataframes df1 = pd.read_csv( 'D:\python\data_files\data_files\emp_1.txt' )
df2 = pd.read_csv( 'D:\python\data_files\data_files\emp_2.txt' )
df3 = pd.read_csv( 'D:\python\data_files\data_files\emp_3.txt' )
frames = [df1, df2, df3]
# concat function concatenates the frames result = pd.concat(frames)
# to_csv function writes output to file result.to_csv( 'D:\\python\\data_files'
'\\target_file\\master.txt' , encoding = 'utf-8' , index = False )
|
Output: