Open In App

Python | Create Archives and Find Files by Name

Last Updated : 29 Dec, 2020
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn how to create or unpack archives in common formats (e.g., .tar, .tgz, or .zip) using shutil module.

The shutil module has two functions — make_archive() and unpack_archive() — that can exactly be the solution.

Code #1 :




import shutil
shutil.unpack_archive('Python-3.3.0.tgz')
shutil.make_archive('py33', 'zip', 'Python-3.3.0')


Output :

'/Users/Dell/Downloads/py33.zip'

The second argument to make_archive() is the desired output format. To get a list of supported archive formats, use get_archive_formats().
 
Code #2 :




shutil.get_archive_formats()


Output :

[('bztar', "bzip2'ed tar-file"), 
 ('gztar', "gzip'ed tar-file"), 
 ('tar', 'uncompressed tar file'), 
 ('zip', 'ZIP file')]

Python has other library modules for dealing with the low-level details of various archive formats (e.g., tarfile, zipfile, gzip, bz2, etc.). However, to make or extract an archive, there’s really no need to go so low level.
One can just use these high-level functions in shutil instead. The functions have a variety of additional options for logging, dry-runs, file permissions, and so forth.
 
Let’s write a script that involves finding files, like a file renaming script or a log archiver utility, but rather not have to call shell utilities from within the Python script, or to provide specialized behavior not easily available by “shelling out.”

To search for files, use the os.walk() function, supplying it with the top-level directory.

Code #3 : Function that finds a specific filename and prints out the full path of all matches.




import os
  
def findfile(start, name):
    for relpath, dirs, files in os.walk(start):
  
        if name in files:
            full_path = os.path.join(start, relpath, name)
            print(os.path.normpath(os.path.abspath(full_path)))
  
if __name__ == '__main__':
    findfile(sys.argv[1], sys.argv[2])


Save this script as abc.py and run it from the command line, feeding in the starting point and the name as positional arguments as –




bash % ./abc.py .myfile.txt


 
How it works ?

  • The os.walk() method traverses the directory hierarchy for us, and for each directory it enters, it returns a 3-tuple, containing the relative path to the directory it’s inspecting, a list containing all of the directory names in that directory, and a list of filenames in that directory.
  • For each tuple, simply check if the target filename is in the files list. If it is, os.path.join() is used to put together a path.
  • To avoid the possibility of weird looking paths like ././foo//bar, two additional functions are used to fix the result.
  • The first is os.path.abspath(), which takes a path that might be relative and forms the absolute path.
  • The second is os.path.normpath(), which will normalize the path, thereby resolving issues with double slashes, multiple references to the current directory, and so on.

Although the code is pretty simple compared to the features of the find utility found on UNIX platforms, it has the benefit of being cross-platform. Furthermore, a lot of additional functionality can be added in a portable manner without much more work.

Code #4 : Function that prints out all of the files that have a recent modification time




import os
import time
  
def modified_within(top, seconds):
    now = time.time()
  
    for path, dirs, files in os.walk(top):
        for name in files:
            fullpath = os.path.join(path, name)
  
            if os.path.exists(fullpath):
                mtime = os.path.getmtime(fullpath)
                if mtime > (now - seconds):
                    print(fullpath)
                      
if __name__ == '__main__':
    import sys
  
    if len(sys.argv) != 3:
        print('Usage: {} dir seconds'.format(sys.argv[0]))
        raise SystemExit(1)
          
    modified_within(sys.argv[1], float(sys.argv[2]))


It wouldn’t take long for you to build far more complex operations on top of this little function using various features of the os, os.path, glob, and similar modules.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads