Open In App

C Program to Read and Print All Files From a Zip File

Last Updated : 05 Sep, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

To understand how to write a C program for reading and printing zip files, it’s important to know what exactly a zip file is.

  • At its core, a zip file contains one or more files compressed using specific compression algorithms.
  • Including the compressed data of the files, the zip file contains meta and header information about all files inside the zip file. These contain file names, modification dates, signatures, compression methods, etc.
  • A zip file can also be called a zip archive because of its organization and structure of multiple files.

With the above information in mind we only need two things to programmatically read and print the contents of a zip file:

  1. A decompression method that will decompress the data so that we can read it.
  2. A library for interacting with zip files, to make things way easier

We will be using two libraries that also provide decompression functions to write this program.

Library 1: libzip

Library 2: zlib

Both of these libraries are prerequisites for running the code. Libzip is a higher-level library that already utilizes parts of zlib. Zlib is lower level and therefore more technical to use. There will be code examples, one that uses libzip, and one that uses libzip for zip file interaction while utilizing zlib for the decompression.

  • For help installing libzip: check here
  • For help installing zlib: check here

If these libraries are installed, you can successfully compile them with gcc, just pass compiler flags -lz for method 1 and -lz -lzip for method 2.

Method 1: Reading and Printing All Files from a Zip File using libzip

C




// C program to read and print
// all files in a zip file
// uses library libzip
#include <stdlib.h>
#include <zip.h>
  
// this is run from the command line with the zip file
// passed in example usage: ./program zipfile.zip
int main(int argc, char* argv[])
{
    // if more or less than 2
    // command line arguments,
    // program ends
    if (argc > 2 || argc < 2)
        return -1;
  
    // if the file provided can't
    // be opened/read, program
    // ends
    if (!fopen(argv[1], "r"))
        return -2;
  
    // stores error codes for libzip functions
    int errorp = 0;
  
    // initializes a pointer to a zip archive
    zip_t* arch = NULL;
  
    // sets that pointer to the
    // zip file from argv[1]
    arch = zip_open(argv[1], 0, &errorp);
  
    // the zip_stat structure
    // contains information such as
    // file name, size, comp size
  
    struct zip_stat* finfo = NULL;
  
    // must be allocated enough space
    // (not exact space here)
    finfo = calloc(256, sizeof(int));
  
    // "initializes" the structure
    // according to documentation
  
    zip_stat_init(finfo);
  
    // initialize file descriptor for
    // zip files inside archive
    zip_file_t* fd = NULL;
  
    // initialize string pointer for
    // reading from fd
    char* txt = NULL;
  
    // count = index of file archive   0 =
    // first file
  
    int count = 0;
  
    // we open the file at the count'th index inside the
    // archive we loop and print every file and its
    // contents, stopping when zip_stat_index did not return
    // 0, which means the count index is more than # of
    // files
    while ((zip_stat_index(arch, count, 0, finfo)) == 0) {
  
        // allocate room for the entire file contents
        txt = calloc(finfo->size + 1, sizeof(char));
        fd = zip_fopen_index(
            arch, count, 0); // opens file at count index
                             // reads from fd finfo->size
                             // bytes into txt buffer
        zip_fread(fd, txt, finfo->size);
  
        printf("file #%i: %s\n\n", count + 1,
               finfo->name); // prints filename
        printf("%s\n\n",
               txt); // prints entire file contents
  
        // frees allocated buffer, will
        // reallocate on next iteration of loop
        free(txt);
  
        // increase index by 1 and the loop will
        // stop when files are not found
        count++;
    }
    return 0;
}


 

Example output where my zip file contained this C file

method 1 output

Method 2: Using  Zlib Implementation

Zlib cannot directly access zip files, which is why we will use libzip to open the zip file. Zlib directly is used for decompression of the file contents in this method.

C




// C program to read and print
// all files in a zip file
// uses libraries libzip and zlib
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <zip.h>
#include <zlib.h>
  
// Compatibility with Windows
#if defined(MSDOS) || defined(OS2) || defined(WIN32)       \
    || defined(__CYGWIN__)
#include <fcntl.h>
#include <io.h>
#define SET_BINARY_MODE(file)                              \
    setmode(fileno(file), O_BINARY)
#else
#define SET_BINARY_MODE(file)
#endif
  
#define CHUNK 1000
  
// We need to change one line of the zlib library function
// uncompress2 from err = inflateInit(&stream); to err =
// inflateInit2(&stream, -MAX_WBITS);
  
// This tells function that there is no extra zlib, gzip, z
// header information It's just a pure stream of compressed
// data to decompress
  
int ZEXPORT uncompress2(dest, destLen, source,
                        sourceLen) Bytef* dest;
uLongf* destLen;
const Bytef* source;
uLong* sourceLen;
{
    z_stream stream;
    int err;
    const uInt max = (uInt)-1;
    uLong len, left;
  
    // for detection of incomplete stream when
    // *destLen == 0
    Byte buf[1];
  
    len = *sourceLen;
    if (*destLen) {
        left = *destLen;
        *destLen = 0;
    }
    else {
        left = 1;
        dest = buf;
    }
  
    stream.next_in = (z_const Bytef*)source;
    stream.avail_in = 0;
    stream.zalloc = (alloc_func)0;
    stream.zfree = (free_func)0;
    stream.opaque = (voidpf)0;
  
    err = inflateInit2(&stream,
                       -MAX_WBITS); // THIS LINE IS CHANGED
    if (err != Z_OK)
        return err;
  
    stream.next_out = dest;
    stream.avail_out = 0;
  
    do {
        if (stream.avail_out == 0) {
            stream.avail_out
                = left > (uLong)max ? max : (uInt)left;
            left -= stream.avail_out;
        }
        if (stream.avail_in == 0) {
            stream.avail_in
                = len > (uLong)max ? max : (uInt)len;
            len -= stream.avail_in;
        }
        err = inflate(&stream, Z_NO_FLUSH);
    } while (err == Z_OK);
  
    *sourceLen -= len + stream.avail_in;
    if (dest != buf)
        *destLen = stream.total_out;
    else if (stream.total_out && err == Z_BUF_ERROR)
        left = 1;
  
    inflateEnd(&stream);
    return err == Z_STREAM_END
               ? Z_OK
               : err == Z_NEED_DICT
                     ? Z_DATA_ERROR
                     : err == Z_BUF_ERROR
                               && left + stream.avail_out
                           ? Z_DATA_ERROR
                           : err;
}
  
int main(int argc, char* argv[])
{
    // Command line program that only takes exactly 2
    // arguments example usage: ./program zipfile.zip will
    // print the name and contents of every file inside the
    // zip archive
  
    if (argc > 2 || argc < 2)
        return -1;
    if (!fopen(argv[1], "r"))
        return -2;
  
    int errorp = 0; // error code variable
    zip_t* arch = NULL; // Zip archive pointer
    arch = zip_open(argv[1], 0, &errorp);
  
    // allocates space for file information
    struct zip_stat* finfo = NULL;
    finfo = calloc(256, sizeof(int)); // must be allocated
    zip_stat_init(finfo);
  
    // Loop variables
    int index = 0;
    char* txt = NULL;
    zip_file_t* fd = NULL;
    char* outp = NULL;
  
    while (zip_stat_index(arch, index, 0, finfo) == 0) {
  
        txt = calloc(finfo->comp_size + 1, sizeof(char));
        // Read compressed data to buffer txt
        // ZIP_FL_COMPRESSED flag is passed in to read the
        // compressed data
        fd = zip_fopen_index(arch, 0, ZIP_FL_COMPRESSED);
        zip_fread(fd, txt, finfo->comp_size);
  
        outp = calloc(finfo->size + 1, sizeof(char));
        // uncompresses from txt buffer to outp buffer
        // uncompress function calls our uncompress2
        // function defined at top
        uncompress(outp, &finfo->size, txt,
                   finfo->comp_size);
  
        printf("FILE #%i: %s\n", index + 1, finfo->name);
        printf("\n%s\n", outp);
  
        // free memory every iteration
        free(txt);
        free(outp);
        index++;
    }
}


Here is the output of  the above code using a zip file with 2 files contained

Method 2 output



Similar Reads

C program to read a range of bytes from file and print it to console
Given a file F, the task is to write C program to print any range of bytes from the given file and print it to a console. Functions Used: fopen(): Creation of a new file. The file is opened with attributes as “a” or “a+” or “w” or “w++”.fgetc(): Reading the characters from the file.fclose(): For closing a file. Approach: Initialize a file pointer,
2 min read
C Program to read contents of Whole File
C programming language supports four pre-defined functions to read contents from a file, defined in stdio.h header file: fgetc()- This function is used to read a single character from the file.fgets()- This function is used to read strings from files.fscanf()- This function is used to read formatted input from a file.fread()- This function is used
6 min read
C Program to merge contents of two files into a third file
Let the given two files be file1.txt and file2.txt. The following are steps to merge. 1) Open file1.txt and file2.txt in read mode. 2) Open file3.txt in write mode. 3) Run a loop to one by one copy characters of file1.txt to file3.txt. 4) Run a loop to one by one copy characters of file2.txt to file3.txt. 5) Close all files. To successfully run the
2 min read
lseek() in C/C++ to read the alternate nth byte and write it in another file
From a given file (e.g. input.txt) read the alternate nth byte and write it on another file with the help of “lseek”. lseek (C System Call): lseek is a system call that is used to change the location of the read/write pointer of a file descriptor. The location can be set either in absolute or relative terms. Function Definition off_t lseek(int fild
2 min read
Read/Write Structure From/to a File in C
For writing in the file, it is easy to write string or int to file using fprintf and putc, but you might have faced difficulty when writing contents of the struct. fwrite and fread make tasks easier when you want to write and read blocks of data. Writing Structure to a File using fwrite We can use fwrite() function to easily write a structure in a
3 min read
cJSON - JSON File Write/Read/Modify in C
JSON (JavaScript Object Notation) is a lightweight data-interchange format that is easy to read and write for humans and machines alike. JSON is widely used for data exchange between applications and web services. In this article, we will discuss how to read and write JSON data in the C programming language. JSON in C JSON in C can be handled using
6 min read
Implement your own tail (Read last n lines of a huge file)
Given a huge file having dynamic data, write a program to read last n lines from the file at any point without reading the entire file. The problem is similar to tail command in linux which displays the last few lines of a file. It is mostly used for viewing log file updates as these updates are appended to the log files. Source : Microsoft Intervi
4 min read
How to Read a Struct from a Binary File in C?
The structure in C allows the users to create user-defined data types that can be used to group data items of different types into a single unit. We can save this structure in a file using C file handling. In this article, we will learn how we can read a struct from a binary file in C. Read a Struct from a Binary File in CTo read a struct from a bi
2 min read
C Program to list all files and sub-directories in a directory
#include &lt;stdio.h&gt; #include &lt;dirent.h&gt; int main(void) { struct dirent *de; // Pointer for directory entry // opendir() returns a pointer of DIR type. DIR *dr = opendir(&quot;.&quot;); if (dr == NULL) // opendir returns NULL if couldn't open directory { printf(&quot;Could not open current directory&quot; ); return 0; } // Refer http://pu
1 min read
C program to check whether the file is JPEG file or not
Write a C program which inputs a file as a command-line arguments and detects whether the file is JPEG(Joint Photographic Experts Group) or not. Approach: We will give an image as a command line argument while executing the code. Read the first three bytes of the given image(file). After reading the file bytes, compare it with the condition for JPE
2 min read
Article Tags :