Open In App

Reading binary files in Python

Last Updated : 25 Aug, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Reading binary files is an important skill for working with data (non-textual) such as images, audio, and videos. Using file mode and the “read” method you can easily read binary files. Python has the ability to handle the data and consent provides various help with certain criteria. Whether you are dealing with multimedia files, compressed data, or custom binary formats, Python’s ability to handle binary data empowers you to create powerful and versatile applications for a wide range of use cases. In this article, you will learn What binary files are and how to read data into a byte array, and Read binary data into chunks? and so on.

What are Binary files?

Generally, binary means two. In computer science, binary files are stored in a binary format having digits 0’s and 1’s. For example, the number 9 in binary format is represented as ‘1001’. In this way, our computer stores each and every file in a machine-readable format in a sequence of binary digits. The structure and format of binary files depend on the type of file. Image files have different structures when compared to audio files. However, decoding binary files depends on the complexity of the file format. In this article, let’s understand the reading of binary files.

Python Read A Binary File

To read a binary file,

Step 1: Open the binary file in binary mode

To read a binary file in Python, first, we need to open it in binary mode (‘”rb”‘). We can use the ‘open()’ function to achieve this.

Step 2: Create a binary file

To create a binary file in Python, You need to open the file in binary write mode ( wb ). For more refer to this article.

Python – Write Bytes to File

Step 3: Read the binary data

After opening the binary file in binary mode, we can use the read() method to read its content into a variable. The” read()” method will return a sequence of bytes, which represents the binary data.

Step 4: Process the binary data

Once we have read the binary data into a variable, we can process it according to our specific requirements. Processing the binary data could involve various tasks such as decoding binary data, analyzing the content, or writing the data to another binary file.

Step 5: Close the file

After reading and processing the binary data, it is essential to close the file using the “close()” method to release system resources and avoid potential issues with file access.

Python3




# Opening the binary file in binary mode as rb(read binary)
f = open("files.zip", mode="rb")
 
# Reading file data with read() method
data = f.read()
 
# Knowing the Type of our data
print(type(data))
 
# Printing our byte sequenced data
print(data)
 
# Closing the opened file
f.close()


Output:

In the output, we see a sequence of byte data as bytes are the fundamental unit of binary representation.

b’PK\x03\x04\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00TODO11.txt\xe3\xe5JN,\xceH-/\xe6\xe5\x82\xc0\xcc\xbc\x92\xd4\x9c\x9c\xcc\x82\xc4\xc4\x12^.w7w\x00PK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00U\xbd\xebV\xc2=j\x87\x1e\x00\x00\x00!\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x01\x00 \x00\x00\x00\x00\x00\x00\x00TODO11.txtPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x008\x00\x00\x00F\x00\x00\x00\x00\x00′

Reading binary data into a byte array

This given code demonstrates how to read binary data from a file into a byte array and then To read binary data into a binary array print the data using a while loop. Let’s explain the code step-by-step:

Open the Binary File

This line opens the binary file named “string.bin” in binary mode (‘”rb”‘). The file is opened for reading, and the file object is stored in the variable ‘file’.

Python3




# Open the binary file
file = open("string.bin", "rb")


Reading the first three bytes

This line reads the first three bytes from the binary file and stores them in the variable “data”. The “read(3)” method bytes from the file and advance the pointer accordingly.

Python3




data = file.read(3)


Print data using a “while ” Loop

The loop will keep reading and printing three bytes at a time until the end of the file is reached. Once the end of the file is reached, the read() method will return an empty bytes object, which evaluates to False in the while loop condition, and the loop will terminate.

Python3




while data:
    print(data)
    data = file.read(3)


Close the Binary File

Finally, after the loop has finished reading and printing the data, we close the binary file using the ‘close()’ method to release system resources.

Python3




file.close()


Now by using the above steps in one, we will get this :

The code output will depend on the content of the “string.bin” binary file. The code reads and prints the data in chunks of three bytes at a time until the end of the file is reached. Each iteration of the loop will print the three bytes read from the file.

Python




# Open the binary file
file = open("string.bin", "rb")
# Reading the first three bytes from the binary file
data = file.read(3)
# Printing data by iterating with while loop
while data:
    print(data)
    data = file.read(3)
# Close the binary file
file.close()


For example, if the content of “string.bin” is b’GeeksForGeeks’ (a sequence of six bytes), the output will be:

Output:

b 'Gee'
b ' ksf'
b 'org'
b 'eek'
b 's'

Read Binary files in Chunks

To Read binary file data in chunks we use a while loop to read the binary data from the file in chunks of the specified size (chunk_size). The loop continues until the end of the file is reached, and each chunk of data is processed accordingly.

In this “chunk_size=1024” is used to specify the size of each chunk to read the binary file. file = open(“binary_file.bin”, “rb”): This line opens the binary file named “binary_file.bin” in binary mode (“rb”). while True is used to sets up an infinite loop that will keep reading the file in chunks until the end of the file is reached. “chunk = file. read(chunk_size)” is Inside the loop, and the read(chunk_size) method is used to read a chunk of binary data from the file.

Python3




# Specify the size of each chunk to read
chunk_size = 10
 
file = open("binary_file.bin", "rb")
# Using while loop to iterate the file data
while True:
    chunk = file.read(chunk_size)
    if not chunk:
        break
    # Processing the chunk of binary data
    print(f"Read {len(chunk)} bytes: {chunk}")


The output of the code will depend on the content of the “binary_file.bin” binary file and the specified “chunk_size”, For example, if the file contains the binary data “b” Hello, this is binary data!’, and the chunk_size is set to 10, the output will be:

Output :

Read 10 bytes: b'Hello, thi'
Read 10 bytes: b's is binar'
Read 7 bytes: b'y data!'

Outputs vary depending on the binary file data we are reading and also on the chunk size we are specifying.

Read Binary file Data into Array

To read a binary file into an array.bin and used the “wb” mode to write a given binary file. The “array” is the name of the file. assigned array as num=[3,6,9,12,18] to get the array in byte format. use byte array().

To write an array to the file we use:

Python3




file=open("array","wb")
num=[3,6,9,12,18]
array=bytearray(num)
file.write(array)
file.close()


To read the written array from the given file, we have used the same file i.e., file=open(“array”, “rb”). rb used to read the array from the file. The list() is used to create a list object. number=list(file. read(3)). To read the bytes from the file. read() is used.

Python3




file=open("array","rb")
number=list(file.read(3))
print (number)
file.close()


Output:

[3,6,9]

Read Binary files in Python using NumPy

To read a binary file into a NumPy array, import module NumPy. The “dtype” is “np.unit8” which stands for “unsigned 8-bit integer” This means that each item in the array is an 8-bit (1 byte) integer, with values that can range from 0 to 255.

Python3




import numpy as np
 
# Open the file in binary mode
with open('myfile.bin', 'rb') as f:
    # Read the data into a NumPy array
    array = np.fromfile(f, dtype=np.uint8)  # Change dtype according to your data


Remember to change your file to your binary files

Output:

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10], dtype=np.uint8)

Related Article

Python | Convert String to bytes

Python Array

Read a file line by line in Python

Reading and Writing a text file in Python



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads