How to convert tab-separated file into a dataframe using Python

Last Updated : 27 Feb, 2024

In this article, we will learn how to convert a TSV file into a data frame using Python and the Pandas library.

A TSV (Tab-Separated Values) file is a plain text file where data is organized in rows and columns, with each column separated by a tab character.

It is a type of delimiter-separated file, similar to CSV (Comma-Separated Values).
Tab-separated files are commonly used in data manipulation and analysis, and being able to convert them into a data frame can greatly enhance our ability to work with structured data efficiently.

Methods to Convert Tab-Separated File into a Data Frame

Method 1: Using pandas ‘read_csv()’ with ‘sep’ parameter

In this method, we will use Pandas library to read a tab-separated file (file.tsv) into a DataFrame.

Look at the following code snippet.

We have imported the pandas library and defined the path of the tab-separated file.
Then, we use ‘pd.read_csv()’ function to read the contents of the tab-separated file into a DataFrame and specified that the file is tab-separated using “sep =’\t'”
The ‘read_csv()' function automatically detects the delimiter and parses the file accordingly.

Python

import pandas as pd
file_path = "file.tsv"
df = pd.read_csv(file_path,sep='\t')
df.head()

Output:

    0    50    5    881250949
0    0    172    5    881250949
1    0    133    1    881250949
2    196    242    3    881250949
3    186    302    3    891717742
4    22    377    1    878887116

Method 2: Using pandas ‘read_table()’ function

In the following code snippet, we have again used the pandas library in Python to read the contents of a tab-separated file named ‘file.tsv’ into a DataFrame named ‘df’. The pd.read_table() function is employed for this task, which automatically infers the tab separator.

Python

import pandas as pd
df = pd.read_table('file.tsv')
df.head()

Output:

    0    50    5    881250949
0    0    172    5    881250949
1    0    133    1    881250949
2    196    242    3    881250949
3    186    302    3    891717742
4    22    377    1    878887116

Method 3: Using csv module

The code example, begin by importing the csv module, which provides functionality for reading and writing CSV files.

Uses the open() function to open the file specified by file_path in read-only mode ('r'). Utilized the with statement to ensure proper file closure after reading.
Creates a CSV reader object using csv.reader(file, delimiter=’\t’), specifing that the values in the file are tab-separated.

Python

import csv
file_path = "file.tsv"
with open(file_path, 'r') as file:
    reader = csv.reader(file, delimiter='\t')
    df = pd.DataFrame(reader)
df.head()

Output:

    0    1    2    3
0    0    50    5    881250949
1    0    172    5    881250949
2    0    133    1    881250949
3    196    242    3    881250949
4    186    302    3    891717742

Method 4: Use ‘numpy’ to load the data and then convert to a DataFrame

This code segment employs NumPy’s ‘genfromtxt()’ function to import tab-separated data from ‘file.tsv’ into a NumPy array, configuring the tab delimiter and data type. Following this, it converts the NumPy array into a pandas DataFrame, facilitating structured data representation for further analysis and manipulation.

Python

import numpy as np
import pandas as pd
data = np.genfromtxt('file.tsv', delimiter='\t', dtype=None, encoding=None)
df = pd.DataFrame(data)
df.head()

Output:

     0    1  2          3
0    0   50  5  881250949
1    0  172  5  881250949
2    0  133  1  881250949
3  196  242  3  881250949
4  186  302  3  891717742

Suggest improvement

Python | Pandas DataFrame.to_latex() method

How To Get The Uncompressed And Compressed File Size Of A File In Python

Share your thoughts in the comments

How to convert tab-separated file into a dataframe using Python

Methods to Convert Tab-Separated File into a Data Frame

Method 1: Using pandas ‘read_csv()’ with ‘sep’ parameter

Python

Method 2: Using pandas ‘read_table()’ function

Python

Method 3: Using csv module

Python

Method 4: Use ‘numpy’ to load the data and then convert to a DataFrame

Python

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?