Read And Write Tabular Data using Pandas

Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental, high-level building block for doing practical, real-world data analysis in Python.
The two primary data structures of Pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything about R’s data.frame provides, and much more. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other third-party libraries.

Data structures

Dimension	Name	Description
1	Series	1D-labeled homogeneously-typed array
2	DataFrame	General 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed column

Reading Tabular Data

Pandas provides the read_csv() function to read data stored as a csv file into a pandas DataFrame. pandas supports many different file formats or data sources out of the box (csv, excel, sql, json, parquet, …), each of them with the prefix read_*.

Importing Necessary libraries

Python3

import pandas as pd

CSV file

dataset.csv

1. Reading the csv file

Dataset link : dataset.csv

Python

# Load the dataset from the 'dataset.csv' file using Pandas

data = pd.read_csv('dataset.csv')
 
# Display the first few rows of the loaded dataset

print(data.head())

Output:

total_bill   tip     sex smoker  day    time  size
0       16.99  1.01  Female     No  Sun  Dinner     2
1       10.34  1.66    Male     No  Sun  Dinner     3
2       21.01  3.50    Male     No  Sun  Dinner     3
3       23.68  3.31    Male     No  Sun  Dinner     2
4       24.59  3.61  Female     No  Sun  Dinner     4

2. Reading excel file

Dataset link : data.xlsx

Python

# Load the dataset from the 'data.xlsx' file using Pandas

data = pd.read_excel('data.xlsx')
 
# Display the first few rows of the loaded dataset

print(data.head())

Output:

 Column1 Column2  Column3
0        1         A             10.5
1        2         B             20.3
2        3        C             15.8
3        4        D             8.2

Writing Tabular Data

1. Writing in Excel file

Python

# Reading the data from a CSV file named 'dataset.csv' into a pandas DataFrame

data = pd.read_csv('dataset.csv')
 
# Specifying the path for the new Excel file to be created

excel_file_path = 'newDataset.xlsx'
 
# Writing the DataFrame to an Excel file with the specified path, excluding the index column

data.to_excel(excel_file_path, index=False)
 
# Displaying a message indicating that the data has been successfully written to the Excel file

print(f'Data written to Excel file: {excel_file_path}')

Output:

Data written to Excel file: newDataset.xlsx

newDataset.xlsx

2. Writing in CSV file

Python

# Reading the data from a CSV file named 'dataset.csv' into a pandas DataFrame

data = pd.read_csv('dataset.csv')
 
# Specifying the path for the new CSV file to be created

csv_file_path = 'newDataset.csv'
 
# Writing the DataFrame to a CSV file with the specified path, excluding the index column

data.to_csv(csv_file_path, index=False)
 
# Displaying a message indicating that the data has been successfully written to the CSV file

print(f'Data written to CSV file: {csv_file_path}')

Output:

Data written to CSV file: newDataset.csv

Conclusion

In conclusion, Pandas provides essential tools for efficiently managing tabular data, allowing seamless reading and writing operations across various file formats. The library’s key functions, such as read_csv, read_excel, to_csv, and to_excel, facilitate the smooth import and export of data, irrespective of its original format.

Pandas’ adaptability extends to diverse data scenarios, enabling users to address nuances like missing values and customizable parameters. Whether dealing with CSV, Excel, SQL, JSON, or other file types, Pandas offers a consistent and user-friendly interface for data manipulation.

Article Tags :

Geeks Premier League

Pandas

Geeks Premier League 2023