CSV stands for Comma Separated Values and CSV files are essentially text files which are used to store data in a tabular fashion using commas (,) as delimiters. CSV is a file format and all the files of this format are stored with a .csv extension. It is a very popular and extensively used format for storing the data in a structured form. CSV files find a lot of applications in Machine Learning and Statistical Models. Python has a library dedicated to deal with operations catering to CSV files such as reading, writing, or modifying them. Following is an example of how a CSV file looks like.
This article deals with the different ways to get column names from CSV files using Python. The following approaches can be used to accomplish the same :
- Using Python’s CSV library to read the CSV file line and line and printing the header as the names of the columns
- Reading the CSV file as a dictionary using DictReader and then printing out the keys of the dictionary
- Converting the CSV file to a data frame using the Pandas library of Python
Method 1:
Using this approach, we first read the CSV file using the CSV library of Python and then output the first row which represents the column names.
# importing the csv library import csv
# opening the csv file by specifying # the location # with the variable name as csv_file with open ( 'data.csv' ) as csv_file:
# creating an object of csv reader
# with the delimiter as ,
csv_reader = csv.reader(csv_file, delimiter = ',' )
# list to store the names of columns
list_of_column_names = []
# loop to iterate through the rows of csv
for row in csv_reader:
# adding the first row
list_of_column_names.append(row)
# breaking the loop after the
# first iteration itself
break
# printing the result print ( "List of column names : " ,
list_of_column_names[ 0 ])
|
Output:
List of column names : ['Column1', 'Column2', 'Column3']
Method 2:
Under the second approach, we use the DictReader function of the CSV library to read the CSV file as a dictionary. We can simply use keys() method to get the column names.
Steps :
- Open the CSV file using DictReader.
- Convert this file into a list.
- Convert the first row of the list to the dictionary.
- Call the keys() method of the dictionary and convert it into a list.
- Display the list.
# importing the csv library import csv
# opening the csv file with open ( 'data.csv' ) as csv_file:
# reading the csv file using DictReader
csv_reader = csv.DictReader(csv_file)
# converting the file to dictionary
# by first converting to list
# and then converting the list to dict
dict_from_csv = dict ( list (csv_reader)[ 0 ])
# making a list from the keys of the dict
list_of_column_names = list (dict_from_csv.keys())
# displaying the list of column names
print ( "List of column names : " ,
list_of_column_names)
|
Output :
List of column names : ['Column1', 'Column2', 'Column3']
Method 3:
Under this approach, we read the CSV file as a data frame using the pandas library of Python. Then, we just call the column’s method of the data frame.
# importing the pandas library import pandas as pd
# reading the csv file using read_csv # storing the data frame in variable called df df = pd.read_csv( 'data.csv' )
# creating a list of column names by # calling the .columns list_of_column_names = list (df.columns)
# displaying the list of column names print ( 'List of column names : ' ,
list_of_column_names)
|
Output :
List of column names : ['Column1', 'Column2', 'Column3']
The Data Frame looks as follows :