Ways to import CSV files in Google Colab

Colab (short for Colaboratory) is Google’s free platform which enables users to code in Python. It is a Jupyter Notebook-based cloud service, provided by Google. This platform allows us to train the Machine Learning models directly in the cloud and all for free. Google Colab does whatever your Jupyter Notebook does and a bit more, i.e. you can use GPU and TPU for free. Some of Google Colab’s advantages include quick installation and real-time sharing of Notebooks between users. 

However, when loading a CSV file it requires to write some extra line of codes. In this article, we will be discussing three different ways to load a CSV file and store it in a pandas dataframe. To get started, sign in to your Google Account, and then go to “https://colab.research.google.com” and click on “New Notebook”
 

Ways to import CSV

Load data from local drive 

To upload the file from the local drive write the following code in the cell and run it

Python3



filter_none

edit
close

play_arrow

link
brightness_4
code

from google.colab import files
  
  
uploaded = files.upload()

chevron_right


you will get a screen as, 
 

Click on “choose files”, then select and download the CSV file from your local drive.  Later write the following code snippet to import it into a pandas dataframe.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
import io
  
df = pd.read_csv(io.BytesIO(uploaded['file.csv']))
print(df)

chevron_right


Output:

From Github 

It is the easiest way to to upload a CSV file in Colab. For this go to the dataset in your github repository, and then click on “View Raw”. Copy the link to the raw dataset and pass it as a parameter to the read_csv() in pandas to get the dataframe. 
 

Python3



filter_none

edit
close

play_arrow

link
brightness_4
code

url = 'copied_raw_github_link'
df = pd.read_csv(url)

chevron_right


Output:

From your Google drive

We can import datasets that are uploaded on our google drive in two ways : 

1. Using PyDrive 
This is the most complex method for importing datasets among all. For this we first require to install PyDrive library from python installer(pip) and execute the following.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

!pip install -U -q PyDrive
  
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
  
  
# Authenticate and create the PyDrive client.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

chevron_right


Output:

Click on the link prompted to get the authentication to allow Google to access your Drive. You will see a screen with “Google Cloud SDK wants to access your Google Account” at the top. After you allow permission, copy the given verification code and paste it in the box in Colab. 

Now, go to the CSV file in your Drive and get the shareable link and store it in a string variable in Colab. Now, to get this file in dataframe run the following code.



Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

  
import pandas as pd
  
# to get the id part of the file
id = link.split("/")[-2]
  
downloaded = drive.CreateFile({'id':id}) 
downloaded.GetContentFile('xclara.csv')  
  
df = pd.read_csv('xclara.csv')
print(df)

chevron_right


Output:

2. Mounting the drive 
This method is quite simple and clean than the above mentioned method. 

  • Create a folder in your Google Drive. 
  • Upload the CSV file in this folder. 
  • Write the following code in your Colab Notebook : 
     
from google.colab import drive

drive.mount(‘/content/drive’)

Just like with the previous method, the commands will bring you to a Google Authentication step. Later complete the verification as we did in the last method. Now in the Notebook, at the top-left there is File menu and then click on Locate in Drive, and then find your data. Then copy the path of the CSV file in a variable in your notebook, and read the file using read_csv(). 

path = "copied path"
df_bonus = pd.read_csv(path)

Now, to read the file run the following code.

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
  
df = pd.read_csv("file_path")
print(df)

chevron_right


Output:




My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

1


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.