Open In App

How to Import Kaggle Datasets Directly into Google Colab

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to import Kaggle Datasets into Google Colab.

Getting Started

Here, we are going to cover two different methods to start working with Colab. In the first method, we will use Kaggle API to download our dataset, and after that, we are good to go to use our dataset. In another method, we manually download from the Kaggle website and use our dataset for our production or analysis data. you first need to log in to your Google account, then go to this link https://colab.research.google.com.

Method 1: Downloading Kaggle Dataset in Google Colab Notebook

Step 1: Open your Google Colab Notebook 

Google Colab interface

Google Colab interface 

Step 2: Download and Install the required packages. 

pip install opendatasets
pip install pandas

Install open dataset in Colab 

Step 3: Visit www.kaggle.com. Go to your profile and click on account. 

Download Kaggle token

Download Kaggle token 

Step 4: On the following page you will see an API section, where you will find a “Create New API Token” click on it, and it will download a kaggle.json file in which you will get your username and key. we will use username and key in our next step.

 

Step 5: Import the opendatasets library and download your Kaggle dataset by pasting the link on it.

Python3




import opendatasets as od
import pandas
 
od.download(
    "https://www.kaggle.com/datasets/\
    muratkokludataset/acoustic-extinguisher-fire-dataset")


Output:

 

Step 6: Now we are ready to use our dataset. 

Python3




import pandas as pds
 
# reading the XLSX file
file =('Acoustic_Extinguisher_Fire_Dataset/\
Acoustic_Extinguisher_Fire_Dataset.xlsx')
newData = pds.read_excel(file)
 
# displaying the contents of the XLSX file
newData.head()


Output:

First 5 rows of dataset

First 5 rows of the dataset 

Method 2: By Installing Kaggle In our Colab Notebook 

Step 1: Select any dataset from Kaggle

Kaggle dataset

Kaggle dataset 

Step 2: Download Dataset API Token

We will download the Kaggle API token which will be present in Account directory under our Kaggle profile section. The file name for the token will be Kaggle.json

Kaggle token file 

To download the dataset  

Step 3: Setup the Colab Notebook

To download the dataset into google colab notebook we first have to install kaggle in our local system then we will grant permission kaggle.json file to download file dataset from third party link 

  • Install the Kaggle library
 pip install kaggle
  • Make a directory named “.kaggle”
 mkdir ~/.kaggle
  • Copy the “kaggle.json” into this new directory
 cp kaggle.json ~/.kaggle/
  • Allocate the required permission for this file.
 chmod 600 ~/.kaggle/kaggle.json
Command to instal kaggle API in colab notebook

Command to install Kaggle API in Colab notebook 

Step 4: Download the Dataset into Colab File 

To download the dataset into Colab we will use another command followed by the dataset name 

  • For Downloading dataset 

Suppose our dataset web link in 

https://www.kaggle.com/datasets/gauravduttakiit/cassava-leaf-disease-classification

then we will type 

! kaggle datasets download gauravduttakiit/cassava-leaf-disease-classification
Kaggle dataset download command

Kaggle dataset download command 

  • For downloading competitions data 

Suppose our competition data link is 

https://www.kaggle.com/competitions/playground-series-s3e14

then we will type 

! kaggle competitions download playground-series-s3e14
Kaggle competition download command

Kaggle competition download command 

Method 3: By easily downloading the Kaggle dataset.

Step 1: Visit the Kaggle website and Select the Dataset tab.

 

Step 2: Select any Dataset and Click on the Download.

 

Step 3: The downloaded file will be in Zip form, Unzip it.

Step 4: Upload Your Dataset file or folder to Google Colab Notebook. On clicking on Upload your folder/file you will get an option to upload your file/ folder as the given image illustrate.

 

Step 5: Now we have successfully uploaded our dataset on Google Colab Notebook.

 

Step 5: Now you are ready to use your Kaggle dataset.

Python3




import pandas as pds
 
# reading the XLSX file
file = ('Acoustic_Extinguisher_Fire_Dataset/\
            Acoustic_Extinguisher_Fire_Dataset.xlsx')
 
newData = pds.read_excel(file)
 
# displaying the contents of the XLSX file
newData.head()


Output:

 



Last Updated : 23 May, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads