Open In App

How to Import Kaggle Datasets Directly into Google Colab

In this article, we will see how to import Kaggle Datasets into Google Colab.

Getting Started

Here, we are going to cover two different methods to start working with Colab. In the first method, we will use Kaggle API to download our dataset, and after that, we are good to go to use our dataset. In another method, we manually download from the Kaggle website and use our dataset for our production or analysis data. you first need to log in to your Google account, then go to this link https://colab.research.google.com.



Method 1: Downloading Kaggle Dataset in Google Colab Notebook

Step 1: Open your Google Colab Notebook 

Google Colab interface 

Step 2: Download and Install the required packages. 



pip install opendatasets
pip install pandas

Install open dataset in Colab 

Step 3: Visit www.kaggle.com. Go to your profile and click on account. 

Download Kaggle token 

Step 4: On the following page you will see an API section, where you will find a “Create New API Token” click on it, and it will download a kaggle.json file in which you will get your username and key. we will use username and key in our next step.

 

Step 5: Import the opendatasets library and download your Kaggle dataset by pasting the link on it.




import opendatasets as od
import pandas
 
od.download(
    "https://www.kaggle.com/datasets/\
    muratkokludataset/acoustic-extinguisher-fire-dataset")

Output:

 

Step 6: Now we are ready to use our dataset. 




import pandas as pds
 
# reading the XLSX file
file =('Acoustic_Extinguisher_Fire_Dataset/\
Acoustic_Extinguisher_Fire_Dataset.xlsx')
newData = pds.read_excel(file)
 
# displaying the contents of the XLSX file
newData.head()

Output:

First 5 rows of the dataset 

Method 2: By Installing Kaggle In our Colab Notebook 

Step 1: Select any dataset from Kaggle

Kaggle dataset 

Step 2: Download Dataset API Token

We will download the Kaggle API token which will be present in Account directory under our Kaggle profile section. The file name for the token will be Kaggle.json

Kaggle token file 

To download the dataset  

Step 3: Setup the Colab Notebook

To download the dataset into google colab notebook we first have to install kaggle in our local system then we will grant permission kaggle.json file to download file dataset from third party link 

 pip install kaggle
 mkdir ~/.kaggle
 cp kaggle.json ~/.kaggle/
 chmod 600 ~/.kaggle/kaggle.json

Command to install Kaggle API in Colab notebook 

Step 4: Download the Dataset into Colab File 

To download the dataset into Colab we will use another command followed by the dataset name 

Suppose our dataset web link in 

https://www.kaggle.com/datasets/gauravduttakiit/cassava-leaf-disease-classification

then we will type 

! kaggle datasets download gauravduttakiit/cassava-leaf-disease-classification

Kaggle dataset download command 

Suppose our competition data link is 

https://www.kaggle.com/competitions/playground-series-s3e14

then we will type 

! kaggle competitions download playground-series-s3e14

Kaggle competition download command 

Method 3: By easily downloading the Kaggle dataset.

Step 1: Visit the Kaggle website and Select the Dataset tab.

 

Step 2: Select any Dataset and Click on the Download.

 

Step 3: The downloaded file will be in Zip form, Unzip it.

Step 4: Upload Your Dataset file or folder to Google Colab Notebook. On clicking on Upload your folder/file you will get an option to upload your file/ folder as the given image illustrate.

 

Step 5: Now we have successfully uploaded our dataset on Google Colab Notebook.

 

Step 5: Now you are ready to use your Kaggle dataset.




import pandas as pds
 
# reading the XLSX file
file = ('Acoustic_Extinguisher_Fire_Dataset/\
            Acoustic_Extinguisher_Fire_Dataset.xlsx')
 
newData = pds.read_excel(file)
 
# displaying the contents of the XLSX file
newData.head()

Output:

 


Article Tags :