Open In App

How to check dataframe is empty in Scala?

Last Updated : 27 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will learn how to check dataframe is empty or not in Scala. we can check if a DataFrame is empty by using the isEmpty method or by checking the count of rows.

Syntax:

val isEmpty = dataframe.isEmpty

OR,

val isEmpty = dataframe.count() == 0

Here’s how you can do it:

Example #1: using isEmpty function

Scala
import org.apache.spark.sql.{DataFrame, SparkSession}

object DataFrameEmptyCheck {
  def main(args: Array[String]): Unit = {
    
    // Create SparkSession
    val spark = SparkSession.builder()
      .appName("DataFrameEmptyCheck")
      .master("local[*]")
      .getOrCreate()

    // Sample DataFrame (replace this with 
    // your actual DataFrame)
    val dataframe: DataFrame = spark.emptyDataFrame

    // Check if DataFrame is empty
    val isEmpty = dataframe.isEmpty
    if (isEmpty) {
      println("DataFrame is empty")
    } else {
      println("DataFrame is not empty")
    }

    // Stop SparkSession
    spark.stop()
  }
}

Output:

DataFrame is empty

Explanation:

  1. The code creates a SparkSession, which is the entry point to Spark functionality.
  2. It defines a sample DataFrame using spark.emptyDataFrame, which creates an empty DataFrame. You would typically replace this with your actual DataFrame.
  3. The code then checks if the DataFrame is empty using the isEmpty method. Since we initialized it as an empty DataFrame, the condition isEmpty will evaluate to true.
  4. If the DataFrame is empty, it prints “DataFrame is empty”.
  5. Finally, the SparkSession is stopped to release resources.

Example #2 : using count function

Scala
import org.apache.spark.sql.{DataFrame, SparkSession}

object DataFrameEmptyCheck {
  def main(args: Array[String]): Unit = {
    
    // Create SparkSession
    val spark = SparkSession.builder()
      .appName("DataFrameEmptyCheck")
      .master("local[*]")
      .getOrCreate()

    // Sample DataFrame (replace this with 
    // your actual DataFrame)
    val dataframe: DataFrame = spark.emptyDataFrame

    // Check if DataFrame is empty
    val isEmpty = dataframe.count() == 0
    if (isEmpty) {
      println("DataFrame is empty")
    } else {
      println("DataFrame is not empty")
    }

    // Stop SparkSession
    spark.stop()
  }
}

Output:

DataFrame is empty

Explanation:

  1. The code creates a SparkSession, initializing it as “local[*]”, which means it will run locally using all available CPU cores.
  2. It defines a sample DataFrame using spark.emptyDataFrame, creating an empty DataFrame. This DataFrame has no rows.
  3. The code then checks if the DataFrame is empty using the count() function. This function returns the number of rows in the DataFrame. Since the DataFrame is empty, its count will be 0.
  4. The condition dataframe.count() == 0 evaluates to true because the count of rows in the DataFrame is indeed 0.
  5. Therefore, it prints “DataFrame is empty” to indicate that the DataFrame is indeed empty.
  6. Finally, the SparkSession is stopped to release resources.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads