Open In App

Apply a function to a single column of a csv in Spark

Last Updated : 17 Apr, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Apache Spark is an open-source distributed computing system allowing fast and large-scale data processing. One common use case in Spark is applying a function to a single CSV file column. This article will demonstrate how to apply a function to a single CSV file column in Spark in Python.

Syntax: withColumn(colName : String, col : Column) 

Parameters:
colName : str –  string, name of the new column.

col : Column –  a Column expression for the new column.

First, let’s create a sample CSV file and load it into Spark. The sample CSV file contains three columns: “id“, “name“, and “age“. Here is the sample data:

 

To load the CSV file into Spark, we use the spark.read.csv method:

To access and download the CSV file, click here.

Python3




from pyspark.sql import SparkSession
  
spark = SparkSession.builder.appName("Apply function to CSV").getOrCreate()
  
# Loading the csv file
df = spark.read.csv("sample.csv", header=True, inferSchema=True)
# Displaying the csv file
df.show()


Output:

 

Now that we have loaded the CSV file into Spark, we can apply a function to a single column. In this example, we will apply a function that increments the age column by 1. To apply the function, we use the withColumn method:

Python3




from pyspark.sql.functions import col
  
# Incrementing the value of each column by 1
df = df.withColumn("age", col("age") + 1)
  
df.show()


Output:

 

In this example, we have used the col method to select the “age” column and increment it by 1. You can use any valid Spark expression as the argument to withColumn to apply a custom function to the selected column.

In conclusion, applying a function to a single CSV file column in Spark is a straightforward process. Using the withColumn method, you can apply a custom function to a selected column, allowing you to manipulate and transform your data as needed.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads