Add a column with the literal value in PySpark DataFrame
In this article, we are going to see how to add a column with the literal value in PySpark Dataframe.
Creating dataframe for demonstration:
Method 1: Using Lit() function
Here we can add the constant column ‘literal_values_1’ with value 1 by Using the select method. The lit() function will insert constant values to all the rows.
Select table by using select() method and pass the arguments first one is the column name, or “*” for selecting the whole table and second argument pass the lit() function with constant values.
Method 2: Using SQL clause
In this method first, we have to create the temp view of the table with the help of createTempView we can create the temporary view. The life of this temp is up to the life of the sparkSession. CreateOrReplace will create the temp table if it is not available or if it is available then replace it.
Then after creating the table select the table by SQL clause which will take all the values as a string
Method 3: Using UDF(User-defined Functions) Method
This function allows us to create the new function as per our requirements that’s why this is also called a user-defined function. Now we define the datatype of the UDF function and create the functions which will return the values in the form of a new column
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course