Skip to content
Related Articles

Related Articles

Improve Article

Converting Row into list RDD in PySpark

  • Last Updated : 18 Jul, 2021

In this article, we are going to convert Row into a list RDD in Pyspark.

Creating RDD from Row for demonstration:


# import Row and SparkSession
from pyspark.sql import SparkSession, Row
# create sparksession
spark = SparkSession.builder.appName('').getOrCreate()
# create student data with Row function
data = [Row(name="sravan kumar",
            subjects=["Java", "python", "C++"],
            lang=["Spark", "Java", "C++"],
            subjects=["DS", "PHP", ".net"],
            lang=["Python", "C", "sql"],
            lang=["CSharp", "VB"],
rdd = spark.sparkContext.parallelize(data)
# display actual rdd


[Row(name='sravan kumar', subjects=['Java', 'python', 'C++'], state='AP'),
Row(name='Ojaswi', lang=['Spark', 'Java', 'C++'], state='Telangana'),
Row(name='rohith', subjects=['DS', 'PHP', '.net'], state='AP'),
Row(name='bobby', lang=['Python', 'C', 'sql'], state='Delhi'),
Row(name='rohith', lang=['CSharp', 'VB'], state='Telangana')]

Using map() function we can convert into list RDD


where, rdd_data is the data is of type rdd.

Finally, by using the collect method we can display the data in the list RDD.


# convert rdd to list by using map() method
b =
# display the data in b with collect method
for i in b.collect():


['sravan kumar', ['Java', 'python', 'C++'], 'AP']
['Ojaswi', ['Spark', 'Java', 'C++'], 'Telangana']
['rohith', ['DS', 'PHP', '.net'], 'AP']
['bobby', ['Python', 'C', 'sql'], 'Delhi']
['rohith', ['CSharp', 'VB'], 'Telangana']

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.  

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course

My Personal Notes arrow_drop_up
Recommended Articles
Page :