Series is a type of list in Pandas that can take integer values, string values, double values, and more. But in Pandas Series we return an object in the form of a list, having an index starting from 0 to n, Where n is the length of values in the series. Later in this article, we will discuss Dataframes in pandas, but we first need to understand the main difference between Series and Dataframe.
Series can only contain a single list with an index, whereas Dataframe can be made of more than one series or we can say that a Dataframe is a collection of series that can be used to analyze the data.
Creating Pandas DataFrames from Series
Python3
import pandas as pd
author = [ 'Jitender' , 'Purnima' ,
'Arpit' , 'Jyoti' ]
auth_series = pd.Series(author)
print (auth_series)
|
Output:
0 Jitender
1 Purnima
2 Arpit
3 Jyoti
dtype: object
Let’s check the type of Series:
Output:
<class 'pandas.core.series.Series'>
Create DataFrame From Multiple Series
We have created two lists ‘author’ and article’ which have been passed to pd.Series() functions to create two Series. After creating the Series, we created a dictionary and passed Series objects as values of the dictionary, and the keys of the dictionary will be served as Columns of the Dataframe.
Python3
import pandas as pd
author = [ 'Jitender' , 'Purnima' ,
'Arpit' , 'Jyoti' ]
article = [ 210 , 211 , 114 , 178 ]
auth_series = pd.Series(author)
article_series = pd.Series(article)
frame = { 'Author' : auth_series,
'Article' : article_series}
result = pd.DataFrame(frame)
print (result)
|
Output:
Author Article
0 Jitender 210
1 Purnima 211
2 Arpit 114
3 Jyoti 178
Add a Column in Pandas Dataframe
We have added one more series externally named as the age of the authors, then directly added this series in the Pandas Dataframe.
Python3
import pandas as pd
auth_series = pd.Series([ 'Jitender' ,
'Purnima' , 'Arpit' , 'Jyoti' ])
article_series = pd.Series([ 210 , 211 , 114 , 178 ])
frame = { 'Author' : auth_series,
'Article' : article_series}
result = pd.DataFrame(frame)
age = [ 21 , 21 , 24 , 23 ]
result[ 'Age' ] = pd.Series(age)
print (result)
|
Output:
Author Article Age
0 Jitender 210 21
1 Purnima 211 21
2 Arpit 114 24
3 Jyoti 178 23
Missing value in Pandas Dataframe
Remember one thing if any value is missing then by default it will be converted into NaN value, i.e, null by default.
Python3
import pandas as pd
auth_series = pd.Series([ 'Jitender' ,
'Purnima' , 'Arpit' , 'Jyoti' ])
article_series = pd.Series([ 210 , 211 , 114 , 178 ])
frame = { 'Author' : auth_series,
'Article' : article_series}
result = pd.DataFrame(frame)
age = [ 21 , 21 , 24 ]
result[ 'Age' ] = pd.Series(age)
print (result)
|
Output:
Author Article Age
0 Jitender 210 21.0
1 Purnima 211 21.0
2 Arpit 114 23.0
3 Jyoti 178 NaN
Creating a Dataframe using a dictionary of Series
Here, we have passed a dictionary that has been created using a series as values then passed this dictionary to create a Dataframe. We can see while creating a Dataframe using Python Dictionary, the keys of the dictionary will become Columns and values will become Rows.
Python3
import pandas as pd
dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,
'Purnima' , 'Arpit' , 'Jyoti' ]),
'Author_Book_No' :\
pd.Series([ 210 , 211 , 114 , 178 ]),
'Age' : pd.Series([ 21 , 21 , 24 , 23 ])}
df = pd.DataFrame(dict1)
print (df)
|
Output:
Auth_Name Auth_Book_No Age
0 Jitender 210 21
1 Purnima 211 21
2 Arpit 114 24
3 Jyoti 178 23
Explicit Indexing in Pandas Dataframe
Here we can see after providing an index to the dataframe explicitly, it has filled all data with NaN values since we have created this dataframe using Series and Series has its own default indices(0,1,2) which is why when indices of both dataframe and Series do not match, we got all NaN values.
Python3
import pandas as pd
dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,
'Purnima' , 'Arpit' , 'Jyoti' ]),
'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ]),
'Age' : pd.Series([ 21 , 21 , 24 , 23 ])}
df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])
print (df)
|
Output:
Auth_Name Author_Book_No Age
SNo1 NaN NaN NaN
SNo2 NaN NaN NaN
SNo3 NaN NaN NaN
SNo4 NaN NaN NaN
Here, we can rectify this problem by providing the same index values to every Series element.
Python3
import pandas as pd
dict1 = { 'Auth_Name' : pd.Series([ 'Jitender' ,
'Purnima' , 'Arpit' , 'Jyoti' ],
index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]),
'Author_Book_No' : pd.Series([ 210 , 211 , 114 , 178 ],
index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ]),
'Age' : pd.Series([ 21 , 21 , 24 , 23 ],
index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])}
df = pd.DataFrame(dict1, index = [ 'SNo1' , 'SNo2' , 'SNo3' , 'SNo4' ])
print (df)
|
Output:
Auth_Name Author_Book_No Age
SNo1 Jitender 210 21
SNo2 Purnima 211 21
SNo3 Arpit 114 24
SNo4 Jyoti 178 23