Reindexing in Pandas DataFrame
Reindexing in Pandas can be used to change the index of rows and columns of a DataFrame. Indexes can be used with reference to many index DataStructure associated with several pandas series or pandas DataFrame. Let’s see how can we Reindex the columns and rows in Pandas DataFrame.
Reindexing the Rows
One can reindex a single row or multiple rows by using reindex() method. Default values in the new index that are not present in the dataframe are assigned NaN.
Example #1:
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] # create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) print (df1) print ( '\n\nDataframe after reindexing rows: \n' , df1.reindex([ 'B' , 'D' , 'A' , 'C' , 'E' ])) |
Output:
Example #2:
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] # create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) # create the new index for rows new_index = [ 'U' , 'A' , 'B' , 'C' , 'Z' ] print (df1.reindex(new_index)) |
Output:
Reindexing the columns using axis keyword
One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.
Example #1:
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] #create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) column = [ 'e' , 'a' , 'b' , 'c' , 'd' ] # create the new index for columns print (df1.reindex(column, axis = 'columns' )) |
Output:
Example #2:
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] # create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) column = [ 'a' , 'b' , 'c' , 'g' , 'h' ] # create the new index for columns print (df1.reindex(column, axis = 'columns' )) |
Output:
Replacing the missing values
Code #1: Missing values from the dataframe can be filled by passing a value to the keyword fill_value. This keyword replaces the NaN values.
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] # create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) column = [ 'a' , 'b' , 'c' , 'g' , 'h' ] # create the new index for columns print (df1.reindex(column, axis = 'columns' , fill_value = 1.5 )) |
Output:
Code #2: Replacing the missing data with a string.
Python3
# import numpy and pandas module import pandas as pd import numpy as np column = [ 'a' , 'b' , 'c' , 'd' , 'e' ] index = [ 'A' , 'B' , 'C' , 'D' , 'E' ] # create a dataframe of random values of array df1 = pd.DataFrame(np.random.rand( 5 , 5 ), columns = column, index = index) column = [ 'a' , 'b' , 'c' , 'g' , 'h' ] # create the new index for columns print (df1.reindex(column, axis = 'columns' , fill_value = 'data missing' )) |
Output:
Please Login to comment...