Python | Pandas Index.factorize()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Index.factorize() function encode the object as an enumerated type or categorical variable. This method is useful for obtaining a numeric representation of an array when all that matters is identifying distinct values. factorize is available as both a top-level function pandas.factorize(), and as a method Series.factorize() and Index.factorize().
Syntax: Index.factorize(sort=False, na_sentinel=-1)
sort : Sort uniques and shuffle labels to maintain the relationship.
na_sentinel : Value to mark “not found”.
Returns : An integer ndarray that’s an indexer into uniques. uniques.take(labels) will have the same values as values.
Example #1: Use
Index.factorize() function to encode the given Index values into categorical form.
Let’s factorize the given Index.
As we can see in the output, the
Index.factorize() function has converted each label in the Index to a category and has assigned them numerical values.
Example #2: Use
Index.factorize() function to factorize the index values based on their sorted order sequence.
Let’s factorize it based on sorted order. Numerical values are assigned only after the sorting of the values in the Index.
As we can see in the output, sorting has been performed on the Index values before assigning them numerical values.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course. And to begin with your Machine Learning Journey, join the Machine Learning – Basic Level Course