How to Convert a Dataframe Column to Numpy Array
Last Updated :
03 Feb, 2024
NumPy and Pandas are two powerful libraries in the Python ecosystem for data manipulation and analysis. Converting a DataFrame column to a NumPy array is a common operation when you need to perform array-based operations on the data. In this section, we will explore various methods to achieve this task.
Prerequisites
- NumPy Arrays: NumPy arrays are the core data structure in the NumPy library. They provide a way to store and manipulate numerical data efficiently. Converting a DataFrame column to a NumPy array allows you to leverage the array’s functionality for various mathematical operations.
- DataFrame Column Selection : In Pandas, accessing a single column of a DataFrame results in a Pandas Series. Converting this Series to a NumPy array is a straightforward process.
Step 1: Creating a sample dataset for demonstration:
Python
import pandas as pd
import numpy as np
np.random.seed( 42 )
df = pd.DataFrame({ 'Numeric_Column' : np.random.randint( 1 , 100 , 5 )})
print ( "Original DataFrame:" )
print (df)
|
Output:
Original DataFrame:
Numeric_Column
0 51
1 92
2 14
3 71
4 60
Step 2: Using methods and examples
a. Using the values Attribute:
The values attribute in Pandas returns the underlying data as a NumPy array. This is a simple and direct way to convert a DataFrame column to a NumPy array.
Python
numpy_array_values = df[ 'Numeric_Column' ].values
print (numpy_array_values)
|
Output: The output numpy_array_values
is a NumPy array containing the values from the ‘Numeric_Column’ of the DataFrame df.
[51 92 14 71 60]
b. to_numpy() method:
The to_numpy() method in Pandas converts the DataFrame or Series to a NumPy array. It provides flexibility and options for handling different data types.
Python
numpy_array_to_numpy = df[ 'Numeric_Column' ].to_numpy()
print (numpy_array_to_numpy)
|
Output:
The output shows numpy_array_to_numpy now holds the NumPy array representation of the ‘Numeric_Column’.
[51 92 14 71 60]
c. asarray() method:
The asarray() function in NumPy converts the input to an array. It can be applied to a Pandas Series to convert it into a NumPy array.
Python
numpy_array_asarray = np.asarray(df[ 'Numeric_Column' ])
|
Output: numpy_array_asarray shows the NumPy array representation of the ‘Numeric_Column’ obtained through the asarray()
function
NumPy Array using asarray() function:
[51 92 14 71 60]
Share your thoughts in the comments
Please Login to comment...