Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). This data structure can be converted to NumPy ndarray with the help of the DataFrame.to_numpy() method. In this article we will see how to convert dataframe to numpy array.
Syntax of Pandas DataFrame.to_numpy()
Syntax: Dataframe.to_numpy(dtype = None, copy = False)
Parameters:
- dtype: Data type which we are passing like str.
- copy: [bool, default False] Ensures that the returned value is a not a view on another array.
Returns: numpy.ndarray
Convert DataFrame to Numpy Array
Here, we will see how to convert DataFrame to a Numpy array.
Python3
import pandas as pd
df = pd.DataFrame(
[[ 1 , 2 , 3 ],
[ 4 , 5 , 6 ],
[ 7 , 8 , 9 ],
[ 10 , 11 , 12 ]],
columns = [ 'a' , 'b' , 'c' ])
arr = df.to_numpy()
print ( '\nNumpy Array\n----------\n' , arr)
print ( type (arr))
|
Output:
Numpy Array
----------
[[ 1 2 3]
[ 4 5 6]
[ 7 8 9]
[10 11 12]]
<class 'numpy.ndarray'>
Here we want to convert a particular column into numpy array.
Python3
import pandas as pd
df = pd.DataFrame(
[[ 1 , 2 , 3 ],
[ 4 , 5 , 6 ],
[ 7 , 8 , 9 ],
[ 10 , 11 , 12 ]],
columns = [ 'a' , 'b' , 'c' ])
arr = df[[ 'a' , 'c' ]].to_numpy()
print ( '\nNumpy Array\n----------\n' , arr)
print ( type (arr))
|
Output:
Numpy Array
----------
[[ 1 3]
[ 4 6]
[ 7 9]
[10 12]]
<class 'numpy.ndarray'>
Here we are converting a dataframe with different datatypes.
Python3
import pandas as pd
import numpy as np
df = pd.DataFrame(
[[ 1 , 2 , 3 ],
[ 4 , 5 , 6.5 ],
[ 7 , 8.5 , 9 ],
[ 10 , 11 , 12 ]],
columns = [ 'a' , 'b' , 'c' ])
arr = df.to_numpy()
print ( 'Numpy Array' , arr)
print ( 'Numpy Array Datatype :' , arr.dtype)
|
Output:
Numpy Array [[ 1. 2. 3. ]
[ 4. 5. 6.5]
[ 7. 8.5 9. ]
[10. 11. 12. ]]
Numpy Array Datatype : float64
To get the link to the CSV file, click on nba.csv
Example 1:
Here, we are using a CSV file for changing the Dataframe into a Numpy array by using the method DataFrame.to_numpy(). After that, we are printing the first five values of the Weight column by using the df.head() method.
Python3
import pandas as pd
data = pd.read_csv( "nba.csv" )
data.dropna(inplace = True )
df = pd.DataFrame(data[ 'Weight' ].head())
print (df.to_numpy())
|
Output:
[[180.]
[235.]
[185.]
[235.]
[238.]]
Example 2:
In this example, we are just providing the parameters in the same code to provide the dtype here.
Python3
import pandas as pd
data = pd.read_csv( "nba.csv" )
data.dropna(inplace = True )
df = pd.DataFrame(data[ 'Weight' ].head())
print (df.to_numpy(dtype = 'float32' ))
|
Output:
[[180.]
[235.]
[185.]
[235.]
[238.]]
Example 3:
Validating the type of the array after conversion.
Python3
import pandas as pd
data = pd.read_csv( "nba.csv" )
data.dropna(inplace = True )
df = pd.DataFrame(data[ 'Weight' ].head())
print ( type (df.to_numpy()))
|
Output:
<class 'numpy.ndarray'>