NumPy is a Python library used for numerical computing. It offers robust multidimensional arrays as a Python object along with a variety of mathematical functions. In this article, we will go through all the essential NumPy functions used in the descriptive analysis of an array. Let’s start by initializing a sample array for our analysis.
The following code initializes a NumPy array:
Python3
import numpy as np
arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 ,
9 , 2 , 4 , 3 , 6 ])
print (arr)
|
Output:
[4 5 8 5 6 4 9 2 4 3 6]
In order to describe our NumPy array, we need to find two types of statistics:
- Measures of central tendency.
- Measures of dispersion.
Measures of central tendency
The following methods are used to find measures of central tendency in NumPy:
- mean()- takes a NumPy array as an argument and returns the arithmetic mean of the data.
np.mean(arr)
- median()- takes a NumPy array as an argument and returns the median of the data.
np.median(arr)
The following example illustrates the usage of the mean() and median() methods.
Example:
Python3
import numpy as np
arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 ,
9 , 2 , 4 , 3 , 6 ])
mean = np.mean(arr)
median = np.median(arr)
print ( "Array =" , arr)
print ( "Mean =" , mean)
print ( "Median =" , median)
|
Output:
Array = [4 5 8 5 6 4 9 2 4 3 6]
Mean = 5.09090909091
Median = 5.0
Measures of dispersion
The following methods are used to find measures of dispersion in NumPy:
- amin()- it takes a NumPy array as an argument and returns the minimum.
np.amin(arr)
- amax()- it takes a NumPy array as an argument and returns maximum.
np.amax(arr)
- ptp()- it takes a NumPy array as an argument and returns the range of the data.
np.ptp(arr)
- var()- it takes a NumPy array as an argument and returns the variance of the data.
np.var(arr)
- std()- it takes a NumPy array as an argument and returns the standard variation of the data.
np.std(arr)
Example: The following code illustrates amin(), amax(), ptp(), var() and std() methods.
Python3
import numpy as np
arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 ,
9 , 2 , 4 , 3 , 6 ])
min = np.amin(arr)
max = np.amax(arr)
range = np.ptp(arr)
variance = np.var(arr)
sd = np.std(arr)
print ( "Array =" , arr)
print ( "Measures of Dispersion" )
print ( "Minimum =" , min )
print ( "Maximum =" , max )
print ( "Range =" , range )
print ( "Variance =" , variance)
print ( "Standard Deviation =" , sd)
|
Output:
Array = [4 5 8 5 6 4 9 2 4 3 6]
Measures of Dispersion
Minimum = 2
Maximum = 9
Range = 7
Variance = 3.90082644628
Standard Deviation = 1.9750509984
Example: Now we can combine the above-mentioned examples to get a complete descriptive analysis of our array.
Python3
import numpy as np
arr = np.array([ 4 , 5 , 8 , 5 , 6 , 4 ,
9 , 2 , 4 , 3 , 6 ])
mean = np.mean(arr)
median = np.median(arr)
min = np.amin(arr)
max = np.amax(arr)
range = np.ptp(arr)
variance = np.var(arr)
sd = np.std(arr)
print ( "Descriptive analysis" )
print ( "Array =" , arr)
print ( "Measures of Central Tendency" )
print ( "Mean =" , mean)
print ( "Median =" , median)
print ( "Measures of Dispersion" )
print ( "Minimum =" , min )
print ( "Maximum =" , max )
print ( "Range =" , range )
print ( "Variance =" , variance)
print ( "Standard Deviation =" , sd)
|
Output:
Descriptive analysis
Array = [4 5 8 5 6 4 9 2 4 3 6]
Measurements of Central Tendency
Mean = 5.09090909091
Median = 5.0
Minimum = 2
Maximum = 9
Range = 7
Variance = 3.90082644628
Standard Deviation = 1.9750509984