Expression for mean and variance in a running stream
Let we have a running stream of numbers as x1,x2,x3,…,xn.
The formula for calculating mean and variance at any given point is given as :
- Mean = E(x) = u = 1/n ∑i=1n xi
- Standard Deviation = s = 1/n ∑i=1n (xi – u) 2
- Variance = s2
However, it would be a very slow approach if we calculate these expressions by looping through all numbers each time a new number comes in.
Effective solution
s2 = 1/n ∑i=1n (xi - u) 2
= 1/n (∑i=1n xi2 + ∑i=1n u2 - 2u ∑i=1n xi)
= 1/n (∑xi2 + nu2 - 2u ∑xi)
= ∑xi2/n + u2 - 2u ∑xi/n
= ∑xi2/n - u2
= E(x2) - u2
= E(x2) - [E(x)]2
Therefore, in this implementation, we have to maintain a variable sum of all the current numbers for mean and maintain variable sum2 of all the current numbers for E(x2) and we have to maintain another variable n for the count of numbers present.
Python code for the implementation :
Python3
sum = 0
sumsq = 0
n = 0
while ( True ):
x = int ( input ( "Enter a number : " ))
n + = 1
sum + = x
sumsq + = (x * x)
mean = sum / n
var = (sumsq / n) - (mean * mean)
print ( "Mean : " ,mean)
print ( "Variance : " ,var)
print ()
|
Input and corresponding output :
Enter a number : 1
Mean : 1.0
Variance : 0.0
Enter a number : 2
Mean : 1.5
Variance : 0.25
Enter a number : 5
Mean : 2.6666666666666665
Variance : 2.8888888888888893
Enter a number : 4
Mean : 3.0
Variance : 2.5
Enter a number : 3
Mean : 3.0
Variance : 2.0
Thus, we can compute mean and variance of a running stream at any given point in constant time.
Last Updated :
22 Feb, 2021
Like Article
Save Article
Share your thoughts in the comments
Please Login to comment...