Average of a stream of numbers
Difficulty Level: Rookie
Given a stream of numbers, print average (or mean) of the stream at every point. For example, let us consider the stream as 10, 20, 30, 40, 50, 60, …
Average of 1 numbers is 10.00 Average of 2 numbers is 15.00 Average of 3 numbers is 20.00 Average of 4 numbers is 25.00 Average of 5 numbers is 30.00 Average of 6 numbers is 35.00 ..................
To print mean of a stream, we need to find out how to find average when a new number is being added to the stream. To do this, all we need is count of numbers seen so far in the stream, previous average and new number. Let n be the count, prev_avg be the previous average and x be the new number being added. The average after including x number can be written as (prev_avg*n + x)/(n+1).
#include <stdio.h>
// Returns the new average after including x
float getAvg(float prev_avg, int x, int n)
{
return (prev_avg*n + x)/(n+1);
}
// Prints average of a stream of numbers
void streamAvg(float arr[], int n)
{
float avg = 0;
for(int i = 0; i < n; i++)
{
avg = getAvg(avg, arr[i], i);
printf("Average of %d numbers is %f \n", i+1, avg);
}
return;
}
// Driver program to test above functions
int main()
{
float arr[] = {10, 20, 30, 40, 50, 60};
int n = sizeof(arr)/sizeof(arr[0]);
streamAvg(arr, n);
return 0;
}
The above function getAvg() can be optimized using following changes. We can avoid the use of prev_avg and number of elements by using static variables (Assuming that only this function is called for average of stream). Following is the oprimnized version.
#include <stdio.h>
// Returns the new average after including x
float getAvg (int x)
{
static int sum, n;
sum += x;
return (((float)sum)/++n);
}
// Prints average of a stream of numbers
void streamAvg(float arr[], int n)
{
float avg = 0;
for(int i = 0; i < n; i++)
{
avg = getAvg(arr[i]);
printf("Average of %d numbers is %f \n", i+1, avg);
}
return;
}
// Driver program to test above functions
int main()
{
float arr[] = {10, 20, 30, 40, 50, 60};
int n = sizeof(arr)/sizeof(arr[0]);
streamAvg(arr, n);
return 0;
}
Thanks to Abhijeet Deshpande for suggesting this optimized version.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.
Tweet
float getAvg (int x)
{
static int sum, n;
sum += x;
return (((float)sum)/++n);
}
here sum can give you overflow problem.
to solve this problem you can use following approach.
float getAvg (int x) { static float oldAvg, n; fload balance = n - oldAvg; oldAvg += (balance/++n); return oldAvg; }It doesn't seem to work. The following program prints
Average of 1 numbers is 0.000000
Average of 2 numbers is 0.500000
Average of 3 numbers is 1.000000
Average of 4 numbers is 1.500000
Average of 5 numbers is 2.000000
Average of 6 numbers is 2.500000
#include <stdio.h> float getAvg (int x) { static float oldAvg, n; float balance = n - oldAvg; oldAvg += (balance/++n); return oldAvg; } // Prints average of a stream of numbers void streamAvg(float arr[], int n) { float avg = 0; for(int i = 0; i < n; i++) { avg = getAvg(arr[i]); printf("Average of %d numbers is %f \n", i+1, avg); } return; } // Driver program to test above functions int main() { float arr[] = {10, 20, 30, 40, 50, 60}; int n = sizeof(arr)/sizeof(arr[0]); streamAvg(arr, n); return 0; }I think you meant
Sorry for typo It should be float balance = x - oldAvg;
Yes it should be float balance = x - oldAvg; That was a typo
Sorry for the typo.
float getAvg (int x) { static int sum, n; sum += x; return (((float)sum)/++n); } here sum can give you overflow problem. to solve this problem you can use following approach. float getAvg (int x) { static float oldAvg, n; fload balance = x - oldAvg; oldAvg += (balance/++n); return oldAvg; }This code does not exploit the inherent property of the problem, that is, the average is over a "stream" of numbers. So, the possible optimizations are:
1. You need not pass the number of elements every time.
2. You need not pass the previous average. (again, assuming this function is called since the first element in the stream)
3. Perhaps not relevant here, but n++ is more optimal compared to n+1
Hence, the code:
float get_incremental_avg (int x) { static int sum, n; sum += x; return (((float)sum)/++n); }@Abhijeet Deshpande: Great! Your approach looks good to be added to the original post.
Could you explain why n++ is more optimal compared to n+1. Any link or other reference would be helpful.
@Kartik: Thank you.
Well, in C language, n=n+1 (Assigning to "n", only for comparison. In the original code above, n+1 is not assigned to n) n++ does not make any difference. The result is the same. However, when the code is compiled and an assembly code is generated, n+1 compiles to
Load R1, n /* Load n in some register */
Load_Imm R2, 1 /* Load an immediate value 1 in reg R2 */
Add R1, R2 /* Add R1 and R2 and store result in R2 - for Intel type instructions */
You need 2 registers and 3 instructions (~3 cycles for a RISC processor)
Whereas n++ translates to..
Load R1, n /* Load n in some register */
Incr R1 /* Increment R1 */
You need 1 regsiter and 2 instructions here.
Typically Increment (and decrement) operations are available in most MPU/MCU instruction sets and are (again typically) single cycle instructions.
For reference, look up any datasheet, say Intel 8086 and you fill find a recommendation to use INCR/DECR instead of addition wherever possible
@Abhijeet Deshpande: Thanks!!
But abhijeet, the C/C++ compiler does these optimizations for you regardless of whether you write n=n+1 or n++, so effectively both the statements are equal and one need not worry about it.
Average over past few X numbers will be interesting, where X is function parameter (input). If stream size is less than X return average so far.
@Venki: Does this answer your question? What you ask for is a FIR filter (rather, a specific case of FIR filter, or a moving average filter, wherein all the co-efficients of the filter are 1) with "n_of_taps" taps.
I have not compiled the code. There might be errors, but this code should convey the algorithm
float get_incremental_avg (int x, int n_of_taps) { static int *p, *base_addr; static int sum, n; if (p != NULL) { p = (int*)calloc (n_of_taps, sizeof(int)); base_addr = p; } *p = x; sum += *p; if (++p > base_addr+n_of_taps) p = base_addr; sum -= *p; return (((float)sum)/++n); }PS: Code gets simpler if you fix the number of taps. Then you do not need calloc
how could we take care of the overflow(if it occurs) for prev_avg*n?
I think, following expression can be used to avoid overlfow.
prev_avg*(n/(n+1)) + x/(n+1);
Either way will have large error when n is large. If you want something serious, you should use Kahan summation algorithm and avoid the *(n/(n+1)) step.