Open In App

Design a Hit Counter

Improve
Improve
Like Article
Like
Save
Share
Report

Design a hit counter which counts the number of hits received in the past 5 minutes. Source: Microsoft Interview Experience “Design hit counter” problem has recently been asked by many companies including Dropbox and the question is harder than it seems to be. It includes a couple of topics like basic data structures design, various optimization, concurrency and distributed counter. It should support the following two operations: hit and getHits. hit(timestamp) – Shows a hit at the given timestamp. getHits(timestamp) – Returns the number of hits received in the past 5 minutes (300 seconds) (from currentTimestamp). Each function accepts a timestamp parameter (in seconds granularity) and you may assume that calls are being made to the system in chronological order (i.e. the timestamp is monotonically increasing). You may assume that the earliest timestamp starts at 1. Examples:

HitCounter counter = new HitCounter();

// hit at timestamp 1.
counter.hit(1);

// hit at timestamp 2.
counter.hit(2);

// hit at timestamp 3.
counter.hit(3);

// get hits at timestamp 4, should return 3.
counter.getHits(4);

// hit at timestamp 300.
counter.hit(300);

// get hits at timestamp 300, should return 4.
counter.getHits(300);

// get hits at timestamp 301, should return 3.
counter.getHits(301);

Asked In: Microsoft, Amazon, Dropbox and many more companies. 1. Simple Solution (Brute-force Approach) : We can use a vector to store all the hits. These two functions are self explainatory. 

CPP




vector<int> v;
 
/* Record a hit.
@param timestamp - The current timestamp (in
                    seconds granularity). */
 
void hit(int timestamp)
{
    v.push_back(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
int getHits(int timestamp)
{
    int i, j;
    for (i = 0; i < v.size(); ++i) {
        if (v[i] > timestamp - 300) {
            break;
        }
    }
    return v.size() - i;
}
 
// Time Complexity : O(n)


Java




static ArrayList<Integer> v;
 
/* Record a hit.
   @param timestamp - The current timestamp (in
                      seconds granularity).  */
 
static void hit(int timestamp)
{
    v.add(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
static int getHits(int timestamp)
{
    int i, j;
    for (i = 0; i < v.size(); ++i) {
        if (v[i] > timestamp - 300) {
            break;
        }
    }
    return v.size() - i;
}
 
// Time Complexity : O(n)
 
 
// This code is contributed by aadityaburujwale.


Python3




v = []
 
# Record a hit.
#   @param timestamp - The current timestamp (in
#                     seconds granularity).  */
 
def hit(timestamp):
    v.append(timestamp)
 
# Time Complexity : O(1)
 
# Return the number of hits in the past 5 minutes.
#    @param timestamp - The current timestamp (in
#    seconds granularity). */
def getHits(timestamp):
    # for (i = 0; i < v.length; ++i) {
    for i in range(0,len(v)):
        if (v[i] > timestamp - 300):
            break
    return len(v) - i
 
# Time Complexity : O(n)
 
 
# This code is contributed by akashish__


C#




List<int> var = new List<int>();
 
 
/* Record a hit.
   @param timestamp - The current timestamp (in
                      seconds granularity).  */
 
static void hit(int timestamp)
{
    v.Add(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
static int getHits(int timestamp)
{
    int i, j;
    for (i = 0; i < v.Count; ++i) {
        if (v[i] > timestamp - 300) {
            break;
        }
    }
    return v.Count - i;
}
 
// Time Complexity : O(n)
 
// This code is contributed by sourabhdalal0001.


Javascript




let v = [];
 
/* Record a hit.
   @param timestamp - The current timestamp (in
                      seconds granularity).  */
 
function hit(timestamp)
{
    v.push(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
function getHits(timestamp)
{
    let i, j;
    for (i = 0; i < v.length; ++i) {
        if (v[i] > timestamp - 300) {
            break;
        }
    }
    return v.length - i;
}
 
// Time Complexity : O(n)
 
 
// This code is contributed by akashish__


2. Space Optimized Solution: We can use a queue to store the hits and delete the entries in queue which are of no use. It will save our space. We are deleting the extra elements from queue. 

CPP




queue<int> q;
 
/** Record a hit.
    @param timestamp - The current timestamp
                (in seconds granularity). */
void hit(int timestamp)
{
    q.push(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
@param timestamp - The current timestamp (in seconds
granularity). */
int getHits(int timestamp)
{
    while (!q.empty() && timestamp - q.front() >= 300) {
        q.pop();
    }
    return q.size();
}
// Time Complexity : O(n)


Python3




q = []
 
''' Record a hit.
    @param timestamp - The current timestamp
                (in seconds granularity). '''
def hit(timestamp):
    q.push(timestamp)
     
 
# Time Complexity : O(1)
 
''' Return the number of hits in the past 5 minutes.
@param timestamp - The current timestamp (in seconds
granularity). '''
def getHits(timestamp):
    while (len(q) > 0 and timestamp - q[0] >= 300):
        q.pop(0)
    return len(q)
# Time Complexity : O(n)
 
# This code is contributed by akashish__


Java




import java.util.LinkedList;
import java.util.Queue;
 
class GFG {
    Queue<Integer> q;
 
    /**
     * Record a hit.
     *
     * @param timestamp - The current timestamp (in seconds
     *     granularity).
     */
    public void hit(int timestamp) { q.offer(timestamp); }
 
    /**
     * Return the number of hits in the past 5 minutes.
     *
     * @param timestamp - The current timestamp (in seconds
     *     granularity).
     */
    public int getHits(int timestamp)
    {
        while (!q.isEmpty()
               && timestamp - q.peek() >= 300) {
            q.poll();
        }
        return q.size();
    }
 
    public static void main(String[] args) {}
}
 
// Time Complexity : O(1) for hit() method, O(n) for
// getHits() method
 
// This code is contributed by akashish__


C#




Queue q = new Queue();
 
/** Record a hit.
    @param timestamp - The current timestamp
                (in seconds granularity). */
public static void hit(int timestamp)
{
    q.Enqueue(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
@param timestamp - The current timestamp (in seconds
granularity). */
public static int getHits(int timestamp)
{
    while (q.Count > 0 && timestamp - q.Peek() >= 300) {
        q.Dequeue();
    }
    return q.Count;
}
// Time Complexity : O(n)
// This code is contributed by akashish__


Javascript




let q = [];
 
/** Record a hit.
    @param timestamp - The current timestamp
                (in seconds granularity). */
function hit(timestamp)
{
    q.push(timestamp);
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
@param timestamp - The current timestamp (in seconds
granularity). */
function getHits(timestamp)
{
    while (q.length > 0 && timestamp - q[0] >= 300) {
        q.shift();
    }
    return q.length;
}
// Time Complexity : O(n)
 
// This code is contributed by akashish__


3. Most optimized solution : What if the data comes in unordered and several hits carry the same timestamp. Since the queue approach wouldn’t work without ordered data, this time go with arrays to store the hit count in each unit of time. If we are tracking hits in the past 5 mins in seconds granularity which is 300 seconds, create 2 arrays of size 300. int[] hits = new int[300]; TimeStamp[] times = new TimeStamp[300]; // timestamp of the last counted hit Given an incoming, mod its timestamp by 300 to see where it locates in the hits array. int idx = timestamp % 300; => hits[idx] keeps the hit count took place in this second But before we increase the hit count at idx by 1, the timestamp really belongs to the second that hits[idx] is tracking. timestamp[i] stores the timestamp of the last counted hit. If timestamp[i] > timestamp, this hit should be discarded since it did not happened in the past 5 minute. If timestamp[i] == timestamp, then hits[i] increase by 1. If timestamp[i] currentTime – 300. 

CPP




vector<int> times, hits;
 
times.resize(300);
hits.resize(300);
 
/** Record a hit.
@param timestamp - The current timestamp
(in seconds granularity). */
void hit(int timestamp)
{
    int idx = timestamp % 300;
    if (times[idx] != timestamp) {
        times[idx] = timestamp;
        hits[idx] = 1;
    }
    else {
        ++hits[idx];
    }
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
int getHits(int timestamp)
{
    int res = 0;
    for (int i = 0; i < 300; ++i) {
        if (timestamp - times[i] < 300) {
            res += hits[i];
        }
    }
    return res;
}
// Time Complexity : O(300) == O(1)


Java




int[] times = new int[300];
int[] hits = new int[300];
 
//times.resize(300);
//hits.resize(300);
 
/** Record a hit.
@param timestamp - The current timestamp
(in seconds granularity). */
static void hit(int timestamp)
{
    int idx = timestamp % 300;
    if (times[idx] != timestamp) {
        times[idx] = timestamp;
        hits[idx] = 1;
    }
    else {
        ++hits[idx];
    }
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
static int getHits(int timestamp)
{
    int res = 0;
    for (int i = 0; i < 300; ++i) {
        if (timestamp - times[i] < 300) {
            res += hits[i];
        }
    }
    return res;
}
 
// Time Complexity : O(300) == O(1)
// This code is contributed by akashish__


Python3




times = []
hits = []
 
for i in range(0,300):
  times.append(0)
  hits.append(300)
 
''' Record a hit.
@param timestamp - The current timestamp
(in seconds granularity). '''
def hit(timestamp):
  idx = timestamp % 300
  if (times[idx] is not timestamp):
    times[idx] = timestamp
    hits[idx] = 1
  else:
    hits[idx]+=1
 
# Time Complexity : O(1)
 
''' Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). '''
def getHits(timestamp):
  res = 0
  for i in range(0,300):
    if (timestamp - times[i] < 300):
      res += hits[i]
  return res
 
# Time Complexity : O(300) == O(1)
# This code is contributed by akashish__


Javascript




const times = new Array(300);
const hits = new Array(300);
 
/** Record a hit.
 * @param {number} timestamp - The current timestamp (in seconds granularity).
 */
function hit(timestamp) {
  const idx = timestamp % 300;
  if (times[idx] !== timestamp) {
    times[idx] = timestamp;
    hits[idx] = 1;
  } else {
    hits[idx]++;
  }
}
 
/** Return the number of hits in the past 5 minutes.
 * @param {number} timestamp - The current timestamp (in seconds granularity).
 * @return {number}
 */
function getHits(timestamp) {
  let res = 0;
  for (let i = 0; i < 300; ++i) {
    if (timestamp - times[i] < 300) {
      res += hits[i];
    }
  }
  return res;
}
 
// Time Complexity : O(300) == O(1)
 
// This code is contributed by akashish__


C#




List<int> times = new List<int>(300);
List<int> hits = new List<int>(300);
 
 
//times.resize(300);
//hits.resize(300);
 
/** Record a hit.
@param timestamp - The current timestamp
(in seconds granularity). */
static void hit(int timestamp)
{
    int idx = timestamp % 300;
    if (times[idx] != timestamp) {
        times[idx] = timestamp;
        hits[idx] = 1;
    }
    else {
        ++hits[idx];
    }
}
 
// Time Complexity : O(1)
 
/** Return the number of hits in the past 5 minutes.
    @param timestamp - The current timestamp (in
    seconds granularity). */
static int getHits(int timestamp)
{
    int res = 0;
    for (int i = 0; i < 300; ++i) {
        if (timestamp - times[i] < 300) {
            res += hits[i];
        }
    }
    return res;
}
// Time Complexity : O(300) == O(1)


Approach: Most Efficient Way using TreeMap

  • We can use a TreeMap to keep the timestamp and the be counted of hits at that timestamp. The TreeMap will maintain the timestamps in looked after order and permit us to quickly discover the hits at a given timestamp.
  • To handle the case where hits are obtained on the equal timestamp, we can use a LinkedHashMap to save the hits at a given timestamp. This will maintain the order of the hits and allow us to count number the hits in O(1) time.
  • To deal with the case where hits are obtained out of order, we are able to use a Queue to shop the hits. When we get hold of a success, we will add it to the Queue and then dispose of any hits that are extra than 5 minutes antique. We can then update the TreeMap and LinkedHashMap therefore.
  • For the hit() operation, we can virtually add the hit to the Queue.
  • For the getHits() operation, we are able to iterate over the TreeMap and count the hits in the LinkedHashMap for each timestamp that falls in the past 5 minutes.
  • To handle concurrency, we will use a synchronized block to make certain that handiest one thread can update the facts structures at a time.
  • The very last implementation will have the hit() and getHits() methods synchronized to make certain thread safety.

Java




/*package whatever //do not write package name here */
 
import java.io.*;
import java.util.*;
 
class HitCounter {
    // Use TreeMap to store hits at different timestamps
    TreeMap<Integer, Integer> map;
 
    /** Initialize your data structure here. */
    public HitCounter() {
        map = new TreeMap<>();
    }
 
    /** Record a hit.
        @param timestamp - The current timestamp (in seconds granularity). */
    public void hit(int timestamp) {
        // Put the hit in the TreeMap with its timestamp as the key
        // and the number of hits as the value
        map.put(timestamp, map.getOrDefault(timestamp, 0) + 1);
    }
 
    /** Return the number of hits in the past 5 minutes.
        @param timestamp - The current timestamp (in seconds granularity). */
    public int getHits(int timestamp) {
        // Calculate the start timestamp for the past 5 minutes
        int start = timestamp - 300;
        // If the start timestamp is less than 0, set it to 0
        if (start < 0) {
            start = 0;
        }
        // Iterate over the tail map of the TreeMap starting from
        // the start timestamp and add up the number of hits at each timestamp
        int count = 0;
        for (int key : map.tailMap(start).keySet()) {
            count += map.get(key);
        }
        // Return the total number of hits
        return count;
    }
  public static void main(String[] args) {
    HitCounter counter = new HitCounter();
    counter.hit(1);
    counter.hit(2);
    counter.hit(3);
    int hits1 = counter.getHits(4); // should return 3
    counter.hit(300);
    int hits2 = counter.getHits(300); // should return 4
    int hits3 = counter.getHits(301); // should return 3
    System.out.println(hits1);
    System.out.println(hits2);
    System.out.println(hits3);
}
}


Output

3
4
4

Time complexity : O(Mlog N) where M is the wide variety of hits that came about within the remaining 5 minutes and N is the full variety of hits. This is due to the fact we are iterating over a subset of the TreeMap and each generation takes logarithmic time.

Space complexity :  O(N) in which N is the total variety of hits stored within the TreeMap.

How to handle concurrent requests? When two requests update the list simultaneously, there can be race conditions. It’s possible that the request that updated the list first may not be included eventually. The most common solution is to use a lock to protect the list. Whenever someone wants to update the list (by either adding new elements or removing the tail), a lock will be placed on the container. After the operation finishes, the list will be unlocked. This works pretty well when you don’t have a large volume of requests or performance is not a concern. Placing a lock can be costly at some times and when there are too many concurrent requests, the lock may potentially block the system and becomes the performance bottleneck. Distribute the counter When a single machine gets too many traffic and performance becomes an issue, it’s the perfect time to think of distributed solution. Distributed system significantly reduces the burden of a single machine by scaling the system to multiple nodes, but at the same time adding complexity. Let’s say we distribute visit requests to multiple machines equally. I’d like to emphasize the importance of equal distribution first. If particular machines get much more traffic than the rest machines, the system doesn’t get to its full usage and it’s very important to take this into consideration when designing the system. In our case, we can get a hash of users email and distribute by the hash (it’s not a good idea to use email directly as some letter may appear much more frequent than the others). To count the number, each machine works independently to count its own users from the past minute. When we request the global number, we just need to add all counters together. References: I had published it originally here. https://aonecode.com/getArticle/211



Last Updated : 20 May, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads