Skip to content
Related Articles

Related Articles

Count distinct emails present in a given array
  • Last Updated : 23 Mar, 2021

Given an array arr[] consisting of N strings where each string represents an email address consisting of English alphabets, ‘.’, ‘+’ and ‘@’, the task is to count the number of distinct emails present in the array according to the following rules:

  • An email address can be split into two substrings, the prefix and suffix of ‘@’, which are the local name and domain name respectively.
  • The ‘.’ character in the string in the local name is ignored.
  • In the local name, every character after ‘+‘ is ignored.

Examples:

Input: arr[] = {“raghav.agg@geeksforgeeks.com”, “raghavagg@geeksforgeeks.com”}
Output: 1
Explanation: Removing all the ‘.’s before ‘@’ modifies the strings to {“raghavagg@geeksforgeeks.com”, “raghavagg@geeksforgeeks.com”}. Therefore, the total number of distinct emails present in the string are 1.

Input: arr[] = {“avruty+dhir+gfg@geeksforgeeks.com”, “avruty+gfg@geeksforgeeks.com”, “av.ruty@geeksforgeeks.com”}
Output: 1

Approach: The given problem can be solved by storing each email in a HashSet after populating it according to the given rule and print the size of the HashSet obtained. Follow the steps below to solve the problem:

  • Initialize a HashSet, say S, to store all the distinct strings after populating according to the given rules.
  • Traverse the given array arr[] and perform the following steps:
    • Find the position of ‘@’ and store it in a variable, say pos2.
    • Delete all the ‘.’ characters before pos2 using erase() function.
    • Update the position of ‘@’ i.e., pos2 = find(‘@’) and find the position of ‘+’ and store it in a variable say pos1 as S.find(‘+’).
    • Now, erase all the characters after pos1 and before pos2.
    • Insert all the updated strings in a HashSet S.
  • After completing the above steps, print the size of HashSet S as the result.

Below is the implementation of the above approach:

C++14




// C++ program for the above approach
#include <bits/stdc++.h>
using namespace std;
 
// Function to count all the distinct
// emails after preprocessing according
// to the given rules
int distinctEmails(vector<string>& emails)
{
    // Traverse the given array of
    // strings arr[]
    for (auto& x : emails) {
 
        // Stores the position of '@'
        // in the string
        auto pos2 = x.find('@');
 
        // If pos2 < x.size()
        if (pos2 < x.size())
 
            // Erases all the ocurrences
            // of '.' before pos2
            x.erase(
                remove(x.begin(),
                    x.begin() + pos2, '.'),
                x.begin() + pos2);
 
        // Stores the position of the
        // first '+'
        auto pos1 = x.find('+');
 
        // Update the position pos2
        pos2 = x.find('@');
 
        // If '+' exists then erase
        // charcters after '+' and
        // before '@'
        if (pos1 < x.size()
            and pos2 < x.size()) {
            x.erase(pos1, pos2 - pos1);
        }
    }
 
    // Insert all the updated strings
    // inside the set
    unordered_set<string> ans(
        emails.begin(),
        emails.end());
 
    // Return the size of set ans
    return ans.size();
}
 
// Driver Code
int main()
{
    vector<string> arr
        = { "raghav.agg@geeksforgeeks.com",
            "raghavagg@geeksforgeeks.com" };
 
    // Function Call
    cout << distinctEmails(arr);
 
    return 0;
}

Python3




# Python3 program for the above approach
 
# Function to count all the distinct
# emails after preprocessing according
# to the given rules
def distinctEmails(emails):
   
  ans = set([])
 
  # Traverse the given array of
  # strings arr[]
  for x in emails:
 
    # Stores the position of '@'
    # in the string
    pos2 = x.find('@')
 
    # If pos2 < x.size()
    if (pos2 < len(x)):
 
      # Erases all the ocurrences
      # of '.' before pos2
      p = x[:pos2]
      p = p.replace(".", "")
      x = p + x[pos2:]
 
      # Stores the position of the
      # first '+'
      pos1 = x.find('+')
 
      # Update the position pos2
      pos2 = x.find('@')
 
      # If '+' exists then erase
      # charcters after '+' and
      # before '@'
      if (pos1 > 0 and pos1 < len(x) and
          pos2 < len(x)):
        x = x[:pos1] + x[pos2:]
 
      # Insert all the updated strings
      # inside the set
      ans.add(x)
 
  # Return the size of set ans
  return len(ans)
 
# Driver Code
if __name__ == "__main__":
 
    arr = ["raghav.agg@geeksforgeeks.com",
           "raghavagg@geeksforgeeks.com"]
 
    # Function Call
    print(distinctEmails(arr))
 
# This code is contributed by ukasp
Output: 
1

 

Time Complexity: O(N2)
Auxiliary Space: O(N)




My Personal Notes arrow_drop_up
Recommended Articles
Page :