Related Articles
Count distinct emails present in a given array
• Last Updated : 23 Mar, 2021

Given an array arr[] consisting of N strings where each string represents an email address consisting of English alphabets, ‘.’, ‘+’ and ‘@’, the task is to count the number of distinct emails present in the array according to the following rules:

• An email address can be split into two substrings, the prefix and suffix of ‘@’, which are the local name and domain name respectively.
• The ‘.’ character in the string in the local name is ignored.
• In the local name, every character after ‘+‘ is ignored.

Examples:

Input: arr[] = {“raghav.agg@geeksforgeeks.com”, “raghavagg@geeksforgeeks.com”}
Output: 1
Explanation: Removing all the ‘.’s before ‘@’ modifies the strings to {“raghavagg@geeksforgeeks.com”, “raghavagg@geeksforgeeks.com”}. Therefore, the total number of distinct emails present in the string are 1.

Input: arr[] = {“avruty+dhir+gfg@geeksforgeeks.com”, “avruty+gfg@geeksforgeeks.com”, “av.ruty@geeksforgeeks.com”}
Output: 1

Approach: The given problem can be solved by storing each email in a HashSet after populating it according to the given rule and print the size of the HashSet obtained. Follow the steps below to solve the problem:

• Initialize a HashSet, say S, to store all the distinct strings after populating according to the given rules.
• Traverse the given array arr[] and perform the following steps:
• Find the position of ‘@’ and store it in a variable, say pos2.
• Delete all the ‘.’ characters before pos2 using erase() function.
• Update the position of ‘@’ i.e., pos2 = find(‘@’) and find the position of ‘+’ and store it in a variable say pos1 as S.find(‘+’).
• Now, erase all the characters after pos1 and before pos2.
• Insert all the updated strings in a HashSet S.
• After completing the above steps, print the size of HashSet S as the result.

Below is the implementation of the above approach:

## C++14

 // C++ program for the above approach#include using namespace std; // Function to count all the distinct// emails after preprocessing according// to the given rulesint distinctEmails(vector& emails){    // Traverse the given array of    // strings arr[]    for (auto& x : emails) {         // Stores the position of '@'        // in the string        auto pos2 = x.find('@');         // If pos2 < x.size()        if (pos2 < x.size())             // Erases all the ocurrences            // of '.' before pos2            x.erase(                remove(x.begin(),                    x.begin() + pos2, '.'),                x.begin() + pos2);         // Stores the position of the        // first '+'        auto pos1 = x.find('+');         // Update the position pos2        pos2 = x.find('@');         // If '+' exists then erase        // charcters after '+' and        // before '@'        if (pos1 < x.size()            and pos2 < x.size()) {            x.erase(pos1, pos2 - pos1);        }    }     // Insert all the updated strings    // inside the set    unordered_set ans(        emails.begin(),        emails.end());     // Return the size of set ans    return ans.size();} // Driver Codeint main(){    vector arr        = { "raghav.agg@geeksforgeeks.com",            "raghavagg@geeksforgeeks.com" };     // Function Call    cout << distinctEmails(arr);     return 0;}

## Python3

 # Python3 program for the above approach # Function to count all the distinct# emails after preprocessing according# to the given rulesdef distinctEmails(emails):     ans = set([])   # Traverse the given array of  # strings arr[]  for x in emails:     # Stores the position of '@'    # in the string    pos2 = x.find('@')     # If pos2 < x.size()    if (pos2 < len(x)):       # Erases all the ocurrences      # of '.' before pos2      p = x[:pos2]      p = p.replace(".", "")      x = p + x[pos2:]       # Stores the position of the      # first '+'      pos1 = x.find('+')       # Update the position pos2      pos2 = x.find('@')       # If '+' exists then erase      # charcters after '+' and      # before '@'      if (pos1 > 0 and pos1 < len(x) and          pos2 < len(x)):        x = x[:pos1] + x[pos2:]       # Insert all the updated strings      # inside the set      ans.add(x)   # Return the size of set ans  return len(ans) # Driver Codeif __name__ == "__main__":     arr = ["raghav.agg@geeksforgeeks.com",           "raghavagg@geeksforgeeks.com"]     # Function Call    print(distinctEmails(arr)) # This code is contributed by ukasp
Output:
1

Time Complexity: O(N2)
Auxiliary Space: O(N)

My Personal Notes arrow_drop_up