Welcome to Subscribe On Youtube

182. Duplicate Emails

Description

Table: Person

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| id          | int     |
| email       | varchar |
+-------------+---------+
id is the primary key (column with unique values) for this table.
Each row of this table contains an email. The emails will not contain uppercase letters.

 

Write a solution to report all the duplicate emails. Note that it's guaranteed that the email field is not NULL.

Return the result table in any order.

The result format is in the following example.

 

Example 1:

Input: 
Person table:
+----+---------+
| id | email   |
+----+---------+
| 1  | a@b.com |
| 2  | c@d.com |
| 3  | a@b.com |
+----+---------+
Output: 
+---------+
| Email   |
+---------+
| a@b.com |
+---------+
Explanation: a@b.com is repeated two times.

Solutions

Solution 1: Group By + Having

We can use the GROUP BY statement to group the data by the email field, and then use the HAVING statement to filter out the email addresses that appear more than once.

Solution 2: Self-Join

We can use a self-join to join the Person table with itself, and then filter out the records where the id is different but the email is the same.

  • import pandas as pd
    
    
    def duplicate_emails(person: pd.DataFrame) -> pd.DataFrame:
        results = pd.DataFrame()
    
        results = person.loc[person.duplicated(subset=["email"]), ["email"]]
    
        return results.drop_duplicates()
    
    
  • # Write your MySQL query statement below
    
    SELECT Email FROM Person GROUP BY Email
    HAVING COUNT(*) > 1;
    
    --
    
    SELECT DISTINCT p1.Email FROM Person p1
    JOIN Person p2 ON p1.Email = p2.Email
    WHERE p1.Id <> p2.Id;
    
    --
    
    SELECT email
    FROM Person
    GROUP BY 1
    HAVING COUNT(1) > 1;
    
    

All Problems

All Solutions