Welcome to Subscribe On Youtube
3657. Find Loyal Customers
Description
Table: customer_transactions
+++ \| transaction_id \| int \| \| customer_id \| int \| \| transaction_date \| date \| \| amount \| decimal \| \| transaction_type \| varchar \| ++-+-++--++ \| transaction_id \| customer_id \| transaction_date \| amount \| transaction_type \| +-+-++--++
Output:
+-+ \| customer_id \| +-+ \| 101 \| \| 104 \| +-+
Explanation:
- Customer 101:
- Purchase transactions: 4 (IDs: 1, 2, 3, 4)
- Refund transactions: 0
- Refund rate: 0/4 = 0% (less than 20%)
- Active period: Jan 5 to Feb 20 = 46 days (at least 30 days)
- Qualifies as loyal
- Customer 102:
- Purchase transactions: 3 (IDs: 5, 6, 9)
- Refund transactions: 2 (IDs: 7, 8)
- Refund rate: 2/5 = 40% (exceeds 20%)
- Not loyal
- Customer 103:
- Purchase transactions: 3 (IDs: 10, 11, 12)
- Refund transactions: 0
- Refund rate: 0/3 = 0% (less than 20%)
- Active period: Jan 1 to Jan 3 = 2 days (less than 30 days)
- Not loyal
- Customer 104:
- Purchase transactions: 5 (IDs: 13, 14, 15, 16, 17)
- Refund transactions: 1 (ID: 18)
- Refund rate: 1/6 = 16.67% (less than 20%)
- Active period: Jan 1 to Mar 15 = 73 days (at least 30 days)
- Qualifies as loyal
The result table is ordered by customer_id in ascending order.
</div>
Solutions
Solution 1
-
import pandas as pd def find_loyal_customers(customer_transactions: pd.DataFrame) -> pd.DataFrame: customer_transactions["transaction_date"] = pd.to_datetime( customer_transactions["transaction_date"] ) grouped = customer_transactions.groupby("customer_id") agg_df = grouped.agg( total_transactions=("transaction_type", "size"), refund_count=("transaction_type", lambda x: (x == "refund").sum()), min_date=("transaction_date", "min"), max_date=("transaction_date", "max"), ).reset_index() agg_df["date_diff"] = (agg_df["max_date"] - agg_df["min_date"]).dt.days agg_df["refund_ratio"] = agg_df["refund_count"] / agg_df["total_transactions"] result = ( agg_df[ (agg_df["total_transactions"] >= 3) & (agg_df["refund_ratio"] < 0.2) & (agg_df["date_diff"] >= 30) ][["customer_id"]] .sort_values("customer_id") .reset_index(drop=True) ) return result
-
# Write your MySQL query statement below SELECT customer_id FROM customer_transactions GROUP BY 1 HAVING COUNT(1) >= 3 AND SUM(transaction_type = 'refund') / COUNT(1) < 0.2 AND DATEDIFF(MAX(transaction_date), MIN(transaction_date)) >= 30 ORDER BY 1;