Welcome to Subscribe On Youtube

3230. Customer Purchasing Behavior Analysis 🔒

Description

Table: Transactions

+++
\| transaction_id   \| int     \|
\| customer_id      \| int     \|
\| product_id       \| int     \|
\| transaction_date \| date    \|
\| amount           \| decimal \|
+++
\| Column Name \| Type    \|
+-++
product_id is the unique identifier for this table.
Each row of this table contains information about a product, including its category and price.

Write a solution to analyze customer purchasing behavior. For each customer, calculate:

  • The total amount spent.
  • The number of transactions.
  • The number of unique product categories purchased.
  • The average amount spent. 
  • The most frequently purchased product category (if there is a tie, choose the one with the most recent transaction).
  • A loyalty score defined as: (Number of transactions * 10) + (Total amount spent / 100).

Round total_amount, avg_transaction_amount, and loyalty_score to 2 decimal places.

Return the result table ordered by loyalty_score in descending order, then by customer_id in ascending order.

The query result format is in the following example.

 

Example:

Input:

Transactions table:

+-+-+++--+
\| 1              \| 101         \| 1          \| 2023-01-01       \| 100.00 \|
\| 2              \| 101         \| 2          \| 2023-01-15       \| 150.00 \|
\| 3              \| 102         \| 1          \| 2023-01-01       \| 100.00 \|
\| 4              \| 102         \| 3          \| 2023-01-22       \| 200.00 \|
\| 5              \| 101         \| 3          \| 2023-02-10       \| 200.00 \|
+-+--+
\| product_id \| category \| price  \|
++-+--+

Output:

+-+--+-+-++--++
\| 101         \| 450.00       \| 3                 \| 3                 \| 150.00                 \| C            \| 34.50         \|
\| 102         \| 300.00       \| 2                 \| 2                 \| 150.00                 \| C            \| 23.00         \|
+-+--+-+-++--+---+

Explanation:

  • For customer 101:
    • Total amount spent: 100.00 + 150.00 + 200.00 = 450.00
    • Number of transactions: 3
    • Unique categories: A, B, C (3 categories)
    • Average transaction amount: 450.00 / 3 = 150.00
    • Top category: C (Customer 101 made 1 purchase each in categories A, B, and C. Since the count is the same for all categories, we choose the most recent transaction, which is category C on 2023-02-10)
    • Loyalty score: (3 * 10) + (450.00 / 100) = 34.50
  • For customer 102:
    • Total amount spent: 100.00 + 200.00 = 300.00
    • Number of transactions: 2
    • Unique categories: A, C (2 categories)
    • Average transaction amount: 300.00 / 2 = 150.00
    • Top category: C (Customer 102 made 1 purchase each in categories A and C. Since the count is the same for both categories, we choose the most recent transaction, which is category C on 2023-01-22)
    • Loyalty score: (2 * 10) + (300.00 / 100) = 23.00

Note: The output is ordered by loyalty_score in descending order, then by customer_id in ascending order.

Solutions

Solution 1: Grouping + Window Functions + Join

First, we join the Transactions table with the Products table, recording the result in a temporary table T.

Next, we use the T table to calculate the transaction count and the most recent transaction date for each user in each category, saving the results in a temporary table P.

Then, we use the P table to calculate the ranking of transaction counts for each user in each category, saving the results in a temporary table R.

Finally, we use the T and R tables to calculate the total transaction amount, transaction count, unique category count, average transaction amount, most frequently purchased category, and loyalty score for each user, and return the results in descending order of loyalty score and ascending order of user ID.

  • # Write your MySQL query statement below
    WITH
        T AS (
            SELECT *
            FROM
                Transactions
                JOIN Products USING (product_id)
        ),
        P AS (
            SELECT
                customer_id,
                category,
                COUNT(1) cnt,
                MAX(transaction_date) max_date
            FROM T
            GROUP BY 1, 2
        ),
        R AS (
            SELECT
                customer_id,
                category,
                RANK() OVER (
                    PARTITION BY customer_id
                    ORDER BY cnt DESC, max_date DESC
                ) rk
            FROM P
        )
    SELECT
        t.customer_id,
        ROUND(SUM(amount), 2) total_amount,
        COUNT(1) transaction_count,
        COUNT(DISTINCT t.category) unique_categories,
        ROUND(AVG(amount), 2) avg_transaction_amount,
        r.category top_category,
        ROUND(COUNT(1) * 10 + SUM(amount) / 100, 2) loyalty_score
    FROM
        T t
        JOIN R r ON t.customer_id = r.customer_id AND r.rk = 1
    GROUP BY 1
    ORDER BY 7 DESC, 1;
    
    

All Problems

All Solutions