Welcome to Subscribe On Youtube

3808. Find Emotionally Consistent Users

Description

Table: reactions

+--++
\| user_id      \| int     \|
\| content_id   \| int     \|
\| reaction     \| varchar \|
+--++++++++-++-++-+----+

Explanation:

  • User 1:
    • Total reactions = 5
    • like appears 4 times
    • reaction_ratio = 4 / 5 = 0.80
    • Meets the 60% consistency requirement
  • User 2:
    • Total reactions = 5
    • Most frequent reaction appears only 2 times
    • reaction_ratio = 2 / 5 = 0.40
    • Does not meet the consistency requirement
  • User 3:
    • Total reactions = 5
    • 'love' appears 5 times
    • reaction_ratio = 5 / 5 = 1.00
    • Meets the consistency requirement

The Results table is ordered by reaction_ratio in descending order, then by user_id in ascending order.

</div>

Solutions

Solution 1: Grouping Statistics + Join Query

We first count the number of each reaction for every user and record it in a temporary table $t$. Then, based on the temporary table $t$, we calculate the maximum reaction count and total reaction count for each user, compute the reaction ratio, and filter out the users who meet the conditions, recording them in a temporary table $s$. Finally, we join the temporary tables $s$ and $t$ to find the dominant reaction for each user and sort the results as required.

  • import pandas as pd
    from decimal import Decimal, ROUND_HALF_UP
    
    
    def find_emotionally_consistent_users(reactions: pd.DataFrame) -> pd.DataFrame:
        t = reactions.groupby(["user_id", "reaction"]).size().reset_index(name="cnt")
    
        s = (
            t.groupby("user_id")
            .agg(mx_cnt=("cnt", "max"), total_cnt=("cnt", "sum"))
            .reset_index()
        )
    
        s["reaction_ratio"] = (
            s["mx_cnt"]
            .div(s["total_cnt"])
            .apply(
                lambda x: float(
                    Decimal(str(x)).quantize(Decimal("0.00"), rounding=ROUND_HALF_UP)
                )
            )
        )
    
        s = s[(s["reaction_ratio"] >= 0.60) & (s["total_cnt"] >= 5)]
    
        merged = pd.merge(
            s[["user_id", "mx_cnt", "reaction_ratio"]],
            t,
            left_on=["user_id", "mx_cnt"],
            right_on=["user_id", "cnt"],
        )
    
        result = (
            merged[["user_id", "reaction", "reaction_ratio"]]
            .rename(columns={"reaction": "dominant_reaction"})
            .sort_values(by=["reaction_ratio", "user_id"], ascending=[False, True])
            .reset_index(drop=True)
        )
    
        return result
    
    
  • # Write your MySQL query statement below
    WITH
        t AS (
            SELECT
                user_id,
                reaction,
                COUNT(1) cnt
            FROM reactions
            GROUP BY 1, 2
        ),
        s AS (
            SELECT
                user_id,
                MAX(cnt) mx_cnt,
                ROUND(MAX(cnt) / SUM(cnt), 2) reaction_ratio
            FROM t
            GROUP BY 1
            HAVING reaction_ratio >= 0.60 AND SUM(cnt) >= 5
        )
    SELECT user_id, reaction dominant_reaction, reaction_ratio
    FROM
        s
        JOIN t USING (user_id)
    WHERE cnt = mx_cnt
    ORDER BY 3 DESC, 1;
    
    

All Problems

All Solutions