Welcome to Subscribe On Youtube

3055. Top Percentile Fraud

Description

Table: Fraud

+-------------+---------+
\| Column Name \| Type    \|
+-------------+---------+
\| policy_id   \| int     \|
\| state       \| varchar \|
\| fraud_score \| int     \|
+-------------+---------+
policy_id is column of unique values for this table.
This table contains policy id, state, and fraud score.

The Leetcode Insurance Corp has developed an ML-driven predictive model to detect the likelihood of fraudulent claims. Consequently, they allocate their most seasoned claim adjusters to address the top 5% of claims flagged by this model.

Write a solution to find the top 5 percentile of claims from each state.

Return the result table ordered by state in ascending order, fraud_score in descending order, and policy_id in ascending order.

The result format is in the following example.

 

Example 1:

Input: 
Fraud table:
+-----------+------------+-------------+
\| policy_id \| state      \| fraud_score \| 
+-----------+------------+-------------+
\| 1         \| California \| 0.92        \| 
\| 2         \| California \| 0.68        \|   
\| 3         \| California \| 0.17        \| 
\| 4         \| New York   \| 0.94        \| 
\| 5         \| New York   \| 0.81        \| 
\| 6         \| New York   \| 0.77        \|  
\| 7         \| Texas      \| 0.98        \|  
\| 8         \| Texas      \| 0.97        \| 
\| 9         \| Texas      \| 0.96        \| 
\| 10        \| Florida    \| 0.97        \|  
\| 11        \| Florida    \| 0.98        \| 
\| 12        \| Florida    \| 0.78        \| 
\| 13        \| Florida    \| 0.88        \| 
\| 14        \| Florida    \| 0.66        \| 
+-----------+------------+-------------+
Output: 
+-----------+------------+-------------+
\| policy_id \| state      \| fraud_score \|
+-----------+------------+-------------+
\| 1         \| California \| 0.92        \| 
\| 11        \| Florida    \| 0.98        \| 
\| 4         \| New York   \| 0.94        \| 
\| 7         \| Texas      \| 0.98        \|  
+-----------+------------+-------------+
Explanation
- For the state of California, only policy ID 1, with a fraud score of 0.92, falls within the top 5 percentile for this state.
- For the state of Florida, only policy ID 11, with a fraud score of 0.98, falls within the top 5 percentile for this state. 
- For the state of New York, only policy ID 4, with a fraud score of 0.94, falls within the top 5 percentile for this state. 
- For the state of Texas, only policy ID 7, with a fraud score of 0.98, falls within the top 5 percentile for this state. 
Output table is ordered by state in ascending order, fraud score in descending order, and policy ID in ascending order.

Solutions

Solution 1: Using Window Function

We can use the RANK() window function to calculate the ranking of fraud scores for each state, then filter out the records with a rank of 1, and sort them as required by the problem.

  • # Write your MySQL query statement below
    WITH
        T AS (
            SELECT
                *,
                RANK() OVER (
                    PARTITION BY state
                    ORDER BY fraud_score DESC
                ) AS rk
            FROM Fraud
        )
    SELECT policy_id, state, fraud_score
    FROM T
    WHERE rk = 1
    ORDER BY 2, 3 DESC, 1;
    
    

All Problems

All Solutions