Welcome to Subscribe On Youtube

3278. Find Candidates for Data Scientist Position II 🔒

Description

Table: Candidates

+--++ 
\| candidate_id \| int     \| 
\| skill        \| varchar \|
\| proficiency  \| int     \|
+--++ 
\| Column Name  \| Type    \| 
+--++
(project_id, skill) is the primary key for this table.
Each row includes project_id, required skill, and its importance (1-5) for the project.

Leetcode is staffing for multiple data science projects. Write a solution to find the best candidate for each project based on the following criteria:

  1. Candidates must have all the skills required for a project.
  2. Calculate a score for each candidate-project pair as follows:
    • Start with 100 points
    • Add 10 points for each skill where proficiency > importance
    • Subtract 5 points for each skill where proficiency < importance

Include only the top candidate (highest score) for each project. If there’s a tie, choose the candidate with the lower candidate_id. If there is no suitable candidate for a project, do not return that project.

Return a result table ordered by project_id in ascending order.

The result format is in the following example.

 

Example:

Input:

Candidates table:

+--+--+-+
\| 101          \| Python    \| 5           \|
\| 101          \| Tableau   \| 3           \|
\| 101          \| PostgreSQL\| 4           \|
\| 101          \| TensorFlow\| 2           \|
\| 102          \| Python    \| 4           \|
\| 102          \| Tableau   \| 5           \|
\| 102          \| PostgreSQL\| 4           \|
\| 102          \| R         \| 4           \|
\| 103          \| Python    \| 3           \|
\| 103          \| Tableau   \| 5           \|
\| 103          \| PostgreSQL\| 5           \|
\| 103          \| Spark     \| 4           \|
+--+--++
\| project_id  \| skill     \| importance \|
+-+--++

Output:

+-+--+-+
\| project_id  \| candidate_id \| score \|
+-+--+-+
\| 501         \| 101          \| 105   \|
\| 502         \| 102          \| 130   \|
+-+--+-+

Explanation:

  • For Project 501, Candidate 101 has the highest score of 105. All other candidates have the same score but Candidate 101 has the lowest candidate_id among them.
  • For Project 502, Candidate 102 has the highest score of 130.

The output table is ordered by project_id in ascending order.

Solutions

Solution 1: Equi-Join + Group Statistics + Window Function

We can perform an equi-join of the Candidates table and the Projects table on the skill column, counting the number of matched skills and calculating the total score for each candidate in each project, which is recorded in table S.

Next, we count the required number of skills for each project, recording the results in table T.

Then, we perform an equi-join of tables S and T on the project_id column, filtering out candidates whose number of matched skills equals the required number of skills, and recording them in table P. We calculate the rank (rk) for each candidate within each project.

Finally, we filter out the candidates with rank $rk = 1$ for each project, identifying them as the best candidates.

  • WITH
        S AS (
            SELECT
                candidate_id,
                project_id,
                COUNT(*) matched_skills,
                SUM(
                    CASE
                        WHEN proficiency > importance THEN 10
                        WHEN proficiency < importance THEN -5
                        ELSE 0
                    END
                ) + 100 AS score
            FROM
                Candidates
                JOIN Projects USING (skill)
            GROUP BY 1, 2
        ),
        T AS (
            SELECT project_id, COUNT(1) required_skills
            FROM Projects
            GROUP BY 1
        ),
        P AS (
            SELECT
                project_id,
                candidate_id,
                score,
                RANK() OVER (
                    PARTITION BY project_id
                    ORDER BY score DESC, candidate_id
                ) rk
            FROM
                S
                JOIN T USING (project_id)
            WHERE matched_skills = required_skills
        )
    SELECT project_id, candidate_id, score
    FROM P
    WHERE rk = 1
    ORDER BY 1;
    
    

All Problems

All Solutions