Welcome to Subscribe On Youtube

3586. Find COVID Recovery Patients

Description

Table: patients

+-++
\| patient_id  \| int     \|
\| patient_name\| varchar \|
\| age         \| int     \|
+-++
\| Column Name \| Type    \|
+-++
test_id is the unique identifier for this table.
Each row represents a COVID test result. The result can be Positive, Negative, or Inconclusive.

Write a solution to find patients who have recovered from COVID - patients who tested positive but later tested negative.

  • A patient is considered recovered if they have at least one Positive test followed by at least one Negative test on a later date
  • Calculate the recovery time in days as the difference between the first positive test and the first negative test after that positive test
  • Only include patients who have both positive and negative test results

Return the result table ordered by recovery_time in ascending order, then by patient_name in ascending order.

The result format is in the following example.

 

Example:

Input:

patients table:

++--+--+
\| 1          \| Alice Smith  \| 28  \|
\| 2          \| Bob Johnson  \| 35  \|
\| 3          \| Carol Davis  \| 42  \|
\| 4          \| David Wilson \| 31  \|
\| 5          \| Emma Brown   \| 29  \|
++--++++--+
\| test_id \| patient_id \| test_date  \| result       \|
++++--+

Output:

++--++
\| patient_id \| patient_name \| age \| recovery_time \|
++--++
\| 1          \| Alice Smith  \| 28  \| 10            \|
\| 3          \| Carol Davis  \| 42  \| 10            \|
\| 2          \| Bob Johnson  \| 35  \| 11            \|
++--++

Explanation:

  • Alice Smith (patient_id = 1):
    • First positive test: 2023-01-15
    • First negative test after positive: 2023-01-25
    • Recovery time: 25 - 15 = 10 days
  • Bob Johnson (patient_id = 2):
    • First positive test: 2023-02-01
    • Inconclusive test on 2023-02-05 (ignored for recovery calculation)
    • First negative test after positive: 2023-02-12
    • Recovery time: 12 - 1 = 11 days
  • Carol Davis (patient_id = 3):
    • Had negative test on 2023-01-20 (before positive test)
    • First positive test: 2023-02-10
    • First negative test after positive: 2023-02-20
    • Recovery time: 20 - 10 = 10 days
  • Patients not included:
    • David Wilson (patient_id = 4): Only has positive tests, no negative test after positive
    • Emma Brown (patient_id = 5): Only has negative tests, never tested positive

Output table is ordered by recovery_time in ascending order, and then by patient_name in ascending order.

Solutions

Solution 1: Group Statistics + Equi-join

We can first find the date of the first positive test for each patient and record this in table first_positive. Next, we can find the date of the first negative test for each patient after their first positive test in the covid_tests table, and record this in table first_negative_after_positive. Finally, we join these two tables with the patients table, calculate the recovery time, and sort according to requirements.

  • import pandas as pd
    
    
    def find_covid_recovery_patients(
        patients: pd.DataFrame, covid_tests: pd.DataFrame
    ) -> pd.DataFrame:
        covid_tests["test_date"] = pd.to_datetime(covid_tests["test_date"])
    
        pos = (
            covid_tests[covid_tests["result"] == "Positive"]
            .groupby("patient_id", as_index=False)["test_date"]
            .min()
        )
        pos.rename(columns={"test_date": "first_positive_date"}, inplace=True)
    
        neg = covid_tests.merge(pos, on="patient_id")
        neg = neg[
            (neg["result"] == "Negative") & (neg["test_date"] > neg["first_positive_date"])
        ]
        neg = neg.groupby("patient_id", as_index=False)["test_date"].min()
        neg.rename(columns={"test_date": "first_negative_date"}, inplace=True)
    
        df = pos.merge(neg, on="patient_id")
        df["recovery_time"] = (
            df["first_negative_date"] - df["first_positive_date"]
        ).dt.days
    
        out = df.merge(patients, on="patient_id")[
            ["patient_id", "patient_name", "age", "recovery_time"]
        ]
        return out.sort_values(by=["recovery_time", "patient_name"]).reset_index(drop=True)
    
    
  • # Write your MySQL query statement below
    WITH
        first_positive AS (
            SELECT
                patient_id,
                MIN(test_date) AS first_positive_date
            FROM covid_tests
            WHERE result = 'Positive'
            GROUP BY patient_id
        ),
        first_negative_after_positive AS (
            SELECT
                t.patient_id,
                MIN(t.test_date) AS first_negative_date
            FROM
                covid_tests t
                JOIN first_positive p
                    ON t.patient_id = p.patient_id AND t.test_date > p.first_positive_date
            WHERE t.result = 'Negative'
            GROUP BY t.patient_id
        )
    SELECT
        p.patient_id,
        p.patient_name,
        p.age,
        DATEDIFF(n.first_negative_date, f.first_positive_date) AS recovery_time
    FROM
        first_positive f
        JOIN first_negative_after_positive n ON f.patient_id = n.patient_id
        JOIN patients p ON p.patient_id = f.patient_id
    ORDER BY recovery_time ASC, patient_name ASC;
    
    

All Problems

All Solutions