Welcome to Subscribe On Youtube

3601. Find Drivers with Improved Fuel Efficiency

Description

Table: drivers

+-++
\| driver_id   \| int     \|
\| driver_name \| varchar \|
+-++++--+--+--+++
\| trip_id \| driver_id \| trip_date  \| distance_km \| fuel_consumed \|
+--++-+++

Output:

+++-++
\| driver_id \| driver_name   \| first_half_avg   \| second_half_avg   \| efficiency_improvement \|
+++-++
\| 2         \| Bob Smith     \| 11.24            \| 13.33             \| 2.10                   \|
\| 1         \| Alice Johnson \| 11.97            \| 14.02             \| 2.05                   \|
+++-++

Explanation:

  • Alice Johnson (driver_id = 1):
    • First half trips (Jan-Jun): Feb 15 (120.5/10.2 = 11.81), Mar 20 (200.0/16.5 = 12.12)
    • First half average efficiency: (11.81 + 12.12) / 2 = 11.97
    • Second half trips (Jul-Dec): Aug 10 (150.0/11.0 = 13.64), Sep 25 (180.0/12.5 = 14.40)
    • Second half average efficiency: (13.64 + 14.40) / 2 = 14.02
    • Efficiency improvement: 14.02 - 11.97 = 2.05
  • Bob Smith (driver_id = 2):
    • First half trips: Jan 10 (100.0/9.0 = 11.11), Apr 15 (250.0/22.0 = 11.36)
    • First half average efficiency: (11.11 + 11.36) / 2 = 11.24
    • Second half trips: Oct 5 (200.0/15.0 = 13.33)
    • Second half average efficiency: 13.33
    • Efficiency improvement: 13.33 - 11.24 = 2.10 (rounded to 2 decimal places)
  • Drivers not included:
    • Carol Davis (driver_id = 3): Only has trips in first half (Mar, May)
    • David Wilson (driver_id = 4): Only has trips in second half (Jul, Nov)
    • Emma Brown (driver_id = 5): Only has trips in first half (Feb)

The output table is ordered by efficiency improvement in descending order then by name in ascending order.

</div>

Solutions

Solution 1: Group Aggregation + Join Query

First, we perform group aggregation on the trips table to calculate the average fuel efficiency for each driver in the first half and the second half of the year.

Then, we join the results with the drivers table, filter out the drivers whose fuel efficiency has improved, and calculate the amount of improvement.

  • import pandas as pd
    
    
    def find_improved_efficiency_drivers(
        drivers: pd.DataFrame, trips: pd.DataFrame
    ) -> pd.DataFrame:
        trips = trips.copy()
        trips["trip_date"] = pd.to_datetime(trips["trip_date"])
        trips["half"] = trips["trip_date"].dt.month.apply(lambda m: 1 if m <= 6 else 2)
        trips["efficiency"] = trips["distance_km"] / trips["fuel_consumed"]
        half_avg = (
            trips.groupby(["driver_id", "half"])["efficiency"]
            .mean()
            .reset_index(name="half_avg")
        )
        pivot = half_avg.pivot(index="driver_id", columns="half", values="half_avg").rename(
            columns={1: "first_half_avg", 2: "second_half_avg"}
        )
        pivot = pivot.dropna()
        pivot = pivot[pivot["second_half_avg"] > pivot["first_half_avg"]]
        pivot["efficiency_improvement"] = (
            pivot["second_half_avg"] - pivot["first_half_avg"]
        ).round(2)
        pivot["first_half_avg"] = pivot["first_half_avg"].round(2)
        pivot["second_half_avg"] = pivot["second_half_avg"].round(2)
        result = pivot.reset_index().merge(drivers, on="driver_id")
        result = result.sort_values(
            by=["efficiency_improvement", "driver_name"], ascending=[False, True]
        )
        return result[
            [
                "driver_id",
                "driver_name",
                "first_half_avg",
                "second_half_avg",
                "efficiency_improvement",
            ]
        ]
    
    
  • # Write your MySQL query statement below
    WITH
        T AS (
            SELECT
                driver_id,
                AVG(distance_km / fuel_consumed) half_avg,
                CASE
                    WHEN MONTH(trip_date) <= 6 THEN 1
                    ELSE 2
                END half
            FROM trips
            GROUP BY driver_id, half
        )
    SELECT
        t1.driver_id,
        d.driver_name,
        ROUND(t1.half_avg, 2) first_half_avg,
        ROUND(t2.half_avg, 2) second_half_avg,
        ROUND(t2.half_avg - t1.half_avg, 2) efficiency_improvement
    FROM
        T t1
        JOIN T t2 ON t1.driver_id = t2.driver_id AND t1.half < t2.half AND t1.half_avg < t2.half_avg
        JOIN drivers d ON t1.driver_id = d.driver_id
    ORDER BY efficiency_improvement DESC, d.driver_name;
    
    

All Problems

All Solutions