Welcome to Subscribe On Youtube
3601. Find Drivers with Improved Fuel Efficiency
Description
Table: drivers
+-++ \| driver_id \| int \| \| driver_name \| varchar \| +-++++--+--+--+++ \| trip_id \| driver_id \| trip_date \| distance_km \| fuel_consumed \| +--++-+++
Output:
+++-++ \| driver_id \| driver_name \| first_half_avg \| second_half_avg \| efficiency_improvement \| +++-++ \| 2 \| Bob Smith \| 11.24 \| 13.33 \| 2.10 \| \| 1 \| Alice Johnson \| 11.97 \| 14.02 \| 2.05 \| +++-++
Explanation:
- Alice Johnson (driver_id = 1):
- First half trips (Jan-Jun): Feb 15 (120.5/10.2 = 11.81), Mar 20 (200.0/16.5 = 12.12)
- First half average efficiency: (11.81 + 12.12) / 2 = 11.97
- Second half trips (Jul-Dec): Aug 10 (150.0/11.0 = 13.64), Sep 25 (180.0/12.5 = 14.40)
- Second half average efficiency: (13.64 + 14.40) / 2 = 14.02
- Efficiency improvement: 14.02 - 11.97 = 2.05
- Bob Smith (driver_id = 2):
- First half trips: Jan 10 (100.0/9.0 = 11.11), Apr 15 (250.0/22.0 = 11.36)
- First half average efficiency: (11.11 + 11.36) / 2 = 11.24
- Second half trips: Oct 5 (200.0/15.0 = 13.33)
- Second half average efficiency: 13.33
- Efficiency improvement: 13.33 - 11.24 = 2.10 (rounded to 2 decimal places)
- Drivers not included:
- Carol Davis (driver_id = 3): Only has trips in first half (Mar, May)
- David Wilson (driver_id = 4): Only has trips in second half (Jul, Nov)
- Emma Brown (driver_id = 5): Only has trips in first half (Feb)
The output table is ordered by efficiency improvement in descending order then by name in ascending order.
</div>
Solutions
Solution 1: Group Aggregation + Join Query
First, we perform group aggregation on the trips
table to calculate the average fuel efficiency for each driver in the first half and the second half of the year.
Then, we join the results with the drivers
table, filter out the drivers whose fuel efficiency has improved, and calculate the amount of improvement.
-
import pandas as pd def find_improved_efficiency_drivers( drivers: pd.DataFrame, trips: pd.DataFrame ) -> pd.DataFrame: trips = trips.copy() trips["trip_date"] = pd.to_datetime(trips["trip_date"]) trips["half"] = trips["trip_date"].dt.month.apply(lambda m: 1 if m <= 6 else 2) trips["efficiency"] = trips["distance_km"] / trips["fuel_consumed"] half_avg = ( trips.groupby(["driver_id", "half"])["efficiency"] .mean() .reset_index(name="half_avg") ) pivot = half_avg.pivot(index="driver_id", columns="half", values="half_avg").rename( columns={1: "first_half_avg", 2: "second_half_avg"} ) pivot = pivot.dropna() pivot = pivot[pivot["second_half_avg"] > pivot["first_half_avg"]] pivot["efficiency_improvement"] = ( pivot["second_half_avg"] - pivot["first_half_avg"] ).round(2) pivot["first_half_avg"] = pivot["first_half_avg"].round(2) pivot["second_half_avg"] = pivot["second_half_avg"].round(2) result = pivot.reset_index().merge(drivers, on="driver_id") result = result.sort_values( by=["efficiency_improvement", "driver_name"], ascending=[False, True] ) return result[ [ "driver_id", "driver_name", "first_half_avg", "second_half_avg", "efficiency_improvement", ] ]
-
# Write your MySQL query statement below WITH T AS ( SELECT driver_id, AVG(distance_km / fuel_consumed) half_avg, CASE WHEN MONTH(trip_date) <= 6 THEN 1 ELSE 2 END half FROM trips GROUP BY driver_id, half ) SELECT t1.driver_id, d.driver_name, ROUND(t1.half_avg, 2) first_half_avg, ROUND(t2.half_avg, 2) second_half_avg, ROUND(t2.half_avg - t1.half_avg, 2) efficiency_improvement FROM T t1 JOIN T t2 ON t1.driver_id = t2.driver_id AND t1.half < t2.half AND t1.half_avg < t2.half_avg JOIN drivers d ON t1.driver_id = d.driver_id ORDER BY efficiency_improvement DESC, d.driver_name;