Welcome to Subscribe On Youtube
Question
Formatted question description: https://leetcode.ca/all/295.html
Level
Hard
Description
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
For example,
[2,3,4]
, the median is 3
[2,3]
, the median is (2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
- void addNum(int num) - Add a integer number from the data stream to the data structure.
- double findMedian() - Return the median of all elements so far.
Example:
addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3)
findMedian() -> 2
Follow up:
- If all integer numbers from the stream are between 0 and 100, how would you optimize it?
- If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
Algorithm
Since the data in the data stream is not in order
, we should first think of a way to make it in ordered. If we use a vector to save the data stream, we must sort the array every time a new data comes in, which is not efficient.
Use large and small heaps
to solve the problem, where the large heap holds the larger numbers in the right half, and the small heap holds the smaller numbers in the left half. In this way, the entire array is divided
into two sections in the middle. Since the heap is saved from large to small, we hope that the data in the large pile is from small to large, so it is convenient to take the first one to calculate the median.
Follow up
Q-1: If all integer numbers from the stream are between 0 and 100, how would you optimize it?
- We can maintain an integer array of length 100 to store the count of each number along with a total count. Then, we can iterate over the array to find the middle value to get our median. Time and space complexity would be
O(100) = O(1)
.
Q-2: If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
- In this case, we can keep a counter for
lessThanHundred
andgreaterThanHundred
. As we know the solution will be definitely in 0-100 we don’t need to know those number which are >100 or <0, only count of them will be enough.
Code
-
public class Find_Median_from_Data_Stream { public static void main(String[] args) { Find_Median_from_Data_Stream out = new Find_Median_from_Data_Stream(); MedianFinder mf = out.new MedianFinder(); mf.addNum(1); mf.addNum(2); System.out.println( mf.findMedian() ); // 1.5 mf.addNum(3); System.out.println( mf.findMedian() ); // 2 } /** * Your MedianFinder object will be instantiated and called as such: * * obj.addNum(num); * double param_2 = obj.findMedian(); */ // ref: https://leetcode.com/problems/find-median-from-data-stream/solution/ class MedianFinder { PriorityQueue<Integer> smallerHalf; // one more than larger half, or equal to larger half PriorityQueue<Integer> largerHalf; /** initialize your data structure here. */ public MedianFinder() { smallerHalf = new PriorityQueue<Integer>(new MaxComparator()); largerHalf = new PriorityQueue<Integer>(new MinComparator()); } // below addNum() with too many condition checks // @note: simple logic: always add to larger half, and if larger half 1 more than smaller half, then re-balance public void addNum(int num) { // 2 polls() and 3 offer(), total 5 operations, each logN => 5 * logN ~= logN // https://stackoverflow.com/questions/28819327/time-complexity-of-inserting-in-to-a-heap // always to smaller half first // can check if num bigger than smallerHalf.peek, to directly offer to biggerHalf smallerHalf.offer(num); // now smaller half get additional int, re-balance by smaller half giving its max to smaller half largerHalf.offer(smallerHalf.poll()); if (smallerHalf.size() < largerHalf.size()) { // only for the 1st time initial num being added smallerHalf.offer(largerHalf.poll()); } } public double findMedian() { if (smallerHalf.size() == largerHalf.size()) { return (smallerHalf.peek() + largerHalf.peek()) / 2.0; // @note: double division } else { // smaller half is one more return smallerHalf.peek(); } } class MaxComparator implements Comparator<Integer> { @Override public int compare(Integer o1, Integer o2) { return o2 - o1; //max first } } class MinComparator implements Comparator<Integer> { @Override public int compare(Integer o1, Integer o2) { return o1 - o2; // min first } } } class MedianFinder_NlogN { List<Integer> list; /** initialize your data structure here. */ public MedianFinder_NlogN() { list = new ArrayList<>(); } public void addNum(int num) { list.add(num); Collections.sort(list); // maintain a sorted list } public double findMedian() { int length = list.size(); if (length % 2 == 1) { return list.get(length / 2); } else { return (list.get(length / 2) + list.get(length / 2 -1)) / 2.0; // @note: if x / 2, then will be an int, 3/2==1, 3/2.0==1.5 } } } } ############ class MedianFinder { private PriorityQueue<Integer> q1 = new PriorityQueue<>(); private PriorityQueue<Integer> q2 = new PriorityQueue<>(Collections.reverseOrder()); /** initialize your data structure here. */ public MedianFinder() { } public void addNum(int num) { q1.offer(num); q2.offer(q1.poll()); if (q2.size() - q1.size() > 1) { q1.offer(q2.poll()); } } public double findMedian() { if (q2.size() > q1.size()) { return q2.peek(); } return (q1.peek() + q2.peek()) * 1.0 / 2; } } /** * Your MedianFinder object will be instantiated and called as such: * MedianFinder obj = new MedianFinder(); * obj.addNum(num); * double param_2 = obj.findMedian(); */
-
// OJ: https://leetcode.com/problems/find-median-from-data-stream/ // Time: // MedianFinder: O(1) // addNum: O(logN) // findMedian: O(1) // Space: O(N) class MedianFinder { priority_queue<int> sm; priority_queue<int, vector<int>, greater<>> gt; public: MedianFinder() {} void addNum(int num) { if (gt.size() && num > gt.top()) { gt.push(num); if (gt.size() > sm.size()) { sm.push(gt.top()); gt.pop(); } } else { sm.push(num); if (sm.size() > gt.size() + 1) { gt.push(sm.top()); sm.pop(); } } } double findMedian() { return sm.size() > gt.size() ? sm.top() : ((double)sm.top() + gt.top()) / 2; } };
-
from heapq import heappush, heappop class MedianFinder: def __init__(self): self.smallerHalf = [] # the smaller half of the list, max heap (invert min-heap) self.largerHalf = [] # the larger half of the list, min heap def addNum(self, num: int) -> None: # trick for smaller half, use -1*val # heapq in python does NOT have comparator like in Java heappush(self.smallerHalf, -num) heappush(self.largerHalf, -heappop(self.smallerHalf)) # note: not self.smallerHalf.pop() if len(self.smallerHalf) < len(self.largerHalf): heappush(self.smallerHalf, -heappop(self.largerHalf)) def findMedian(self) -> float: if len(self.smallerHalf) == len(self.largerHalf): return (-self.smallerHalf[0] + self.largerHalf[0]) / 2.0 else: return float(-self.smallerHalf[0]) # Your MedianFinder object will be instantiated and called as such: # obj = MedianFinder() # obj.addNum(num) # param_2 = obj.findMedian() # follow up class MedianFinder: def __init__(self): self.counts = [0] * 101 self.total = 0 def addNum(self, num: int) -> None: self.counts[num] += 1 self.total += 1 def findMedian(self) -> float: if self.total % 2 == 0: # even number of elements middle1 = self.total // 2 middle2 = middle1 + 1 count = 0 i1 = i2 = 0 for i in range(101): count += self.counts[i] if count >= middle1 and i1 == 0: i1 = i if count >= middle2: i2 = i break return (i1 + i2) / 2 else: # odd number of elements middle = self.total // 2 + 1 count = 0 for i in range(101): count += self.counts[i] if count >= middle: return i ############ class MedianFinder: def __init__(self): """ initialize your data structure here. """ self.h1 = [] self.h2 = [] def addNum(self, num: int) -> None: heappush(self.h1, num) heappush(self.h2, -heappop(self.h1)) if len(self.h2) - len(self.h1) > 1: heappush(self.h1, -heappop(self.h2)) def findMedian(self) -> float: if len(self.h2) > len(self.h1): return -self.h2[0] return (self.h1[0] - self.h2[0]) / 2 # Your MedianFinder object will be instantiated and called as such: # obj = MedianFinder() # obj.addNum(num) # param_2 = obj.findMedian()
-
type MedianFinder struct { q1 hp q2 hp } /** initialize your data structure here. */ func Constructor() MedianFinder { return MedianFinder{hp{}, hp{} } } func (this *MedianFinder) AddNum(num int) { heap.Push(&this.q1, num) heap.Push(&this.q2, -heap.Pop(&this.q1).(int)) if this.q2.Len()-this.q1.Len() > 1 { heap.Push(&this.q1, -heap.Pop(&this.q2).(int)) } } func (this *MedianFinder) FindMedian() float64 { if this.q2.Len() > this.q1.Len() { return -float64(this.q2.IntSlice[0]) } return float64(this.q1.IntSlice[0]-this.q2.IntSlice[0]) / 2.0 } /** * Your MedianFinder object will be instantiated and called as such: * obj := Constructor(); * obj.AddNum(num); * param_2 := obj.FindMedian(); */ type hp struct{ sort.IntSlice } func (h hp) Less(i, j int) bool { return h.IntSlice[i] < h.IntSlice[j] } func (h *hp) Push(v interface{}) { h.IntSlice = append(h.IntSlice, v.(int)) } func (h *hp) Pop() interface{} { a := h.IntSlice v := a[len(a)-1] h.IntSlice = a[:len(a)-1] return v }
-
class MedianFinder { private nums: number[]; constructor() { this.nums = []; } addNum(num: number): void { const { nums } = this; let l = 0; let r = nums.length; while (l < r) { const mid = (l + r) >>> 1; if (nums[mid] < num) { l = mid + 1; } else { r = mid; } } nums.splice(l, 0, num); } findMedian(): number { const { nums } = this; const n = nums.length; if ((n & 1) === 1) { return nums[n >> 1]; } return (nums[n >> 1] + nums[(n >> 1) - 1]) / 2; } } /** * Your MedianFinder object will be instantiated and called as such: * var obj = new MedianFinder() * obj.addNum(num) * var param_2 = obj.findMedian() */
-
/** * initialize your data structure here. */ var MedianFinder = function () { this.val = []; }; /** * @param {number} num * @return {void} */ MedianFinder.prototype.addNum = function (num) { let left = 0; let right = this.val.length; while (left < right) { let mid = left + ~~((right - left) / 2); if (num > this.val[mid]) { left = mid + 1; } else { right = mid; } } this.val.splice(left, 0, num); }; /** * @return {number} */ MedianFinder.prototype.findMedian = function () { let mid = ~~(this.val.length / 2); return this.val.length % 2 ? this.val[mid] : (this.val[mid - 1] + this.val[mid]) / 2; };
-
public class MedianFinder { private List<int> nums; private int curIndex; /** initialize your data structure here. */ public MedianFinder() { nums = new List<int>(); } private int FindIndex(int val) { int left = 0; int right = nums.Count - 1; while (left <= right) { int mid = left + (right - left) / 2; if (val > nums[mid]) { left = mid + 1; } else { right = mid - 1; } } return left; } public void AddNum(int num) { if (nums.Count == 0) { nums.Add(num); curIndex = 0; } else { curIndex = FindIndex(num); if (curIndex == nums.Count) { nums.Add(num); } else { nums.Insert(curIndex, num); } } } public double FindMedian() { if (nums.Count % 2 == 1) { return (double)nums[nums.Count / 2]; } else { if (nums.Count == 0) { return 0; } else { return (double) (nums[nums.Count / 2 - 1] + nums[nums.Count / 2]) / 2; } } } } /** * Your MedianFinder object will be instantiated and called as such: * MedianFinder obj = new MedianFinder(); * obj.AddNum(num); * double param_2 = obj.FindMedian(); */
-
struct MedianFinder { nums: Vec<i32>, } /** * `&self` means the method takes an immutable reference. * If you need a mutable reference, change it to `&mut self` instead. */ impl MedianFinder { /** initialize your data structure here. */ fn new() -> Self { Self { nums: Vec::new() } } fn add_num(&mut self, num: i32) { let mut l = 0; let mut r = self.nums.len(); while l < r { let mid = l + r >> 1; if self.nums[mid] < num { l = mid + 1; } else { r = mid; } } self.nums.insert(l, num); } fn find_median(&self) -> f64 { let n = self.nums.len(); if (n & 1) == 1 { return f64::from(self.nums[n >> 1]); } f64::from(self.nums[n >> 1] + self.nums[(n >> 1) - 1]) / 2.0 } } /** * Your MedianFinder object will be instantiated and called as such: * let obj = MedianFinder::new(); * obj.add_num(num); * let ret_2: f64 = obj.find_median(); */