Formatted question description: https://leetcode.ca/all/692.html

# 692. Top K Frequent Words

Medium

## Description

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

Input: [“i”, “love”, “leetcode”, “i”, “love”, “coding”], k = 2

Output: [“i”, “love”]

Explanation: “i” and “love” are the two most frequent words. Note that “i” comes before “love” due to a lower alphabetical order.

Example 2:

Input: [“the”, “day”, “is”, “sunny”, “the”, “the”, “the”, “sunny”, “is”, “is”], k = 4

Output: [“the”, “is”, “sunny”, “day”]

Explanation: “the”, “is”, “sunny” and “day” are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:

1. You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
2. Input words contain only lowercase letters.

1. Try to solve it in O(n log k) time and O(n) extra space.

## Solution

First loop over words and obtain each word’s number of occurrences. Use a map to store each word and its number of occurrences. Then use a priority queue to store each word and its number of occurrences, where the word with lowest number of occurrences or the highest alphabetical order is polled first. For each entry of the map, create an object from the entry and offer it into the priority queue. If the priority queue’s size is greater than k, then poll one element from the priority queue. This can make sure that there are at most k elements in the priority queue and the elements are the most frequent.

After all the words are checked, use a list to store all the words in the priority queue (in the order that they are polled from the priority queue). Then reverse the list and return.

• public class Top_K_Frequent_Words {

public static void main(String[] args) {
Top_K_Frequent_Words out = new Top_K_Frequent_Words();
Solution s = out.new Solution();

System.out.println(s.topKFrequent(new String[]{"the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"}, 4));
System.out.println(s.topKFrequent(new String[]{"i", "love", "leetcode", "i", "love", "coding"}, 2));
}

// ref: https://leetcode.com/articles/top-k-frequent-words/
// using a heap (PQ)
// Time Complexity: O(Nlogk), where N is the length of words.
//          We count the frequency of each word in O(N) time, then we add N words to the heap, each in O(logk) time.
//          Finally, we pop from the heap up to k times. As k≤N, this is O(Nlogk) in total.
// Space Complexity: O(N), the space used to store our count.
class Solution {
public List<String> topKFrequent(String[] words, int k) {
Map<String, Integer> count = new HashMap<>();
for (String word: words) {
count.put(word, count.getOrDefault(word, 0) + 1);
}
PriorityQueue<String> heap = new PriorityQueue<String>(
(w1, w2) -> count.get(w1).equals(count.get(w2)) ?
w2.compareTo(w1) : count.get(w1) - count.get(w2)
);

for (String word: count.keySet()) {
heap.offer(word);
if (heap.size() > k) heap.poll();
}

List<String> result = new ArrayList<>();
Collections.reverse(result);

return result;

//            // below also working
//            List<String> result = new ArrayList<>();
//            for (int i = k - 1; i >= 0; i--) {
//            }
//
//            return result;
}
}

}

• // OJ: https://leetcode.com/problems/top-k-frequent-words/
// Time: O(NlogK)
// Space: O(N)
class Solution {
public:
vector<string> topKFrequent(vector<string>& A, int k) {
unordered_map<string, int> m;
for (auto &s : A) m[s]++;
auto cmp = [&](auto &a, auto &b) { return m[a] == m[b] ? a < b : m[a] > m[b]; };
priority_queue<string, vector<string>, decltype(cmp)> pq(cmp);
for (auto &[s, cnt] : m) {
pq.push(s);
if (pq.size() > k) {
pq.pop();
}
}
vector<string> ans;
while (pq.size()) {
ans.push_back(pq.top());
pq.pop();
}
reverse(begin(ans), end(ans));
return ans;
}
};

• '''
>>> words = ["the","day","is","sunny","the","the","the","sunny","is","is"]
>>> count = collections.Counter(words)
>>> count
Counter({'the': 4, 'is': 3, 'sunny': 2, 'day': 1})
>>>
>>> count.items()
dict_items([('the', 4), ('day', 1), ('is', 3), ('sunny', 2)])
>>>
'''

class Solution(object):
def topKFrequent(self, words, k):
"""
:type words: List[str]
:type k: int
:rtype: List[str]
"""
count = collections.Counter(words)
def compare(x, y):
if x == y:
return cmp(x, y)
else:
return -cmp(x, y) # -cmp, so reversed order, bigger ones at front
return [x for x in sorted(count.items(), cmp = compare)[:k]]