Formatted question description: https://leetcode.ca/all/692.html

692. Top K Frequent Words

Level

Medium

Description

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:

Input: [“i”, “love”, “leetcode”, “i”, “love”, “coding”], k = 2

Output: [“i”, “love”]

Explanation: “i” and “love” are the two most frequent words. Note that “i” comes before “love” due to a lower alphabetical order.

Example 2:

Input: [“the”, “day”, “is”, “sunny”, “the”, “the”, “the”, “sunny”, “is”, “is”], k = 4

Output: [“the”, “is”, “sunny”, “day”]

Explanation: “the”, “is”, “sunny” and “day” are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:

You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.

Follow up:

Try to solve it in O(n log k) time and O(n) extra space.

Solution

Similar to 347-Top-K-Frequent-Elements

First, iterate over the words array to calculate the frequency of each word.

Utilize a map to associate each word with its frequency count. Subsequently, employ a priority queue to manage the words and their frequencies, ensuring that words with the lowest frequency or those higher in alphabetical order are prioritized for removal.

For every entry in the map, construct an object representing the entry and insert it into the priority queue. If the priority queue exceeds k in size, remove one element. This process guarantees that the priority queue retains at most k elements, specifically those that are most frequent.

Once all words have been processed, transfer the elements from the priority queue to a list, maintaining the order of retrieval. Finally, reverse this list to achieve the desired order and return it.

public class Top_K_Frequent_Words {

    public static void main(String[] args) {
        Top_K_Frequent_Words out = new Top_K_Frequent_Words();
        Solution s = out.new Solution();

        System.out.println(s.topKFrequent(new String[]{"the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"}, 4));
        System.out.println(s.topKFrequent(new String[]{"i", "love", "leetcode", "i", "love", "coding"}, 2));
    }

    // ref: https://leetcode.com/articles/top-k-frequent-words/
    // using a heap (PQ)
    // Time Complexity: O(Nlogk), where N is the length of words.
    //          We count the frequency of each word in O(N) time, then we add N words to the heap, each in O(logk) time.
    //          Finally, we pop from the heap up to k times. As k≤N, this is O(Nlogk) in total.
    // Space Complexity: O(N), the space used to store our count.
    class Solution {
        public List<String> topKFrequent(String[] words, int k) {
            Map<String, Integer> count = new HashMap<>();
            for (String word: words) {
                count.put(word, count.getOrDefault(word, 0) + 1);
            }
            PriorityQueue<String> heap = new PriorityQueue<String>(
                (w1, w2) -> count.get(w1).equals(count.get(w2)) ?
                    w2.compareTo(w1) : count.get(w1) - count.get(w2)
            );

            for (String word: count.keySet()) {
                heap.offer(word);
                if (heap.size() > k) heap.poll();
            }

            List<String> result = new ArrayList<>();
            while (!heap.isEmpty()) result.add(heap.poll());
            Collections.reverse(result);

            return result;

//            // below also working
//            List<String> result = new ArrayList<>();
//            for (int i = k - 1; i >= 0; i--) {
//                result.add(0, heap.poll());
//            }
//
//            return result;
        }
    }

}

############

class Solution {
    public List<String> topKFrequent(String[] words, int k) {
        Map<String, Integer> cnt = new HashMap<>();
        for (String v : words) {
            cnt.put(v, cnt.getOrDefault(v, 0) + 1);
        }
        PriorityQueue<String> q = new PriorityQueue<>((a, b) -> {
            int d = cnt.get(a) - cnt.get(b);
            return d == 0 ? b.compareTo(a) : d;
        });
        for (String v : cnt.keySet()) {
            q.offer(v);
            if (q.size() > k) {
                q.poll();
            }
        }
        LinkedList<String> ans = new LinkedList<>();
        while (!q.isEmpty()) {
            ans.addFirst(q.poll());
        }
        return ans;
    }
}

// OJ: https://leetcode.com/problems/top-k-frequent-words/
// Time: O(NlogK)
// Space: O(N)
class Solution {
public:
    vector<string> topKFrequent(vector<string>& A, int k) {
        unordered_map<string, int> m;
        for (auto &s : A) m[s]++;
        auto cmp = [&](auto &a, auto &b) { return m[a] == m[b] ? a < b : m[a] > m[b]; };
        priority_queue<string, vector<string>, decltype(cmp)> pq(cmp);
        for (auto &[s, cnt] : m) {
            pq.push(s);
            if (pq.size() > k) {
                pq.pop();
            }
        }
        vector<string> ans;
        while (pq.size()) {
            ans.push_back(pq.top());
            pq.pop();
        }
        reverse(begin(ans), end(ans));
        return ans;
    }
};

'''
>>> sorted(student_objects, key=attrgetter('grade', 'age'))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]

ref: https://docs.python.org/3/howto/sorting.html#operator-module-functions
'''
class Solution:
    def topKFrequent(self, words: List[str], k: int) -> List[str]:
        cnt = Counter(words)
        # multiple comparators "lambda x: (-cnt[x], x)"
        return sorted(cnt, key=lambda x: (-cnt[x], x))[:k]
        # why the returned sorted() not a tuple (word, count), but only word?
        # it's the property of Counter() class
        # so, also pass OJ: return sorted(cnt.keys(), key=lambda x: (-cnt[x], x))[:k]

'''
>>> words = ["i","love","leetcode","i","love","coding"]
>>> k = 2
>>> cnt = Counter(words)
>>> cnt
Counter({'i': 2, 'love': 2, 'leetcode': 1, 'coding': 1})

>>> sorted(cnt, key=lambda x: (-cnt[x], x))
['i', 'love', 'coding', 'leetcode']

# default, only for keys()
>>> sorted(cnt)
['coding', 'i', 'leetcode', 'love']
>>> sorted(cnt.keys())
['coding', 'i', 'leetcode', 'love']
>>> sorted(cnt.values())
[1, 1, 2, 2]
>>> sorted(cnt.items())
[('coding', 1), ('i', 2), ('leetcode', 1), ('love', 2)]
'''


############

'''
>>> words = ["the","day","is","sunny","the","the","the","sunny","is","is"]
>>> count = collections.Counter(words)
>>> count
Counter({'the': 4, 'is': 3, 'sunny': 2, 'day': 1})
>>>
>>> count.items()
dict_items([('the', 4), ('day', 1), ('is', 3), ('sunny', 2)])
>>>
'''

'''
cmp doesn't exist in Python 3. If you really want it, you could define it yourself:

def cmp(a, b):
    return (a > b) - (a < b)

'''
class Solution(object): #python-2
    def topKFrequent(self, words, k):
        """
        :type words: List[str]
        :type k: int
        :rtype: List[str]
        """
        count = collections.Counter(words)

        # https://docs.python.org/3/howto/sorting.html#comparison-functions
        def compare(x, y):
            if x[1] == y[1]:
                return cmp(x[0], y[0])
            else:
                return -cmp(x[1], y[1]) # -cmp, so reversed order, bigger ones at front
        return [x[0] for x in sorted(count.items(), cmp = compare)[:k]]

func topKFrequent(words []string, k int) []string {
	cnt := map[string]int{}
	for _, v := range words {
		cnt[v]++
	}
	ans := []string{}
	for v := range cnt {
		ans = append(ans, v)
	}
	sort.Slice(ans, func(i, j int) bool {
		a, b := ans[i], ans[j]
		return cnt[a] > cnt[b] || cnt[a] == cnt[b] && a < b
	})
	return ans[:k]
}

692 - Top K Frequent Words

692. Top K Frequent Words

Level

Description

Solution

All Problems

All Solutions

Welcome to Subscribe On Youtube

692. Top K Frequent Words

Level

Description

Solution

All Problems

All Solutions