Formatted question description: https://leetcode.ca/all/2156.html

2156. Find Substring With Given Hash Value (Medium)

The hash of a 0-indexed string s of length k, given integers p and m, is computed using the following function:

  • hash(s, p, m) = (val(s[0]) * p0 + val(s[1]) * p1 + ... + val(s[k-1]) * pk-1) mod m.

Where val(s[i]) represents the index of s[i] in the alphabet from val('a') = 1 to val('z') = 26.

You are given a string s and the integers power, modulo, k, and hashValue. Return sub, the first substring of s of length k such that hash(sub, power, modulo) == hashValue.

The test cases will be generated such that an answer always exists.

A substring is a contiguous non-empty sequence of characters within a string.

 

Example 1:

Input: s = "leetcode", power = 7, modulo = 20, k = 2, hashValue = 0
Output: "ee"
Explanation: The hash of "ee" can be computed to be hash("ee", 7, 20) = (5 * 1 + 5 * 7) mod 20 = 40 mod 20 = 0. 
"ee" is the first substring of length 2 with hashValue 0. Hence, we return "ee".

Example 2:

Input: s = "fbxzaad", power = 31, modulo = 100, k = 3, hashValue = 32
Output: "fbx"
Explanation: The hash of "fbx" can be computed to be hash("fbx", 31, 100) = (6 * 1 + 2 * 31 + 24 * 312) mod 100 = 23132 mod 100 = 32. 
The hash of "bxz" can be computed to be hash("bxz", 31, 100) = (2 * 1 + 24 * 31 + 26 * 312) mod 100 = 25732 mod 100 = 32. 
"fbx" is the first substring of length 3 with hashValue 32. Hence, we return "fbx".
Note that "bxz" also has a hash of 32 but it appears later than "fbx".

 

Constraints:

  • 1 <= k <= s.length <= 2 * 104
  • 1 <= power, modulo <= 109
  • 0 <= hashValue < modulo
  • s consists of lowercase English letters only.
  • The test cases are generated such that an answer always exists.

Similar Questions:

Solution 1. Rolling Hash

In Rabin-Karp algorithm, the hash function is “reversed” hash(s, p, m) = (val(s[0]) * p^(k-1) + val(s[1]) * p^(k-2) + ... + val(s[k-1]) * p^0) mod m.

During the contest, I felt that if I generate the hash from left to right, I need to do division and it might cause trouble with modulo.

To be safe, I reversed the string and calculated the hashes using the Rabin-Karp algorithm.

  • // OJ: https://leetcode.com/problems/find-substring-with-given-hash-value/
    
    // Time: O(N)
    // Space: O(N) which can be reduced to `O(1)` if we really calculate from right to left
    class Solution {
    public:
        string subStrHash(string s, int p, int mod, int k, int target) {
            long h = 0, N = s.size(), pp = 1; // `pp` = p^k
            vector<long> hash(N);
            string r(rbegin(s), rend(s));
            for (int i = 0; i < N; ++i) {
                if (i < k) pp = pp * p % mod;
                h = (h * p + (r[i] - 'a' + 1)) % mod; // push r[i] into the window
                if (i - k >= 0) { // pop r[i-k] out of the window
                    h = (h - (r[i - k] - 'a' + 1) * pp % mod + mod) % mod;
                }
                if (i >= k - 1) hash[i] = h;
            }
            reverse(begin(hash), end(hash));
            for (int i = 0; i < N; ++i) {
                if (hash[i] == target) return s.substr(i, k); // hash[i] is the hash of `s[i .. (i+k-1)]`
            }
            return "";
        }
    };
    
  • // Todo
    
  • # 2156. Find Substring With Given Hash Value
    # https://leetcode.com/problems/find-substring-with-given-hash-value
    
    class Solution:
        def subStrHash(self, s: str, power: int, M: int, k: int, hashValue: int) -> str:
            def val(x):
                return ord(x) - ord('a') + 1
            
            curr = 0
            b = 1
            res = n = len(s)
            
            for i in range(n - 1, -1, -1):
                curr = (curr * power + val(s[i])) % M
                
                if i + k >= n:
                    b = b * power % M
                else:
                    curr = (curr - val(s[i + k]) * b) % M
                
                if curr == hashValue:
                    res = i
            
            return s[res: res + k]
    
    

FAQ

1. Why could division + modulo be problematic?

Because when we do division to a modulo-ed number, we might get the wrong answer.

Assume mod = 100, a hash h’s real value is 123 and we want to divide it by 3. So the expected result should be 41. But in fact we need to do 123 % 100 / 3 which incorrectly returns 7.

2. Why do we need division if we calculate from left to right?

Assume s = "abcd", k = 3.

If we calculate from left to right, hash("abc") = a * p^0 + b * p^1 + c * p^2. To get hash("bcd"), we need to subtract a then divide by p then add d * p^2. Here comes the division.

If we calculate from right to left, we can leverage the Rabin-Karp algorithm which only requires multiplication. We get hash("dcb") = d * p^2 + c * p^1 + b * p^0 first. To get hash("abc"), we just need to multiply by p then add a then subtract d * p^3.

Discuss

https://leetcode.com/problems/find-substring-with-given-hash-value/discuss/1730114/

All Problems

All Solutions