Welcome to Subscribe On Youtube
Formatted question description: https://leetcode.ca/all/940.html
940. Distinct Subsequences II (Hard)
Given a string S
, count the number of distinct, non-empty subsequences of S
.
Since the result may be large, return the answer modulo 10^9 + 7
.
Example 1:
Input: “abc”
Output: 7
Explanation: The 7 distinct subsequences are “a”, “b”, “c”, “ab”, “ac”, “bc”, and “abc”.
Example 2:
Input: “aba”
Output: 6
Explanation: The 6 distinct subsequences are “a”, “b”, “ab”, “ba”, “aa” and “aba”.
Example 3:
Input: “aaa”
Output: 3
Explanation: The 3 distinct subsequences are “a”, “aa” and “aaa”.
Note:
S
contains only lowercase letters.1 <= S.length <= 2000
Thoughts
There are 2^N - 1
subsequences because for each letter you can either choose or not choose it which results in 2^N
and -1
is to remove empty string.
One naive solution is to generate all the subsequences and deduplicate using unordered_set
. But this is too time and space inefficient.
Can we reduce the complexity?
Try divide and conquer?
Split S
into S1
and S2
.
Example:
aaab
=> aa
and ab
R(aaab)
= a, aa, aaa, ab, aab, aaab
R(aa)
= a, aa
R(ab)
= a, b, ab
R(aaab)
is not a simple addition or multiplication of R(aa)
and R(ab)
.
This idea doesn’t seem to work :(
Solution 1. DP
Try consume the letters one by one?
When we visit a new letter S[i]
, we have two options – pick it or don’t pick it. Denote the results as P(i)
and N(i)
respectively.
- If don’t pick, then
N(i) = N(i - 1) + P(i - 1)
. - If pick, we can try to append
S[i]
to all the subsequences previously generated, yieldingN(i - 1) + P(i - 1)
subsequences. But all the previously generated subsequences ending with the same letter asS[i]
will be duplicate with the subsequences ending with exactly thisS[i]
. SoP(i) = N(i - 1) + P(i - 1) - Sum{P(k) | where S[k] == S[i]}
.
Let’s use 2-d array dp
to store the results. dp[1][i]
means “pick S[i - 1]
” while dp[0][i]
means “don’t pick S[i - 1]
”, 1 <= i <= N
.
// For 1 <= i <= N
dp[0][i] = dp[0][i - 1] + dp[1][i - 1]
dp[1][i] = dp[0][i] - Sum{ dp[1][k] | where S[k - 1] == S[i - 1] }
dp[0][0] = 1
dp[1][0] = 0
It doesn’t matter dp[0][0]
or dp[0][1]
as long as only one equals 1.
// OJ: https://leetcode.com/problems/distinct-subsequences-ii/
// Time: O(N^2)
// Space: O(N)
class Solution {
public:
int distinctSubseqII(string S) {
int N = S.size(), mod = pow(10, 9) + 7;
vector<vector<long long>> dp(2, vector<long long>(N + 1, 0));
dp[0][0] = 1;
for (int i = 1; i <= N; ++i) {
dp[0][i] = (dp[0][i - 1] + dp[1][i - 1]) % mod;
dp[1][i] = dp[0][i];
for (int j = 1; j < i; ++j) {
if (S[i - 1] == S[j - 1]) dp[1][i] = (dp[1][i] - dp[1][j] + mod) % mod;
}
}
return (dp[0][N] + dp[1][N] - 1) % mod;
}
};
Solution 2.
In the formula of Solution 1, we notice that dp[?][i]
almost just rely on the dp[?][i - 1]
.
The dp
array is mainly for computing the the sum of duplicates. And the iterative summation
If we store the sum of previous dp[1][k]
where S[k - 1] == S[i - 1]
, we can reduce the 2N
extra space to constant space.
// OJ: https://leetcode.com/problems/distinct-subsequences-ii/
// Time: O(N)
// Space: O(1)
class Solution {
public:
int distinctSubseqII(string S) {
int N = S.size(), mod = pow(10, 9) + 7;
int picks[26] = {0};
int pick = 0, noPick = 1;
for (int i = 0; i < N; ++i) {
noPick = (noPick + pick) % mod;
pick = (noPick - picks[S[i] - 'a'] + mod) % mod;
picks[S[i] - 'a'] = (picks[S[i] - 'a'] + pick) % mod;
}
return (pick + noPick - 1) % mod;
}
};
Solution 3.
Same idea as Solution 2, but shorter and unnoticeably slower (it’s O(26N)
while solution 2 is real O(N)
).
// OJ: https://leetcode.com/problems/distinct-subsequences-ii/
// Time: O(N)
// Space: O(1)
// Ref: https://leetcode.com/problems/distinct-subsequences-ii/discuss/192017/C%2B%2BJavaPython-4-lines-O(N)-Time-O(1)-Space
class Solution {
public:
int distinctSubseqII(string S) {
long endsWith[26] = {}, mod = 1e9 + 7;
for (char c : S)
endsWith[c - 'a'] = accumulate(begin(endsWith), end(endsWith), 1L) % mod;
return accumulate(begin(endsWith), end(endsWith), 0L) % mod;
}
};
Solution 4
In solution 2 we notice that we always sum pick
and noPick
, meaning we only care about the sum.
// OJ: https://leetcode.com/problems/distinct-subsequences-ii/
// Time: O(N)
// Space: O(1)
// Ref: https://leetcode.com/problems/distinct-subsequences-ii/discuss/192095/C%2B%2B-O(n)-or-O-(n)-Geeks4Geeks-improved-to-O(n)-or-O(1)
class Solution {
public:
int distinctSubseqII(string S) {
int mod = pow(10, 9) + 7, picks[26] = {0}, sum = 1;
for (char c : S) {
int oldSum = sum;
sum = (sum * 2 % mod - picks[c - 'a'] + mod) % mod;
picks[c - 'a'] = oldSum;
}
return sum - 1;
}
};
-
class Solution { public int distinctSubseqII(String S) { final int MODULO = 1000000007; int length = S.length(); int[] dp = new int[length + 1]; dp[0] = 1; int[] last = new int[26]; Arrays.fill(last, -1); for (int i = 0; i < length; i++) { int index = S.charAt(i) - 'a'; dp[i + 1] = dp[i] * 2 % MODULO; if (last[index] >= 0) dp[i + 1] -= dp[last[index]]; dp[i + 1] = (dp[i + 1] + MODULO) % MODULO; last[index] = i; } int distinctSubsequences = (dp[length] - 1 + MODULO) % MODULO; return distinctSubsequences; } } ############ class Solution { private static final int MOD = (int) 1e9 + 7; public int distinctSubseqII(String s) { int[] dp = new int[26]; int ans = 0; for (int i = 0; i < s.length(); ++i) { int j = s.charAt(i) - 'a'; int add = (ans - dp[j] + 1) % MOD; ans = (ans + add) % MOD; dp[j] = (dp[j] + add) % MOD; } return (ans + MOD) % MOD; } }
-
// OJ: https://leetcode.com/problems/distinct-subsequences-ii/ // Time: O(N^2) // Space: O(N) class Solution { public: int distinctSubseqII(string S) { int N = S.size(), mod = pow(10, 9) + 7; vector<vector<long long>> dp(2, vector<long long>(N + 1, 0)); dp[0][0] = 1; for (int i = 1; i <= N; ++i) { dp[0][i] = (dp[0][i - 1] + dp[1][i - 1]) % mod; dp[1][i] = dp[0][i]; for (int j = 1; j < i; ++j) { if (S[i - 1] == S[j - 1]) dp[1][i] = (dp[1][i] - dp[1][j] + mod) % mod; } } return (dp[0][N] + dp[1][N] - 1) % mod; } };
-
class Solution: def distinctSubseqII(self, s: str) -> int: mod = 10**9 + 7 dp = [0] * 26 ans = 0 for c in s: i = ord(c) - ord('a') add = ans - dp[i] + 1 ans = (ans + add) % mod dp[i] += add return ans ############ class Solution(object): def distinctSubseqII(self, S): """ :type S: str :rtype: int """ nums = [0] * 26 for s in S: nums[ord(s) - ord("a")] = (sum(nums) + 1) % (10 ** 9 + 7) return sum(nums) % (10 ** 9 + 7)
-
func distinctSubseqII(s string) int { const mod int = 1e9 + 7 dp := make([]int, 26) ans := 0 for _, c := range s { c -= 'a' add := ans - dp[c] + 1 ans = (ans + add) % mod dp[c] = (dp[c] + add) % mod } return (ans + mod) % mod }
-
function distinctSubseqII(s: string): number { const mod = 1e9 + 7; const dp = new Array(26).fill(0); for (const c of s) { dp[c.charCodeAt(0) - 'a'.charCodeAt(0)] = dp.reduce((r, v) => (r + v) % mod, 0) + 1; } return dp.reduce((r, v) => (r + v) % mod, 0); }
-
impl Solution { pub fn distinct_subseq_ii(s: String) -> i32 { const MOD: i32 = 1e9 as i32 + 7; let mut dp = [0; 26]; for u in s.as_bytes() { let i = (u - &b'a') as usize; dp[i] = { let mut sum = 0; dp.iter().for_each(|&v| sum = (sum + v) % MOD); sum } + 1; } let mut res = 0; dp.iter().for_each(|&v| res = (res + v) % MOD); res } }