Welcome to Subscribe On Youtube
Question
Formatted question description: https://leetcode.ca/all/392.html
Given two strings s and t, return true if s is a subsequence of t, or false otherwise.
A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ace" is a subsequence of "abcde" while "aec" is not).
Example 1:
Input: s = "abc", t = "ahbgdc" Output: true
Example 2:
Input: s = "axc", t = "ahbgdc" Output: false
Constraints:
0 <= s.length <= 1000 <= t.length <= 104sandtconsist only of lowercase English letters.
Follow up: Suppose there are lots of incoming s, say s1, s2, ..., sk where k >= 109, and you want to check one by one to see if t has its subsequence. In this scenario, how would you change your code?
Algorithm
We define two pointers $i$ and $j$ to point to the initial position of the string $s$ and $t$ respectively. Each time we compare the two characters pointed to by the two pointers, if they are the same, both pointers move right at the same time; if they are not the same, only $j$ moves right. When the pointer $i$ moves to the end of the string $s$, it means that $s$ is the subsequence of $t$.
The time complexity is $O(m + n)$, where $m$ and $n$ are the lengths of the strings $s$ and $t$ respectively. The space complexity is $O(1)$.
Follow up - lots of incoming s
The follow-up question in the topic suggests optimizing the process for handling a large number of strings s, given that the string t remains constant. To enhance efficiency, we can preprocess t by establishing a direct mapping from each character in t to its positions within the string. This approach is particularly useful because, although subsequences do not need to be contiguous, the order of characters must be maintained. Here’s an improved explanation:
Preprocessing t
Given that t may contain repeated characters, we store all occurrences of each character in arrays, maintaining their order. This is achieved by using a HashMap, where each key is a character from t, and the corresponding value is an array listing all positions of that character in t.
Processing Strings s
When processing each character in a string s, we perform the following steps:
-
Character Mapping: For each character
cins, we refer to the HashMap to find the array of positions wherecoccurs int. -
Binary Search: Given that these position arrays are sorted, we utilize binary search to expedite the lookup process. This is crucial for efficiently determining whether
ccan be part of a subsequence that matchest. -
Order Maintenance: To ensure the subsequence’s order is preserved, we maintain a variable
prethat records the last matched position int. For the current charactercins, it’s essential that it followspreint. We employupper_bound()in the binary search to find the smallest position int’s position array that is greater thanpre. If no such position exists, the function returnsfalse, indicating thatscannot form a subsequence oft. Otherwise, we updatepreto this new position and proceed.
Complexity Analysis
-
Preprocessing Complexity: The preprocessing of
tincurs a time complexity ofO(m), wheremis the length oft, to establish the mapping from characters to their positions. -
Processing Complexity: For each string
s, the time complexity isO(n log k), wherenis the length ofsandkis the average number of occurrences of each character int. Thelog kfactor arises from the binary search operation within the position arrays. -
Space Complexity: The space complexity is
O(m + u), whereuis the total number of unique characters int. This space is required to store the position arrays for each character. Note that the description previously mentionedO(m*k), which might have been misleading, askwould represent the number of stringss, which doesn’t directly influence the space needed for storing positions fromt.
This optimized approach significantly improves efficiency by reducing the need for repeated computations, especially beneficial when t is fixed, and a large number of strings s need to be checked for being subsequences of t.
Code
-
import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; public class Is_Subsequence { class Solution { public boolean isSubsequence(String s, String t) { if (s == null || s.length() == 0) { return true; } int i = 0; for (int j = 0; j < t.length(); j++) { if (s.charAt(i) == t.charAt(j)) { i++; if (i == s.length()) { // last char is matched return true; } } } return false; } } class Solution_followup { public boolean isSubsequence(String s, String t) { // step 1: save all the index for the t Map<Character, List<Integer>> map = new HashMap<>(); for (int i = 0; i < t.length(); i++) { char c = t.charAt(i); if (!map.containsKey(c)) { List<Integer> pos = new ArrayList<>(); pos.add(i); map.put(c, pos); } else { List<Integer> pos = map.get(c); pos.add(i); // map.put(c, pos); } } // step 2: for each char in s, find the first index int prevIndex = -1; for (int i = 0; i < s.length(); i++) { char c = s.charAt(i); List<Integer> pos = map.get(c); if (pos == null || pos.size() == 0) { return false; } int currentIndex = getNextIndexGreaterThanTarget(pos, prevIndex); if (currentIndex == -1) { return false; } prevIndex = currentIndex; } return true; } // find next number greater than target // if not found, return -1 private int getNextIndexGreaterThanTarget(List<Integer> pos, int targetIndex) { int lo = 0; int hi = pos.size() - 1; while (lo + 1 <= hi) { int mid = lo + (hi - lo) / 2; if (pos.get(mid) == targetIndex) { lo = mid + 1; } else if (pos.get(mid) > targetIndex) { hi = mid; } else if (pos.get(mid) < targetIndex) { lo = mid + 1; } } if (pos.get(lo) > targetIndex) { return pos.get(lo); } if (pos.get(hi) > targetIndex) { return pos.get(hi); } return -1; } } } ////// class Solution { public boolean isSubsequence(String s, String t) { int m = s.length(), n = t.length(); int i = 0, j = 0; while (i < m && j < n) { if (s.charAt(i) == t.charAt(j)) { ++i; } ++j; } return i == m; } } -
// OJ: https://leetcode.com/problems/is-subsequence/ // Time: O(M + N) // Space: O(1) class Solution { public: bool isSubsequence(string s, string t) { int i = 0, j = 0, M = s.size(), N = t.size(); for (; i < M && j < N; ++j) { if (s[i] == t[j]) ++i; } return i == M; } }; -
''' >>> a=[1,2,3,1,2,3,1,2,3] >>> bisect.bisect_left(a, 2) 1 >>> bisect.bisect_right(a, 2) 8 ''' class Solution: def isSubsequence(self, s: str, t: str) -> bool: i, j, m, n = 0, 0, len(s), len(t) while i < m and j < n: if s[i] == t[j]: i += 1 j += 1 return i == m # follow up class Solution: def isSubsequence(self, s: str, t: str) -> bool: # pre-processing step index = defaultdict(list) for i, c in enumerate(t): index[c].append(i) # iterate through each incoming string s j = 0 for c in s: if c not in index: return False pos_list = index[c] pos = bisect_left(pos_list, j) if pos == len(pos_list): return False j = pos_list[pos] + 1 return True -
func isSubsequence(s string, t string) bool { i, j, m, n := 0, 0, len(s), len(t) for i < m && j < n { if s[i] == t[j] { i++ } j++ } return i == m } -
function isSubsequence(s: string, t: string): boolean { let m = s.length, n = t.length; let i = 0; for (let j = 0; j < n && i < m; ++j) { if (s.charAt(i) == t.charAt(j)) { ++i; } } return i == m; } -
impl Solution { pub fn is_subsequence(s: String, t: String) -> bool { let (s, t) = (s.as_bytes(), t.as_bytes()); let n = t.len(); let mut i = 0; for &c in s.iter() { while i < n && t[i] != c { i += 1; } if i == n { return false; } i += 1; } true } } -
public class Solution { public bool IsSubsequence(string s, string t) { int m = s.Length, n = t.Length; int i = 0, j = 0; for (; i < m && j < n; ++j) { if (s[i] == t[j]) { ++i; } } return i == m; } }