Welcome to Subscribe On Youtube
Question
Formatted question description: https://leetcode.ca/all/392.html
Given two strings s
and t
, return true
if s
is a subsequence of t
, or false
otherwise.
A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ace"
is a subsequence of "abcde"
while "aec"
is not).
Example 1:
Input: s = "abc", t = "ahbgdc" Output: true
Example 2:
Input: s = "axc", t = "ahbgdc" Output: false
Constraints:
0 <= s.length <= 100
0 <= t.length <= 104
s
andt
consist only of lowercase English letters.
Follow up: Suppose there are lots of incoming s
, say s1, s2, ..., sk
where k >= 109
, and you want to check one by one to see if t
has its subsequence. In this scenario, how would you change your code?
Algorithm
We define two pointers $i$ and $j$ to point to the initial position of the string $s$ and $t$ respectively. Each time we compare the two characters pointed to by the two pointers, if they are the same, both pointers move right at the same time; if they are not the same, only $j$ moves right. When the pointer $i$ moves to the end of the string $s$, it means that $s$ is the subsequence of $t$.
The time complexity is $O(m + n)$, where $m$ and $n$ are the lengths of the strings $s$ and $t$ respectively. The space complexity is $O(1)$.
Follow up - lots of incoming s
The follow-up question in the topic suggests optimizing the process for handling a large number of strings s
, given that the string t
remains constant. To enhance efficiency, we can preprocess t
by establishing a direct mapping from each character in t
to its positions within the string. This approach is particularly useful because, although subsequences do not need to be contiguous, the order of characters must be maintained. Here’s an improved explanation:
Preprocessing t
Given that t
may contain repeated characters, we store all occurrences of each character in arrays, maintaining their order. This is achieved by using a HashMap, where each key is a character from t
, and the corresponding value is an array listing all positions of that character in t
.
Processing Strings s
When processing each character in a string s
, we perform the following steps:
-
Character Mapping: For each character
c
ins
, we refer to the HashMap to find the array of positions wherec
occurs int
. -
Binary Search: Given that these position arrays are sorted, we utilize binary search to expedite the lookup process. This is crucial for efficiently determining whether
c
can be part of a subsequence that matchest
. -
Order Maintenance: To ensure the subsequence’s order is preserved, we maintain a variable
pre
that records the last matched position int
. For the current characterc
ins
, it’s essential that it followspre
int
. We employupper_bound()
in the binary search to find the smallest position int
’s position array that is greater thanpre
. If no such position exists, the function returnsfalse
, indicating thats
cannot form a subsequence oft
. Otherwise, we updatepre
to this new position and proceed.
Complexity Analysis
-
Preprocessing Complexity: The preprocessing of
t
incurs a time complexity ofO(m)
, wherem
is the length oft
, to establish the mapping from characters to their positions. -
Processing Complexity: For each string
s
, the time complexity isO(n log k)
, wheren
is the length ofs
andk
is the average number of occurrences of each character int
. Thelog k
factor arises from the binary search operation within the position arrays. -
Space Complexity: The space complexity is
O(m + u)
, whereu
is the total number of unique characters int
. This space is required to store the position arrays for each character. Note that the description previously mentionedO(m*k)
, which might have been misleading, ask
would represent the number of stringss
, which doesn’t directly influence the space needed for storing positions fromt
.
This optimized approach significantly improves efficiency by reducing the need for repeated computations, especially beneficial when t
is fixed, and a large number of strings s
need to be checked for being subsequences of t
.
Code
-
import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; public class Is_Subsequence { class Solution { public boolean isSubsequence(String s, String t) { if (s == null || s.length() == 0) { return true; } int i = 0; for (int j = 0; j < t.length(); j++) { if (s.charAt(i) == t.charAt(j)) { i++; if (i == s.length()) { // last char is matched return true; } } } return false; } } class Solution_followup { public boolean isSubsequence(String s, String t) { // step 1: save all the index for the t Map<Character, List<Integer>> map = new HashMap<>(); for (int i = 0; i < t.length(); i++) { char c = t.charAt(i); if (!map.containsKey(c)) { List<Integer> pos = new ArrayList<>(); pos.add(i); map.put(c, pos); } else { List<Integer> pos = map.get(c); pos.add(i); // map.put(c, pos); } } // step 2: for each char in s, find the first index int prevIndex = -1; for (int i = 0; i < s.length(); i++) { char c = s.charAt(i); List<Integer> pos = map.get(c); if (pos == null || pos.size() == 0) { return false; } int currentIndex = getNextIndexGreaterThanTarget(pos, prevIndex); if (currentIndex == -1) { return false; } prevIndex = currentIndex; } return true; } // find next number greater than target // if not found, return -1 private int getNextIndexGreaterThanTarget(List<Integer> pos, int targetIndex) { int lo = 0; int hi = pos.size() - 1; while (lo + 1 <= hi) { int mid = lo + (hi - lo) / 2; if (pos.get(mid) == targetIndex) { lo = mid + 1; } else if (pos.get(mid) > targetIndex) { hi = mid; } else if (pos.get(mid) < targetIndex) { lo = mid + 1; } } if (pos.get(lo) > targetIndex) { return pos.get(lo); } if (pos.get(hi) > targetIndex) { return pos.get(hi); } return -1; } } } ////// class Solution { public boolean isSubsequence(String s, String t) { int m = s.length(), n = t.length(); int i = 0, j = 0; while (i < m && j < n) { if (s.charAt(i) == t.charAt(j)) { ++i; } ++j; } return i == m; } }
-
// OJ: https://leetcode.com/problems/is-subsequence/ // Time: O(M + N) // Space: O(1) class Solution { public: bool isSubsequence(string s, string t) { int i = 0, j = 0, M = s.size(), N = t.size(); for (; i < M && j < N; ++j) { if (s[i] == t[j]) ++i; } return i == M; } };
-
''' >>> a=[1,2,3,1,2,3,1,2,3] >>> bisect.bisect_left(a, 2) 1 >>> bisect.bisect_right(a, 2) 8 ''' class Solution: def isSubsequence(self, s: str, t: str) -> bool: i, j, m, n = 0, 0, len(s), len(t) while i < m and j < n: if s[i] == t[j]: i += 1 j += 1 return i == m # follow up class Solution: def isSubsequence(self, s: str, t: str) -> bool: # pre-processing step index = defaultdict(list) for i, c in enumerate(t): index[c].append(i) # iterate through each incoming string s j = 0 for c in s: if c not in index: return False pos_list = index[c] pos = bisect_left(pos_list, j) if pos == len(pos_list): return False j = pos_list[pos] + 1 return True
-
func isSubsequence(s string, t string) bool { i, j, m, n := 0, 0, len(s), len(t) for i < m && j < n { if s[i] == t[j] { i++ } j++ } return i == m }
-
function isSubsequence(s: string, t: string): boolean { let m = s.length, n = t.length; let i = 0; for (let j = 0; j < n && i < m; ++j) { if (s.charAt(i) == t.charAt(j)) { ++i; } } return i == m; }
-
impl Solution { pub fn is_subsequence(s: String, t: String) -> bool { let (s, t) = (s.as_bytes(), t.as_bytes()); let n = t.len(); let mut i = 0; for &c in s.iter() { while i < n && t[i] != c { i += 1; } if i == n { return false; } i += 1; } true } }
-
public class Solution { public bool IsSubsequence(string s, string t) { int m = s.Length, n = t.Length; int i = 0, j = 0; for (; i < m && j < n; ++j) { if (s[i] == t[j]) { ++i; } } return i == m; } }