# Question

Formatted question description: https://leetcode.ca/all/392.html

Given two strings s and t, return true if s is a subsequence of t, or false otherwise.

A subsequence of a string is a new string that is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (i.e., "ace" is a subsequence of "abcde" while "aec" is not).

Example 1:

Input: s = "abc", t = "ahbgdc"
Output: true


Example 2:

Input: s = "axc", t = "ahbgdc"
Output: false


Constraints:

• 0 <= s.length <= 100
• 0 <= t.length <= 104
• s and t consist only of lowercase English letters.

Follow up: Suppose there are lots of incoming s, say s1, s2, ..., sk where k >= 109, and you want to check one by one to see if t has its subsequence. In this scenario, how would you change your code?

# Algorithm

We define two pointers $i$ and $j$ to point to the initial position of the string $s$ and $t$ respectively. Each time we compare the two characters pointed to by the two pointers, if they are the same, both pointers move right at the same time; if they are not the same, only $j$ moves right. When the pointer $i$ moves to the end of the string $s$, it means that $s$ is the subsequence of $t$.

The time complexity is $O(m + n)$, where $m$ and $n$ are the lengths of the strings $s$ and $t$ respectively. The space complexity is $O(1)$.

## Follow up - lots of incoming s

The follow-up question in the topic suggests optimizing the process for handling a large number of strings s, given that the string t remains constant. To enhance efficiency, we can preprocess t by establishing a direct mapping from each character in t to its positions within the string. This approach is particularly useful because, although subsequences do not need to be contiguous, the order of characters must be maintained. Here’s an improved explanation:

### Preprocessing t

Given that t may contain repeated characters, we store all occurrences of each character in arrays, maintaining their order. This is achieved by using a HashMap, where each key is a character from t, and the corresponding value is an array listing all positions of that character in t.

### Processing Strings s

When processing each character in a string s, we perform the following steps:

1. Character Mapping: For each character c in s, we refer to the HashMap to find the array of positions where c occurs in t.

2. Binary Search: Given that these position arrays are sorted, we utilize binary search to expedite the lookup process. This is crucial for efficiently determining whether c can be part of a subsequence that matches t.

3. Order Maintenance: To ensure the subsequence’s order is preserved, we maintain a variable pre that records the last matched position in t. For the current character c in s, it’s essential that it follows pre in t. We employ upper_bound() in the binary search to find the smallest position in t’s position array that is greater than pre. If no such position exists, the function returns false, indicating that s cannot form a subsequence of t. Otherwise, we update pre to this new position and proceed.

### Complexity Analysis

• Preprocessing Complexity: The preprocessing of t incurs a time complexity of O(m), where m is the length of t, to establish the mapping from characters to their positions.

• Processing Complexity: For each string s, the time complexity is O(n log k), where n is the length of s and k is the average number of occurrences of each character in t. The log k factor arises from the binary search operation within the position arrays.

• Space Complexity: The space complexity is O(m + u), where u is the total number of unique characters in t. This space is required to store the position arrays for each character. Note that the description previously mentioned O(m*k), which might have been misleading, as k would represent the number of strings s, which doesn’t directly influence the space needed for storing positions from t.

This optimized approach significantly improves efficiency by reducing the need for repeated computations, especially beneficial when t is fixed, and a large number of strings s need to be checked for being subsequences of t.

# Code

• import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Is_Subsequence {
class Solution {
public boolean isSubsequence(String s, String t) {
if (s == null || s.length() == 0) {
return true;
}

int i = 0;
for (int j = 0; j < t.length(); j++) {
if (s.charAt(i) == t.charAt(j)) {
i++;

if (i == s.length()) { // last char is matched
return true;
}
}
}

return false;
}
}

class Solution_followup {
public boolean isSubsequence(String s, String t) {
// step 1: save all the index for the t
Map<Character, List<Integer>> map = new HashMap<>();
for (int i = 0; i < t.length(); i++) {
char c = t.charAt(i);
if (!map.containsKey(c)) {
List<Integer> pos = new ArrayList<>();
map.put(c, pos);
} else {
List<Integer> pos = map.get(c);
// map.put(c, pos);
}
}

// step 2: for each char in s, find the first index
int prevIndex = -1;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
List<Integer> pos = map.get(c);
if (pos == null || pos.size() == 0) {
return false;
}

int currentIndex = getNextIndexGreaterThanTarget(pos, prevIndex);

if (currentIndex == -1) {
return false;
}

prevIndex = currentIndex;
}

return true;
}

// find next number greater than target
private int getNextIndexGreaterThanTarget(List<Integer> pos, int targetIndex) {
int lo = 0;
int hi = pos.size() - 1;

while (lo + 1 <= hi) {
int mid = lo + (hi - lo) / 2;
if (pos.get(mid) == targetIndex) {
lo = mid + 1;
} else if (pos.get(mid) > targetIndex) {
hi = mid;
} else if (pos.get(mid) < targetIndex) {
lo = mid + 1;
}
}

if (pos.get(lo) > targetIndex) {
return pos.get(lo);
}

if (pos.get(hi) > targetIndex) {
return pos.get(hi);
}

return -1;
}
}
}

//////

class Solution {
public boolean isSubsequence(String s, String t) {
int m = s.length(), n = t.length();
int i = 0, j = 0;
while (i < m && j < n) {
if (s.charAt(i) == t.charAt(j)) {
++i;
}
++j;
}
return i == m;
}
}

• // OJ: https://leetcode.com/problems/is-subsequence/
// Time: O(M + N)
// Space: O(1)
class Solution {
public:
bool isSubsequence(string s, string t) {
int i = 0, j = 0, M = s.size(), N = t.size();
for (; i < M && j < N; ++j) {
if (s[i] == t[j]) ++i;
}
return i == M;
}
};

• '''
>>> a=[1,2,3,1,2,3,1,2,3]
>>> bisect.bisect_left(a, 2)
1
>>> bisect.bisect_right(a, 2)
8
'''
class Solution:
def isSubsequence(self, s: str, t: str) -> bool:
i, j, m, n = 0, 0, len(s), len(t)
while i < m and j < n:
if s[i] == t[j]:
i += 1
j += 1
return i == m

class Solution:
def isSubsequence(self, s: str, t: str) -> bool:
# pre-processing step
index = defaultdict(list)
for i, c in enumerate(t):
index[c].append(i)

# iterate through each incoming string s
j = 0
for c in s:
if c not in index:
return False
pos_list = index[c]
pos = bisect_left(pos_list, j)
if pos == len(pos_list):
return False
j = pos_list[pos] + 1
return True


• func isSubsequence(s string, t string) bool {
i, j, m, n := 0, 0, len(s), len(t)
for i < m && j < n {
if s[i] == t[j] {
i++
}
j++
}
return i == m
}

• function isSubsequence(s: string, t: string): boolean {
let m = s.length,
n = t.length;
let i = 0;
for (let j = 0; j < n && i < m; ++j) {
if (s.charAt(i) == t.charAt(j)) {
++i;
}
}
return i == m;
}


• impl Solution {
pub fn is_subsequence(s: String, t: String) -> bool {
let (s, t) = (s.as_bytes(), t.as_bytes());
let n = t.len();
let mut i = 0;
for &c in s.iter() {
while i < n && t[i] != c {
i += 1;
}
if i == n {
return false;
}
i += 1;
}
true
}
}


• public class Solution {
public bool IsSubsequence(string s, string t) {
int m = s.Length, n = t.Length;
int i = 0, j = 0;
for (; i < m && j < n; ++j) {
if (s[i] == t[j]) {
++i;
}
}
return i == m;
}
}