# Question

Formatted question description: https://leetcode.ca/all/392.html

 392	Is Subsequence

Given a string s and a string t, check if s is subsequence of t.

You may assume that there is only lower case English letters in both s and t.
t is potentially a very long (length ~= 500,000) string, and
s is a short string (<=100).

A subsequence of a string is a new string which is formed from the original string
by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters.
(ie, "ace" is a subsequence of "abcde" while "aec" is not).

Example 1:

s = "abc", t = "ahbgdc"

Return true.

Example 2:

s = "axc", t = "ahbgdc"

Return false.

If there are lots of incoming S, say S1, S2, ... , Sk where k >= 1B,
and you want to check one by one to see if T has its subsequence.
In this scenario, how would you change your code?

@tag-dp


# Algorithm

The two pointers point to the strings s and t respectively, and then if the characters are equal, i and j are incremented by 1, otherwise only j increments by 1.

Finally, see if i is equal to the length of s, which means that s has been traversed, and the characters are all appeared in t.

Follow up in the topic said that if there are a large number of strings s, ask us how to optimize.

So since the string t is always the same, we can do some pre-processing on t. Although the subsequence does not need to be a contiguous substring, but the order between characters is required, then we can establish a direct mapping

• from: each character in the string t
• to: its position

Since there may be repeated characters in t, all positions where the same character appears are added to an array in order.

Therefore, HashMap is used to establish the mapping between each character and its position array.

Then iterate over each character in the string s:

• For each traversed character c, we search in the corresponding character array in HashMap.
• Since the position array is ordered, we use binary search to speed up the search.

It should be noted here that because the subsequence is ordered, a variable pre is needed to record the position of the current match in the t string. For the character c in the current s string, even if it exists in the t string, if It cannot be matched before the position pre.

So we can use uppper_bound() to find the first position greater than pre in binary, If it does not exist, return false directly, otherwise, update pre to the result of binary search and continue the loop.

# Code

Java

• import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public class Is_Subsequence {
class Solution {
public boolean isSubsequence(String s, String t) {
if (s == null || s.length() == 0) {
return true;
}

int i = 0;
for (int j = 0; j < t.length(); j++) {
if (s.charAt(i) == t.charAt(j)) {
i++;

if (i == s.length()) { // last char is matched
return true;
}
}
}

return false;
}
}

class Solution_followup {
public boolean isSubsequence(String s, String t) {
// step 1: save all the index for the t
Map<Character, List<Integer>> map = new HashMap<>();
for (int i = 0; i < t.length(); i++) {
char c = t.charAt(i);
if (!map.containsKey(c)) {
List<Integer> pos = new ArrayList<>();
map.put(c, pos);
} else {
List<Integer> pos = map.get(c);
//                    map.put(c, pos);
}
}

// step 2: for each char in s, find the first index
int prevIndex = -1;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
List<Integer> pos = map.get(c);
if (pos == null || pos.size() == 0) {
return false;
}

int currentIndex = getNextIndexGreaterThanTarget(pos, prevIndex);

if (currentIndex == -1) {
return false;
}

prevIndex = currentIndex;
}

return true;
}

// find next number greater than target
private int getNextIndexGreaterThanTarget(List<Integer> pos, int targetIndex) {
int lo = 0;
int hi = pos.size() - 1;

while (lo + 1 <= hi) {
int mid = lo + (hi - lo) / 2;
if (pos.get(mid) == targetIndex) {
lo = mid + 1;
} else if (pos.get(mid) > targetIndex) {
hi = mid;
} else if (pos.get(mid) < targetIndex) {
lo = mid + 1;
}
}

if (pos.get(lo) > targetIndex) {
return pos.get(lo);
}

if (pos.get(hi) > targetIndex) {
return pos.get(hi);
}

return -1;
}
}
}

• // OJ: https://leetcode.com/problems/is-subsequence/
// Time: O(M + N)
// Space: O(1)
class Solution {
public:
bool isSubsequence(string s, string t) {
int i = 0, j = 0, M = s.size(), N = t.size();
for (; i < M && j < N; ++j) {
if (s[i] == t[j]) ++i;
}
return i == M;
}
};

• class Solution(object):
def isSubsequence(self, s, t):
"""
:type s: str
:type t: str
:rtype: bool
"""
d = collections.defaultdict(list)
for i, c in enumerate(t):
d[c].append(i)
start = 0
for c in s:
idx = bisect.bisect_left(d[c], start)
if len(d[c]) == 0 or idx >= len(d[c]):
return False
start = d[c][idx] + 1
return True