Question

Formatted question description: https://leetcode.ca/all/44.html

Implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "*") → true
isMatch("aa", "a*") → true
isMatch("ab", "?*") → true
isMatch("aab", "c*a*b") → false

Algorithm

Pay attention to distinguish it from another Regular Expression Matching question. The functions of the asterisks in the two questions are different. Pay attention to contrast and distinguish.

The biggest difficulty of this question is that for the processing of asterisks, it can match any string, which is almost like a hang, that is, before the corresponding position of the asterisk, no matter if there is any string in your s, I have a big asterisk to match. But there is a problem that it cannot handle, that is, once there is a character in p that does not exist in s, then it must not be matched, because the asterisk can only add characters, not eliminate characters, and if there are characters, it will be a star.

Once the character string to be matched is determined, the matching situation after the asterisk position is beyond reach. Therefore, the position of the asterisk in the p string is very important. It is represented by jStar, and the position where the asterisk matches in the s string is represented by iStart. Here iStar and jStar are initialized to -1, which means that there is no star by default. Number. Then use two variables i and j to point to the traversed position in the current s string and p string, respectively.

Start matching, if i is less than the length of the s string, perform a while loop.

  • If the current two characters are equal, or the character in p is a question mark, then i and j are increased by 1.
  • If p[j] is an asterisk, to record the position of the asterisk, jStar is assigned to j, at this time j is incremented by 1, and iStar is assigned to i.
  • If the current p[j] is not an asterisk and cannot match p[i], the asterisk must be used at this time. If the asterisk has not appeared before, then return false directly, such as s = “aa” and p = “c*”, at this time s[0] and p[0] cannot match. Although p[1] is an asterisk, it cannot match still.
  • If the asterisk has appeared before, such as s = “aa” and p = “*c”, when it is found that s[1] and p[1] cannot match, but fortunately before p[0 ] When an asterisk appears, give s[1] to the asterisk of p[0] to match. As for how to know if there is an asterisk before, you can see the role of iStar at this time, because it is initialized to -1, and when it encounters an asterisk, it will be updated to i. Just check the value of iStar, to decide if you can use asterisks to continue your life.

Although all the characters in s were matched, the p string must be checked afterwards. At this time, only asterisks are left in the unmatched p string, and no other characters are allowed. The continuous asterisks are filtered out , If j is not equal to the length of p, return false.

Note

  1. Wildcard Matching

Implement wildcard pattern matching with support for ‘?’ and ‘’. ‘?’ Matches any single character. ‘’ Matches any sequence of characters (including the empty sequence).

  1. Regular Expression Matching

Implement regular expression matching with support for ‘.’ and ‘’. ‘.’ Matches any single character. ‘’ Matches zero or more of the preceding element.

The function prototypes for both topics are bool isMatch(const char *s, const char *p);

At first glance, these two questions thought they were the same, but they were not. The first question was actually wildcard matching, which means that the ‘’ sign can match any substring. The second question is actually regular expression matching, that is, the ‘’ sign means that the character before it can appear any number of times (including 0). This is actually the Klin closure described in the Long Book of Compilation Principles.

Code

Java

public class Wildcard_Matching {

    public static void main(String[] args) {
        Wildcard_Matching out = new Wildcard_Matching();
        Solution_dp sdp = out.new Solution_dp();

        System.out.println(sdp.isMatch("", "*"));
        System.out.println(sdp.isMatch(null, "*"));
    }

    public class Solution_dp {

        public boolean isMatch(String s, String p) {

            if (s == null || p == null) {
                return false;
            }

            if (p.length() == 0) {
                return s.length() == 0;
            }

            // @note: need to comment out below, or else fail test case
            //          s="", p="******"
//            if (s.length() == 0) {
//                return p == "*"; // * matching emtpy string
//            }

            int m = s.length();
            int n = p.length();

            // dp[i][j]: up to i-th char in p, and up to j-th char in s, are matching.
            // note: i,j are not index
            boolean[][] dp = new boolean[m + 1][n + 1];
            dp[0][0] = true; // facilitate loop

            for (int i = 1; i <= n; ++i) {
                // if p starting with "*", then all true for dp[0][i]
                // or else, all false for dp[0][i]
                if (p.charAt(i - 1) == '*') dp[0][i] = dp[0][i - 1];
            }
            for (int i = 1; i <= m; ++i) {
                for (int j = 1; j <= n; ++j) {
                    if (p.charAt(j - 1) == '*') {
                        dp[i][j] = dp[i - 1][j] || dp[i][j - 1];
                    } else {
                        dp[i][j] = (s.charAt(i - 1) == p.charAt(j - 1) || p.charAt(j - 1) == '?')
                            && dp[i - 1][j - 1];
                    }
                }
            }
            return dp[m][n];
        }
    }


    public class Solution_iteration {

        public boolean isMatch(String s, String p) {

            if (s == null || p == null) {
                return false;
            }

            if (p.length() == 0) {
                return s.length() == 0;
            }

            if (s.length() == 0) {
                return p == "*"; // * matching emtpy string
            }

            int i = 0;
            int j = 0;
            int iMark = -1;
            int jMark = -1;

            while (i < s.length()) {
                if (j < p.length() && (s.charAt(i) == p.charAt(j) || p.charAt(j) == '?')) {
                    i++;
                    j++;
                } else if (j < p.length() && p.charAt(j) == '*') {

                    // record i,j before trying different i positions
                    iMark = i;
                    jMark = j;

                    // skip * to match its next
                    j++;

                    //这一步是关键,匹配s中当前字符与p中‘*’后面的字符,如果匹配,则在第一个if中处理,如果不匹配,则继续比较s中的下一个字符。
                } else if (jMark != -1) { // if reaching here, meaning one attempt of start at position iMkark failed, try another attempt

                    // another attempt, move iMark next, also move j to marked matching position
                    iMark++;
                    j = jMark + 1;

                    i = iMark;
                } else {
                    return false;
                }
            }

            // now scanned s, but possibly p is not fully scanned. Only remaining of p is all * then return true
            while (j < p.length() && p.charAt(j) == '*') {
                j++;
            }

            return j == p.length();

        }
    }


    public class Solution_over_time {
        /*
        test case, over time:

        "aaaabaaaabbbbaabbbaabbaababbabbaaaababaaabbbbbbaabbbabababbaaabaabaaaaaabbaabbbbaababbababaabbbaababbbba"
        "*****b*aba***babaa*bbaba***a*aaba*b*aa**a*b**ba***a*a*"

        */
        public boolean isMatch(String s, String p) {

            if (s == null || p == null) {
                return false;
            }

            if (p.length() == 0) {
                return s.length() == 0;
            }

            if (s.length() == 0) {
                return p == "*"; // * matching emtpy string
            }

            // now length of p and s are at least 1
            if (p.charAt(0) == '?') {
                return isMatch(s.substring(1), p.substring(1));
            } else if (p.charAt(0) == '*') {
                char prev = s.charAt(0);
                for (int i = 0; i < s.length(); i++) {
                    // when i=0, it's case: "b" matching "*b"
                    if (prev == s.charAt(i) && isMatch(s.substring(i + 1), p.substring(1))) { // @note: here i+1 for substring
                        return true;
                    }
                }
            } else { // normal cases
                return s.charAt(0) == p.charAt(0) && isMatch(s.substring(1), p.substring(1));
            }

            return false;
        }
    }
}

All Problems

All Solutions