Question

Formatted question description: https://leetcode.ca/all/1410.html

 1410. HTML Entity Parser
 HTML entity parser is the parser that takes HTML code as input and replace all the entities of the special characters by the characters itself.

 The special characters and their entities for HTML are:

     Quotation Mark: the entity is " and symbol character is ".
     Single Quote Mark: the entity is ' and symbol character is '.
     Ampersand: the entity is & and symbol character is &.
     Greater Than Sign: the entity is > and symbol character is >.
     Less Than Sign: the entity is &lt; and symbol character is <.
     Slash: the entity is &frasl; and symbol character is /.

 Given the input text string to the HTML parser, you have to implement the entity parser.

 Return the text after replacing the entities by the special characters.


 Example 1:

 Input: text = "&amp; is an HTML entity but &ambassador; is not."
 Output: "& is an HTML entity but &ambassador; is not."
 Explanation: The parser will replace the &amp; entity by &


 Example 2:

 Input: text = "and I quote: &quot;...&quot;"
 Output: "and I quote: \"...\""


 Example 3:

 Input: text = "Stay home! Practice on Leetcode :)"
 Output: "Stay home! Practice on Leetcode :)"


 Example 4:

 Input: text = "x &gt; y &amp;&amp; x &lt; y is always false"
 Output: "x > y && x < y is always false"


 Example 5:

 Input: text = "leetcode.com&frasl;problemset&frasl;all"
 Output: "leetcode.com/problemset/all"


 Constraints:
     1 <= text.length <= 10^5
     The string may contain any possible characters out of all the 256 ASCII characters.

Algorithm

substring check

The beginning of the first character of each special character is &, so you can decide target substring based on this.

In addition, why not select the next character after & as further judgment, is because for example, &apos; and &amp;, two additional characters need to be judged, the code is not very elegant, so directly call substring.

2 pointers

This question can also use the 2-pointer method, l and r are initialized to -1, when encountering &, use l to record the left boundary, when encountering ;, use h to record the right boundary, when the left boundary and the right boundary.

If both exist, take out the string to judge whether it is in the dictionary and replace it directly, otherwise, keep the original string, and set l and r to -1 to facilitate the recording of the next left and right boundary.

Code

Java

public class HTML_Entity_Parser {

    public String entityParser(String text) {

        if (text == null || text.length() == 0) {
            return text;
        }

        StringBuilder sb = new StringBuilder();
        StringBuilder cur = new StringBuilder();

        Map<String, String> dic = new HashMap<>();
        dic.put("&quot;", "\"");
        dic.put("&apos;", "'");
        dic.put("&amp;", "&");
        dic.put("&gt;", ">");
        dic.put("&lt;", "<");
        dic.put("&frasl;", "/");

        char[] txt = text.toCharArray();
        for (char aTxt : txt) {
            if (aTxt == '&') {
                sb.append(cur);
                cur.setLength(0);
                cur.append("&");
            } else if (aTxt == ';') {
                cur.append(";");
                String s = cur.toString();
                if (dic.containsKey(s)) sb.append(dic.get(s));
                else sb.append(cur);
                cur.setLength(0);
            } else {
                cur.append(aTxt);
            }
        }
        return sb.append(cur).toString();
    }
}

Java

class Solution {
    public String entityParser(String text) {
        StringBuffer parse = new StringBuffer();
        StringBuffer temp = new StringBuffer();
        boolean flag = false;
        int length = text.length();
        for (int i = 0; i < length; i++) {
            char c = text.charAt(i);
            if (flag) {
                temp.append(c);
                if (c == ';') {
                    flag = false;
                    String specialCharacter = getSpecialCharacter(temp.toString());
                    parse.append(specialCharacter);
                    temp.setLength(0);
                }
            } else {
                if (c == '&') {
                    flag = true;
                    temp.append(c);
                } else
                    parse.append(c);
            }
        }
        if (temp.length() > 0)
            parse.append(temp.toString());
        return parse.toString();
    }

    public String getSpecialCharacter(String str) {
        if (str.equals("&quot;"))
            return "\"";
        else if (str.equals("&apos;"))
            return "\'";
        else if (str.equals("&amp;"))
            return "&";
        else if (str.equals("&gt;"))
            return ">";
        else if (str.equals("&lt;"))
            return "<";
        else if (str.equals("&frasl;"))
            return "/";
        else
            return str;
    }
}

All Problems

All Solutions