Given a url startUrl and an interface HtmlParser, implement
a Multi-threaded web crawler to crawl all links that are under the same
hostname as startUrl.
Return all urls obtained by your web crawler in any order.
Your crawler should:
startUrlHtmlParser.getUrls(url) to get all urls from a webpage of given url.
startUrl.

As shown in the example url above, the hostname is example.org. For simplicity
sake, you may assume all urls use http protocol without
any port specified. For example, the urls http://leetcode.com/problems
and http://leetcode.com/contest are under the same hostname, while urls
http://example.org/test and http://example.com/abc are not under
the same hostname.
The HtmlParser interface is defined as such:
interface HtmlParser {
// Return a list of all urls from a webpage of given url.
// This is a blocking call, that means it will do HTTP request and return when this request is finished.
public List<String> getUrls(String url);
}
Note that getUrls(String url) simulates performing a HTTP
request. You can treat it as a blocking function call which waits for a HTTP request to
finish. It is guaranteed that getUrls(String url) will return the urls
within 15ms. Single-threaded solutions will exceed the time limit so,
can your multi-threaded web crawler do better?
Below are two examples explaining the functionality of the problem, for custom testing
purposes you'll have three variables urls, edges and startUrl. Notice that you will only have access to startUrl in your code, while urls and edges are not directly accessible to you in
code.
Follow up:
Example 1:

Input: urls = [ "http://news.yahoo.com", "http://news.yahoo.com/news", "http://news.yahoo.com/news/topics/", "http://news.google.com", "http://news.yahoo.com/us" ] edges = [[2,0],[2,1],[3,2],[3,1],[0,4]] startUrl = "http://news.yahoo.com/news/topics/" Output: [ "http://news.yahoo.com", "http://news.yahoo.com/news", "http://news.yahoo.com/news/topics/", "http://news.yahoo.com/us" ]
Example 2:

Input: urls = [ "http://news.yahoo.com", "http://news.yahoo.com/news", "http://news.yahoo.com/news/topics/", "http://news.google.com" ] edges = [[0,2],[2,1],[3,2],[3,1],[3,0]] startUrl = "http://news.google.com" Output: ["http://news.google.com"] Explanation: The startUrl links to all other pages that do not share the same hostname.
Constraints:
1 <= urls.length <= 10001 <= urls[i].length <= 300startUrl is one of the urls.