Welcome to Subscribe On Youtube
3451. Find Invalid IP Addresses
Description
Table: logs
+-++ \| log_id \| int \| \| ip \| varchar \| \| status_code \| int \| +-++-+ \| log_id \| ip \| status_code \| +--++-+
Output:
++--+ \| 256.1.2.3 \| 2 \| \| 192.168.001.1 \| 2 \| \| 192.168.1 \| 1 \| +-----+
Explanation:
- 256.1.2.3 is invalid because 256 > 255
- 192.168.001.1 is invalid because of leading zeros
- 192.168.1 is invalid because it has only 3 octets
The output table is ordered by invalid_count, ip in descending order respectively.
</div>
Solutions
Solution 1: Simulation
We can determine if an IP address is invalid based on the following conditions:
- The number of
.
in the IP address is not equal to ;3 - Any octet in the IP address starts with
0
; - Any octet in the IP address is greater than
.255
Then we group the invalid IP addresses and count the occurrences of each invalid IP address invalid_count
, and finally sort by invalid_count
and ip
in descending order.
-
import pandas as pd def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame: def is_valid_ip(ip: str) -> bool: octets = ip.split(".") if len(octets) != 4: return False for octet in octets: if not octet.isdigit(): return False value = int(octet) if not 0 <= value <= 255 or octet != str(value): return False return True logs["is_valid"] = logs["ip"].apply(is_valid_ip) invalid_ips = logs[~logs["is_valid"]] invalid_count = invalid_ips["ip"].value_counts().reset_index() invalid_count.columns = ["ip", "invalid_count"] result = invalid_count.sort_values( by=["invalid_count", "ip"], ascending=[False, False] ) return result
-
SELECT ip, COUNT(*) AS invalid_count FROM logs WHERE LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3 OR SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]' OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]' OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]' OR SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]' OR SUBSTRING_INDEX(ip, '.', 1) > 255 OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) > 255 OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) > 255 OR SUBSTRING_INDEX(ip, '.', -1) > 255 GROUP BY 1 ORDER BY 2 DESC, 1 DESC;