Welcome to Subscribe On Youtube

3451. Find Invalid IP Addresses

Description

Table: logs

+-++
\| log_id      \| int     \|
\| ip          \| varchar \|
\| status_code \| int     \|
+-++-+
\| log_id \| ip            \| status_code \| 
+--++-+

Output:

++--+
\| 256.1.2.3     \| 2            \|
\| 192.168.001.1 \| 2            \|
\| 192.168.1     \| 1            \|
+-----+

Explanation:

  • 256.1.2.3 is invalid because 256 > 255
  • 192.168.001.1 is invalid because of leading zeros
  • 192.168.1 is invalid because it has only 3 octets

The output table is ordered by invalid_count, ip in descending order respectively.

</div>

Solutions

Solution 1: Simulation

We can determine if an IP address is invalid based on the following conditions:

  1. The number of . in the IP address is not equal to 3;
  2. Any octet in the IP address starts with 0;
  3. Any octet in the IP address is greater than 255.

Then we group the invalid IP addresses and count the occurrences of each invalid IP address invalid_count, and finally sort by invalid_count and ip in descending order.

  • import pandas as pd
    
    
    def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame:
        def is_valid_ip(ip: str) -> bool:
            octets = ip.split(".")
            if len(octets) != 4:
                return False
            for octet in octets:
                if not octet.isdigit():
                    return False
                value = int(octet)
                if not 0 <= value <= 255 or octet != str(value):
                    return False
            return True
    
        logs["is_valid"] = logs["ip"].apply(is_valid_ip)
        invalid_ips = logs[~logs["is_valid"]]
        invalid_count = invalid_ips["ip"].value_counts().reset_index()
        invalid_count.columns = ["ip", "invalid_count"]
        result = invalid_count.sort_values(
            by=["invalid_count", "ip"], ascending=[False, False]
        )
        return result
    
    
  • SELECT
        ip,
        COUNT(*) AS invalid_count
    FROM logs
    WHERE
        LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
    
        OR SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]'
        OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]'
        OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]'
        OR SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]'
    
        OR SUBSTRING_INDEX(ip, '.', 1) > 255
        OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) > 255
        OR SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) > 255
        OR SUBSTRING_INDEX(ip, '.', -1) > 255
    
    GROUP BY 1
    ORDER BY 2 DESC, 1 DESC;
    
    

All Problems

All Solutions