705. Design HashSet

Description

Design a HashSet without using any built-in hash table libraries.

Implement MyHashSet class:

void add(key) Inserts the value key into the HashSet.
bool contains(key) Returns whether the value key exists in the HashSet or not.
void remove(key) Removes the value key in the HashSet. If key does not exist in the HashSet, do nothing.

Example 1:

Input
["MyHashSet", "add", "add", "contains", "contains", "add", "contains", "remove", "contains"]
[[], [1], [2], [1], [3], [2], [2], [2], [2]]
Output
[null, null, null, true, false, null, true, null, false]

Explanation
MyHashSet myHashSet = new MyHashSet();
myHashSet.add(1);      // set = [1]
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(1); // return True
myHashSet.contains(3); // return False, (not found)
myHashSet.add(2);      // set = [1, 2]
myHashSet.contains(2); // return True
myHashSet.remove(2);   // set = [1]
myHashSet.contains(2); // return False, (already removed)

Constraints:

0 <= key <= 10⁶
At most 10⁴ calls will be made to add, remove, and contains.

Solutions

There are multiple ways to implement a hash set, but one straightforward method is to use a list of lists (also known as bucketing or chaining) to handle collisions (when two different elements have the same hash).

List of lists, ie. bucketing or chaining

Since all values are in the range of [0, 1000000], create an array with length 1000001, which has boolean type.

If a value is added, set the element at the corresponding index in the array to true. If a value is removed, set the element at the corresponding index in the array to false.

For the contains function, simply return the element at the corresponding index in the array, where true means the element exists in the HashSet and false means the element does not exist in the HashSet.

Open Addressing with Linear Probing

In open addressing, all elements occupy the hash table array itself. When a collision occurs (two keys hash to the same index), the hash table searches for the next empty slot in the array by probing subsequent indices until an empty slot is found.

For simplicity, below code assumes keys are positive integers and uses deletion markers for removed elements.

Explanation

Initialization: Sets initial capacity and creates an array (self.data) to store the elements. It also defines a load factor to determine when to resize the hash table to maintain efficiency.
_hash Method: Computes the hash index of a given key.
_resize Method: Doubles the capacity of the hash table and rehashes all existing elements when the load factor is exceeded.
add Method: Inserts a new key if it’s not already present, resizing the table if necessary. It linearly probes for an empty slot if there’s a collision.
remove Method: Searches for the key and marks its slot as “DEL” if found. It uses linear probing to handle collisions.
contains Method: Checks if a key exists in the hash table, returning True if found.

Differences from Chaining

Space Efficiency: Open addressing might be more space-efficient since it doesn’t use additional data structures for chaining. However, it requires careful management of load factors and resizing to maintain performance.
Handling of Deletions: Open addressing uses special markers to indicate deleted elements (“DEL” in this case), which is necessary to ensure subsequent elements found by probing can still be located.
Resizing Logic: To maintain efficient access times, the hash table size is increased based on the load factor, requiring all elements to be rehashed to new positions.

This open addressing implementation is just one of many ways to design a hash set. The choice between chaining, open addressing, or other methods depends on the specific requirements, including the expected number of elements, access patterns, and performance criteria.

class MyHashSet {
    private boolean[] data = new boolean[1000001];

    public MyHashSet() {
    }

    public void add(int key) {
        data[key] = true;
    }

    public void remove(int key) {
        data[key] = false;
    }

    public boolean contains(int key) {
        return data[key];
    }
}

/**
 * Your MyHashSet object will be instantiated and called as such:
 * MyHashSet obj = new MyHashSet();
 * obj.add(key);
 * obj.remove(key);
 * boolean param_3 = obj.contains(key);
 */

class MyHashSet {
public:
    bool data[1000001];

    MyHashSet() {
        memset(data, false, sizeof data);
    }

    void add(int key) {
        data[key] = true;
    }

    void remove(int key) {
        data[key] = false;
    }

    bool contains(int key) {
        return data[key];
    }
};

/**
 * Your MyHashSet object will be instantiated and called as such:
 * MyHashSet* obj = new MyHashSet();
 * obj->add(key);
 * obj->remove(key);
 * bool param_3 = obj->contains(key);
 */

class MyHashSet:
    def __init__(self):
        self.data = [False] * 1000001

    def add(self, key: int) -> None:
        self.data[key] = True

    def remove(self, key: int) -> None:
        self.data[key] = False

    def contains(self, key: int) -> bool:
        return self.data[key]


# Your MyHashSet object will be instantiated and called as such:
# obj = MyHashSet()
# obj.add(key)
# obj.remove(key)
# param_3 = obj.contains(key)

###############

class MyHashSet: # hash to bucket

    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.size = 1000  # Choosing a size for the outer list
        self.buckets = [[] for _ in range(self.size)]  # List of lists

    def _hash(self, key: int) -> int:
        """
        Generate a hash for a given key.
        """
        return key % self.size
    
    def add(self, key: int) -> None:
        """
        Insert a value into the HashSet.
        """
        hash_key = self._hash(key)
        if key not in self.buckets[hash_key]:
            self.buckets[hash_key].append(key)

    def remove(self, key: int) -> None:
        """
        Remove a value in the HashSet. If the value does not exist, do nothing.
        """
        hash_key = self._hash(key)
        if key in self.buckets[hash_key]:
            self.buckets[hash_key].remove(key)

    def contains(self, key: int) -> bool:
        """
        Returns true if this set contains the specified element.
        """
        hash_key = self._hash(key)
        return key in self.buckets[hash_key]

# Your MyHashSet object will be instantiated and called as such:
# obj = MyHashSet()
# obj.add(key)
# obj.remove(key)
# param_3 = obj.contains(key)

############

# Collision handling, optimized
class MyHashSet:
    def __init__(self):
        self.capacity = 1000
        self.data = [None] * self.capacity
        self.load_factor = 0.7
        self.size = 0
    
    def _hash(self, key):
        return key % self.capacity
    
    def _resize(self):
        if self.size / self.capacity >= self.load_factor:
            self.capacity *= 2
            old_data = self.data
            self.data = [None] * self.capacity
            self.size = 0
            for item in old_data:
                if item is not None:
                    self.add(item)
    
    def add(self, key: int) -> None:
        if self.contains(key):
            return
        self._resize()
        idx = self._hash(key)
        while self.data[idx] is not None:
            idx = (idx + 1) % self.capacity # Collision handling
        self.data[idx] = key
        self.size += 1
    
    def remove(self, key: int) -> None:
        idx = self._hash(key)
        while self.data[idx] is not None:
            if self.data[idx] == key:
                self.data[idx] = "DEL"  # Mark as deleted
                self.size -= 1
                return
            idx = (idx + 1) % self.capacity
    
    def contains(self, key: int) -> bool:
        idx = self._hash(key)
        while self.data[idx] is not None:
            if self.data[idx] == key:
                return True
            idx = (idx + 1) % self.capacity
        return False

type MyHashSet struct {
	data []bool
}

func Constructor() MyHashSet {
	data := make([]bool, 1000010)
	return MyHashSet{data}
}

func (this *MyHashSet) Add(key int) {
	this.data[key] = true
}

func (this *MyHashSet) Remove(key int) {
	this.data[key] = false
}

func (this *MyHashSet) Contains(key int) bool {
	return this.data[key]
}

/**
 * Your MyHashSet object will be instantiated and called as such:
 * obj := Constructor();
 * obj.Add(key);
 * obj.Remove(key);
 * param_3 := obj.Contains(key);
 */

class MyHashSet {
    data: Array<boolean>;
    constructor() {
        this.data = new Array(10 ** 6 + 1).fill(false);
    }

    add(key: number): void {
        this.data[key] = true;
    }

    remove(key: number): void {
        this.data[key] = false;
    }

    contains(key: number): boolean {
        return this.data[key];
    }
}

/**
 * Your MyHashSet object will be instantiated and called as such:
 * var obj = new MyHashSet()
 * obj.add(key)
 * obj.remove(key)
 * var param_3 = obj.contains(key)
 */

705 - Design HashSet

705. Design HashSet

Description

Solutions

List of lists, ie. bucketing or chaining

Open Addressing with Linear Probing

Explanation

Differences from Chaining

All Problems

All Solutions

Welcome to Subscribe On Youtube

705. Design HashSet

Description

Solutions

List of lists, ie. bucketing or chaining

Open Addressing with Linear Probing

Explanation

Differences from Chaining

All Problems

All Solutions