Skip to content
This repository has been archived by the owner on Sep 19, 2024. It is now read-only.

Latest commit

 

History

History

heaps

heaps



  • heap is a data structure capable of giving you the smallest (or the largest) element in constant time, while adding or removing the smallest (or the largest) element in logarithmic time.

  • a heap is like a binary tree with these properties:

    • it must have all of its nodes in a specific order, and
    • its shape must be complete (all the levels of the tree must be completely filled except maybe for the last one and the last level must have the left-most nodes filled, always).
    • a max heap's root node must have all its children either greater than or equal to its children.
  • a min heap is the opposite. duplicate values are allowed.

  • heaps can be represented with linked lists, queues (arrays), or binary trees.

  • in the case of an array heap:

    • the parent node index is given by n / 2
    • the left children index is 2 * n
    • the right children index is 2 * n + 1
    • a node is a leaf when its index > n / 2
  • common applications of heap are: sort (heap sort), getting the top-k elements, and finding the kth element.

  • in python you can use heapq.heapify(array) with heapq.heappush(array, value) and heapq.heappop().


class Heap:

    def __init__(self):

        self.heap = []

    def heapify(self, n, i):

        largest = i
        left_children = 2 * i + 1
        right_children = 2 * i + 2

        if left_children < n and self.heap[i] < self.heap[left_children]:
            largest = left_children

        if right_children < n and self.heap[largest] < self.heap[right_children]:
            largest = right_children

        if largest != i:
            self.heap[i], self.heap[largest] = self.heap[largest], self.heap[i]
            self.heapify(n, largest)


    def insert(self, num):

        size = len(self.heap)

        if size == 0:
            self.heap.append(num)

        else:
            self.heap.append(num)
            for i in range((size // 2) - 1, -1, -1):
                self.heapify(size, i)


    def delete_node(self, num):

        size = len(self.heap)

        i = 0
        for i in range(size):
            if num == self.heap[i]:
                break

        self.heap[i], self.heap[size - 1] = self.heap[size - 1], self.heap[i]

        self.heap.remove(size - 1)

        for i in range((len(self.heap) // 2) - 1, -1, -1):
            self.heapify(len(self.heap), i)


heapify


  • it's cheaper to heapify an array of data (O(N)) than create an empty heap and inserting each element (O(N log(N))).

    • heapify means create a binary tree and then comparing each nodes with their children (and swapping when necessary).
    • parents node can simply exchange with their smallest child (so the max number of exchanges is N/2) and leaves are left out.
  • python's built-in heap differs from the standard implementation of a heap in two ways:

    • firstly, it uses zero-based indexing, so it stores the root node at index zero instead of the size of the heap.
    • secondly, the built-in module does not offer a direct way to create a max heap, instead, we must modify the values of each element when inserting in the heap, and when removing it from the heap.

import heapq

min_heap = [3,1,2]
heapq.heapify(min_heap)

max_heap = [-x for x in min_heap]
heapq.heapify(max_heap)

heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, -5)

min_elem = min_heap[0]
max_elem = -1 * max_heap[0

heapq.heappop(min_heap)
heapq.heappop(max_heap)

size_min_heap = len(min_heap)
size_max_heap = len(max_heap)


priority queues


  • a priority queue is an abstract data type with the following properties:

    1. every item has a priority (usually an integer).
    2. an item with a high priority is dequeued before an item with low priority.
    3. two items with an equal priority are dequeued based on their order in the queue.
  • priority queues can be implemented with a stack, queue, or linked list data structures.



min heaps


  • a min heap is a complete binary tree where each node is smaller than its children (the root is the min element).

  • insert:

    • insert the element at the bottom, by finding the most rightmost node and checking its children: if left is empty, insert there, otherwise, insert on right.
    • then compare this node to each parent, exchanging them until the tree's properties are correct.
  • extract_min:

    • first, remove/return the top and then replace the tree's top with its latest element (the bottom-most rightmost).
    • then bubble down, swapping it with one of its children until the min-heap is properly restored
    • there is no need for order between right and left, so this operation would only take O(log N) runtime.
  • the code below is an example of an ad-hoc heap class in python.


class MinHeap:

    def __init__(self, size):

        self.heapsize = size
        self.minheap = [0] * (size + 1)
        self.realsize = 0

    def add(self, element):

        if self.realsize + 1 > self.heapsize:
            print("Too many elements!")
            return False

        self.realsize += 1
        self.minheap[self.realsize] = element

        index = self.realsize
        parent = index // 2

        while self.minheap[index] < self.minheap[parent] and index > 1:

            self.minheap[parent], self.minheap[index] = self.minheap[index], self.minheap[parent]
            index = parent
            parent = index // 2
    
    def peek(self):

        return self.minheap[1]
    
    def pop(self):

        if self.realsize < 1:
            print("Heap is empty.")
            return False

        else:
            remove_element = self.minheap[1]
            self.minheap[1] = self.minheap[self.realsize]
            self.realsize -= 1
            index = 1

            while index <= self.realsize // 2:

                left_children = index * 2
                right_children = (index * 2) + 1

                if self.minheap[index] > self.minheap[left_children] or \
                   self.minheap[index] > self.minheap[right_children]:

                    if self.minheap[left_children] < self.minheap[right_children]:

                        self.minheap[left_children], self.minheap[index] = self.minheap[index], self.minheap[left_children]
                        index = left_children
                    
                    else:

                        self.minheap[right_children], self.minheap[index] = self.minheap[index], self.minheap[right_children]
                        index = right_children
                else:
                    break

            return remove_element
    
    def size(self):
        return self.realsize
    
    def __str__(self):
        return str(self.minheap[1 : self.realsize + 1])


max heaps


  • a max heap is a complete binary tree where each node is larger than its children (the root is the max element).

  • insert:

    • insert the element at the bottom, at the leftmost node.
    • then compare the node to each parent, exchanging them until the tree's properties are correct.
  • extract_max:

    • remove/return the top and then replace the tree's top with its bottom rightmost element.
    • swap up until the max element is on the top.
  • the code below is an example of a max heap class built in python:


class MaxHeap:
    def __init__(self, heapsize):

        self.heapsize = heapsize
        self.maxheap = [0] * (heapsize + 1)
        self.realsize = 0

    def add(self, element):

        self.realsize += 1
        if self.realsize > self.heapsize:
            print("Too many elements!")
            self.realsize -= 1
            return False

        self.maxheap[self.realsize] = element
        index = self.realsize
        parent = index // 2

        while self.maxheap[index] > self.maxheap[parent] and index > 1:
            self.maxheap[parent], self.maxheap[index] = self.maxheap[index], self.maxheap[parent]
            index = parent
            parent = index // 2
            
    def peek(self):

        return self.maxheap[1]
    
    def pop(self):

        if self.realsize < 1:
            print("Heap is empty.")
            return False
        else:
            remove_element = self.maxheap[1]
            self.maxheap[1] = self.maxheap[self.realsize]
            self.realsize -= 1
            index = 1

            while (index <= self.realsize // 2):

                left_children = index * 2
                right_children = (index * 2) + 1

                if (self.maxheap[index] < self.maxheap[left_children] or self.maxheap[index] < self.maxheap[right_children]):
                    if self.maxheap[left_children] > self.maxheap[right_children]:
                        self.maxheap[left_children], self.maxheap[index] = self.maxheap[index], self.maxheap[left_children]
                        index = left_children
                    else:
                        self.maxheap[right_children], self.maxheap[index] = self.maxheap[index], self.maxheap[right_children]
                        index = right_children
                else:
                    break
            return remove_element
    
    def size(self):
        return self.realsize
    
    def __str__(self):
        return str(self.maxheap[1 : self.realsize + 1])


heap sort


  • the core concept of the heap sort involves constructing a heap from our input and then repeatedly removing the min/max element to sort the array.

  • the key idea for in-place heap sort involves a balance of these two ideas:

    • building a heap from an unsorted array through a "bottom-up heapification" process, and
    • using the heap to sort the input array
  • heapsort traditionally uses a max-heap to sort the array, although a min-heap also works.

  • this is not a stable sort.


def heap_sort(self, array) -> None:

      def max_heapify(heap_size, index):
        
            left, right = 2 * index + 1, 2 * index + 2
            largest = index
        
            if left < heap_size and array[left] > array[largest]:
                largest = left
            elif if right < heap_size and array[right] > array[largest]:
                largest = right
            elif largest != index:
                array[index], array[largest] = array[largest], array[index]
                max_heapify(heap_size, largest)

      for i in range(len(lst) // 2 - 1, -1, -1):
            max_heapify(len(array), i)

      for i in range(len(array) - 1, 0, -1):
            array[i], array[0] = array[0], array[i]
            max_heapify(i, 0)

      return array


compare two tops


def compare_two_tops(array) -> int:

        for i in range(len(array)):
            array[i] *= -1

        heapq.heapify(array)

        while len(array) > 1:

            val1 = heapq.heappop(array)
            val2 = heapq.heappop(array)

            if val1 != val2:
                heapq.heappush(array, val1 - val2)

        if array:
            return -heapq.heappop(array)
            
        return 0


find non-overlapping intervals


  • given an array of intervals[i] = [start_i, end_i], return the minimum the non-overlapping intervals:

def non_overlapping_invervals(intervals):
        
        if not intervals:
            return 0

        result = []
        intervals.sort(key=lambda x: x[0])
        heapq.heappush(result, intervals[0][-1])

        for interval in intervals[1:]:

            if result[0] <= interval[0]:
                heapq.heappop(result)
            
            heapq.heappush(result, interval[1])
        
        return len(result)


top k frequent values


def top_k_frequent_values(list, k):
        
        if k == len(nums):
                return nums

        # hashmap element: frequency
        counter = Counter(nums)
        return heapq.nlargest(k, counter.keys(), key=counter.get)


kth largest element in a stream


class KthLargest:

    def __init__(self, k, nums):

        self.k = k
        self.heap = nums
        heapq.heapify(self.heap)

        while len(self.heap) > k:
            heapq.heappop(self.heap)
        
    def add(self, val: int) -> int:

        heapq.heappush(self.heap, val)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        
        return self.heap[0]


kth smallest element in a matrix


  • given an n x n matrix where each of the rows and columns is sorted in ascending order, return the kth smallest element in the matrix.

def kth_smallest(matrix, k) -> int:

        min_heap = []

        for row in range(min(k, len(matrix))):
            min_heap.append((matrix[row][0], row, 0))

        heapq.heapify(min_heap)

        while k:

            element, row, col = heapq.heappop(min_heap)
            if col < len(matrix) - 1:
                heapq.heappush(min_heap, (matrix[row][cow + 1], row, col + 1))
            k -= 1
        
        return element