Merge Intervals

Author

Pham Nguyen Hung

Published

July 21, 2023

Merge Intervals

Definition

Merge intervals is applicable to problems that entail overlapping intervals. Each interval is represented by a start and an end time. This typically involves a scheduling problem. The best way to think about this is noticing the 3 ways 2 intervals can interact.

To reduce the number of comparisons to make, the list of intervals is usually required to be sorted. Afterwards, it is about iterating the list and compare the elements.

The pattern is usually tested by companies with system scheduling problems, such as TikTok/ByteDance.

Problems

1. Merge Intervals

Intuition

As mentioned, first, I need to sort the array. But how exactly should I sort it? The hint for me are the examples - the array there is sorted based on the start time. The problem then is about updating the end of each interval. If the interval with a smaller start time is 1 and the next one is 2, append 2 as is to the return array first if they are non-overlapping. If 1 envelops 2, do nothing. If 2 ends later than 1, update the entry in the return array.

The question comes to “Can I think of such a thing on my own from first principles in an interview?” I think that I can, but it may take some time. I don’t think that I will encounter this problem in an interview though - if anything, it will be Merge Intervals II or Meeting Room II at least.

Algorithm

  1. Sort the input array intervals.
  2. Initialize the return array sorted_intervals with the first element of the sorted input array.
  3. Iterate the rest of the input array, updating or appending to the return array each time.
  4. Return the result.

Complexity

  • Time complexity: \(O(n\log n)\): The usual time required for sorting.

  • Space complexity: \(O(n)\): Python .sort() uses Timsort, which uses between n//2 to no extra memory (on top of the heap, which is related to heapsort, and is another matter).

timsort can require a temp array containing as many as N//2 pointers, which means as many as 2*N extra bytes on 32-bit boxes. It can be expected to require a temp array this large when sorting random data; on data with significant structure, it may get away without using any extra heap memory.

Code

class Solution:
    def merge(self, intervals: List[List[int]]) -> List[List[int]]:
        intervals.sort(key=lambda x:x[0]) # Not really necessary
        sorted_intervals = [intervals[0]]
        for interval in intervals[1:]:
            if interval[0] > sorted_intervals[-1][1]:
                sorted_intervals.append(interval)
            else:
                sorted_intervals[-1][1] = max(interval[1], sorted_intervals[-1][1])
        return sorted_intervals

2. Insert Interval

Intuition

I got overwhelmed at first. In such a case, it is important to relax and deconstruct. Firstly, modifying the array in place is hard - it may be better to just create a new output array. Next, for any interval in the input array intervals, the new interval can come before it, can come after it, or overlaps with it. The first two cases are trivial, while the third case requires modification. I was just thinking about modifying the output array, while I can modify the new interval instead. #### Algorithm 1. Initialize the output array answer. 2. Iterate the input array intervals. For three cases: 1. If the new interval comes after the current one, append it to answer and return. 2. If the new interval comes before the current one, append the current one to answer. 3. Else, update new interval with the earlier start time and the later end time between the new interval and the current one. 3. Append the new interval into the output array and return.

Complexity

  • Time complexity: \(O(n)\): I need to iterate the input array once.

  • Space complexity: \(O(n)\): I need to store the output array. If it is not counted as part of the extra memory used, it is \(O(1)\).

Code

class Solution:
    def insert(self, intervals: List[List[int]], newInterval: List[int]) -> List[List[int]]:
        answer = []

        for i in range(len(intervals)):
            # Easy case: newInterval is non-overlapping the rest
            if newInterval[1] < intervals[i][0]:
                answer.append(newInterval)
                return answer + intervals[i:]
            elif newInterval[0] > intervals[i][1]:
                answer.append(intervals[i])
            else:
                newInterval = [min(newInterval[0], intervals[i][0]),
                               max(newInterval[1], intervals[i][1])]
        
        answer.append(newInterval)
        return answer

3. Meeting Rooms

Intuition

Welcome to the Meeting Room series! This is a big series much tested by TikTok in merge intervals. The requirement is completing at least Meeting Rooms II. For this series, I use a different platform called LintCode. It allows access to premium questions for free under certain circumstances.

For the problem, it is straightforward. The person cannot attend the meetings if the end time of one is after the start time of another. Hence, if I sort the input array according to the start time, I can iterate and compare two adjacent arrays. If I encounter an overlapping pair, I return False. If I reach the end, I return True.

Algorithm

  1. Iterate the input array intervals, one adjacent pair at a time.
  2. If I encounter an overlapping, return False.
  3. If I reach the end, return True.

Complexity

  • Time complexity: \(O(n)\): I need to iterate the input array once.

  • Space complexity: \(O(1)\): I do not use any extra memory.

Code

from typing import (
    List,
)
from lintcode import (
    Interval,
)

"""
Definition of Interval:
class Interval(object):
    def __init__(self, start, end):
        self.start = start
        self.end = end
"""

class Solution:
    """
    @param intervals: an array of meeting time intervals
    @return: if a person could attend all meetings
    """
    def can_attend_meetings(self, intervals: List[Interval]) -> bool:
        # Write your code here
        intervals.sort(key=lambda x:x.start)
        for i in range(len(intervals)-1):
            if intervals[i].end > intervals[i+1].start:
                return False
        return True

4. Meeting Rooms II

Intuition

For Meeting Rooms II, it appears harder. The answer is the maximum number of meetings going on at a time. The problem appears hard. Frankly, I don’t think I can do it should I encounter it for the first time in an interview. But well, that’s why I need to practice.

The problem can be solved in 2 ways: Min Heap, and “Scanning Line Algorithm” (扫描线). As far as I know, this algorithm is used in computer graphics to process images faster, line by line instead of pixel by pixel. In any case, for this particular problem, I will iterate the input array, storing the start and end time into an intermediate array. Afterwards, I scan the array, keep track of the meeting going on count and maximum number of meeting going on max_count. I can update count by incrementing every time I encounter a start time and decrementing every time I encounter an end time. Once done, I return max_count.

Another solution for those who know is using min heap to store the end time of the array.

Algorithm

  1. Iterate the input array intervals, one adjacent pair at a time.
  2. If I encounter an overlapping, return False.
  3. If I reach the end, return True.

Complexity

Min Heap - Time complexity: \(O(n\log n)\): I need to sort the input array, plus iterate through the array and push the element into a heap.

  • Space complexity: \(O(n)\): This is the worst size of the heap, for all the meetings go on at the same time.

Scan Line - Time complexity: \(O(n)\): I need to iterate the input array equivalent to three times.

  • Space complexity: \(O(n)\): This is the size of the intermediate array.

Code

Min Heap solution
from typing import (
    List,
)
from lintcode import (
    Interval,
)
import heapq
"""
Definition of Interval:
class Interval(object):
    def __init__(self, start, end):
        self.start = start
        self.end = end
"""

class Solution:
    """
    @param intervals: an array of meeting time intervals
    @return: the minimum number of conference rooms required
    """
    def min_meeting_rooms(self, intervals: List[Interval]) -> int:
        # Write your code here
        intervals.sort(key=lambda x:x.start)
        heap = []
        heapq.heappush(heap, intervals[0].end)
        for interval in intervals[1:]:
            if heap[0] <= interval.start:
                heapq.heappop(heap)
            heapq.heappush(heap, interval.end)
        return len(heap)
“Scanning Line Algorithm” (扫描线)
from typing import (
    List,
)
from lintcode import (
    Interval,
)
"""
Definition of Interval:
class Interval(object):
    def __init__(self, start, end):
        self.start = start
        self.end = end
"""

class Solution:
    """
    @param intervals: an array of meeting time intervals
    @return: the minimum number of conference rooms required
    """
    def min_meeting_rooms(self, intervals: List[Interval]) -> int:
        # Write your code here
        times = []
        for interval in intervals:
            times.extend([(interval.start, 1), (interval.end, -1)])
        times.sort(key=lambda x: (x[0], x[1]))

        count = max_count = 0
        for time in times:
            count += time[1]
            max_count = max(max_count, count)
        return max_count