Classical Problems

Steven Halim gives solutions for several different "classical" problems which have Greedy or DP solutions, or at least solutions that could be considered Greedy or DP.

  1. Binary Search (special)
  2. Max 1D Range Sum (greedy/special)
  3. Max 2D Range Sum (greedy/special)
  4. Longest Increasing Subsequence (DP)
  5. Maximum Constrained Subset Sum (DP)
  6. Interval Covering (greedy)
  7. Traveling Salesman Problem (DP)
  8. Load Balancing (greedy)
  9. Coin Change Problem (greedy/DP; from last week)

Here are summaries and code for some of them. Refer to the book for the rest. Anyone who wants to be competitive in a programming contest should know all of these problems by heart.

Binary Search

This is more of a problem-solving strategy than a specific classical algorithm. I'm putting it here because it's often overlooked or forgotten in programming competitions, but it's a powerful tool that can solve a whole lot of problems without needing to be very clever.

Imagine that you are an explorer trying to cross a desert. you use a jeep with a "large enough" fuel tank: initially full. You encounter a series of events throughout your journey such as "drive" (which consumes fuel), "encounter gas station", "experience gas leak", "encounter mechanic" (which fixes gas leaks), and "reach goal". You need to determine the smallext possible fuel tank capacity for your jeep to be able to cross the desert. The answer must be precise to three decimal places.

It turns out that there is a greedy solution to this problem. However, under a time crunch, you may not have time to figure out the greedy solution. Guess what? We can binary search the answer!

  1. Start by picking a very large and very small number (that will definitely be out of range of the solution)
  2. Compute their average, and see if that average is sufficient to cross the desert. If it is sufficient, then continue your search in the bottom half. If it is insufficient, continue your search in the top half.
  3. Continue until you have converged on an answer to the desired precision.

In order for this technique to work, we need to be able to "guess and check" a possible solution, and given the result of the "guess and check", we need to be able to tell whether our final solution is larger or smaller.

Binary Search feels like cheating. But when it works, it's a huge time saver!

Max 1D Range Sum

Alice takes the bus to work every morning. However, she also likes to ride her bike. Different segments along the route to work have higher value than others: she likes riding through quiet, wooded neighborhoods, but not on busy thoroughfares. You are given an array of the values between each bus stop. If Alice is allowed to ride her bike between any two bus stops, what is the maximum value she could get?

We are interested in finding the contiguous sub-list with the maximal sum. For example, suppose that Alice's values were [4, -5, 4, -3, 4, 4, -4, 3, -5]. The highest such sub-list is [4, -3, 4, 4], summing to 9.

The brute force solution is to pick every pair i and j, compute the sum, and pick the highest. That is a cubic time algorithm, which is awfully slow.

If we perform the summation concurrently with iterating over i and j, we can be clever and reduce this to a quadratic solution. However, quadratic is still a bit slow.

The optimal linear-time solution keeps a running total while iterating over the array. If at any point that running total dips below zero, it starts counting the sum over again. It returns the highest sum that occurred at any point during this iteration.

def max_range_sum(nums):
    best = 0
    sum = 0
    for x in nums:
        sum += x
        if sum < 0: sum = 0
        if sum > best: best = sum
    return best

# Run an example
nums = [4, -5, 4, -3, 4, 4, -4, 3, -5]
print(max_range_sum(nums))  # => 9

This algorithm is known as Kadane's algorithm, could be considered either Greedy or DP, but personally it feels more like a Greedy algorithm to me.

We could also write this algorithm using functional programming constructs in one line as shown below.

from itertools import accumulate

def max_range_sum(nums):
    return max(accumulate(nums, lambda s,x: max(s+x, 0)))

# Run an example
nums = [4, -5, 4, -3, 4, 4, -4, 3, -5]
print(max_range_sum(nums))  # => 9

The functional construct "accumulate" in Python is an implementation of what is called various things in other languages (scanl in Haskell, reductions in Clojure, FoldList in Mathematica, etc). It works like reduce, but it returns a list with the "running total" of the reduction at every element in the list.

Max 2D Range Sum (Maximum Submatrix)

You are interested in cutting yourself a rectangular piece of a rectangular cake. Some pieces have higher value than others: for instance, you like sections that have nuts, but you don't like sections that have blue icing. Given a matrix of your values at every point in the cake, find the highest value piece that you could cut from the cake.

In this case, we are interested in the maximum sub-matrix from a larger matrix. For example, given the matrix

 0  -2  -7   0
 9   2  -6   2
-4   1  -4   1
-1   8   0  -2

The maximum sub-matrix, which sums to 15, happens to be the one from rows 2-4 and columns 1-2:

 9   2
-4   1
-1   8

Suppose we have a matrix with m rows and n columns. There are O(m2n2) different sub-matrices, and summing up each sub-matrix takes O(mn) time, making the brute-force solution take O(m3n3) time. For a square matrix, that's O(n6) time.

The best known solution to this problem is O(m2n), which is O(n3) for a square matrix.

Step 1: Construct a matrix having the cumulative sums in each column. For example, in the input matrix above, such a column prefix sum matrix would be

 0   0    0   0
 0  -2   -7   0
 9   0  -13   2
 5   1  -17   3
 4   9  -17   1

I have added the row of zeros at the top for simplicity in the next step. This matrix allows us to find the sum of any column between two arbitrary rows in O(1) time.

Step 2: Perform an O(m2) search over each pair of rows. On each step of the search, do two things: compute a list of the column sums between the two rows in O(n) time; and then perform the Max 1D Range Sum algorithm on the column sum list, also in O(n) time.

For example, given the column prefix sum matrix above, suppose we were interested in rows i through j. We take rows i and j+1 from the column sum matrix, and take their difference. If i=1 and j=2,

Row 1 of prefix sum matrix: {0, 0, 0, 0}
Row 3 of prefix sum matrix: {9, 0, -13, 2}

Difference: {9, 0, -13, 2}

Notice that the list we get out is the same thing as the sum of each of the columns in the original input matrix between rows 1 and 2 (inclusive). The important part is that we do it in O(n) time, when normally that would take O(mn) time. We now run the 1D Range Sum algorithm on that list, and we get 9.

However, if i=2 and j=4, our column sum list would be

Row 2 of prefix sum matrix: {0, -2, -7, 0}
Row 5 of prefix sum matrix: {4, 9, -17, 1}

Difference: {4, 11, -10, 1}

and 1D Range Sum gives us 15, which is the correct answer. We just have to check all O(m2) of the i, j pairs, giving us a total runtime of O(m2n).

The technique of using prefix sums to trade off time for memory is something that you'll see a lot in competitive programming.

In the book, Halim gives an O(m2n2) solution, which is similar to the algorithm above, but is a bit easier to code. Refer to the book if you are interested.

Longest Increasing Subsequence (LIS)

In the Max Range Sum problems, we were interested in contiguous segments. How about noncontiguous subsequences?

You've taken a lot of tests this semester. You've scored really well on some, and really badly on others. It's almost time for one-on-ones with your advisor, and you want to pick out test scores to show that you've been improving. You want to show your advisor as many tests as you can, but each test has to have a higher score than the one before it chronologically. What is the maximum number of tests you can pick out?

For example, if our input array were [10, 100, 90, 20, 30, 80, 80, 10], then we could pick out the four tests I bolded.

A brute-force approach is to test all 2n possible subsequences, but of course that will take exponential time.

There is a nifty O(n2) solution that uses DP. The idea is that we have a recurrence where, for each element in the list, we look through the rest of the list coming before that element, and pick the largest candidate solution in that part of the list whose highest value is less than the element we are currently looking at.

Here's some code. Again, the most important part conceptually is the recurrence on lines 10-14:

# Helper function for LIS.
# Returns a tuple with two elements:
#  Index 0: The length of the LIS.
#  Index 1: The elements in the LIS.
def LIS_helper(nums, i, cache):
    # Check the cache
    if i not in cache:
            # Recursive Step
            soln = max(
                LIS_helper(nums, j, cache)
                for j in range(0,i)
                if nums[j] < nums[i]
            cache[i] = (soln[0]+1, soln[1]+[nums[i]])

        except ValueError:
            # Base Case (no smaller elements in array)
            cache[i] = (1, [nums[i]])

    # Return Value
    return cache[i]

# Entry point to the LIS.
def LIS(nums):
    cache = {}
    return max(LIS_helper(nums, i, cache) for i in range(len(nums)))

# Example
nums = [-7, 10, 9, 2, 3, 8, 8, 1]
length, sequence = LIS(nums)
print("LIS Length: %d" % length)  # => LIS Length: 4
print("LIS Sequence: %s" % str(sequence))  # => LIS Sequence: [-7, 2, 3, 8]

It's quadratic because although there are only O(n) DP states, each state takes O(n) time to compute, leading to O(n2) total.

In a lot of cases, the quadratic solution shown above will be fine. It turns out however that there is an O(n log n) solution to this problem. For this algorithm, in Steven Halim's words, we maintain an array such that element i of the array represents the smallest ending value of all length-i candidate LIS solutions found so far. Note that the array is always sorted, and it is therefore amenable to binary search. If we were in CSE 241, you would be writing a proof right now to show this fact is true. :)

The code for this approach would be:

def LIS(nums):
    if len(nums) == 0: return None
    tracker = []
    for n in nums:
        # Perform the binary search
        i = binary_search(n, tracker)

        # Case 1: We've found the longest known subsequence
        # (this case also handles the initialization of the tracker array)
        if len(tracker)-i == 1:

        # Case 2: We can improve on an existing subsequence
        if tracker[i+1] > n:
            tracker[i+1] = n

    # Return the solution, which is the length of the list.
    # Using this method, we don't get to see the actual subsequence as easilly
    # as we did with the quadratic soluion.
    return len(tracker)

# Binary search implementation: return the largest element in haystack
# that is less than needle
def binary_search(needle, haystack):
    i = 0
    j = len(haystack)
    while j-i > 1:
        k = (i+j)//2
        if haystack[k] < needle:
            i = k
            j = k

    # If no element was found, return -1
    if len(haystack) == 0 or haystack[0] >= needle: return -1

    # Return solution
    return i

This is O(n log n) because there are n operations, and each operation takes O(log n) time for the binary search.

Try running both solutions in your Python shell. You can tell right away that the second solution is way faster than the first as soon as n > 1,000 or so.

In practice, you should always use the second solution. I only showed the first approach because it's a really cute application of top-down DP.

Maximum Constrained Subset Sum

The goal with this family of questions is to find the maximum subset of an array that satisfies some condition. For example,

You are a peddler heading out for the day. You have n items, each with a particular value v and weight w. Your knapsack can only carry at most S pounds. What is the highest value of items that you can put in your knapsack?

For example, suppose our knapsack can carry 12 pounds, and we had the following items:

  1. value=10, weight=10
  2. value=70, weight=4
  3. value=50, weight=6
  4. value=10, weight=12

The optimal solution here is to take items 2 and 3, which have total weight 10 and value 120.

The raw recurrence is as follows. Read the comments for an explanation.

# Brute force solution for the Knapsack Problem
def knapsack(values, weights, i, capacity):
    # Base Case 1: Our knapsack is full
    if capacity == 0: return 0

    # Base Case 2: We have considered all of the items
    if len(values) == i: return 0

    # Condition: Does this item fit in our knapsack?  If not, ignore it and
    # move on to the next item.
    if weights[i] > capacity:
        return knapsack(values, weights, i+1, capacity)

    # Recursive Step: Try either taking or ignoring this item, and return the
    # best result.
    return max(
        knapsack(values,weights,i+1,capacity-weights[i]) + values[i], # take it
        knapsack(values,weights,i+1,capacity) # ignore it

# Example
values = [100, 70, 50, 10]
weights = [10, 4, 6, 12]
capacity = 12
print(knapsack(values, weights, 0, capacity))  # => 120

The raw recurrence will have O(2n) runtime in the worst case. We can improve this to O(nS) (where S is the capacity of the knapsack) by adding a DP cache. The key we'll use is the pair (i,capacity).

# Top-down DP solution for the Knapsack Problem
def knapsack(values, weights, i, capacity, cache=None):
  # Base Cases
  if capacity == 0: return 0
  if len(values) == i: return 0

  # Memoize Step
  if cache is None: cache = {}
  key = (i,capacity)
  if key not in cache:

    if weights[i] > capacity:
      # Condition Case
      cache[key] = knapsack(values, weights, i+1, capacity, cache)

      # Recursive Step
      cache[key] = max(
        knapsack(values, weights, i+1, capacity - weights[i], cache) + values[i],
        knapsack(values, weights, i+1, capacity, cache)

  # Return Value
  return cache[key]

You will see this kind of Dynamic Programming question a lot in competitions.

Note: The problem illustrated here is known as the Knapsack Problem. If you look up the Subset Sum Problem on Wikipedia and elsewhere, the formulation is a bit different than the Knapsack Problem. Like Knapsack, that problem is another special case of the more general "constrained" subset sum problem. A similar top-down DP approach will solve Wikipedia's version of the problem.

Interval Covering

This is one of my favorite problems with a greedy solution, because the greedy solution actually works in all cases!

You are hosting an event, and a bunch of volunteers have each given you a start time and an end time of when they are available to help. What is the fewest number of volunteers required in order to staff the entire duration of the event? If there is a section of time for which no person volunteered, you should return -1.

Here are the steps.

  1. Sort the volunteers by increasing start time. You can break ties arbitrarily.
  2. Take the first volunteer from the list (the one with the earliest start time) and add them to your roster.
  3. Greedy Step: Look through all volunteers whose start time is before the first volunteer's end time. Pick the one that has the latest end time and add them to your roster.
  4. Repeat the process until you have reached the end of the list.

This solution takes O(n log n) time, because we need to sort the list. The iteration step takes O(n) total time, because the greedy condition is evaluated only once per volunteer.

In step 3, you will need to check for the edge case where there are no volunteers whose end time is later than the previous volunteer's end time. If that happens, then there is no solution to the problem and you can exit the function.

Here is some code in Python.

# Greedy Solution for the Interval Covering Problem
# Takes two equal-length lists as input, the start times and end times for the
# pool of volunteers.
def interval_covering(startingtimes, endingtimes):
    # Sort the input into a list of pairs
    intervals = sorted(
        list(zip(startingtimes, endingtimes)),
        key=lambda v: (v[0], -v[1])
    ) + [(float("inf"), float("inf"))]
    if not intervals: return 0  # edge case

    # Set up initial states
    cnt = 1
    endtime = intervals[0][1]
    best = float("-inf")
    del intervals[0]

    # Perform the iteration
    for v in intervals:
        if v[0] <= endtime:
            # Case 1: Still in the same time segment as
            # the previous iteration
            best = max(v[1], best)

        elif endtime < best:
            # Case 2: Add the volunteer and start looking
            # at the next time segment
            cnt += 1
            endtime = best
            best = v[1]

            # Case 3: There is a gap in service.  Return -1
            # as the problem asks.
            return -1

    # Return Result
    return cnt

For example, suppose we had volunteers were signed up for the following intervals:

When we sort them, we get:

I add the last element only to make the iteration a little bit cleaner at the end.

Our first segment is going to be 1 thru 8, and we initialize our "best" to -inf and our "last end time" to 8. The algorithm proceeds as follows:

  1. 4 thru 13: The start time for this volunteer is before the last end time (case 1). Update our "best" to 13.
  2. 4 thru 21: The start time for this volunteer is still before the last end time (case 1). Update our "best" to 21.
  3. 13 thru 21: The start time for this volunteer is after the last end time, so we need to add a volunteer (case 2). The best volunteer from the previous interval had an end time at 21, so we set "last end time" to 21. Now, we would normally also update the "best" counter here, but it is already set at 21, so there is no need.
  4. 18 thru 25: Case 1. Update "best" to 25.
  5. 20 thru 24: Case 1. Do not update "best", because the current value of 25 is better than our value of 24.
  6. inf thru inf: Case 2. Add the volunteer with end time 25.

We now return 3, the number of volunteers required to cover the full time interval.

Traveling Salesman Problem (TSP)

You've probably heard about this problem before.

You are a salesman, and you need to visit several different cities. Given the flight times between each city, what is the minimum flying time such that you visit each city exactly once?

The DP solution, which gets us an exact solution, actually isn't all that great in terms of time or space complexity: it's O(2n*n2). Yuck! Nonetheless, the authors of programming questions know that, and they might throw a TSP at you which does fit inside those bounds. The recurrence is:

  1. tsp(pos, 2n-1) => dist[pos][0]
  2. tsp(pos, mask) => min(dist[pos][nxt] + tsp(nxt, mask | (1 << nxt)))

We will talk more about bitmasks next week.

More often than not, in practice you use an approximation algorithm to solve TSP. There are a lot of good polynomial-time approximation algorithms out there. However, approximation algorithms are not usually something you find in programming competitions, and they are beyond the scope of this class. (Take CSE 541 if you want to learn more.)

Load Balancing

You may be familiar with this problem:

You have k boxes to pack your dorm room, and n items to pack, with each item having a certain weight w. How should you pack your boxes such that each box weighs as close to the average box weight as possible?

There is a greedy solution. First sort the input items by weight (most to least), and then incrementally add each item to the box which currently has the lowest weight.

One can prove that this algorithm gives the exact solution when n ≤ 2k. However, it breaks when n > 2k; in that case, it only gives an approximate solution. Here is a question on Stack Exchange with an example of when this algorithm breaks.