Unit 4 - Notes

CSE408 7 min read

Unit 4: Dynamic Programming and Greedy Techniques

1. Dynamic Programming (DP) Overview

Dynamic Programming is an algorithmic paradigm that solves a given complex problem by breaking it into subproblems and storing the results of subproblems to avoid computing the same results again.

  • Key Properties:
    • Overlapping Subproblems: The problem can be broken down into subproblems which are reused several times.
    • Optimal Substructure: The optimal solution to a problem can be constructed from optimal solutions of its subproblems.

Computing a Binomial Coefficient

The binomial coefficient or represents the number of ways to choose elements from a set of elements.

  • Recurrence Relation:
  • Base Cases: and
  • DP Approach: Construct a 2D array (Pascal's triangle) bottom-up to avoid exponential recursive calls.
  • Time & Space Complexity:

Warshall's and Floyd's Algorithm

These algorithms operate on directed graphs represented by adjacency matrices.

  • Warshall's Algorithm (Transitive Closure): Computes if there is a path between any two vertices and .
    • Recurrence:
    • Time Complexity:
  • Floyd's Algorithm (All-Pairs Shortest Path): Finds the shortest paths between all pairs of vertices.
    • Recurrence:
    • Time Complexity:

Optimal Binary Search Trees (OBST)

Given a sorted array of keys and their search probabilities, an OBST minimizes the expected search cost.

  • DP Formulation: Let be the cost of an OBST containing keys .
  • Recurrence:
  • Time Complexity: using standard DP, optimizable to using Knuth's optimization.

Knapsack Problem and Memory Functions

The 0/1 Knapsack problem asks to maximize the value of items placed in a knapsack of capacity , where each item has a weight and value .

  • Recurrence:
    • if
    • if
  • Memory Functions: Combines top-down recursion with bottom-up DP. It uses a recursive function but caches (memoizes) calculated values in a table. It only solves subproblems strictly necessary for the final solution, often outperforming the strict bottom-up approach in practice.

Matrix-Chain Multiplication

Finds the most efficient way to multiply a given sequence of matrices. Matrix multiplication is associative, so parenthesization matters.

  • Problem: Minimize the number of scalar multiplications. Multiplying an matrix by a matrix costs .
  • Recurrence: Let be the minimum cost to multiply matrices .
    • if
  • Time Complexity:

Longest Common Subsequence (LCS)

Finds the longest subsequence present in both of two given sequences.

  • Recurrence: Let of length and of length .
    • If :
    • If :
  • Time Complexity:

2. Greedy Technique and Graph Algorithms

The Greedy technique makes a locally optimal choice at each stage with the hope of finding a global optimum. It does not look back or revise previous choices.

Minimum Spanning Trees (MST)

An MST of a connected, undirected, edge-weighted graph is a spanning tree whose sum of edge weights is minimized.

Prim's Algorithm

  • Approach: Builds the tree one vertex at a time. Starts with an arbitrary vertex and greedily adds the minimum weight edge that connects a vertex in the tree to a vertex outside the tree.
  • Data Structure: Priority Queue (Min-Heap).
  • Time Complexity: using a binary heap.

Kruskal's Algorithm

  • Approach: Builds the tree one edge at a time. Sorts all edges by weight, then iterates through them, adding the edge to the MST if it does not form a cycle.
  • Data Structure: Disjoint-Set (Union-Find) to detect cycles.
  • Time Complexity: or due to sorting.

Dijkstra's Algorithm (Single-Source Shortest Paths)

Finds the shortest paths from a single source vertex to all other vertices in a weighted graph with non-negative edge weights.

  • Approach: Maintains a set of vertices whose shortest distance is finalized. Greedily picks the unvisited vertex with the minimum distance, then relaxes all its outgoing edges.
  • Edge Relaxation: if (dist[u] + weight(u, v) < dist[v]) then dist[v] = dist[u] + weight(u, v)
  • Time Complexity: using an adjacency list and Min-Heap.

Huffman Code

A lossless data compression algorithm.

  • Approach: Assigns variable-length codes to input characters based on their frequencies. Most frequent characters get the shortest codes.
  • Algorithm:
    1. Create a leaf node for each character and build a min-heap based on frequency.
    2. Extract two nodes with the lowest frequencies.
    3. Create a new internal node with a frequency equal to the sum of the two nodes.
    4. Insert the new node into the heap.
    5. Repeat until only one node remains (the root of the Huffman tree).
  • Time Complexity: where is the number of unique characters.

3. Shortest Paths (Overview)

Single-Source Shortest Paths

  • Dijkstra's Algorithm: Used for graphs with non-negative weights (Greedy approach).
  • Bellman-Ford Algorithm: Used for graphs with negative weights (DP approach). Runs in and can detect negative weight cycles.

All-Pairs Shortest Paths

  • Floyd-Warshall Algorithm: (Covered in DP section). Computes shortest paths between all pairs in .
  • Johnson's Algorithm: Efficient for sparse graphs. Uses Bellman-Ford to reweight edges to be non-negative, then runs Dijkstra from each vertex. Time Complexity: .

4. Iterative Improvement

The Maximum-Flow Problem

Finds the maximum flow possible in a flow network from a source node () to a sink node (), subject to edge capacity constraints.

  • Ford-Fulkerson Method:
    1. Start with initial flow as 0.
    2. While there is an augmenting path from to in the residual graph, add this path-flow to the overall flow.
    3. Update the residual graph by decreasing forward capacities and increasing reverse capacities.
  • Max-Flow Min-Cut Theorem: The maximum flow passing from source to sink is equal to the minimum cut (capacity) that separates the source and the sink.
  • Edmonds-Karp Algorithm: An implementation of Ford-Fulkerson using BFS to find augmenting paths, ensuring a time complexity of .

5. Limitations of Algorithm Power

Lower-Bound Theory

Lower-bound theory attempts to establish the minimum amount of time or space any algorithm requires to solve a specific problem. It proves that no algorithm can do better than this bound.

  • Trivial Lower Bounds: Based on the size of the input/output. E.g., multiplying two matrices must take at least time because there are entries in the output matrix to compute.
  • Information-Theoretic Arguments: Often used in decision tree models. E.g., sorting elements requires comparisons because there are possible permutations, and each comparison splits the search space in half.
  • Adversary Arguments: An adversary provides the worst-possible input dynamically as the algorithm runs, ensuring the algorithm is forced to do a specific minimum amount of work (e.g., finding the maximum and minimum in an array requires at least comparisons).