Unit 6 - Notes

CSE408 7 min read

Unit 6: Number-Theoretic Algorithms and Complexity Classes

1. Number Theory Problems

Number theory plays a fundamental role in modern cryptography, hashing algorithms, and randomized algorithms. This section covers the essential number-theoretic tools and algorithms used in computer science.

1.1 Modular Arithmetic

Modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" upon reaching a certain value, called the modulus.

Definition: If $a$ and $b$ are integers and $n$ is a positive integer, then $a$ is congruent to $b$ modulo $n$ , denoted as $a \equiv b \pmod n$ , if $n$ divides $(a - b)$ . Equivalently, $a$ and $b$ have the same remainder when divided by $n$ .
Properties:
- Addition: $(a + b) \pmod n = [(a \pmod n) + (b \pmod n)] \pmod n$
- Subtraction: $(a - b) \pmod n = [(a \pmod n) - (b \pmod n)] \pmod n$
- Multiplication: $(a \cdot b) \pmod n = [(a \pmod n) \cdot (b \pmod n)] \pmod n$
- Exponentiation: $(a^b) \pmod n$ is usually computed efficiently using the Modular Exponentiation algorithm (Repeated Squaring), which runs in $O(\log b)$ time.
Modular Multiplicative Inverse: An integer $x$ such that $(a \cdot x) \equiv 1 \pmod n$ . It exists if and only if $\text{gcd}(a, n) = 1$ (i.e., $a$ and $n$ are coprime). It can be found using the Extended Euclidean Algorithm.

1.2 Greatest Common Divisor (GCD)

The Greatest Common Divisor of two non-zero integers $a$ and $b$ is the largest positive integer that divides both $a$ and $b$ without a remainder.

Euclidean Algorithm

The Euclidean algorithm is an efficient method for computing the GCD. It is based on the principle that the GCD of two numbers does not change if the larger number is replaced by its difference with the smaller number. In practice, modulo is used.

Theorem: $\text{gcd}(a, b) = \text{gcd}(b, a \pmod b)$
Base Case: $\text{gcd}(a, 0) = a$

Algorithm (Recursive):

PYTHON

def gcd(a, b):
    if b == 0:
        return a
    return gcd(b, a % b)

Time Complexity: $O(\log(\min(a, b)))$ (Lamé's Theorem).

Extended Euclidean Algorithm

This algorithm finds the GCD of $a$ and $b$ , and also finds the coefficients $x$ and $y$ (Bézout coefficients) such that:
$a \cdot x + b \cdot y = \text{gcd}(a, b)$
This is heavily used in finding modular multiplicative inverses.

1.3 Chinese Remainder Theorem (CRT)

The Chinese Remainder Theorem states that if one knows the remainders of the Euclidean division of an integer $x$ by several integers, then one can determine uniquely the remainder of the division of $x$ by the product of these integers, provided the divisors are pairwise coprime.

Statement: Given pairwise coprime positive integers $n_1, n_2, \dots, n_k$ and arbitrary integers $a_1, a_2, \dots, a_k$ , the system of simultaneous congruences:
$x \equiv a_1 \pmod{n_1}$
$x \equiv a_2 \pmod{n_2}$
$\dots$
$x \equiv a_k \pmod{n_k}$
has a unique solution modulo $N = n_1 \cdot n_2 \cdots n_k$ .
Construction/Algorithm:
1. Compute $N = \prod_{i=1}^{k} n_i$ .
2. For each $i$ , compute $N_i = \frac{N}{n_i}$ .
3. Compute $y_i$ , the modular inverse of $N_i$ modulo $n_i$ (i.e., $N_i \cdot y_i \equiv 1 \pmod{n_i}$ ).
4. The solution is $x = \sum_{i=1}^{k} a_i \cdot N_i \cdot y_i \pmod N$ .
Applications: Fast computation of large numbers (e.g., in RSA cryptography) by breaking them down into smaller computations.

2. Optimization Problems vs. Decision Problems

Before entering complexity classes, it is crucial to distinguish between problem types.

Optimization Problem: Asks for the "best" (minimum or maximum) solution among all feasible solutions.
- Example: Traveling Salesperson Problem (TSP) - Find the shortest route visiting all cities.
Decision Problem: Asks a "Yes/No" question.
- Example: TSP Decision Version - Is there a route visiting all cities with a length $\le K$ ?
Relationship: Every optimization problem can be mapped to a decision problem. If an optimization problem is easy, the corresponding decision problem is easy. If a decision problem is hard, the corresponding optimization problem is at least as hard. Complexity theory usually focuses on Decision Problems for mathematical standardization.

3. Basic Concepts of Complexity Classes

Computational Complexity Theory classifies problems based on the resources (time and space) required to solve them using an algorithm.

3.1 Class P (Polynomial Time)

Definition: The class P consists of all decision problems that can be solved by a Deterministic Turing Machine (a standard computer) in polynomial time.
Meaning: An algorithm exists with a worst-case time complexity of $O(n^k)$ , where $n$ is the input size and $k$ is a constant.
Characteristics: These problems are considered mathematically "tractable" or "easy" to solve.
Examples: Sorting algorithms, finding the shortest path in a graph (Dijkstra's), determining if a number is prime, computing GCD.

3.2 Class NP (Nondeterministic Polynomial Time)

Definition: The class NP consists of all decision problems for which a given proposed solution ("certificate") can be verified by a Deterministic Turing Machine in polynomial time.
Alternative Definition: Problems that can be solved by a Non-deterministic Turing Machine in polynomial time. (A non-deterministic machine can "guess" the right answer and verify it instantly).
Relationship to P: $P \subseteq NP$ . Any problem that can be solved in polynomial time can certainly have its solution verified in polynomial time.
Examples:
- Subset Sum: Given a set of integers, is there a non-empty subset whose sum is zero? (Hard to find, but given a subset, easy to verify by adding).
- Graph Coloring: Can a graph be colored with $k$ colors such that no adjacent vertices share a color?

3.3 Class NP-Hard

Definition: A problem $X$ is NP-hard if every problem $Y$ in NP can be reduced to $X$ in polynomial time ( $Y \le_p X$ ).
Meaning: NP-hard problems are at least as hard as the hardest problems in NP. If you could solve an NP-hard problem in polynomial time, you could solve all NP problems in polynomial time.
Characteristics:
- They do not have to be decision problems (they can be optimization problems).
- They do not have to be in NP (their solutions might not be verifiable in polynomial time).
Examples: The Halting Problem (undecidable, not in NP), Optimization version of TSP.

3.4 Class NP-Complete

Definition: A problem is NP-complete if it satisfies two conditions:
1. It is in NP (solutions can be verified in polynomial time).
2. It is NP-hard (all other NP problems can be polynomially reduced to it).
Significance: NP-complete problems are the "hardest" problems in NP. They form the core of the P vs. NP debate.
Polynomial Reduction ( $A \le_p B$ ): A process of transforming an instance of problem $A$ into an instance of problem $B$ in polynomial time, such that the answer to $B$ is "yes" if and only if the answer to $A$ is "yes".
Cook-Levin Theorem: Proved that the Boolean Satisfiability Problem (SAT) is NP-complete. This was the first problem proven to be NP-complete, providing a baseline to prove others via reduction.
Classic Examples of NP-Complete Problems:
- SAT (Satisfiability): Given a Boolean formula, is there an assignment of True/False values that makes the formula True?
- 3-SAT: A special case of SAT where each clause has exactly 3 literals.
- Decision TSP: Is there a tour of length $\le K$ ?
- Vertex Cover: Is there a set of $k$ vertices that touch every edge in a graph?
- Knapsack Problem (Decision version): Can a value of at least $V$ be achieved without exceeding weight $W$ ?

3.5 The P vs NP Problem

The Question: Does ?
- If $P = NP$ , it means that if a solution to a problem can be easily checked, the problem can also be easily solved. (This would break modern cryptography).
- If $P \neq NP$ (which is widely believed), it means there are problems whose solutions can be verified quickly, but finding the solution takes an impractically long time (exponential time).
Proving any single NP-complete problem to have a polynomial-time algorithm would instantly prove $P = NP$ .

Unit 5