JosephFerano/Notes-TheAlgorithmDesignManual

Fork 0

Joseph Ferano 3e61af6cac Chapter 3 with one exercise; reversing words in a sentence in-place

2023-01-09 13:17:55 +07:00

10 KiB

Raw Blame History

Notes & Exercises: The Algorith Design Manual

Chapter 1
Chapter 2
Chapter 3

Chapter 1

1.1 Robots

An algorithm is a procedure that takes any of the possible input instances and transforms it to the desired output.

The Robot arm problem is presented where it is trying to solder contact points and visit all points in the shortest path possible.

The first algorithm considered is NearestNeighbor. However this is a naïve, and the arm hopscotches around

Next we consider ClosestPair, but that too misses in certain instances.

Next is OptimalTSP, which will always give the correct result because it enumerates all possible combinations and return the one with the shortest path. For 20 points however, the algorithm grows at a right of O(n!). TSP stands for Traveling Salesman Problem.

TODO Implement NearestNeighbor

TODO Implement ClosestPair

TODO Implement OptimalTSP for N < 8

1.2 Right Jobs

Here we are introduced to the Movie Scheduling Problem where we try to pick the largest amount of mutually non-overlapping movies an actor can pick to maximize their time. Algorithms considered are EarliestJobFirst to start as soon as possible, and then ShortestJobFirst to be done with it the quickest, but both fail to find optimal solutions.

ExhaustiveScheduling grows at a rate of O(2ⁿ) which is much better than O(n!) as in the previous problem. Finally OptimalScheduling improves efficiency by first removing candidates that are overlapping such that it doesn't even compare them.

1.3 Correctness

It's important to be clear about the steps in pseudocode when designing algorithms on paper. There are important things to consider about algorithm correctness;

Verifiability
Simplicity
Think small
Think exhaustively
Hunt for the weakness
Go for a tie
Seek extremes

Other tecniques include Induction, Recursion, and Summations.

1.4 Modeling

Most algorithms are designed to work on rigorously defined abstract structures. These fundamental structures include;

Permutations
Subsets
Trees
Graphs
Points
Polygons
Strings

1.5-1.6 War Story about Psychics

Chapter 2

2.1 RAM Model of Computation

This is a simpler kind of Big Oh where

Each simple operation is 1 step
Loops and Subroutines are composition of simple operations
Each memory access is one time step

Like flat earth theory, in practice we use it when engineering certain structures because we don't take into account the curvature of the Earth.

We can already apply the concept of worst, average, and best case to this model.

2.2 Big Oh

The previous model often requires concrete implementations to actually measure correctly, so instead Big Oh gives us a better, simpler framework for discussing the relative performance between algorithms. It ignores factors that don't impact how algorithms scale.

2.3 Growth Rates and Dominance Relations

These are the functions that occur in algorithm analyses;

Constant O(1) Hashtable look up, array look up, consing a list
Logarithmic O(log n) Binary Search
Linear O(n) Iterating over a list
Superlinear O(n log n) Quicksort and Mergesort
Quadratic O(n²) Insertion Sort and Selection Sort
Cubic O(n³) Some dynamic programming problems
Exponential O(Cⁿ^{}) c for any constant c > 1 Enumerate all subsets
Factorial O(n!) Generating all permutations or orderings

Notes:

O(n!) algorithms become useless for anything n >= 20
O(2ⁿ) algorithms become impractical for anything n > 40
O(n²^{}) algorithms start deteriorating after n > 10,000, a million is hopeless
O(n²^{}) and O(n log n) Are fine up to 1 billion

2.4 Working with Big Oh

Apparently you can do arithmetic on the Big Oh functions

2.5 Efficiency

Selection Sort

C

void print_nums(int *nums, int length) {
    for (int i = 0; i < length; i++) {
        printf("%d,", nums[i]);
    }
    printf("\n");
}

void selection_sort(int *nums, int length) {
    int i, j;
    int min_idx;
    for (i = 0; i < length; i++) {
        print_nums(nums, length);
        min_idx = i;
        for (j = i+1; j < length; j++) {
            if (nums[j] < nums[min_idx]) {
                min_idx = j;
            }
        }
        int temp = nums[min_idx];
        nums[min_idx] = nums[i];
        nums[i] = temp;
    }
}

int nums[9] = { 2, 4, 9, 1, 3, 8, 5, 7, 6 };
selection_sort(nums, 9);

2	4	9	1	3	8	5	7	6
1	4	9	2	3	8	5	7	6
1	2	9	4	3	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	5	8	9	7	6
1	2	3	4	5	6	9	7	8
1	2	3	4	5	6	7	9	8
1	2	3	4	5	6	7	8	9

Insertion Sort

C

void insertion_sort(int *nums, int len) {
    int i, j;
    for (i = 1; i < len; i++) {
        j = i;
        while (nums[j] < nums[j -1] && j > 0) {
            int temp = nums[j];
            nums[j] = nums[j - 1];
            nums[j - 1] = temp;
            j--;
        }
    }
}

int nums[8] = {1,4,5,2,8,3,7,9};
insertion_sort(nums, 8);
for (int i = 0; i < 8; i++) {
    printf("%d", nums[i]);
}

12345789

TODO String Pattern Matching

TODO Matrix Multiplication

2.6 Logarithms

Logarithms are the inverse of exponents. Binary search is great for sorted lists. There are applications related to fast exponentiation, binary trees, harmonic numbers, and criminal sentencing.

2.7 Properties of Logarithms

Common bases for logarithms include 2, e, and 10. The base of the logarithm has no real impact on the growth rate; log₂ and log₃ are roughly equivalent.

2.8 War Story Pyramids

Cool story bro

2.9 Advanced Analysis

Some advanced stuff

Inverse Ackerman's Function Union-Find data structure
log log n Binary search on a sorted array of only log n items
log n / log log n
log² n
\sqrt{,}n

There are also limits and dominance relations

Chapter 3

3.1 Contiguous vs Linked Data Structures

Advantages of Arrays

Constant-time access given the index
Space efficiency
Memory locality

Downsides is that they don't grow but dynamic arrays fix this by allocating a new bigger array when needed.

Advantages of Linked Structures

No overflow, can keep growing
Insertions/Deletions are simpler
A collection of pointers are lighter than contiguous data

However, pointers require extra space for storing pointer fields

3.2 Stacks and Queues

Stacks

(PUSH, /POP) LIFO, useful in executing recursive algorithms.

Queues

(ENQUEUE, DEQUEUE) FIFO, useful for breadth-first searches in graphs.

3.3 Dictionaries

Not just hashtables but anything that can provide access to data by content. Some dictionaries implement trees instead of hashing. Both contiguous and linked structures can be used with tradeoffs between them.

3.4 Binary Search Trees

BSTs have a parent and two child nodes; left and right. They support insertion, deletion, traversal. Interestingly, Min and Max can be calculated by seeking the leftmost and rightmost node respectively, provided the tree is balanced. BSTs can have good performance for most cases so long as they remain balanced. O(h) refers to the time being the height of the BST.

3.5 Priority Queues

They allow new elements to enter a system at arbitrary intervals.

3.6 War Story

Rather than storing all of the vertices of a mesh, you can share them between the different triangles, but connecting all vertices requires visiting each vertice once, a Hamiltonian path, but that's NP-Complete. Using a greedy heuristic where it tries to always grab the best possible thing first. Then using a priority queue, they were able to reduce the running time by several orders of magnitude compared to the naïve approach.

3.7 Hasing and Strings

Take a map to a big int, use modulo to spin around, and if m is a large prime you'll get fairly uniform distribution. The two main ways to solve collisions are Chaining and Open Addressing. Chaining is where each bucket has a linked list and collisions are appended. Open addressing looks for adjacent empty buckets.

Hashing is also useful when dealing strings, in particular, substring pattern matching. Overlaying pattern p over every position in text t would result in O(m*n). With hashing, you can hash the slices of t and compare them to p, and get slower growth. This is called the Rabin-Karp algorithm. While false-positives may occur, a good hashing function would avoid this.

Hashing is so important Yahoo! employs them extensively.

3.8 Specialized Data Structures

These include;

String Characters in an array
Geometric Collection of points and regions/polygons
Graph Using adjacency matrices
Set Dicionaries and bit vectors

3.9 War Story

They were trying to implement sequencing by hybridization (SBH), but ran into issues when they used a BST. Then they tried a hashtable, then a trie. Finally what worked was a compressed suffix tree.

Exercises

3.42

Reverse the words in a sentence—that is, “My name is Chris” becomes “Chris is name My.” Optimize for time and space.

void reverse_word(char *string, int length) {
    for (int i = 0; i < length / 2; i++) {
        char temp = string[i];
        string[i] = string[length - 1 - i];
        string[length - 1 - i] = temp;
    }
}

void reverse_words(char *string, int length) {
    printf("Before: %s\n", string);
    reverse_word(string, length);
    printf("After: %s\n", string);
    int start = 0;
    for (int i = 0; i < length; i++) {
        if (string[i] == ' ' || i == length - 1) {
            if (i == length - 1) i++;
            reverse_word(&string[start], i - start);
            start = i + 1;
        }
    }
}

char str[] = "My name is Chris";
reverse_words(str, strlen(str));
printf("Final: %s\n", str);

Before:	My	name	is	Chris
After:	sirhC	si	eman	yM
Final:	Chris	is	name	My

2	4	9	1	3	8	5	7	6
1	4	9	2	3	8	5	7	6
1	2	9	4	3	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	5	8	9	7	6
1	2	3	4	5	6	9	7	8
1	2	3	4	5	6	7	9	8
1	2	3	4	5	6	7	8	9

2	4	9	1	3	8	5	7	6
1	4	9	2	3	8	5	7	6
1	2	9	4	3	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	9	8	5	7	6
1	2	3	4	5	8	9	7	6
1	2	3	4	5	6	9	7	8
1	2	3	4	5	6	7	9	8
1	2	3	4	5	6	7	8	9

10 KiB Raw Blame History