Saturday, July 30, 2016

DP - Longest increasing subsequence

The problem

Given a sequence of numbers S1,S2...,Sn
We have to find a sub-sequence

R1,R2...Rn

such that

R1<R2<...<Rn


The solution

First we create a table by scanning the number from start to end. For each number we find number with largest sequence that is smaller than the current number. We set length of current items sequence length to be one more the found number. We also set the predecessor of the current item to be the found numbers index.
01234567891011
Sequence613284957121611
Length112233445676
Predecessor-1-11122457898
From the table we see that the maximum length is 7 at index 10. So we start at index 10 which will be last item of the sub-sequence. The value is 16. Then we find its predecessor from the predecessor list at index 10 - which is 9. So the next to last item will be item at index 9 which is 12. Which has predecessor at index 7 and so on.
The longest increasing sub-sequence is:
1, 3, 4, 5, 7, 12, 16

Here is the code in python:

 

[This is incomplete but I may not complete it]

Tuesday, April 15, 2014

CS131: Quicksort and quickselect

Quick sort is a divide and conquer inplace sorting algorithm. This is a preferable sorting algorithm for many practical applications because of its average case performnace is very good- O(n log n).

Quick sort takes an array and partitions around a pivot item X such that elements left of X are smaller than X and elements right of X is greater than or equal to X. It takes O(n) to partition any array.

It then recursively sorts subarrays left of X and right of X recursively. If partitions are done in the middle on average total runtime will be O(n log n).

partition(items, left, right)
    X = items[left]
    I = left
    FOR J = left+1 TO right 
        IF items[J] <= X
            I = I + 1 
            swap(items[I], items[J])

    swap(items[left], items[I]

    RETURN I
quicksort(items, left, right)
    if left < right
        r = partition(items, left, right)
        quicksort(items, left, r-1)
        quicksort(items, r+1, right)

We call the quicksort function passing the array and entire range initially.

quicksort(items, 1, n);

Lets take an array of numbers

items = [27, 54, 43, 56, 9, 23, 4, 13, 17]

We start by calling

quicksort(items, 0, 8)

Lets execute the partition statement of quicksort function since left < right or (0 < 8) is TRUE

r = partition(items, 0, 8)

Initially left = 0 right = 8

X=27 I=1

FOR J = 1

item[1] = 54 (54 <= 27) is FALSE so we move to next iteration

FOR J = 2

item[2] = 43 (43 <= 27) is FALSE so we move to next iteration

FOR J = 3

item[2] = 56 (56 <= 27) is FALSE so we move to next iteration

FOR J = 4

item[4] = 9 (9 <= 27) is TRUE I = 0 + 1 = 1 We swap items[1] and items[4], so 9 and 54 swaps their places and items becomes

items = [27, 9, 43, 56, 54, 23, 4, 13, 17]

FOR J = 5

item[5] = 23 (23 <= 27) is TRUE I = 1 + 1 = 2 We swap items[2] and items[5], so 43 and 23 swaps their places and items becomes

items = [27, 9, 23, 56, 54, 43, 4, 13, 17]

FOR J = 6

item[6] = 4 (4 <= 27) is TRUE I = 2 + 1 = 3 We swap items[3] and items[6], so 4 and 56 swaps their places and items becomes

items = [27, 9, 23, 4, 54, 43, 56, 13, 17]

FOR J = 7

item[7] = 13 (13 <= 27) is TRUE I = 3 + 1 = 4 We swap items[4] and items[7], so 13 and 54 swaps their places and items becomes

items = [27, 9, 23, 4, 13, 43, 56, 54, 17]

FOR J = 8

item[7] = 17 (17 <= 27) is TRUE I = 4 + 1 = 5 We swap items[5] and items[8], so 17 and 43 swaps their places and items becomes

items = [27, 9, 23, 4, 13, 17, 56, 54, 43]

FOR loop is now complete (J is now equal to right).

We now swap items[left] = items[0] and items[5] so 17 and 27 swap their places and items becomes

items = [17, 9, 23, 4, 13, 27, 56, 54, 43]

No if you look at the items array any number that is smaller than 27 ia on the left of 27 and any number greater is on the right side of 27.

Now if we recursively partition left of 27 and right of 27 we well have our sorted items.

Here r = 5. So we call quicksort with

quicksort(items, 0, 5)
quicksort(items,6, 8)

We continue this and recursion of any branch stops when left < right is true. Then the subarray is completely sorted and there is nothing to do.

Quickselect

The quick select algorithm finds the k-th smallest number given an array of numbers containing items greater than or equal to k (you cn not find 6th smallest item from an array of 5 items).

This algorithms is very similar to the quicksort algorithms and uses the same partition function. When one partition is made and position of partition r is greater than k we partition left subarray (r+1, right) and if r is smaller than k we partition roght subarray (left, r-1).

quickselect(items, left, right, k)    
    if(left == right)
        return items[left]

    r = partition(items, left right)

    if r == k
        return items[k]
    if r > k
        return quickselect(items, left, r-1)
    else
        return quickselect(items, r+1, right, k)

Please not that the algorithm does not check any error condition.

Sunday, April 6, 2014

CS131: Graph

A graph G is a set of of vertices V and a set of edges E that connects the vertices.
G = (V,E)
This notation means V is a set of vertices and E is a set of edges. For example
V = (A,B,C,D) E =

Directed and undirected graph

Graph can be either directed or undirected. For directed graph or digraph edges has a direction.
Undirected graph
Fig: Undirected Graph
Directed graph or digraph
Fig: Directed graph
Multiple copies of and edge is not allowed in a graph. Digraphs may have two edges in opposite direction.

Weighted graph

A graph G(V,E), either directed or undirected, where each edge u in E has an associated weight with it.
Fig: Weighted graph
Weights can be either positive or negative.

Terminology

Path

A sequence of vertices with where each two adjacent vertices are connected by an edge. For a directed graph each edges that connects the adjacend vertices must have a direction aligned to the path.
The number of edges in the path is the length of the path.

Strongly connected

If a graph has edges such that is is a path from any vertex to any other vertex then the graph is strongly connected graph.

Degree of vertex

Number of edges that connects a vertex to other verteices including itself (self edge) is the degree of theat vertex. Self edge is counted once.

Indegree

For digraph number of edges that are directed toward a vertex is the indegree of that vertex.

Outdeegree

For digraph number of edges that are directed away a vertex is the outdegree of that vertex.

Planner graph

A graph that can be drawn on a plane without crossing edges.

Sparse graph

If a graph has fewer number of edges than maximum number of edges possible is called sparse graph.

Graph representation

Graphs, either directed or undirected whether weighted or not, are generally represented in memory as one of two standard ways to represent a graph- using adjacency list or using adjacency matrix.
If a graph is sparse adjacency list is used to save memory. For a dense graph adjacency matrix will save memory and will be fast access representation.
To illustrate the two types of representation we'll use the following graphs.
Fig: Undirected graph
Fig: Undirected graph
Fig: Directed graph
Fig: Directed graph

Adjacency list

Uses array of lists, one list per vertex, where each list contains the items that is adjacent to the vertex that the list represents.
[A|*]-->[B|*]-->[C|*]-->[D|$]
[B|*]-->[A|*]-->[E|$]
[C|*]-->[A|*]-->[D|*]-->[E|$]
[D|*]-->[A|*]-->[C|*]-->[F|$]
[E|*]-->[B|*]-->[C|*]-->F|$]
[F|*]-->[D|*]-->[E|$]

Fig: Adjacency list representation for undirected graph
[A|*]-->[A|*]-->[B|*]-->[D|$]
[B|*]-->[A|$]
[C|*]-->[A|*]-->[E|$]
[D|*]-->[C|*]-->[F|$]
[E|*]-->[B|$]
[F|*]-->[E|$]

Fig: Adjacency list representation for digraph

Adjacency matrix

To represent a graph G(V,E) a matrix of size VxV, requiring O(V^2) memory, is used. Each item a[i,j] in the matrix is 1 (or actual weight for the edge in case of weighted graph) if the edge (i,j) exists in E and NIL (or any convenient value like zero) otherwise.
    A B C D E F
   _           _
A | 0 1 1 1 0 0 |
B | 1 0 0 0 1 0 |
C | 1 0 0 1 1 0 |
D | 1 0 1 0 0 1 |
E | 0 1 1 0 0 1 |
F | 0 0 0 1 1 0 |
   -           -

Fig: Adjacency matrix representation for undirected graph
    A B C D E F
   _           _
A | 1 1 0 1 0 0 |
B | 1 0 0 0 0 0 |
C | 1 0 0 0 1 0 |
D | 0 0 1 0 0 1 |
E | 0 1 0 0 0 0 |
F | 0 0 0 0 1 0 |
   -           -

Fig: Adjacency matrix representation for digraph
Adjacency matrix are simpler to use and faster compared to adjacency list. If a graph is small we may prefer to use adjacency matrix.

Graph traversals

The process of visiting each vertex of the graph once is called graph traversal.

Depth-first search (DFS)

Visits deeper level nodes first before visiting any sibling of the current node.
DFS node visit order
Uses stack or recursive method to keep track of adjacent nodes that are not visited yet.
DFS(G, u)
  for each vertex u in G
    if u.visited != true
        v.visited = true
        distance = distance + 1
        u.distance = distance  
        for each vertex v in Adjacent(u)
            if v.visited = false
                v.prev = u
                DFS(G, v)

BFS

Visits all current level nodes first before visiting any deeper level nodes.
BFS node visit order
Uses a queue to keep track of adjacent nodes.
Finds the distances from source vertex or the root.

Topological Sort

A directed acyclic graph that defines the order of edges. In a graph G each directed edge (u,v) indicates that e mush come before v in the ordering in G. In G nodes may indicate ordered events. If G has a cycle topological sorting is not possible.
Implementation: a. Using DFS. O(V+E) b. Repeat: Find a vertex of indegree = 0 and output it- remove the vertex and all its outgoing edges Until: No edge exists. O(V+E)

Minimum Spanning Tree

Kruskal's Algorithm

Kruskal's algorithm first creates a new graph G' with all the nodes of graph G. The hew graph initially has no edge. Then it takes edges in increasing order and adds two trees together to form a new bigger tree. An edge is only selected if the two nodes it is connecting is not already part of a tree. The process continues until there is no edge available to connect two trees.
We can use a disjoint set determine if two nodes are in same set.

The algorithm

Take all nodes and create disjoint sets those initially contains one node.
Create a sorted list of all the edges by their weight in increasing order.

Selected Edges = empty
While the list is not empty
    Remove the smallest weighted edge E from the list.
    If the nodes N1 and N2 those are connected by E is not in same set
       Put E into Selected Edges
       Union the sets that contains N1 and N2 
After the algorithm completes we have a set of edges in Selected Edges list that forms the minimum spanning three if the graph is connected or otherwise a minimum spanning forest.

CS131: 2-3-4 Tree [Draft]

2-3-4 Tree is a perfectly balanced tree (all leafs are on the same level) with O(log n) worst case time for find, insert and remove operations. As the name suggests every node of the tree has 2,3 or 4 children.

Operations

Find

Insert

Find a node with key k. If a node X with key k is found proceed to X's left until a leaf node is found. If a 3 key node is found we move the middle key up to its parent. Parent has at most two keys.

Fig:

Insertion may increase the depth of tree by one level.

Remove

Find the node X with key k to remove If X is a leaf, remove it ans return from function. If X is an internal node search for a leaf node Y with smallest key k' such that k" > k and replace Replace X with Y so that X is removed and Y takes its place.

If we find a one key node on the path of a leaf node we want to remove we continuously do the following 3 steps (except the root node):

  1. If X is an one key node that is not root move one key from its adjacent sibling by rotation.

Fig:

  1. If there is no adjacent sibling with more than one key take one key down from the parent. Internal nodes always have two or three keys. Moving down one key may trigger further operations if the parent is left with one key (verify).

Fig:

  1. If parent is root and there is no adjacent sibling with more than one key (1 and 2 fails) create a 3 key node to form the root. This is called a fusion operation. The tree level is decreased by one level.

Fig:

Remove may decrease the tree by one level.

CS131: Splay Tree

Splay tree is a balanced binary search tree. On average all operations take O(log n) time.
Any sequence of k operations starting from an empty tree and never exceeding n items in tree will have O(k log n) worst case time.
The data structure ensures fast access to recently accessed data and much easier to program that other balanced trees like 2-3-4 tree.
Splay tree is not kept perfectly balanced all the time.

Tree rotation


Splay operation

We do splay operation whenever an item is accessed to move an item up through root. Repeatedly doing the operation moves the item up to the root of the tree.
We continuously rotate the tree so that it takes the position of the item above (parent, grand parent, great grand parent etc.) until it becomes the root.
Given an item X to splay up there are three possible situations:

a. X is a left child of a right child or right child of a left child
For left child of a right child we rotate right first and then rotate left to move the item two level up. For right child of a left child we rotate left first and then rotate right to move the item two level up.
b. X is a left child of a left child or right child of a right child
For left child of a left child we rotate right twice to move the item two level up. For right child of a right child we rotate left twice to move the item two level up.
c. X is child of root
We rotate once as appropriate (left or right) to move the item to root.
In all cases one rotation moves the item one level in upper direction.

Splay tree operations

Find operation

find(k)
    Search for the key k as like a binary search tree or reached a leaf node
    Let X be the last node during the search whether it contains k or not
    Splay the node X up so that it becomes root through a sequence of tree rotation

    Splay X up to root using a sequence of rotation.

Insert Operation

insert(k)
    Insert k like binary search tree
    Splay the new node up to root through tree rotation

Remove operation

remove(k)
    Remove the node as ordinary binary search tree
    Splay the nodes parent up to root using tree rotation

CS131: Binary Search Tree

A binary search tree is a binary tree such that for any node n, if a node x is in n's left subtree, x's key is smaller than n's key nad if x is on the right subtree of n, x's key is bigger than n's key.

Fig: Binary Search Tree

From above figure let us consider the node with key 15. Its left subtree has nodes with keys 10,5 and 12, all of which are less than 15. Its right subtree has keys 20,17, and 22 all of which are bigger than 15. Node with same key can be in any subtree.

Binary search tree search for minimum or maximum key is fast and we can do inexact match like what is the largest key smaller than 57 and what is the smallest key bigger than 57. Or we can find a word with close but not exact spelling. While searching down the tree for a key k, if that key is not present we enclunter the node with smallest key greater that k and the node with largest key smaller than k.

If we do an inorder traversal we traverse the nodes in sorted order of the keys.

Operations

Min()

Start with root node and go to left child until no longer possible and return the last node encountered.

Max()

Start with root node and go to right child until no longer possible and return the last node encountered.

struct node
{
    int key;
    node *left, *right;
};

node* search(node root, int k, node* &parent)
{
    cur_node = root;
    while(cur_node != NULL)
    {
        if(cur_node->key == k)
        {
            return cur_node;
        }

        parent = cur_node;

        if(cur_node->key > k)
        {
            cur_node = cur_node->right;
        }
        else
        {            
            cur_node = cur_node->left;
        }
    }
    return NULL;
}
Insert(k):
Remove(k)

Find the node n with key k.
If none found return.
If n has no child remove it.
If n has one child replace n with its child.
If n has two children
  - Find node d with smallest key from n's right subtree
  - Remove remove d from its location and place at n's location

Example

Fig: Binary search tree

Remove(15)

Node 15 has two children and smallest key on the right subtree is 17. We replace node 15 with node 17 and the tree becomes:

Fig: Binary search tree after removing 15

remove(node root, int k)
{
    node* parent;
    node* target_node = search(root, k, parent)

    if(node == NULL)
    {
        return;
    }

    if(node->right == NULL && node->left == NULL)
    {
        if(parent->left == target_node)
        {
            delete target_node;
            parent->left = NULL;
        }

        if(parent->right == target_node)
        {
            delete target_node;
            parent->right = NULL;
        }
    }

    //todo
}

Running time

Running time for operations on a binary search tree depends on the depth of the tree. If the tree is balanced the depth of the tree oi O(log n) where n is the number of nodes in the tree. In that case the complexity of the operations is O(log n).

So, if the tree has O(log n) depth all operations takes O(log n)

But it is possible that the tree is not balanced. If we inser items in a binary search tree in sorted order the binary search tree becomes a linked list which will have a running time of O(n) for all operations. This is worst case running time.

Worst case complexity O(n)

CS131: Tree

A tree is a collection of nodes and edges. Edges connects the edges together. In a tree there is one and only one path between any two nodes.

Fig: A tree - here we have A,B,C,D,E,F nodes and l,m,n,o,p, edges.

In a tree we define a path using connected sequence of edges. Number of edges on the path is called the length of the path. In above figure from B to F there is exactly one path and that is (m,n,p).

Rooted tree

In a tree we can pick a node and call it to be the root of the tree. We call this tree to be a rooted tree.

We choose C to be the root and we redraw the tree to keeping the root on top and others in different level based on distance from the root.

Fig1: Original tree Fig2: Redrawn tree

If we look carefully we see that nothing has changed for the tree but we have chosen C to be root and placed the nodes differently on paper. We draw the like figure 2 for convenience of understanding levels and relations.

If we choose any node and find a path from that node to the root the first node we encounter is called the parent of that node. All node except the root has exactly one parent. Root has no parent.

On the path fopm node F to root node C we encounter D first. So D is the parent of F and F is called to be child of D. A node may have any number of children including zero child. A node without any child is called a leaf node. A,F,E are leaf nodes.

All the nodes on the path to root including F itself and the root is called to be ancestor of F. So F's ancestors are F,D and C and F is called to be their descendents.

Length of the path from an node to the root is the depth of that node. Depth of F is 2. Length of path from a node to its deepest descendent is the height of the node. height of D is 1 nad height of root C is 2. Height of the root is also height of the tree. So the tree has a height of 2.

Nodes having same parent is called to be siblings. B and D are siblings. Similarly E and F are siblings.

Binary tree

A binary tree is a special type of three where no node is allowed to have more than two children.

Fig: Binary tree - a node may hav zero, one or two children.

Each child of a node is either left child or a right child. Even there is only one child that child must be either left child or a right chi

Representing rooted tree in memory

-----------------------
|  data   | Parent ref|
-----------------------
|Children list        |
-----------------------

Fig:

-----------------------------
|  data   |     Parent ref  |
-----------------------------
|First child | Next sibling |
-----------------------------

Fig:

Traversal

The process of visiting each node in a tree once is called traversal. Depending on the order of visiting nodes in a tree we ma traverse in a few different ways:

  • Pre-order traversal
  • Post-order traversal
  • Binary tree inorder traversal
  • Level order traversal

We will discuss them in details.

Preorder traversal

We visit each node starting from root and then recursively visit its children from left to right.

Preorder(node):
    Visit(node)
    for each child c of node
        Preorder(c)

Fig: Preorder traversal - numbers show the order of nodes getting visited

Postorder traversal

We visit each node's children left to right and then visit the node itself.

Preorder(node):    
    for each child c of node
        Preorder(c)
    Visit(node)

Fig: Postorder traversal

Binary tree inorder traversal

We visit left child then the node itself and then right chhild.

Inorder(node)
    if node.left exists
        Inorder(node.left)

    Visit(node)

    if node.right exists
        Inorder(node.right)

Fig: Inorder traversal

Level order traversal

We visit each level of the tree left to right before visiting any deeper level.

Fig: Level order traversal

We use a queue to keep track of the visited nodes children.

LevelOrder(tree)
   Q = new empty Queue
   Q.enque(tree.root)

   while Q is not empty
       node = Q.deque()
       Visit(node)
       for each child c of node from left to right
           Q.enqueue(c)