Sunday, April 6, 2014

CS131: Graph

A graph G is a set of of vertices V and a set of edges E that connects the vertices.
G = (V,E)
This notation means V is a set of vertices and E is a set of edges. For example
V = (A,B,C,D) E =

Directed and undirected graph

Graph can be either directed or undirected. For directed graph or digraph edges has a direction.
Undirected graph
Fig: Undirected Graph
Directed graph or digraph
Fig: Directed graph
Multiple copies of and edge is not allowed in a graph. Digraphs may have two edges in opposite direction.

Weighted graph

A graph G(V,E), either directed or undirected, where each edge u in E has an associated weight with it.
Fig: Weighted graph
Weights can be either positive or negative.

Terminology

Path

A sequence of vertices with where each two adjacent vertices are connected by an edge. For a directed graph each edges that connects the adjacend vertices must have a direction aligned to the path.
The number of edges in the path is the length of the path.

Strongly connected

If a graph has edges such that is is a path from any vertex to any other vertex then the graph is strongly connected graph.

Degree of vertex

Number of edges that connects a vertex to other verteices including itself (self edge) is the degree of theat vertex. Self edge is counted once.

Indegree

For digraph number of edges that are directed toward a vertex is the indegree of that vertex.

Outdeegree

For digraph number of edges that are directed away a vertex is the outdegree of that vertex.

Planner graph

A graph that can be drawn on a plane without crossing edges.

Sparse graph

If a graph has fewer number of edges than maximum number of edges possible is called sparse graph.

Graph representation

Graphs, either directed or undirected whether weighted or not, are generally represented in memory as one of two standard ways to represent a graph- using adjacency list or using adjacency matrix.
If a graph is sparse adjacency list is used to save memory. For a dense graph adjacency matrix will save memory and will be fast access representation.
To illustrate the two types of representation we'll use the following graphs.
Fig: Undirected graph
Fig: Undirected graph
Fig: Directed graph
Fig: Directed graph

Adjacency list

Uses array of lists, one list per vertex, where each list contains the items that is adjacent to the vertex that the list represents.
[A|*]-->[B|*]-->[C|*]-->[D|$]
[B|*]-->[A|*]-->[E|$]
[C|*]-->[A|*]-->[D|*]-->[E|$]
[D|*]-->[A|*]-->[C|*]-->[F|$]
[E|*]-->[B|*]-->[C|*]-->F|$]
[F|*]-->[D|*]-->[E|$]

Fig: Adjacency list representation for undirected graph
[A|*]-->[A|*]-->[B|*]-->[D|$]
[B|*]-->[A|$]
[C|*]-->[A|*]-->[E|$]
[D|*]-->[C|*]-->[F|$]
[E|*]-->[B|$]
[F|*]-->[E|$]

Fig: Adjacency list representation for digraph

Adjacency matrix

To represent a graph G(V,E) a matrix of size VxV, requiring O(V^2) memory, is used. Each item a[i,j] in the matrix is 1 (or actual weight for the edge in case of weighted graph) if the edge (i,j) exists in E and NIL (or any convenient value like zero) otherwise.
    A B C D E F
   _           _
A | 0 1 1 1 0 0 |
B | 1 0 0 0 1 0 |
C | 1 0 0 1 1 0 |
D | 1 0 1 0 0 1 |
E | 0 1 1 0 0 1 |
F | 0 0 0 1 1 0 |
   -           -

Fig: Adjacency matrix representation for undirected graph
    A B C D E F
   _           _
A | 1 1 0 1 0 0 |
B | 1 0 0 0 0 0 |
C | 1 0 0 0 1 0 |
D | 0 0 1 0 0 1 |
E | 0 1 0 0 0 0 |
F | 0 0 0 0 1 0 |
   -           -

Fig: Adjacency matrix representation for digraph
Adjacency matrix are simpler to use and faster compared to adjacency list. If a graph is small we may prefer to use adjacency matrix.

Graph traversals

The process of visiting each vertex of the graph once is called graph traversal.

Depth-first search (DFS)

Visits deeper level nodes first before visiting any sibling of the current node.
DFS node visit order
Uses stack or recursive method to keep track of adjacent nodes that are not visited yet.
DFS(G, u)
  for each vertex u in G
    if u.visited != true
        v.visited = true
        distance = distance + 1
        u.distance = distance  
        for each vertex v in Adjacent(u)
            if v.visited = false
                v.prev = u
                DFS(G, v)

BFS

Visits all current level nodes first before visiting any deeper level nodes.
BFS node visit order
Uses a queue to keep track of adjacent nodes.
Finds the distances from source vertex or the root.

Topological Sort

A directed acyclic graph that defines the order of edges. In a graph G each directed edge (u,v) indicates that e mush come before v in the ordering in G. In G nodes may indicate ordered events. If G has a cycle topological sorting is not possible.
Implementation: a. Using DFS. O(V+E) b. Repeat: Find a vertex of indegree = 0 and output it- remove the vertex and all its outgoing edges Until: No edge exists. O(V+E)

Minimum Spanning Tree

Kruskal's Algorithm

Kruskal's algorithm first creates a new graph G' with all the nodes of graph G. The hew graph initially has no edge. Then it takes edges in increasing order and adds two trees together to form a new bigger tree. An edge is only selected if the two nodes it is connecting is not already part of a tree. The process continues until there is no edge available to connect two trees.
We can use a disjoint set determine if two nodes are in same set.

The algorithm

Take all nodes and create disjoint sets those initially contains one node.
Create a sorted list of all the edges by their weight in increasing order.

Selected Edges = empty
While the list is not empty
    Remove the smallest weighted edge E from the list.
    If the nodes N1 and N2 those are connected by E is not in same set
       Put E into Selected Edges
       Union the sets that contains N1 and N2 
After the algorithm completes we have a set of edges in Selected Edges list that forms the minimum spanning three if the graph is connected or otherwise a minimum spanning forest.