Lab 2: Graphs and Tours
Overview
There are few better ways of getting acquainted with a language than to simply sit down and write some reasonably complex code, so for this next lab we'll continue to forgo the "systems" aspect of the course — which we haven't begun to cover in earnest yet anyway — and jump back into data structures territory.
One of the common data structures most ripe with related algorithms and open research problems is the graph. A graph, as you probably know, is a data structure that consists of a set of vertices (aka nodes) and edges between those vertices. Edges may be directed or undirected, and often have weights associated with them. The following graph is an undirected, weighted graph that represents the distances between Chicago and some nearby suburbs.

Some things we like to do with graphs are:
- to find the shortest distance between two vertices
- to find a minimum spanning tree — i.e., the shortest subgraph that connects all the vertices
- to solve the "traveling salesman problem" (TSP), where we search for the shortest path that passes through all the vertices, with the stipulation that we visit each vertex only once
Note that although the last two seem very similar, there's a fairly straightforward and efficient "greedy" solution to the minimum spanning tree problem, but we still don't have a good (fast) solution for the TSP — and that's after nearly a century of trying!
It's likely that you've already implemented the graph data structure and some related algorithms in your data structures and algorithms course, but we're going to go ahead and do it this time using C.
There are a number of different approaches to representing graphs as data structures — two common ones are the adjacency matrix and the adjacency list.
For the graph above, we would have the following adjacency matrix:

Note that if a graph has a large number of vertices but each vertex is only connected to a few others (we call this a sparse graph) an adjacency matrix is a pretty inefficient representation. We'd be allocating a huge two-dimensional array, essentially, to store very little information.
In that case, an adjacency list is a better choice. The following is an adjacency list representation of the Chicago suburb graph:

In an adjacency list, each vertex is associated with a linked list of those vertices (and corresponding edge weights) adjacent to it.
Your task for this lab is to construct an adjacency list representation of a graph specified on the command line, and to print out the following:
- each vertex along with all its adjacent vertices and edge weights (to verify your graph representation)
- an ordered list of vertices that constitute a "tour" of the graph — i.e., a path on which each vertex is visited only once; if no tour exists, your program should print a message to that effect
- the total distance of the tour (if it exists)
Your program would be invoked at the command line as follows to construct the above Chicago suburb graph:
./graph Chicago Plainfield 30 Chicago OakPark 8 Chicago Schaumburg 30 \
Chicago Evanston 12 OakPark Evanston 14 Schaumburg Evanston 24
Note that use of the '\' character allows us to spread the command line entry over multiple lines.
The command line arguments are used to specify each edge (and its weight)
present in the graph to be constructed. Spaces aren't allowed in vertex names
(so "Oak Park" is written OakPark), and the program automatically assumes
strings that match in separate edges refer to the same vertex. Edge weights are
integers (and fit in 32-bit ints). The graph specified is also assumed to be
connected — i.e., any two vertices can be connected via some path through the
graph.
Another important point to mention is that, because the graph is undirected, an edge between two vertices is only specified once. E.g., the Chicago-Plainfield edge is not specified again as Plainfield-Chicago. That said, the command line invocation above could be written in several other equivalent ways. Here's one of them:
./graph Plainfield Chicago 30 Chicago OakPark 8 OakPark Evanston 14 \
Evanston Schaumburg 24 Schaumburg Chicago 30 Evanston Chicago 12
Your program should handle any equivalent permutation.
The output for either of the above invocations would be as follows: (the order of vertices listed is not important)
Adjacency list:
Chicago: OakPark(8) Evanston(12) Schaumburg(30) Plainfield(30)
OakPark: Chicago(8) Evanston(14)
Evanston: Chicago(12) OakPark(14) Schaumburg(24)
Schaumburg: Chicago(30) Evanston(24)
Plainfield: Chicago(30)
Tour path:
Plainfield Chicago Schaumburg Evanston OakPark
Tour length: 98
Note that there may very well be multiple possible tours — your program just needs to find one of them (and not necessarily the shortest).
Preliminaries
In this lab you'll be working in the labs/2_graphlab directory.
As before, don't forget to commit your previous work and pull the latest changes from the central repository before starting!
Implementation Details
You'll be working on the following three files for this lab:
graph.h: This is the header file where you'll declare all the functions you use to implement your graph in addition to the structures used to create the adjacency list (some declarations are already provided that you can augment or remove)graph.c: The functions that deal with your graph data structure and search routines should go in here — most of these functions should have corresponding prototypes in thegraph.hfilemain.c: You'll do all the command line parsing and calls to graph construction and search functions here.
We advise you to partition your work into phases so that you don't get overwhelmed. Your code should compile and run correctly before you move onto each subsequent phase.
First, you should make sure you're comfortable with the processing of command
line arguments and with string handling. You should look into the atoi
standard library function for converting strings to integers (and thereby easily
accessing edge weights). For starters, have your program parse the command line
arguments and echo the vertex names and edge weights. Comment out the code
currently in the main function as you do this.
Next, check out the structure declarations given to you in the graph.h file —
there are two, duplicated below:
typedef struct vertex vertex_t; typedef struct adj_vertex adj_vertex_t; struct vertex { char *name; adj_vertex_t *adj_list; vertex_t *next; }; struct adj_vertex { vertex_t *vertex; int edge_weight; adj_vertex_t *next; };
Consider the following simple graph:

Included in main.c is sample code, listed below, that manually creates an
adjacency list for the above graph (note that the code does not free the
adjacency list — this is something you'll have to do!):
vertex_t *v1, *v2, *v3, *vlist_head; adj_vertex_t *adj_v; vlist_head = v1 = malloc(sizeof(vertex_t)); v1->name = "A"; v2 = malloc(sizeof(vertex_t)); v2->name = "B"; v3 = malloc(sizeof(vertex_t)); v3->name = "C"; v1->next = v2; v2->next = v3; v3->next = NULL; adj_v = v1->adj_list = malloc(sizeof(adj_vertex_t)); adj_v->vertex = v2; adj_v->edge_weight = 10; adj_v->next = NULL; adj_v = v2->adj_list = malloc(sizeof(adj_vertex_t)); adj_v->vertex = v1; adj_v->edge_weight = 10; adj_v = adj_v->next = malloc(sizeof(adj_vertex_t)); adj_v->vertex = v3; adj_v->edge_weight = 5; adj_v->next = NULL; adj_v = v3->adj_list = malloc(sizeof(adj_vertex_t)); adj_v->vertex = v2; adj_v->edge_weight = 5; adj_v->next = NULL;
This next figure depicts the structures allocated in the code and their interrelationships:

Make sure you understand how the various structure types and pointers are used to create the adjacency list.
The last bit of code you're given prints out the adjacency list for the graph:
vertex_t *vp; printf("Adjacency list:\n"); for (vp = vlist_head; vp != NULL; vp = vp->next) { printf(" %s: ", vp->name); for (adj_v = vp->adj_list; adj_v != NULL; adj_v = adj_v->next) { printf("%s(%d) ", adj_v->vertex->name, adj_v->edge_weight); } printf("\n"); }
The output produced is:
Adjacency list:
A: B(10)
B: A(10) C(5)
C: B(5)
You should delete all the given code (except for the traversal, which you can reuse if you wish) and get your program to create an adjacency list using the command line parameters.
The graph.h file contains the following prototype, which you should provide an
implementation for in graph.c.
/* This is the one function you really should implement as part of your * graph data structure's public API. * * `add_edge` adds the specified edge to the graph passed in via the * first argument. If either of the edge's vertices are not already * in the graph, they are added before their adjacency lists are * updated. If the graph is currently empty (i.e., *vtxhead == NULL), * a new graph is created, and the caller's vtxhead pointer is * modified. * * `vtxhead`: the pointer to the graph (more specifically, the head * of the list of vertex_t structures) * `v1_name`: the name of the first vertex of the edge to add * `v2_name`: the name of the second vertex of the edge to add * `weight` : the weight of the edge to add */ void add_edge (vertex_t **vtxhead, char *v1_name, char *v2_name, int weight);
A correct implementation of add_edge should allow you create a graph using a
sequence of calls like this:
vertex_t *vlist_head = NULL; add_edge(&vlist_head, "Chicago", "Plainfield", 30); add_edge(&vlist_head, "Chicago", "OakPark", 8); add_edge(&vlist_head, "OakPark", "Evanston", 14); add_edge(&vlist_head, "Evanston", "Schaumburg", 24); add_edge(&vlist_head, "Schaumburg", "Chicago", 30); add_edge(&vlist_head, "Evanston", "Chicago", 12);
When you have this working, you're finally ready to start traversing your graph
and searching for a tour. To do this, you should define appropriate functions in
graph.h and provide their implementations in graph.c. You should find that
the problem lends itself fairly naturally to a recursive implementation. Have
fun!
Building
A simple makefile is provided for you that compiles and links 'main.c' and 'graph.c', and builds the executable 'graph'. You can start the build process with the command:
make
Sometimes, when you're running into weird problems with a build, it helps to delete all the intermediate build files and recompile from scratch. You can do this with the command:
make clean ; make
If your program builds successfully, you can run it with the command:
./graph
Of course, you'll be testing it with command line parameters, so you'll more likely do something like:
./graph A B 10 B C 5
Grading
This lab is worth a total of 40 points. Below is the rubric I will use to grade your work:
20 points: Graph construction
- An adjacency list must be used as the internal graph representation
- Graphs should be constructed dynamically using the provided structures
add_edge(or a similar function) should be the primary mechanism used for populating the graph structure- Output adjacency lists should be correct as per the edges entered
10 points: Tour search
- Tours should be identified whenever they exist
- When identified, a valid tour (and its total length) should be printed out
- When a tour doesn't exist an appropriate message should be printed and the program should terminate gracefully
10 points: Memory allocation & Code modularity
- There should be no blatant memory leaks. Take care to properly free all graph
data structures before exiting. Keep in mind that every call to
mallocshould have an associated call tofree! - Code reuse should be evident. In particular, you should break any large functions that contain repeated code into separate functions. A good rule of thumb for this lab (and others you'll work on in this class) is that a function should be no more than an average screenful (40-60 lines) in length.
- Graph related functions should be declared and defined in the
graph.h/graph.cfiles — the API should be coherent.