Data structures and algorithms  helping Santa Claus find his road to San Francisco
It's almost a Christmas time and very soon Santa Claus will pull out his sleigh along with the herd of reindeers. But how his gonna find his way to all the places he is supposed to visit? Well, asuming his not going to buy any GPS device we will try to help him out writing the shortest path algorithm from scratch with the underlying data structures. The goal of this article is to understand the nittygritty details about the algorithms and underlying data structures and not how to write code in functional style. That's why I'm not going to use the functional data structures and style (most of the time) but bear with me, this is just to set everyone one on the same page and give the basic understanding of the topic. I want also to give some intuition about the choices that has been made in terms of running time and space complexity.
I mentioned Santa Claus but what about San Francisco? To check the correctness of my algorithm I wanted to run it against the real datasets with millions of nodes and edges. I googled for some public datasets and found the "9th DIMACS Implementation Challenge  Shortest Paths" California and Nevada large road networks datasets. Perfect for the task at hand. So let's say Santa Claus has to find his way to Golden Gate Bridge in San Francisco from any point in California and Nevada. Of course following the roads on his sleigh (yeah, this kills the whole magic).
The shortest path algorithm (part I)
The road network can be seen as the big graph of nodes connected by the edges. As many of you know, the most popular algorithm to find the shortest path between two nodes in a graph is Dijkstra algorithm. The algorithm exists in many variants; Dijkstra's original variant found the shortest path between two nodes, but a more common variant fixes a single node as the "source" node and finds shortest paths from the source to all other nodes in the graph. This is the approach I'm going to use in this article.
Let's look at the example from Dijkstra algorithm wikipedia page:
From the animated image below we see that the "source" node a (1) is fixed and distances are calculated to each node in the graph. The shortest path to the destination node b (5) is calculated and is equlat to 20. It passes through the nodes 3 and 6 which is the shortest path indeed.
The pseudo code
The pseudo code of Dijkstra algorithm looks something like this:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 

When inserting the graph into the structure Q
the starting node's shortes distance is set to 0.0
while the other nodes' shortest distances are set to infinity
. Remove from the Q
the min element and explore all of it's edges. Compare the shortest distances with all adjacent nodes and if any distance is less than the shortest distance on the current node, update adjacent node shortest distance inside the Q
. Continue until Q
is not empty. Nodes which got no edges will finish with the shortest distance of infinity because it is not possible "get to them" from the starting node. However, they will be still removed from the Q
.
The running time
The original variant of the algorithm uses linked list or an array to store all the nodes of the graph. The extract_minimum operation is simply a linear search through all nodes (vertices) in Q
and in that case the running time is \(O(V^2)\) where \(V\) is the number of nodes (vertices).
Depending on how Q
is implemented the running time can be \(O(E \cdot T_{dk} + V \cdot T_{em})\) where \(T_{dk}\) and \(T_{em}\) are the complexities of the decrease_key and extract_minimum operations in Q
respectively.
Now that we have a basic understanding on how the Dijkstra shortest path aglorithm works we need to implement our data structure Q
. One of the approaches of speeding up the algorithm is to use a Priority Queue" and that is what I'm going to implement next.
Priority Queue
What is a Queue? A queue is an abstract data type supporting two operation Enqueue
to the rear of the collection of elements and Dequeue
from the front of the collection of elements. This makes it a FIFO data structure, the first element added to the queue will be the first one to be removed.
What is a Priority Queue? A priority queue is a generalization of a queue where each element is assigned a priority and elements come out in order by priority.
Typical uses cases are:
 Scheduling jobs
 Dijkstra’s algorithm: finding a shortest path in a graph
 Prim's algorithm: constructing a minimum spanning tree of a graph
 Huffman's algorithm: constructing an optimum prefixfree encoding of a string
 Heap sort: sorting a given sequence
Supported operations:
Insert(p)
adds a new element with a priorityp
.Extract()
extracts an element with the minimum or maximum priority (depending if it's a Min or Max priority queue)
For simplicity I'm using in my code Enqueue
and Dequeue
respectively.
Additionaly others operations like TryDequeue()
, Delete(element)
and ChangePriority(p)
can be implemented.
Naive implementation considerations
If the priority queue is implemented with unsorted array/list, what are the costs of different operations?
Enqueue(e)
addse
to the end. Running time \(O(1)\)Dequeue()
scans the array/list for the min/max element. Running time \(O(n)\)
If the priority queue is implemented with sorted array, what are the costs of different operations?

Enqueue(e)
finds a position fore
\(O(\log n)\) by using binary search, shift all elements to the right of it by 1 \(O(n)\), inserte
\(O(1)\). Running time \(O(n)\) Dequeue()
Extracts the last element. Running time \(O(1)\)
If the priority queue is implemented with sorted list, what are the costs of different operations?
Enqueue(e)
finds a position fore
\(O(n)\) cannot use binary search, inserte
\(O(1)\). Running time \(O(n)\)Dequeue()
Extracts the last element. Running time \(O(1)\)
Is there a better data structure? Yes, binary heap
Enqueue(e)
Running time \(O(\log n)\)Dequeue()
Running time \(O(\log n)\)
Priority Queue with Binary Heap
Binary maxheap is a binary tree (each node has zero, one, or two children) where the value of each node is at least the values of its children. In other words, for each edge of the tree, the value of the parent is at least the value of the child. Binary minheap is the other way round.
My implementation should support the following operations
TryDequeue()
Retruns the root element without removing the node. Running time \(O(1)\)Enqueue e
Attach a new node to any leaf. This however my violate the heap property. To fix this we let the new element tosiftUp
.private siftUp i
it swaps the problematic node (indexedi
) with its parent until the property of the heap is satisfied. Invariant: heap property is violated on at most one edge. This edge gets closer to the root while siftingup. Running time \(O(H)\) where \(H\) is the height of the tree.Dequeue()
replace the root with any leaf but again this may viloate the heap property. To fix it, we let the problematic nodesiftDown
.private siftDown i
it swaps the problematic node with larger child until the heap property is satisfied. We swap with the larger child which automatically fixes one of the two bad edges. Running time \(O(H)\) where \(H\) is the height of the tree.
As most of the operations works in time \(O(H)\) where \(H\) is the height of the tree, we definitely want a tree to be shallow.
How to keep a tree shallow? Making it a complete binary tree. A binary tree is complete if all its levels are filled except possibly the last one which is filled from left to right. The first advantage is that a complete binary tree with \(n\) nodes has height at most \(O(\log n)\). The second advantage is that it can be easily stored as array. How to find the parent, leftChild and rightChild of the node stored in array? This is the simple formulas for 0
based indexed array:
1: 2: 3: 

This is the mutable implementation of priority queue using System.Collections.Generic.List<'T>
(which is baked by an array):
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 36: 37: 38: 39: 40: 41: 42: 43: 44: 45: 46: 47: 48: 49: 50: 51: 52: 53: 54: 55: 56: 57: 58: 59: 60: 61: 62: 63: 64: 65: 66: 67: 68: 69: 70: 

All the operations are described above. As you can see it is very simple and strightforward. The resulting implementation is:
 Fast: all operations work in time \(O(\log n)\)
 Space efficient: elements are stored in the list. Parentchild connections are not stored but are computed on the fly.
 Easy to implement: just few lines of code.
 Not functional friendly: mutable structures can be sometimes useful, but this implementation doesn't help with functional style and immutability.
The interesting bits are the build()
function which allows to turn an array into the heap. We repair the heap property going from bottom to top. Initially, the heap property is satisfied in all the leaves (i.e., subtrees of depth 0) then start repairing the heap property in all subtrees of depth 1. When we reach the root, the heap property is satisfied in the whole tree. Running time is \(O(n \log n)\) since we call siftDown
for \(O(n)\) nodes. If node is already close to the leaves than sifting it down is fast.
Benchmarking the running time
I used BenchmarkDotNet to mesure the performance of my implementation and I compared it to the FSharpx.Collections.PriorityQueue
which is a full functional data structure.
This is the test I used:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35: 

And here are the results:
1: 2: 3: 4: 5: 

Method 
Length 
Mean 
StdDev 
Allocated 

MaxPriorityQueue 
2000 
568.6416 us 
10.9266 us 
107.1 kB 
MaxPriorityQueueInsertingFromStart 
2000 
687.4757 us 
1.7271 us 
123.5 kB 
FSharpxPriorityQueueInsertingFromStart 
2000 
423.1923 us 
10.4438 us 
167.92 kB 
MaxPriorityQueue 
20000 
5,646.1750 us 
13.7735 us 
1.16 MB 
MaxPriorityQueueInsertingFromStart 
20000 
7,006.6801 us 
21.8178 us 
1.35 MB 
FSharpxPriorityQueueInsertingFromStart 
20000 
4,854.3166 us 
19.2608 us 
1.68 MB 
MaxPriorityQueue 
200000 
57,873.1854 us 
302.7054 us 
11.14 MB 
MaxPriorityQueueInsertingFromStart 
200000 
72,939.0291 us 
573.1936 us 
13.06 MB 
FSharpxPriorityQueueInsertingFromStart 
200000 
104,380.8573 us 
1,036.8847 us 
16.8 MB 
MaxPriorityQueue 
2000000 
596,396.9013 us 
3,905.1761 us 
107.11 MB 
MaxPriorityQueueInsertingFromStart 
2000000 
747,160.8772 us 
26,935.2538 us 
126.29 MB 
FSharpxPriorityQueueInsertingFromStart 
2000000 
1,537,904.5511 us 
62,809.3363 us 
168 MB 
My implementation seems to be fastest starting from 200000 elements and with 2 millions is twice as fast as FSharpx.Collections.PriorityQueue
for enqueueing elements.
Now that I have my working implementation of priority queue, I can finally implement the shortest path Dijkstra algorithm
The shortest path algorithm (part II)
I need some basic data structures to represent my graph of nodes. I need an Edge
, a node that I called Vertex
and the Graph
.
Note: in my implementation I use the name 'vertex' instead of 'node' but this is the same
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 

The Edge
points to the destination Vertex
and has also a Distance
. The Vertex
has a list of edges and the ShortestDistance
label which will be calculated and updated according to the "source" Vertex
. Note that the implementation of IComparable
interface is made on that label which is logical because when dequeuing elements from the min priority queue we want to explore nodes with the shortest distance first. The Graph
is just a list of nodes.
Let's start with the implementation following the pseudo code that I showed before. This looks something like that in F# (I'm sorry for not following the functional style but I care more about the steps of the algorithm) :
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 

What is interesting that at line 15 I need to update the queue pq.Update indx newDestination
with the new calculated ShortestDistance for the node. This is an operation \(O(1)\) because I have an index retrieved at line 10 with pq.TryFind
. To find the index of the node in the queue I need to traverse the whole tree which is \(O(V)\) operation. Watch out, but this may cause the serious performance problems.
Test drive on wikipedia example
Before trying my algorithm on the real world data, let's check if it's correct on wikipedia example (see the image above in "The shortest path algorithm (part I)"). We need to find the path between the node a (1) and b (5). The value should be 20.
Running the algorithm produces the following output
1: 2: 3: 4: 5: 6: 

The node 5 has a value of 20 which is correct. We can move to more serious step.
Real world example: Santa Claus way to San Francisco (FAIL)
The next step is to try it on the real world data. After some googling I found interesting data used by 9th DIMACS Implementation Challenge  Shortest Paths. I used California and Nevada datasets with the following properties:
 # of nodes 1 890 815
 # of edges 4 657 742
This will be more challendging that 6 node example from wikipedia. Before running the sample I made a small change to my algorithm to ease the data conversion from the data set files. I'm not using my Graph
type but a simple Dictionary<int, Vertex>
.
The dataset file has the following format
1: 2: 3: 4: 5: 

We can drop the first column (letter a). The second column is the starting nodeId, next column is the destination nodeId and the last column is the distance.
Here is the script loading the data from the file and converting it to the Dictionary<int, Vertex>
.
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 

Note that all of the nodes has the ShortestPath set to positiveInfinity. Then we need to set the starting node by setting it's shortest distance to 0.
1: 2: 

The node number 1215934 is the following address 512 Partridge Ave, Bakersfield, CA 93309, USA.
The destination is the node number 1598609 at the address US101, San Francisco, CA 94129, ÉtatsUnis Golden Gate Bridge
Let's run the algorithm.
Well, after maybe 30 minutes I killed the FSI because nothing happened, it was stuck somewhere and maybe the algorithm would never finish.
Let's try to find out why. If you remember the running time of the algorithm is \(O(E \cdot T_{dk} + V \cdot T_{em})\) where \(T_{dk}\) and \(T_{em}\) are the complexities of the decrease_key and extract_minimum operations in Q
respectively. The Q
is my priority queue pq
, the decrease_key is the pq.TryFind
and pq.Update indx newDestination
all together. Extract_minimum is pq.Dequeue()
but I know that this operation works in \(O(1)\) so it's not a problem. The pq.TryFind
uses the preorder traversal of the tree which is \(O(V)\). Too bad, because the running time of the algorithm is something like \(O(E \cdot V + V)\). With almost 5 millions of edges and 2 millions of nodes, I'm not surprised that the algorithm has never finished.
Can we do better? Yes. It's time for the optimization
Real world example: Santa Claus way to San Francisco (SUCCESS)
What can be improved? After some thinking here are some ideas:
 The graph is connected so after the node has been explored we can mark it as visited and when other nodes will point to it, we simple can ignore it. So the idea is to not visit already visited nodes.
 There may be nodes which we don't want to explore because they won't be reachable at all. Their
ShortesDistance
will be positiveInfinity when dequeuing so we can simply ignore it.  Another improvment would be to get rid of the costly
pq.TryFind
function. Why not to store nodes in a dictionary and priority queue at the same time? The dictionary would allow to fast lookup of the node and the priority queue would ensure to get the minimu node with theShortestDistance
. What we need is just topq.Enqueue newDestination
node and update it in the dictionary at the same time for further retrieval. This would be a double space complexity but that's a tradeoff  Another nice feature will be to track all visited nodes so we could draw a path for the Santa Claus.
Here are some improvments:
1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 

As you can see the graph is now represented with Dictionary<int, Vertex>
. This is our structure allowing finding nodes in \(O(1)\) time. At line 9 and 15 we check that the node is not already visited, this should save some time. Also on line 9 we ignore all the nodes with positiveInfinity. Then on the line 20, if the shortest distance was calculated for the node, we enqueue it in the priority queue so it will be explored before nodes with positiveInfinity and we update on the line 21 the dictionary of nodes, so it will be easily retrieved the next time. On the line 18 when creating the newDestination
record I'm adding to the Path
containing all the visited nodes, the current destination.Id of the node. This is our tracking system of the shortest path.
Let's run the algorithm again we those improvments. We will hopefully calculate the path from Bakersfield CA to San Francisco Gloden Gate Bridge!
The algorith finished!
1:


It's a very good improvment, from never finishing run to 18 seconds. I know that 18 seconds is maybe much for a GPS device but it's a very good start to improve. The running time is \(O(E + V)\) which is a way better than before.
How to know if the algorithm is correct? Well, we have tracked all the intermediate nodes (1415 nodes) on the shortest path. I have retrived their lon lat coordinates and draw it on the google map. Here is what I have:
It's maybe not the fastest way to get from Bakersfield to San Francisco Golden Gate bridge but it seems that this can be the shortest path indeed.
Let's look more closely:
And the destination point:
Let's try a longer way. The destination point will be always the Golden Gate Bridge in San Francisco but the starting point will be 1385 US93, Jackpot, NV 89825, USA
The algorithm has finished!
1:


This is similar running time as before. Cool. It produces the shortest path of 2210 nodes. Let's check on the map:
It seems like tha algorithm is working. Let's look at the starting point:
and the arrival:
This may be not following exactely the roads but it seems the coordinates from th datasets.
Conclusion
The goal of this article was to show how with few lines of code and F# you can write a simple mutable data structure and algorithm that operates on millions of node and edgest to calculate the shortes path from one point on the map to another. The code doesn't follow the functional style and mutability is everywhere but the goal was to understand the underlying implementation. The next time I focus on the same problem but this time writing a fully functional datastructure and algorithm. This is a toy example but in the real world algorithms have to take into account the weights on the edges, traffic data and so on, and should run much more quicker than this one. The best implementation from 9th DIMACS Implementation Challenge that I used the datasets from, have query times expressed in microseconds. Note that these are not simply Dijkstra's algorithms, of course, as the whole point was to get results faster.
The script used for this blog is here. The priority queue implementation with performance tests is here.
References
Here are some references used in this article:
 Dijkstra's algorithm
 9th DIMACS Implementation Challenge  Shortest Paths datasets
 FSharpx.Collections
Full name: Microsoft.FSharp.Core.Operators.not
val seq : sequence:seq<'T> > seq<'T>
Full name: Microsoft.FSharp.Core.Operators.seq

type seq<'T> = System.Collections.Generic.IEnumerable<'T>
Full name: Microsoft.FSharp.Collections.seq<_>
Full name: Microsoft.FSharp.Core.bool
type List<'T> =
new : unit > List<'T> + 2 overloads
member Add : item:'T > unit
member AddRange : collection:IEnumerable<'T> > unit
member AsReadOnly : unit > ReadOnlyCollection<'T>
member BinarySearch : item:'T > int + 2 overloads
member Capacity : int with get, set
member Clear : unit > unit
member Contains : item:'T > bool
member ConvertAll<'TOutput> : converter:Converter<'T, 'TOutput> > List<'TOutput>
member CopyTo : array:'T[] > unit + 2 overloads
...
nested type Enumerator
Full name: System.Collections.Generic.List<_>

System.Collections.Generic.List() : unit
System.Collections.Generic.List(capacity: int) : unit
System.Collections.Generic.List(collection: System.Collections.Generic.IEnumerable<'T>) : unit
Full name: Microsoft.FSharp.Core.Operators.raise
Full name: Microsoft.FSharp.Core.Operators.max
Full name: Microsoft.FSharp.Core.Operators.pown
Full name: Microsoft.FSharp.Collections.list<_>
val int : value:'T > int (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.int

type int = int32
Full name: Microsoft.FSharp.Core.int

type int<'Measure> = int
Full name: Microsoft.FSharp.Core.int<_>
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.set
module List
from Microsoft.FSharp.Collections

type List<'T> =
 ( [] )
 ( :: ) of Head: 'T * Tail: 'T list
interface IEnumerable
interface IEnumerable<'T>
member GetSlice : startIndex:int option * endIndex:int option > 'T list
member Head : 'T
member IsEmpty : bool
member Item : index:int > 'T with get
member Length : int
member Tail : 'T list
static member Cons : head:'T * tail:'T list > 'T list
static member Empty : 'T list
Full name: Microsoft.FSharp.Collections.List<_>
Full name: Microsoft.FSharp.Collections.List.iter
type EntryPointAttribute =
inherit Attribute
new : unit > EntryPointAttribute
Full name: Microsoft.FSharp.Core.EntryPointAttribute

new : unit > EntryPointAttribute
Full name: Microsoft.FSharp.Core.Operators.typeof
val string : value:'T > string
Full name: Microsoft.FSharp.Core.Operators.string

type string = System.String
Full name: Microsoft.FSharp.Core.string
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.printfn
val double : value:'T > double (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.ExtraTopLevelOperators.double

type double = System.Double
Full name: Microsoft.FSharp.Core.double
type CustomComparisonAttribute =
inherit Attribute
new : unit > CustomComparisonAttribute
Full name: Microsoft.FSharp.Core.CustomComparisonAttribute

new : unit > CustomComparisonAttribute
type StructuralEqualityAttribute =
inherit Attribute
new : unit > StructuralEqualityAttribute
Full name: Microsoft.FSharp.Core.StructuralEqualityAttribute

new : unit > StructuralEqualityAttribute
Full name: Microsoft.FSharp.Core.Operators.compare
Full name: Microsoft.FSharp.Core.obj
Full name: Microsoft.FSharp.Core.Operators.unbox
Full name: Microsoft.FSharp.Core.Operators.invalidArg
static member AppendAllLines : path:string * contents:IEnumerable<string> > unit + 1 overload
static member AppendAllText : path:string * contents:string > unit + 1 overload
static member AppendText : path:string > StreamWriter
static member Copy : sourceFileName:string * destFileName:string > unit + 1 overload
static member Create : path:string > FileStream + 3 overloads
static member CreateText : path:string > StreamWriter
static member Decrypt : path:string > unit
static member Delete : path:string > unit
static member Encrypt : path:string > unit
static member Exists : path:string > bool
...
Full name: System.IO.File
System.IO.File.ReadLines(path: string, encoding: System.Text.Encoding) : System.Collections.Generic.IEnumerable<string>
from Microsoft.FSharp.Collections
Full name: Microsoft.FSharp.Collections.Seq.map
val float : value:'T > float (requires member op_Explicit)
Full name: Microsoft.FSharp.Core.Operators.float

type float = System.Double
Full name: Microsoft.FSharp.Core.float

type float<'Measure> = float
Full name: Microsoft.FSharp.Core.float<_>
Full name: Microsoft.FSharp.Collections.Seq.groupBy
Full name: Microsoft.FSharp.Collections.Seq.toList
module Map
from Microsoft.FSharp.Collections

type Map<'Key,'Value (requires comparison)> =
interface IEnumerable
interface IComparable
interface IEnumerable<KeyValuePair<'Key,'Value>>
interface ICollection<KeyValuePair<'Key,'Value>>
interface IDictionary<'Key,'Value>
new : elements:seq<'Key * 'Value> > Map<'Key,'Value>
member Add : key:'Key * value:'Value > Map<'Key,'Value>
member ContainsKey : key:'Key > bool
override Equals : obj > bool
member Remove : key:'Key > Map<'Key,'Value>
...
Full name: Microsoft.FSharp.Collections.Map<_,_>

new : elements:seq<'Key * 'Value> > Map<'Key,'Value>
Full name: Microsoft.FSharp.Collections.Map.ofSeq
Full name: Microsoft.FSharp.Collections.Map.map
Full name: Microsoft.FSharp.Collections.Map.fold
from Microsoft.FSharp.Core