Skip to main content

Prim's Minimum Spanning Tree

Taking a break from Sasa blog posts, I ran into a problem that would be nicely solved by extracting the minimum spanning tree. Turns out this is pretty simple to solve naively in C# using only IEnumerable<T>:

/// <summary>
/// Computes the minimum spanning tree.
/// </summary>
/// <param name="edges">The list of edges.</param>
/// <remarks>
/// Given a list of edges whose vertices are consecutive numbers starting
/// from 0, this extension method returns an enumerable sequence of the
/// edges describing the minimum spanning tree.
/// </remarks>
public static IEnumerable<Edge> MinSpanTree(this IEnumerable<Edge> edges)
{
    var span = edges.ToList();
    var count = 1 + span.Max(x => Math.Max(x.From, x.To));
    edges.Sort();
    var vnew = new BitArray(count);
        vnew[0] = true;
    for (var i = 0; i < count; ++i)
    {
        foreach (var x in edges)
        {
            if (vnew[x.From] ^ vnew[x.To])
            {
                yield return x;
                vnew[x.From] = vnew[x.To] = true;
                break;
            }
        }
    }
}

An Edge is a simple class that encapsulates two vertex numbers and a weight represented by a double. It's also IComparable, so we can sort on it using the weight.

Prim's algorithm selects the minimum weighted edge between a vertex we have not yet accepted, and a vertex that we have accepted. We use a bit array to track the vertex numbers we've accepted, and simply search weight-sorted edge list for the first entry where one vertex is accepted, and the other is not. This is an O(V2), where V is the number of vertices. We could also contract the list after each accepted edge, but that doesn't change the overall time complexity.

A simple adjacency matrix representation using a two-dimensional array of doubles is also easily definable:

/// <summary>
/// Computes the minimum spanning tree.
/// </summary>
/// <param name="edges"></param>
/// <remarks>
/// Given an adjacency matrix describing the weighted edges, where
/// vertices are the indices of the matrix, this extension method
/// returns an enumerable sequence of the edges describing the
/// minimum spanning tree.
/// 
/// Vertices that have no edges between them should have
/// <see cref="double.NaN"/> or <see
/// cref="double.PositiveInfinity"/> in the corresponding
/// matrix entry.
/// </remarks>
public static IEnumerable<Edge> MinSpanTree(this double[,] edges)
{
    if (edges.RowCount() != edges.ColumnCount())
        throw new ArgumentException("Matrix row count must equal column count.", "edges");
    // convert the adjacency matrix into a list of edges
    var count = edges.RowCount();
    var span = new List<Edge>(count);
    for (int i = count - 1; i >= 0; --i)
    {
        for (int j = 0; j < i; ++j)
        {
            var w = edges[i, j];
            if (!double.IsNaN(w) && w < double.PositiveInfinity)
                span.Add(new Edge(i, j, w));
        }
    }
    return span.MinimumSpanningTree();
}

I've included these implementations in Sasa.Numerics.dll for the v0.9.4 release.

Comments

Popular posts from this blog

async.h - asynchronous, stackless subroutines in C

The async/await idiom is becoming increasingly popular. The first widely used language to include it was C#, and it has now spread into JavaScript and Rust. Now C/C++ programmers don't have to feel left out, because async.h is a header-only library that brings async/await to C! Features: It's 100% portable C. It requires very little state (2 bytes). It's not dependent on an OS. It's a bit simpler to understand than protothreads because the async state is caller-saved rather than callee-saved. #include "async.h" struct async pt; struct timer timer; async example(struct async *pt) { async_begin(pt); while(1) { if(initiate_io()) { timer_start(&timer); await(io_completed() || timer_expired(&timer)); read_data(); } } async_end; } This library is basically a modified version of the idioms found in the Protothreads library by Adam Dunkels, so it's not truly ground bre

Building a Query DSL in C#

I recently built a REST API prototype where one of the endpoints accepted a string representing a filter to apply to a set of results. For instance, for entities with named properties "Foo" and "Bar", a string like "(Foo = 'some string') or (Bar > 99)" would filter out the results where either Bar is less than or equal to 99, or Foo is not "some string". This would translate pretty straightforwardly into a SQL query, but as a masochist I was set on using Google Datastore as the backend, which unfortunately has a limited filtering API : It does not support disjunctions, ie. "OR" clauses. It does not support filtering using inequalities on more than one property. It does not support a not-equal operation. So in this post, I will describe the design which achieves the following goals: A backend-agnostic querying API supporting arbitrary clauses, conjunctions ("AND"), and disjunctions ("OR"). Implemen

Easy Automatic Differentiation in C#

I've recently been researching optimization and automatic differentiation (AD) , and decided to take a crack at distilling its essence in C#. Note that automatic differentiation (AD) is different than numerical differentiation . Math.NET already provides excellent support for numerical differentiation . C# doesn't seem to have many options for automatic differentiation, consisting mainly of an F# library with an interop layer, or paid libraries . Neither of these are suitable for learning how AD works. So here's a simple C# implementation of AD that relies on only two things: C#'s operator overloading, and arrays to represent the derivatives, which I think makes it pretty easy to understand. It's not particularly efficient, but it's simple! See the "Optimizations" section at the end if you want a very efficient specialization of this technique. What is Automatic Differentiation? Simply put, automatic differentiation is a technique for calcu