Thursday, 12 December 2013

Graph algorithms: Kruskal's minimum spanning tree

The F# Journal just published an article about graph algorithms:
"Kruskal's algorithm for computing the minimum spanning tree of a weighted undirected graph uses the union-find data structure to combine fragments of tree connected by the lightest edges. This article describes both pure and impure implementations of the algorithm using the persistent union-find data structure by Conchon and FilliĆ¢tre to create a high performance purely functional solution..."
To read this article and more, subscribe to The F# Journal today!

Tuesday, 10 December 2013

Graph algorithms: Prim's minimum spanning tree

The F# Journal just published an article about graph algorithms:
"Prim's algorithm for computing the minimum spanning tree associates a predecessor and distance with each vertex and uses a searchable priority queue to add the lightest-weight vertices to the tree until the minimum spanning tree has been found. This article examines both pure and impure implementations of Prim's algorithm..."
To read this article and more, subscribe to The F# Journal today!

Saturday, 12 October 2013

A thread-safe object cache that doesn't leak

Consider the problem of lazily initializing objects from handles. Requesting the object corresponding to any given handle more than once should return the same physical object that the first such call returned. The simplest solution is to use a dictionary to memoize a create function:

let get create =
  let cache = System.Collections.Generic.Dictionary()
  fun id ->
    let mutable obj = Unchecked.defaultof<_>
    if cache.TryGetValue(id, &obj) then obj else
      obj <- create id
      cache.[id] <- obj

But this is not thread safe. No problem, thanks to .NET the code is actually simpler using a concurrent dictionary:

let get create =
  let cache = System.Collections.Concurrent.ConcurrentDictionary<int, 'a>()
  fun id -> cache.GetOrAdd(id, System.Func<_, _>(create))

The problem with this approach is that objects never have their entries removed from the dictionary and, therefore, this solution leaks memory. For example, the following test:

  let get = Leaks.get (fun n -> [n])
  for i in 1..1000000000 do
    ignore(get i)
    if i &&& (i-1) = 0 then
      printfn "%d created, %dMB in use"
        i (System.GC.GetTotalMemory false / 1048576L)

Produces this output, showing that the program has leaked away over a gigabyte of memory in just a few seconds:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 331MB in use
8388608 created, 649MB in use
16777216 created, 1297MB in use

Fortunately, this mistake is easy to fix on .NET by using a concurrent weak dictionary that has strong keys and weak values. One such implementation is available here. This challenge may then be solved correctly in just 10 lines of F# code as follows:

let get create =
  let cache = TvdP.Collections.ConcurrentWeakDictionaryStrongKeys()
  fun id ->
    lock cache (fun () ->
      let mutable value = Unchecked.defaultof<_>
      if cache.TryGetValue(id, &value) then
        cache.[id] <- create id

With this solution we find that the memory usage never rises beyond 20MB:

1 created, 0MB in use
2 created, 0MB in use
4 created, 0MB in use
4194304 created, 17MB in use
8388608 created, 13MB in use
16777216 created, 16MB in use

To learn more, subscribe to the F# Journal today!

Friday, 11 October 2013

Scribble app in 12 lines of F#

Here’s a little Scribble app for F# Interactive in just 12 lines of code:

#r "PresentationCore"
#r "PresentationFramework"

open System.Windows

let canvas = Controls.Canvas()
let window = Window(Content=canvas)
let mutable lines = Shapes.Polyline()
window.MouseDown.Add(fun _ ->
  lines <- Shapes.Polyline(Stroke=Media.Brushes.Black, StrokeThickness=1.0)
  ignore(canvas.Children.Add lines))
window.MouseMove.Add(fun e -> lines.Points.Add(e.GetPosition window))
window.MouseUp.Add(fun _ -> lines <- Shapes.Polyline())
Application().Run window

This program begins by referencing the required WPF assemblies and opening a namespace. The canvas and a window containing it are created followed by a mutable global polyline. The mouse down event causes the polyline to be replaced and the new one added to the canvas. The mouse move event causes the current mouse position to be appended to the end of the points on the current polyline. The mouse up event replaces the polyline but does not add the new one to the canvas.

The result looks like this:

If you want to see more F# programs, subscribe to the F# Journal today!

Tuesday, 8 October 2013

Removing image borders

Shrinking images by removing a constant-color border is a common problem. The process turns this:

into this:

In fact, we faced this problem with all of the images from our previous blog post. One solution is to fire up a general-purpose image manipulation program like Paint.NET and invoke a series of mystical commands on each image in turn. This blog post looks at an F# program with a minimal WPF UI that allows image files to be dragged from explorer and dropped onto our UI. All such images are automatically shrunk by removing outside edges containing pixels that are all the same color.

We begin by referencing the usual WPF assemblies (PresentationCore, PresentationFramework, System.Xaml and WindowsBase) as well as System.Drawing because GDI+ exposes a more elegant interface for manipulating images. Then we open the following namespace:

open System.Windows

The following function loads an image as a 2D array of colors:

let loadImage (file: string) =
  use image = new System.Drawing.Bitmap(file)
  Array2D.init image.Width image.Height (fun x y ->
    image.GetPixel(x, y))

Note how the use binding allows the image to be automatically disposed when it falls out of scope.

Similarly, the following function saves as image of the given dimensions with pixel colors given by the getPixel function to the given file:

let saveImage (width: int) (height: int) getPixel file =
  use shrunk = new System.Drawing.Bitmap(width, height)
  for x in 0..width-1 do
    for y in 0..height-1 do
      shrunk.SetPixel(x, y, getPixel x y)
  shrunk.Save file

Next we write a function that returns the first index into a sequence of sequences where the subsequence contains more than one value or the length of the sequence if no such subsequence exists:

let find xs =
  let f xs = Seq.length(Seq.distinct xs) <> 1
  match Seq.tryFindIndex f xs with
  | None -> Seq.length xs
  | Some idx -> idx

The core of our program is the following shrink function that loads an image, identifies the rectangle within the border and saves that part of the image back to the same file:

let shrink file =
  let original = loadImage file
  let width, height = original.GetLength 0, original.GetLength 1
  let xs = [ 0..width-1 ]
  let ys = [ 0..height-1 ]
  let row y = seq { for x in xs -> original.[x, y] }
  let column x = seq { for y in ys -> original.[x, y] }
  let x0 = column xs |> find
  let x1 = width - find( column (List.rev xs))
  let y0 = row ys |> find
  let y1 = height - find( row (List.rev ys))
  let width, height = x1 - x0, y1 - y0
  if width > 0 && height > 0 then
    saveImage width height (fun x y -> original.[x+x0, y+y0]) file

The main program then creates a window, enabled drag-and-drop and adds a callback :

  let window = Window(Title="Shrink images")
  window.AllowDrop <- true
  window.Drop.Add(fun e ->
    if e.Data.GetDataPresent DataFormats.FileDrop then
      (e.Data.GetData DataFormats.FileDrop :?> string [])
      |> Seq.iter shrink)
  Application().Run window
  |> ignore

This program can now be used to remove borders from many images in batch.

For more articles on F#, subscribe to the F# Journal!

Tuesday, 1 October 2013

Downloading stock prices reloaded

We recently published a blog post describing a 14-line solution for downloading stock prices using F# 2.0. Type providers are a major new feature in F# 3.0 (Visual Studio 2012 and later) that make this task even easier!

We begin by installing the FSharp.Data package from NuGet:

Install-Package FSharp.Data

This provides us with type providers for the JSON, CSV and XML formats as well as Freebase and the World Bank.

In order to write this as a script we must begin by referencing the FSharp.Data DLL which is currently located in the following relative location:

#I @"../packages/FSharp.Data.1.1.10/lib/net40"
#r "FSharp.Data.dll"

Note that this has hardcoded the NuGet package’s version number which is likely to change in the future.

Next, we create a CSV type provider that uses sample CSV data to infer the types in the columns:

type Stocks = FSharp.Data.CsvProvider<"">

Note that we are able to give a URL as the location of the sample CSV data. This was not obvious from any of the available tutorials at the time of writing, all of which assume the data has already been downloaded to disk. The ability to refer to sample data on the web is clearly useful in practice.

Now we are ready to define the type we wish to use to represent the relevant data:

type Prices = { Open: decimal; High: decimal; Low: decimal; Close: decimal }

We shall use the same base URL as before:

let url = ""

Our getStockPrices function may now be written more simply, without requiring code to parse the CSV, as follows:

let getStockPrices stock =
  let msft = Stocks.Load(url + stock)
  [ for row in msft.Data ->
      { Open=row.Open; High=row.High; Low=row.Low; Close=row.Close } ]

Finally, the following example application downloads historical stock price information for Microsoft:

getStockPrices "MSFT"

Using type providers we have reduced the size of even the simplest example from 14 lines of code to just 9.

Writing a contour plotter

The F# Journal just published an article about computer graphics:
"Visualizations come in many different forms. One of the most common ways to visualize a function of two variables is a contour plot. The function is sampled to identify perimeters where it crosses thresholds and these are rendered as vector paths. This article describes the design and implementation of a simple program that generates contour plots using the meandering triangles algorithm and, when parallelized, runs over 11x faster than Mathematica..."
To read this article and more, subscribe to The F# Journal today!

Human-readable string joining in F#

Eric Lippert (formerly of Microsoft) posted an interesting blog post challenging readers to write a program that converts a sequence of strings into a single string by comma separating all but the last string, which is separated by the word “and” instead. Let’s consider this challenge in F#...

Perhaps the simplest and most maintainable solution is to convert the sequence into an array and act upon that instead:

let concat (xs: seq<string>) =
  let xs = Array.ofSeq xs
  match xs.Length with
  | 0 -> "{}"
  | 1 -> sprintf "{%s}" xs.[0]
  | n -> sprintf "{%s and %s}" (String.concat ", " xs.[..n-2]) xs.[n-1]

Pattern matching over the length of the array allows us to handle the three separate cases elegantly. Furthermore, in the final case it allows us to bind the length of the array to the variable n. In the last two cases we use the C-like sprintf function to compose several strings in one go. The final case also uses an array slice to obtain all elements except the last.

This function is easily tested in F# Interactive as follows:

> concat [[]; ["A"]; ["A"; "B"]; ["A"; "B"; "C"]];;
val it : string list = ["{}"; "{A}"; "{A and B}"; "{A, B and C}"]

Let’s look at some alternative solutions. First up, an elegant (almost) purely functional solution based upon a state machine with three states:

type State =
  | Empty
  | Singleton of string
  | And of System.Text.StringBuilder * string

The following function transitions the state machine from one state to another as new characters are encountered:

let transition a x =
  match a with
  | Empty -> Singleton x
  | Singleton y -> And(System.Text.StringBuilder("{" + y), x)
  | And(sb, y) -> And(sb.Append(", " + y), x)

The main concat function then folds the transition function over the sequence of strings and acts upon the final accumulator:

let concat xs =
  match Seq.fold transition Empty xs with
  | Empty -> "{}"
  | Singleton x -> "{" + x + "}"
  | And(sb, y) -> sb.Append(" and " + y + "}").ToString()

The three states are equivalent to the three pattern matches in our simple solution. Technically, the .NET string builder is not a purely functional data structure but that actually has no effect on our code here.

Another, more traditional, solution is to encode the state machine as an imperative program using conditions and loops. This may be written as follows in F#:

let concat (xs: seq<string>) =
  use e = xs.GetEnumerator()
  if e.MoveNext() then
    let s = e.Current
    if e.MoveNext() then
      let sb = System.Text.StringBuilder("{" + s)
      let mutable s = e.Current
      while e.MoveNext() do
        sb.Append(", " + s) |> ignore
        s <- e.Current
      sb.Append(" and " + s + "}").ToString()
    else "{" + s + "}"
  else "{}"

This produces the same result as the two previous solutions but is the most difficult to understand because the state is implicitly represented by the “program counter”.

Finally, a rather contrived solution that converts the input sequence into an array but creates the output string using the F# String.init function:

let concat (xs: seq<string>) =
  let xs = Array.ofSeq xs
  let n = xs.Length
  let len = if n<2 then 2+n else 1 + 2*n
  String.init len (fun i ->
    match i, len-1-i with
    | 0, _ -> "{"
    | _, 0 -> "}"
    | i, _ when i%2=1 -> xs.[i/2]
    | _, 2 -> " and "
    | _ -> ", ")

The last point of interest is the performance of these different functions when applied to a very long input string. As none of these functions uses repeated concatenation they are all linear time. When given ten million strings on this machine the simplest solution takes 0.79s, the functional state machine takes 0.81s, the imperative state machine takes 54s and the solution using String.init takes just 0.38s. Given such a small performance difference, the winning solution in the vast majority of cases is the simplest solution.

To see more F# samples and learn more tips and tricks, subscribe to the F# Journal today!

Friday, 27 September 2013

F# for Visualization 0.7 released

A new version of F# for Visualization has been released that boasts the following features:
  1. Easier to use than ever before!
  2. Contour plots.
  3. Interactive charts: pan and zoom with the mouse.
  4. New API.
  5. Programmatically close chart windows.

To use F# for Visualization in a script, simply put the DLL in the same directory as the .fsx file and reference it in the script like this:

  #r "FSharpForVisualization"

The following namespace contains lots of helpful functions that make F# for Visualization extremely easy to use:

  open FlyingFrog.FSharpForVisualization

For example, you can plot a sine wave like this:

  plot sin

Note that the chart can be dragged and zoomed using the mouse.

Heuristics are used to guess ranges when none are supplied:

  plot sqrt

Ranges over x are easily supplied like this:

  plot(atan, (0, 10))

Ranges over y can be supplied in a similar way:

  plot(atan, (-10, 10), (-1.5, 1.5))

Note that ranges need not be given as floating point numbers but can, in fact, be any numeric type.

We can also label the axes like this:

  plot(atan, (-10, 10), (-1.5, 1.5), ("x", "atan(x)"))

Run-time type information is used to determine the kind of chart used to visualize a function. For example, a function of two variables is automatically visualized as a contour plot:

  plot atan2

Functions of the type Complexfloat are also visualized as contour plots, this time with default axis labels. For example, the function Im(√z):

  open System.Numerics
  plot(fun z -> Complex.Sqrt(z).Imaginary)

The plot function can also be used to plot series. For example, sequences of numbers are taken to be y coordinates:

  plot [for i in 0..10 -> i*i]

Sequences of pairs of numbers are x,y coordinates:

  plot [for i in 0..10 -> i*i, i]

Sequences of such sequences are assumed to be multiple series:

  plot[[for i in 0..10 -> i*i, i]; [for i in 0..10 -> i*i*i, i]]

Predator-prey population evolution may then be visualized like this:

  let g, k, t = 0.02, 5e2, 1500

  let evolve (r, f) _ =
    let dtrf = 1e-4 * r * f
    r + (1.0 - r/k)*r*g - dtrf, dtrf + (1.0 - g)*f

  plot(Seq.scan evolve (50.0, 10.0) [1..t], ("Rabbits", "Foxes"))

The plot function returns an object with SaveAsPng and Close members that allow the visualization to be saved programmatically and closed:

  let window = plot floor

  window.SaveAsPng "TheFloorFunction.png"


Charts support typeset mathematics for both labels and annotations. For example, plot the sinc function and label the axes accordingly:

  open FlyingFrog.Controls
  open FlyingFrog.Graphics.Typeset.Math
  open FlyingFrog.Interactive

  let sinc x = sin x / x

  let g = Plot([Function sinc], (-10.0, 10.0))
  g.Labels <- font="" math="">"x", Fraction(Row[Text "sin"; math "(x)"], math "x"))

Annotate a point on the graph:

  #r "PresentationCore.dll"
  #r "PresentationFramework.dll"
  #r "WindowsBase.dll"

  open System.Windows
  open FlyingFrog.Graphics.Scene
  open FlyingFrog.Graphics.Typeset

  let pi = System.Math.PI
  g.Data <- data="" o:p="" pi="" sinc="" unction="">
  g.Epilog <- font="" label="" math="" ow="" scene="">"("; Fraction(Greek.pi, math "2"); math ",";
                              Fraction(math "2", Greek.pi); math ")"]),
                    0.003, Point(pi/2.0, sinc(pi/2.0)), Vector(-1.2, -1.2))

Although F# for Visualization does not depend upon F# for Numerics or the F# PowerPack it can consume types from those libraries. For example, the following computes the 5x5 Hilbert matrix using arbitrary-precision rational elements:

  #r "FSharp.PowerPack.dll"

  open Microsoft.FSharp.Math

  let A =
      [ for i in 0N..4N ->
          [ for j in 0N..4N ->
              1N / (i + j + 1N) ] ]

Just as our plot function attempts to chart any value of any type so the view function attempts to visualize any value of any type, such as our Hilbert matrix:

  view A

or even a typeset mathematical matrix equation:

  view(Row[math A; Symbols.Times; math A; math "="; math(A*A)])

The ability to visualize values of F# types as typeset mathematics so easily is invaluable for interactive technical computing.

F# for Visualization also provides an easy-to-use API for general 2D vector graphics. Vector graphics are styled by a sequence of strokes or fills colored with WPF brushes:

  open System.Windows.Media
  open FlyingFrog.Graphics.Style

  let styles =
    [ Stroke(Brushes.Red, {Width = 0.001}) ]

The geometry of a shape is a sequence of contours that may be closed or open (only closed contours will be filled):

  open FlyingFrog.Graphics.Contour

  let geometry =
    let n = 50
    [ for i in 0 .. n do
        for j in 0 .. 1 do
          let x, y = float i / float n, float j
            Contour.Open(Point(x, y), [lineTo(y, 1.0 - x)]) ]

A vector graphics scene may contain shapes, groups of scenes, transformed scenes and clipped scenes:

  open FlyingFrog.Graphics.Scene

  let scene = Shape(styles, geometry)

The view function visualizes the vector graphics scene:

  view scene
The scene may also be visualized on a pair of axes by making it the epilog of an empty plot:

  Plot([], (0., 1.), (0., 1.), Epilog=scene)

F# for Visualization also does 3D graphics.

  open FlyingFrog.Graphics3D
  open FlyingFrog.Graphics3D.Scene3D

A Mesh describes a geometry and a Shape3D combines a mesh with a WPF Brush in order to represent a colored 3D object as a piece of scene graph. For example, the following creates a white icosahedron:

  let scene3d = Shape3D(Polyhedra.icosahedron, Brushes.White)

The same "view" function can be used to visualize 3D objects:

  view scene3d

F# for Visualization is currently available for just £49 under our beta release scheme. This latest release brings F# for Visualization much closer to it's first full release whereupon the price will increase. So why wait? Order your copy of F# for Visualization today!