Friday, January 27, 2012

A Solution For Project Euler Problem 67 in F#

I'm a fan of the projecteuler site. This site lists some 300+ problems everyone is invited to solve. Some of them are relatively easy, some very tricky. As Wikipedia states:
«... Project Euler (named after Leonhard Euler) is a website dedicated to a series of computational problems intended to be solved with computer programs. The project attracts adults and students interested in mathematics and computer programming. As of 24 January 2012, it includes 368[1] problems of varying difficulty, each solvable in less than a minute using an efficient algorithm on a modestly powered computer.  ...»
From time to time I stop by and pick one or more of the problems and try to implement them in different languages. This is always a very interesting experience because it teaches me a lot about the differences between programming languages, their constructs and - of course - the pros and cons of different programming styles. This by the way can also be a very good and fun way to get your hands dirty while learning new languages.

The Problem

Recently I stumbled upon an interesting problem - problem number 67. Given a triangle of numbers, by starting at the top and moving to adjacent child nodes one level below, you are to find the maximum total from top to bottom by adding the respective values. It follows the example from the projecteuler.net site:
3
7 4
2 4 6
8 5 9 3

The resulting total in this example is 3 + 7 + 4 + 9 = 23.
In fact, problem 67 is the same as problem 18 with the little, but, as you will see in a moment, very important difference of a bigger problem size.

On the way to solve that bastard

In fact, problem 67 is the same as problem 18 with the little, but, as you will see in a moment, very important difference of a bigger problem size.

A naive approach

An approach one could possibly take is to traverse every possible route and return the maximum total found. While this is definitely a possible solution to solve problem 18, it doesn't work for problem 67. As mentioned before, the bigger problem size makes it unfeasible. The triangle given in problem 18 has 15 rows (root included), that gives you 14 levels with a left/right decision each, resulting in a total of 214 = 16384 routes to test. Every additional level doubles the number of possible routes. The triangle given in problem 67 has 99 «decision levels» that leads to 299 possible routes - that means
633 825 300 000 000 000 000 000 000 000 
routes! Well, my Mac IS a work horse, but...

So... what do we do then?

Ok, we need another approach. The basic idea comes from a mehtod known in computer science as Dynamic Programming (DP). What you try to do in DP is to break the problem down into smaller and hopefully easier subproblems, solve these subproblems and then combine them in order to construct an overall solution. Often the results of the subproblems are stored for later reuse, e.g. to prevent the algorithm to compute the same subproblem over and over again. This makes for more efficient algorithms. Also, there exist top-down as well as bottom-up strategies in DP.
Back to our Problem: We are solving it using a bottom-up strategy, so let me turn our triangle upside down.
8 5 9 3
2 4 6
7 4
3

In order to define our subproblem let's assume we know the maximum total path from the start until the red number 4. From there we have two directions from which to choose, the 5 or the 9, both in blue. For the purpose of maximizing our route, which would you choose? Yes, sure the 9. Let me formalize this a bit. In matrix form, the triangle being matrix A with elements a(i,j), we can state the updating rule:
a(i,j) <- a(i,j) + max( a(i-1,j), a(i-1,j+1) )

Applying this formula on each element in each row starting at the first and working you down to the last while keeping the most recent results of each subproblem leads you to the overall solution. Let me show you in tabular form how this works out with our small example from the website:
                   row                 intermediate results
==============================================================
init:                                     [ 0, 0, 0, 0 ]
--------------------------------------------------------------
1st row:      [ 8, 5, 9, 3 ]              [ 8, 5, 9, 3 ]
2nd:          [ 2, 4, 6    ]              [10,13,15, 3 ]
3rd:          [ 7, 4       ]              [20,19,15, 3 ]
4rd:          [ 3          ]              [23,19,15, 3 ]

I hope this makes it clear how it works. Ok, so let's start coding!

A possible implementation in F#

The first thing we have to do is to read in the triangle.txt file and put the text in an appropriate format. This is a particularly easy job in F#:
open System
open System.IO

let triangle = 
    File.ReadAllLines(@"/path/to/triangle.txt")
    |> Array.map (fun line -> 
        line.Split([|' '|]) 
        |> Array.map(fun str -> Int32.Parse(str)))
    |> Array.rev

After that the value triangle is a jagged array of Int32 values. I also turned the triangle upside down as explained above.

And here follows the code for finding the maximum total:
let max =   
    triangle 
    |> Array.fold (fun acc line -> 
        acc 
        |> Array.mapi (fun i elem -> 
            match (i < line.Length) with
            | true  -> line.[i] + Math.Max(acc.[i], acc.[i+1])
            | _     -> acc.[i])) 
       (Array.zeroCreate (triangle.[0].Length + 1))
    |> Seq.nth 0

That's really all! When you run these two snippets for example in F# Interactive it givs you following output:
val max : int = 7273
The accumulator argument (acc) passed to the fold function serves as the state of the intermediate results and is initialized with Array.zeroCreate. In the map part I update these intermediate results according to the updating rule explained above.
I mean, basically it's only a map within a fold. Really, I love F#.

A few explanations about the code for F#-newcomers

The main constructs I have used here are the forward pipe, the map and the fold higher order functions and the fun keyword.
  • Pipelining let's you take the result of a computation and use it as input of the following computation. For example in the first snippet I pipe the result of File.ReadAllLines (which is a string []) to the map function.
  • A map function allows you to transform a collection of some type to a collection of some other type. Again, in the first snippet, the first map gets a string [] from the forward pipe and transfroms it to an int [] []. The second map in the same snippet gets a string [] and transorms it to a int [].
  • A fold function is similar to the map function: it too applies a function to each element of a collection, but carrying an accumulator argument with for holding state. With a fold you can for example reduce your collection of values to a single piece of data. I sort of did that in the second snippet, when the two-dimensional array (int [] []) is reduced to a one-dimensional array (int []).
  • The fun keyword let's you define anonymous functions (lambda expressions!) and come in very handy when using for exammple map or fold. As you can see, there are 4 different anonymous functions defined in the snippets above.

Conclusion

All I can say is: Programming F# is fun. And the explorative way of trying things out with the help of the F# Interactive window in Visual Studio is simply great.
I'm often impressed by how short and succinct a solution looks. Don't get me wrong: I'm absolutely not saying that the presented solution is great or anything, I know there are many possibilities to make that code even more elegant, more readable and/or shorter, but still, I'm exited when I compare the code of my, say, Java version with this one.

I hope you enjoyed it. I did. Definitely. If you have any comment, please don't hesitate to drop a few words. Almost everything is appreciated!

No comments:

Post a Comment