3. Pattern Matching – Be Explicit
Pattern matching is a way of writing conditional code by mainly decomposing data into subcomponents. It is a high level construct allowing programmers to think naturally about their algorithms rather than in programming terms. Under the hood pattern matching expressions just are empowered if/switch expressions, so don’t worry to use them, they are efficient.
In F# you can do pattern matching on the terms before the (=) operator in a let expression, using the function syntax or the match with syntax. You can decompose most data structures which have a literal representation in the code such as lists, tuples (kind of array containing values of different types), numbers, strings, booleans etc.…
So you can do:
Let see an examples for each of these constructs:
Here the pattern is firstName, lastName, age, which matches a tuple of three values, indeed the comma operator (,) helps to build tuples. Tuples are defined by their size, and type, which is an union of other types. On the right side of the (=) operator, an other tuple “Dorian”, “Corompt”, 23 of size 3 and of type string * string * int , the (*) operator indicating an union of types. Thanks to pattern matching we decompose this tuples, giving the value “Dorian” to firstName, “Corompt” to lastName and 23 to age.
You could also have written:
You can interpret even simple value binding as pattern matching, making it a more general concept. In the example above person is bound to a tuple, you can think person as a named pattern to match everything, and in this precise case it is a tuple.
In the previous example we named the subcomponents in the pattern to get their values, but in some case you don’t need all subcomponents, in these case you can use the joker/wildcard pattern ( _) to mach the values that should be ignored.
Let’s say we want to represent a 3D point with a tuple of type float * float * float we need functions to access each coordinates of this structure:
In functional languages functions are standard values (see my article on 1st Class Functions), so you can define them with let since let helps to bind names to values.The syntax is let functionName argument1 argument2 … = body. In our example each functions take one argument (a tuple of three values), we used the (_) joker pattern to represent coordinates we don’t need.
In the interactive console we can try the functions we just defined:
It works as expected.
You already saw a function using pattern matching on numbers:
The function construct returns as value a function taking a single argument on which pattern matching is applied, even if you don’t see it, this argument exists. The pipe (|) allows to define possible branches for the match, the branches patterns are tested sequentially from top to bottom. In the fact definition the first branch matches the pattern 0 which is the same constant value and attributes an expression value to it, 1. The second branch matches everything (with some constraints on the type which is inferred by the compiler) and name the value matched n so it can be used in the expressions associated to this match which is n * fact(n – 1).
The match with construct allows to apply pattern matching on a specific expression. Actually the previous fact definition is equivalent to the following:
The function construct is therefore just a syntax sugar to allow concise definitions of particular pattern matching functions.
I told you that a named pattern of the form n (aka only a name) matches everything, so you may have questions:
Does it mean that if I call fact “test” this will call factorial on a string, matching it with the n pattern?
No, in F# everything is typed, in the first branch you specify 0 –> 1, the compiler infers that your function returns an int and that it takes an int so that x should be an int. The second branch has to confirm this hypothesis. So the compiler looks at n –> n * fact(n - 1), it looks first at the result n * fact(n - 1) , n is of unknown type, fact is of type: int –> int and the (*) operator is of type int –> int so the whole expression should have type int ->int meaning that n and x are of type int.
What if I give a negative integer to this fact definition?
This function will make your program crash. Indeed, take fact –1 one for example, -1 it will be matched against n and lead to n * fact(n - 1), it continues recursion with fact –2; fact –3,… x being negative will never be matched against 0, so this function will loop indefinitely and because this definition is not recursive it will push its context on the stack on each loop iteration, leading to a stack overflow. Of course there is a solution to this problem, we can use condition (or guards) on pattern to put constraints on them. Pattern conditions are specified through the use of the when keyword:
After adding this pattern condition you will get a warning which tells you that your pattern matching is incomplete. What does it mean? It simply means that your matching expression doesn’t take into account all input it could be given. Indeed if you call now fact –1, where does it branch to? If you don’t define an additional pattern it will leads to a match failure runtime exception.
Let’s just add a third branch to correct this and encompass all possible values:
The third branch uses the (_) joker pattern to match everything else (negative integers) and throw a failure exception with the given message thanks to the failwith function.
What if while decomposing list of the form h :: t, I want to keep a reference of the entire list not only the head and the tail?
You can use the as keyword to name pattern components for example (h :: t) as ls –> ls matches a non empty list (it has a head and a tail) and returns its value.
Real life functions
When I speak about functional programming languages with imperative programmer colleagues, they tend to consider them as unusable.
I always hear “They are toys good for maths, experiments, but not real life applications…”
When I hear these words, it makes me smile. The basis of OOP concept, as well as important programming technics such as garbage collection, exception handling, generics, software threads, closures, predicates, sequences, most list manipulation functions or technologies (LINQ for example) they are proud to use in their imperative languages come from functional programming languages…
I recently noticed that after closures, and LINQ the next “big” feature coming to C# is asynchronous operations handling through the use of the await keyword, a new concept heavily used in Windows 8 Metro Style UI programming. As you will see this feature is just a particular application of F# workflows (Haskell Monads) that I will introduce in a next article.
They try to interpret and criticize functional programming from their imperative programming knowledge. As you cannot judge a culture you don’t know or understand you cannot judge a paradigm you never tried. Be open.
I am going to show you how are implemented some “real life” functions that are the pillars of most functional programming languages. And explain you how these functions relates to the .NET LINQ technology.
Let’s say you have a list employees aliases, you want to get their list of emails, where each email is of the form firstname.lastname@example.org. First, what you need is a function taking an alias and returning an email, then a way of mapping this function to each alias in the list.
C# (without LINQ):
Thinking imperatively you would try to decompose the second step in something iterative by thinking at the computer level: “I need to iterate through the list, call the function to create the email from the current iteration alias and then adding to a second list”. With a functional programming you think at a high level, you just do what you said!
The map function represents the pattern of applying a function in a context (a list in our case) and returning a new context containing the result. Its non tail recursive implementation for lists is trivial using pattern matching:
Now that you have this email list, let say you want to sort it alphabetically. You would like to sort it efficiently, looking for a solution you found the quicksort algorithm on Wikipedia (you could use the List.sort function, the purpose is to teach you how to think functionally):
Quicksort is a divide and conquer algorithm. Quicksort first divides a large list into two smaller sub-lists: the low elements and the high elements. Quicksort can then recursively sort the sub-lists.
The steps are:
- Pick an element, called a pivot, from the list.
- Reorder the list so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation.
- Recursively sort the sub-list of lesser elements and the sub-list of greater elements.
The base case of the recursion are lists of size zero or one, which never need to be sorted.
So we need to choose an element as a pivot in the list, let’s take the first element of the list. Then the result is a recursion on the elements of the list smaller than the pivot, the pivot, and a recursion on the element greater than the pivot.
Here a simple and concise implementation in F#:
I did exactly what is stated above. I took the first element of the list p as a pivot thanks to pattern matching, I applied qsort recursively on a list composed of the elements smaller than p thanks to the List.filter function, and I concatenated the result thanks to the list concatenation operator (@) with a list containing the pivot p and a list composed of the elements greater than p on which qsort has been recursively applied.
The fun … –> … construct is a way of creating anonymous functions, learn more about closures with my article on 1st Class Functions.
The filter function represents the pattern of filtering elements in a sequence satisfying a condition, and returning a new sequence of retained elements. Its non tail recursive implementation for lists is trivial using pattern matching:
Functions like f are called predicates, they represent conditions and return a boolean value. We use f in the guard to know if we need to keep the element in the resulting list, or switch to the next branch and keep filtering the remaining elements discarding the first element.
map and filter are amongst the most important functions in functional programming languages.
Do you know LINQ?
In my first example I showed you how creating a list of emails from a list of aliases in F# was concise and clear compared to C#, but you could quite rightly argue that I didn’t use LINQ.
For example if you want to create a list of emails with aliases that contains the ‘o’ letter and then sort it alphabetically in C# with LINQ you could do it easily:
I wrote earlier, that LINQ was coming from functional programming, do you see the link with the functions map, filter and qsort, I have shown you?
The Select method is just a map, the Where method a filter, and OrderBy method a sorting function!
The (|>) operator is called the pipe operator (same role as the Powershell or UNIX shell pipe), it allows you to chain expressions by calling the function on its right with its right most parameter equals to the result of the expression on the left. If you know Haskell, it is equivalent to a right currying of the function. You will learn more about it when talking about partial function application and composition in my article on 1st Class Function.
It is not a coincidence if LINQ, inferred typing (var) and closures (F# functions are closures) appeared together in C# 3.0. They all are functional concepts.
So if some of your C# developer colleagues tell you how great is C# with LINQ compared to your toys, don’t tell them you could recode LINQ with your toys in 1 line of code, it could be vexing.