Pattern matching

In this post I want to write something about Pattern Matching, because if you have never used a functional programming language before using Erlang you have to deal with it as soon as you can. It is focal point if you want to understand examples written in Erlang or if you want to start write something in the right way.

You will use Pattern Matching in three (fundamental) cases:

  • Variable Assignement
  • Controlling the flow
  • Data filtering from complex structures

Let’s go in order with my bullet point list.

Variable Assignment

This argument could be hard to understand if you come from a classic programming language. Why? Simple, because you have to understand from the beginning that variables can’t be assigned, they can only be pattern matched. The = operator is not assignment operator, but it is Pattern Matching operator, and it compares the left side of the expression with the right side. It could succeed or fail.

For example, if the value in A is 1, and I write A=1 in the Erlang Shell, the Pattern Matching succeeds because the assertion is true. To make an assignment, you have to Pattern Match an unbound variable with the value you want to give to it. In fact, if a variable is unbound, the only way for  succeed in the Pattern Matching is to bound it to the value on the right side of the = operator. It is also important to understand that a Pattern could be composed not only by variables, but by literals too (integer, float, atoms…).

OK, now you are thinking, “Oh, so strange, and why I can assign a value to a variable only one time? So they are not variable. I’m used to do whatever I want with a variable”.

You are right, they are variables that don’t vary, because in this way we can forget about side effects caused by wrong assignments in some parts of the program.

We also can detect an error more faster because we have to search only for one place in order to find the wrong value.

As a little resume about this point: When you pattern match a value to a variable, if it is unbound the value will be assigned to the variable and the Pattern Matching is true. If the variable is bound, the value on the right side must be the same of the variable to have success in the evaluation, elsewhere you will get an error at runtime.

Controlling the flow

With this construct you can manage the flow control in a pragmatic way, it is intuitive and it will improve the way you write code.

Instead of using tons of “if clauses” you can be more pragmatic, in the case and receive statements (the first is a conditional evaluation and the second is about processes) or in the function definition. In this post, I will cover only functions, because the other two statements involve more knowledge, and these aren’t in the purposes of the post.

A function in Erlang, is defined as a collection of clauses and when it is called, Erlang chooses the right one by Pattern Matching the parameters list.

That is a very easy example:

greetings({woman, _Name, _Surname}) ->
     "Hello Madame";
greetings({man, _Name, _Surname}) ->
     "Hello Sir";
greetings(_) ->
     "Hello".

When we call the greeting function we could pass a tagged tuple, and based on the tag element Erlang will choose the right clause (using the Pattern Matching principles). It is a good habit to insert the last clause that matches everything, in this way we could return a default or an error value.

Data filtering from complex structures

In Erlang, you will surely use lists or tuples. Once you have one of them, you will surely need to extract or check the values in the structure.

Let’s start from the lists:
With lists, you can for example extract or check the value of the Head or the Tail of the list.

Example 1: The H and the T variables are unbound:

List=[1,2,3,4,5].
[H|T] = List.

In this case, after the evaluation of the second instruction, H will be 1 and T will be [2,3,4,5]. Because the Pattern Matching assign the value on the right to an unbound variable on the left. Don’t worry about the |, it is about Lists, I will probably cover it in a future post.
Example 2: Suppose that the H value is 1 and the T value is [2,3]:

List=[1,2,3,4,5].
[H|T] = List.

In this case we get an error at runtime, because if it is true that H matches with the head of the List, the T is not the same, in this case, the tail of the list has got 4 elements and T has got only 2 elements.

With tuples, it is quite the same thing, but we don’t have to think about the cons construct (|). Let’s go with a quick example before write some bits about the “don’t care”:

Tuple = {driver, "Lewis", "Hamilton"}.
{driver, Name, Surname} = Tuple.

In this case we have a F1 driver (my favourite) in the Tuple variable. In the second instruction, we check that the tuple contains a driver, pattern matching the first element with the atom driver, and then, if Name and Surname are unbound we could extract “Lewis” and “Hamilton” in only one instruction. If they are not unbound, we can instead check if we are dealing with the same driver, in the other case, we get a runtime error.
Something must be said about “don’t care” variables in the Pattern Matching. Erlang compiler is a kind entity, and if it understands that you are not using a bound variable it returns some warning as a reminder to you. But if you really don’t need them, you could make the variable start with _, so the compiler and others programmers will understand that the variable don’t care in the execution.

A variable  could be only an _, but in this case it is only a place holder that matches with all the possible values and it will never be assigned.

As Ulf Wiger remind to me in the first comment to this post, the only real “don’t care” is the _, while a variable that starts with _ (for example _Var) is a real variable, the only difference with the others is that the the linter has been instructed not to warn about them. So if you use for example _Var twice in a function, it will be bound the second time and if don’t pay enough attention, you could surprisingly obtain a  ”badmatch” error because the Pattern Matching could fail as for the other variables that doesn’t start with _.

Here an example:

% Surname is not used in the function so that we use the _ in front of it
get_driver_name({driver, Name, _Surname}) ->
     Name.

I hope this post could help someone to get in touch with Pattern Matching. It must be understand from the beginning because it is a masterpiece of the language, a lot of things are based on it, so don’t underestimate this concept.

Other resources:

tags: , ,
posted in Beginner by Mirko

Follow comments via the RSS Feed | Leave a comment | Trackback URL

3 Comments to "Pattern matching"

  1. Ulf Wiger wrote:

    Nice.

    One thing that can bite you is that _Var is actually still a variable; it's just that the linter has been instructed not to warn about them. So if you use e.g. _Surname twice in the same function, it will be bound the second time, and you may end up with a very surprising 'badmatch' error. The single _ is a true anonymous "don't care" pattern.

  2. ErlangThoughts wrote:

    Hi Ulf, thank you so much for the clarification. I have just updated the post, because it is a thing that might be underestimated but it may lead to surprising and not too nice errors. Great comment!

  3. Mirko Bonadei » Blog Archive » Erlang Pattern Matching – Mirko Bonadei’s Blog wrote:

    [...] Ulf Wiger mi ricordò nel primo commento al post originale in lingua Inglese, l’unico vero “don’t care” è il _, mentre una variabile che inizia per _ [...]

Leave Your Comment

 
Powered by Wordpress. Design by Bingo - The Web Design Experts.