First
Prev
Next
Last
Showing entries
1
to
4
of
100.
Scala partial functions (without a PhD)
Posted
Saturday October 15, 2011 12:54 MST
under category
Last updated
Tuesday January 17, 2012 14:34 PST
If you have done some Scala for a while, you know about pattern matching and match/ case. Things like: value match { case Some(value) ⇒ … case None ⇒ … } But there is another use of the case keyword, without match, as in: map foreach { case (k, v) ⇒ println(k + " → " + v) } The first time I saw this kind of things I was a bit puzzled: in which situations could case be used without match? Well, it turns out [1] that a block with a bunch of case inside is one way of defining an anonymous function.There is nothing new with anonymous functions of course, and Scala has a very compact notation for those that doesn't involve case. But this particular way of defining anonymous functions gives you a lot for free, namely all the good things of pattern
matching like casting-done-right, guards, and destructuring. The example above, with foreach, shows how case can be used for destructuring the tuples of the map into key and value components.But there is more. Consider:scala> List(41, "cat") map { case i: Int ⇒ i + 1 } scala.MatchError: cat (of class java.lang.String) As expected this crashes, because the pattern match doesn't know what to do when the string "cat" is passed to it.On the other hand, this example doesn't crash:scala> List(41, "cat") collect { case i: Int ⇒ i + 1 } res1: List[Int] = List(42) So what's the difference? Does collect just catch the MatchError and proceed? That would be clumsy and inefficient. In fact, the apparent magic lies in the fact that case blocks define special functions called partial functions. [2]Now you might wonder, coming from a "normal" programming language background, what that means, for a function to be "partial".
Well, it comes from mathematics, where it's opposed to "total" functions. But even though it comes from math it's actually simple. Take for example this function: def inc(i: Int) = i + 1 It is defined for any Int input value. That means for that any Int argument, it produces a resulting Int result. [3]A partial function on the other hand is defined only for a subset of the possible values of its arguments: def fraction(d: Int) = 42 / d is not defined for d == 0 and fraction(0) will throw an exception. Think also of the square root function, which is not defined for negative real numbers. Examples abound. And it's true also for the collect example above, where the anonymous function is only defined for an Int argument but not for a String (or any other) argument.So you get the idea about some values not "making sense" as the argument of a function because they can't yield a significant
result.Now if you think about it you will notice lots of situations like this in your programs, where functions are expected to work
properly only for some input values. If the function is called with a disallowed value, it will typically crash, yield a special
return value, or throw an exception (and this should better be documented). In short, partial function are very common in real-life programs
even if you don't know about it.So here fraction is defined as a regular function, but conceptually it is a partial function. The good thing is that Scala has built-in support
for partial functions thanks to the PartialFunction trait. And here is one way of defining such a partial function:val fraction = new PartialFunction[Int, Int] { def apply(d: Int) = 42 / d def isDefinedAt(d: Int) = d != 0 } A PartialFunction must provides a method isDefinedAt, which allows the caller of the partial function to know, beforehand, whether the function can return a result for a given input value:scala> fraction.isDefinedAt(42) res2: Boolean = true scala> fraction.isDefinedAt(0) res3: Boolean = false And if you call the function: scala> fraction(42) res4: Int = 1 scala> fraction(0) java.lang.ArithmeticException: / by zero This takes us back to the use of case to define partial functions. The exact same function can be written: def fraction: PartialFunction[Int, Int] = { case d: Int if d != 0 ⇒ 42 / d }
(Notice that you must specify that the PartialFunction[Int, Int] type. It would be great if Scala had a syntax to make this even more compact but it doesn't as of Scala 2.9.) And if you call the function: scala> fraction(42) res5: Int = 1 scala> fraction(0) scala.MatchError: 0 (of class java.lang.Integer) (Note that there is one visible difference from the outside when you use the case way: you get a MatchError as you usually do with pattern matching.) The idea doesn't apply only to numbers. In our collect example above, the partial function implicitly defined looks like this: def incAny: PartialFunction[Any, Int] = { case i: Int ⇒ i + 1 } The function takes an Any as parameter because List(41, "cat") is a List[Any]. But it is only defined for inputs that are of type Int:scala> incAny(41) res6: Int = 42 scala> incAny("cat") scala.MatchError: 41 (of class java.lang.String) Passing a String didn't go too well, as expected. But now you can check this before calling the function with: scala> incAny.isDefinedAt(41) res7: Boolean = true scala> incAny.isDefinedAt("cat") res8: Boolean = false
So we now have the explanation for the difference in behavior between collect and map, which is that collect expects a partial function. It asks incAny whether it is defined for 41 and then "cat", and so automatically filters out "cat". Another cool thing here is that the Scala compiler can even infer a clean resulting collection type: List[Int]!
scala> List(41, "cat") collect incAny res9: List[Int] = List(42)
Also, as you notice, if you define the partial function inline, the compiler knows that it's a partial function and you avoid
the explicit PartialFunction trait.
Notice that partial functions can lie:
scala> def liar: PartialFunction[Any, Int] = { case i: Int ⇒ i; case s: String ⇒ s.toInt } liar: PartialFunction[Any,Int] scala> liar.isDefinedAt(42) res10: Boolean = true scala> liar.isDefinedAt("cat") res11: Boolean = true scala> liar("cat") java.lang.NumberFormatException: For input string: "cat"
Here liar says incorrectly that it's defined for "cat". It would probably be better to write:
scala> def honest: PartialFunction[Any, Int] = { case i: Int ⇒ i; case s: String if isParsableAsInt(s) ⇒ s.toInt } honest: PartialFunction[Any,Int] scala> honest.isDefinedAt("cat") res12: Boolean = false
So now you see how partial functions defined with case can be used for things like collect with a super compact notation. You will see them in other places, including catch expressions.
There is another situation in Scala where partial functions are "just there" and you might not know it. Take the following
List:
val pets = List("cat", "dog", "frog")
In Scala, any instance of Seq, Set or Map is also a function. So you can write
scala> pets(0) res13: java.lang.String = cat
But:
scala> pets(3) java.lang.IndexOutOfBoundsException: 3
Wouldn't that mean that the pets function is, hum, only defined for values 0, 1, and 2? Sounds familiar? Wouldn't it be cool to look at pets as a partial function then? Well you can because in Scala any instance of Seq, Set or Map is actually a partial function. So you can write:scala> pets.isDefinedAt(0) res14: Boolean = true scala> pets.isDefinedAt(3) res15: Boolean = false
And if you had a list of indexes and wanted to safely collect values for these indexes in a new list, you could write: scala> Seq(1, 2, 42) collect pets res16: Seq[java.lang.String] = List(dog, frog) Here it works well because collect handles everything for us. But it can be a pain to check isDefinedAt all over the place. If anything, it feels a bit like a null check, and we hate those in Scala. The good news is that in Scala
the PartialFunction trait supports the lift method, which converts the partial function to a normal function that doesn't crash:
scala> pets.lift(0) res17: Option[java.lang.String] = Some(cat) scala> pets.lift(42) res18: Option[java.lang.String] = None
As you see the lift returns a function that returns an Option of the value. This allows you to safely process values without null checks and without calling isDefinedAt yourself:
scala> pets.lift(0) map ("I love my " + _) getOrElse "" res19: java.lang.String = I love my cat scala> pets.lift(42) map ("I love my " + _) getOrElse "" res20: java.lang.String = ""
I hope this helps make some sense of partial functions in Scala.
[1] From The Scala Language Specification: "An anonymous function can be defined by a sequence of cases […] which appear as an expression without a prior match." [2] This is not to be confused with partially applied functions, which are a completely different topic. [3] In Scala, it is defined even for Int.MaxValue, as Int.MaxValue + 1 == Int.MinValue. The result is just plain wrong but it's defined!
Continuations in Scala (without a PhD)
Posted
Saturday September 17, 2011 16:15 MST
under category
Last updated
Sunday September 18, 2011 11:02 MST
With @avernet we have been thinking lately about continuations, for a few reasons:
- Continuations pop up on the web as a concept that could help with event-based programming
- Scala has a continuations plugin, and we're wondering what the deal is with that.
- It just seems like fun to try to understand this (alongside things like monads).
The main idea of continuations is the ability to interrupt a program, save its control state, and resume it at a later point
in time. One thing to realize is that there are many ways to implement this idea and variations around it. Google a bit and you will
find a lot of material on continuations, some of which goes deep into computer science. Here we don't care about the big picture: we
just want to get at least some insight into Scala continuations. The main source of information on Scala continuations is the EPFL paper describing how continuations were designed in the Scala compiler. But if you google "scala continuations" and hope to find
right away a clear explanation, you might be disappointed. You will find the following example (I am not kidding):
reset { shift { k: (Int ⇒ Int) ⇒ k(k(k(7))) } + 1 } * 2 This proudly produces the flamboyant result: 20. As a commenter says, "these are convoluted ways of adding numbers and I have no idea what is being gained or accomplished". @djspiewak echoes this when he says that continuations in Scala are "powerful ...but useless". It's a bit like explaining how a combustion engine works, but
not that it could be used to, say, move your car from home to work.
So let's try to look at something concrete. Imagine a read() function which returns a byte from the network: def read: Byte This is typically the signature of a synchronous (blocking) function. After all, it has a return value and in normal programming
languages, that means waiting for that value to be available. A program that reads two bytes in a row and prints them looks
like this: val byte1 = read println("byte1 = " + byte1) val byte2 = read println("byte2 = " + byte2) The issue is that in a web browser or node.js or any other single-threaded, event-driven environment, this is not acceptable:
you simply cannot block for a long time, otherwise nothing else can happen in the system. So instead, the read() function is made to take a callback, something like: def read(callback: Byte => Unit): Unit You must now write your program like this: read { byte1 ⇒ println("byte1 = " + byte1) read { byte2 ⇒ println("byte2 = " + byte2) } } The issue here is that you must write in a funny style, even with Scala's lightweight syntax for closures. Note also how each
callback typically causes a new level of indentation. Some programmers manage to get used to this style, but it does not represent
the control flow in a very natural way, and the issue grows with the size of the program. Enter Scala continuations: import scala.util.continuations._ reset { val byte1 = shift(read) println("byte1 = " + byte1) val byte2 = shift(read) println("byte2 = " + byte2) } And voilà: you can write the program again in imperative style without callbacks and closures. You notice the reset and shift constructs. These terms don't make any sense to a newcomer, but they were introduced a long time ago in an academic paper
so are reused in Scala. Basically, reset delimits the continuation. With full continuations, the entire rest of the program would be under control of the continuation, but
here, whatever is before and after the reset block has nothing to do with continuations at all. (Also, reset can return a value, although here we don't care about it.) shift is the construct that does the real magic. Mainly, it smartly hacks around to pass the continuation, that is a closure containing whatever-code-follow-shift-until-the-end-of-the-reset-block, to its body. If you run that closure,
you actually run that code after the shift. If you store that closure somewhere, you gain the ability to decide when to run
that code at a later point. This is the general idea of continuations: interrupt, then resume a program. Here it's all done
with functions and closures behind the scene. To see how our example really works, let's look a the control flow. First, how would you go about implementing the non-blocking read() function? Obviously it would have to work hand in hand with an asynchronous framework of some sort. Let's say it's roughly
equivalent to something like this: var myCallback: Byte ⇒ Unit = null def read(callback: Byte ⇒ Unit): Unit = myCallback = callback The key here is that read() is passed a callback function. read() just stores the callback in a variable and then returns immediately. There is just no waiting. This simulates what a real async framework would do. In our example, shift calls read() with the continuation as a callback, and as we have seen read() returns right away. But then what does shift do? Does it just hang around? No: shift returns right away as well, and then control continues right after the reset block, and control should then return to the async framework. So it's as if the user program had paused just in the middle
of calling shift(read). Now say that 5 minutes later, a byte (say 42) is available from the network. The async framework figures this out, notices myCallback is registered, and so calls it with the value 42. The result of calling the callback is to run the continuation, that is
the code that follows the first shift runs, with byte1 set to the value 42. Did you see what happened there? It's as if the user program had resumed. And in effect
it has. What happens next? There is a another shift, so the scenario repeats: a new continuation is stored into myCallback. This time, it contains the code after the second shift. read() returns, shift returns, and control returns to the async framework, this time via the call to the initial callback. When the framework receives
another byte from the network, the user program runs up to the end of the reset block and has in effect terminated. We are happy because:
- We never blocked our single thread.
- We wrote the program in a clear, understandable style.
- We actually did something (read and processed bytes from the network)
Obviously to make this real you want a framework and a function library with a set of useful asynchronous functions besides read(). Also, note that you can hide the use of shift from the programmer, and expose the read function like: def aRead = shift(read) And the program becomes: reset { val byte1 = aRead println("byte1 = " + byte1) val byte2 = aRead println("byte2 = " + byte2) } By the way, it also works within while loops. With this specific use of continuations where shift never calls the continuation directly, control unwinds the stack back to the top, and there is no stack explosion. This is
good news: reset { var value = -1 while (value != 42) { value = aRead println(value) } println("done") }
So I would say that this at least appears to be a very useful (if not mainstream at this point) use of continuations in Scala.
They will become even more useful when used as part of a reactive programming DSL.
48 hours of Nexus One
Posted
Sunday January 31, 2010 22:11 PST
under category
Last updated
Tuesday February 2, 2010 15:19 PST
2009: Products I Can’t Live Without
Posted
Wednesday January 7, 2009 00:00 PST
under category
Last updated
Wednesday January 7, 2009 00:01 PST
Posted
under category
Last updated
Refresh...
|
IMG_6146
Taken
Sunday February 6, 2011 12:58 UTC
framechannel
IMG_6060
Taken
Monday January 17, 2011 20:13 UTC
framechannel
IMG_6030
Taken
Saturday January 15, 2011 17:32 UTC
framechannel
Result of the first "solid food" experiment
Taken
Sunday January 9, 2011 17:55 UTC
framechannel
Refresh...
|