Closures, from Scheme to Javascript to PHP

Citește postarea în română

Share on:

The notion of closure in PHP, even though it appeared in PHP 5.3, as I’ve said before on my blog, it was properly done only in 5.4.

Wikipedia tells us:

In computer science, a closure (also lexical closure or function closure) is a function or reference to a function together with a referencing environment—a table storing a reference to each of the non-local variables (also called free variables) of that function.

In PHP this isn’t a very popular concept or very well-known. It is often mistaken for Anonymous Functions. But in functional programming languages it is very popular, because they really need it!

Scheme

When Brendan Eich designed JavaScript, relied on the Scheme language and ended up doing an implementation of this language with a C syntax. The C syntax was and is a lot more popular, and back then (1995) the Java programming language was very “fashionable”.

The Scheme syntax is similar to Lisp, in the sens that is using parenthesis abound expressions in order to execute them. The operators are defined as functions and just like them, there must be placed left of the parenthesis.

Let’s take an Scheme closure example:

1(define (make-counter)
2  (let ((count (begin 
3                 (display "run parent function and return 0") 
4                 )))
5    (lambda ()
6      (set! count (+ count 1))
7      (begin 
8        (display "inside child function ") 
9        count))))

The function is setting a “count” variable, with the value 0 and displays “run parent function and return 0”, then returns another lambda function, that is incrementing the variable defined in the main function and then displays “inside child function”.

I’m storing the resulting function in a variable in order to later run it multiple times:

1> (define counter (make-counter))
2run parent function and return 0
3> (counter)
4inside child function 1
5> (counter)
6inside child function 2

In other words, each time I’m calling (make-counter), it will return a new function that has access to the environment at the time at which it was created. If it looks strange because of the syntax, I promise that it will fell a lot more natural in JavaScript.

This concept is very interesting for encapsulation. The environment from the time when the parent function was been executed can be encapsulated, and later it can be used without worrying that it was changed by external causes.

For the functional programming languages this is a very interesting concept. Yet when it comes to object orientated languages, the concept seems almost useless, because objects also have the purpose of encapsulation.

JavaScript

From the beginning JavaScript was a hybrid, a functional programming language, object orientated, with prototype based inheritance. And if this wasn’t enough, the syntax was taken from Java (C).

JavaScript didn’t inherited a lot from Scheme, but it did inherit the closure concept.

A reason why there was a need for closures in Scheme is that that if a function is not finding a variable in its environment, it will search for it in its container’s environment. Let’s take an example:

1(define x 1)
2(define (add-in-env y) (+ x y))

If we call add-in-env with 2:

1(add-in-env 2) -> 3

It looks just as ambiguous as in JavaScript, but is not exactly like that. In Scheme to do mutation is not that easy, simple and transparent, so an subsequent operation of:

1(define x 2)

will result in an error.

In JavaScript resulted a hybrid. Mutation is permitted, but the notion of searching a variable in the current environment remained:

1var x = 1;
2var add_in_env = function (y) {
3   return x + y;
4}
5
6add_in_env(2); // returns 3

Up to this point is ok, but for:

1x = 2;
2add_in_env(2); // returns 4

For this case, things can get out of hand very easy:

But, in order to solve the issue, we can just define a variable in the environment that will finish execution (will close):

 1var make_counter = function () {
 2   console.log("run parent function and set counter to 0")
 3   var count = 0;
 4
 5   return function () {
 6       count = count + 1;
 7       console.log("inside child function");
 8       return count;
 9   }
10}
11
12var counter = make_counter();
13console.log(counter());
14console.log(counter());
15
16var counter2 = make_counter();
17console.log(counter2());
18console.log(counter());
19console.log(counter2());

The output will be:

 1run parent function and set counter to 0
 2inside child function
 31
 4inside child function
 52
 6run parent function and set counter to 0
 7inside child function
 81
 9inside child function
103
11inside child function
122

Even though the main function finished executing, the environment inside it is kept as a closure for the function that was returned. Only when there aren’t any more references to the sub-function the memory allocated for the closure will also be deallocated.

Even though JavaScript has objects, it doesn’t have private methods. An approach is to add a “_” (underscore) in front of the function name and consider it private. From my point is like asking the developers that will later use the code to consider this function private. Of course this is not very consistent.

Let’s take an example:

1var obj = {
2   _secretFunction : function (key) { console.log(do secret  + key) },
3   doStuff : function (key) { this._secretFunction(key) }
4}
5
6obj.doStuff(stuff); // do secret stuff

It seems that there is a public method “doStuff” and a private one “_secretFunction”. Nevertheless you can not prevent a user from calling “_secretFunction” or even worse, to modify it:

1obj._secretFunction = function (key) { console.log('new secret ' + key); }
2
3obj.doStuff('stuff'); // new secret stuff

If we want to hide the function, and make this obvious for everybody, again, we can use closures:

 1var obj = (function () {
 2   var secretFunction =  function (key) { console.log(do secret  + key) }
 3
 4   return {
 5      doStuff : function (key) { 
 6         secretFunction(key) 
 7      }
 8   }
 9})();
10
11obj.doStuff(stuff); // do secret stuff

Because the parent function was not stored but rather immediately executed, basically the space in which secretFunction was defined has already finished its execution, encapsulating the logic. The object returned can call the function because it was defined in the same environment as the object.

Looks complicated at first, but is really very easy when you understand the concept.

And then it was… PHP

PHP includes a lot of different options. It was originally developed as a Perl framework, later the engine was rewritten in C.

PHP is a dynamic language that includes a lot of concepts, from objects, interfaces and anonymous functions, up to goto labels. The development direction for the language is not very clear, it rather offers the possibility for different approaches.

In the weird PHP history, somewhere in version 4, syntax for Anonymous Functions was added, but only in PHP 5.3 a more “normal” version appeared.

Also in version 5.3 the first closure version was introduced:

 1$scalar = 5;
 2
 3$closure = function () use ($scalar) {
 4     return 'Scalar: ' . $scalar . PHP_EOL;
 5};
 6
 7echo $closure(); // Scalar: 5
 8
 9$scalar = 7;
10
11echo $closure(); // Scalar: 5

This version mostly worked, but you had to specify what you want to send to the closure.

And there were other inconveniences:

 1<?php 
 2class Foo {         
 3   private function privateMethod() {                 
 4      return 'Inside private method';         
 5   }
 6
 7   public function bar() {                 
 8      $obj = $this;                 
 9      return function () use ($obj) {                         
10         return $obj->privateMethod();
11      };
12   }
13}
14
15$obj = new Foo();
16$closure = $obj->bar();
17echo $closure();
18
19Fatal error:  Call to private method Foo::privateMethod() from context '' in [...][...] on line 10

Is not working because you can not send $this as a parameter to a closure, and if you try the above trick you still can’t access the private methods. Remember, this was happening in PHP 5.3.

The idea to introduce a closure of this kind seems strange to me. But this is not the first time something “strange” is introduced in PHP, as I was saying before about the Anonymous Functions. Sometimes is looking like work in progress.

I think everybody was expecting a more JavaScript like closures. I think that JavaScript had a big influence in making this concept so popular.

In version PHP 5.4 things changed, we finally have a closure as we would expect:

 1class Foo {
 2   private function privateMethod() {
 3      return 'Inside private method';
 4   }
 5
 6   public function bar() {
 7      return function () {
 8         return $this->privateMethod();
 9      };
10   }
11}
12
13$obj = new Foo();
14$closure = $obj->bar();
15echo $closure(); // Inside private method

And it works!

You can even do:

1unset($obj);
2echo $closure();

and it will work, because the object in which the closure was defined remains in memory until either the script finishes execution, or a call like this is made:

1unset($closure);

For more details on how closures work in PHP 5.4, check out this post.