Chapter 4
Fun with Higher-Order Functions
Computer science is the science of abstraction creating the right model for a problem and devising the appropriate mechanizable technique to solve it. A. Aho, J. Ullman
In this chapter, we cover a very powerful programming paradigm: Higher-order functions. Higher-order functions are one of the most important mechanisms in the development of modular, well-structured, and reusable programs. They allow us to write very short and compact programs, by abstracting over common functionality. This principle of abstraction is in fact a very important software engineering princi- ple. Each significant piece of functionality in a program should be implemented in just one place in the source code. Where similar functions are carried out by distinct pieces of code, it is generally beneficial to combine them into one by abstracting out the varying part. The use of higher-order functions allows you to easily abstract out varying parts.
Since abstracting over varying parts to allow code reuse is such an important prin- ciple in programming, many languages support it in various disguises. Recently, the concept of higher-order functions has received particular attention since it features prominently in Googles MapReduce framework for distributed computing and in the Hadoop project, an open-source implementation to support large-scale data process- ing which is used widely. These frameworks are used to process 20 to 30 petabytes of data a day and simplify data processing on large clusters. Although mostly written in C++, the philosophy behind MapReduce and Hadoop is functional:
Our abstraction is inspired by the map and reduce primitives present in Lisp and many other functional languages.
-Jeffrey Dean and Sanjay Ghemawat
41 c B. Pientka Fall 2020
1 2 3
1 2 3 4
1 2 3 4
All these functions look very similar, and the code is almost the same. But obvi- ously, the sum depends on what function we are summing over! It is natural to ask, if we can write a generic sum function where we only give a lower bound a, and upper bound b, and a function f which describes what needs to be done in each iteration. The answer is, YES, we can! Here is how it works:
Chapter 4: Higher-Order Functions 4.1 Passing Functions
Higher-order functions also play an important role in code generation, i.e. pro- grams which generate other programs. Later on in the course, we will discover that higher-order functions also are key to understanding lazy computation and handling infinite data. Finally, we will see that they allow us to model closures and objects as you know them in object-oriented programming.
4.1 Passing Functions as Arguments
Functions form the powerful building blocks that allow developers to break code down into simple, more easily managed steps, as well as let programmers break programs into reusable parts. As we have seen early on in the course, functions are values, just as numbers and booleans are values.
But if they are values, can we write functions which take other functions as ar-
guments? In other words, can we write a function f:int * (int -> int) -> int? Would
this be useful? The answer is YES! It is in fact extremely useful!
Xk=b k=a
Consider the following implementation of the function which computes
1 let rec sumInts (a,b) = if (a > b) then 0 else a + sumInts(a+1,b)
k.
k=b k=b k=b Similarly, we can compute the function X k2 or X k3 or X 2k
k=a k=a k=a
let rec sumSquare(a,b) = if (a > b) then 0 else square(a) + sumSquare(a+1,b) let rec sumCubes (a,b) = if (a > b) then 0 else cubes(a) + sumCubes(a+1,b) let rec sumExp (a,b) = if (a > b) then 0 else exp(2, a) + sumExp(a+1, b)
(* sum: (int -> int) -> int * int -> int *)
let rec sum f (a,b) =
if (a > b) then 0
else (f a) + sum(f,a+1,b)
We can then easily obtain the previous functions as follows:
42 c B. Pientka Fall 2020
let rec sumInts (a,b) = sum (fun x -> x)
let rec sumCubes (a,b) = sum cube
let rec sumSquare(a,b) = sum square
let rec sumExp (a,b) = sum (fun x -> exp(2,x)) (a, b)
(a, b) (a, b) (a, b)
1 2 3 4 5 6 7 8 9
10 11 12
1 2 3
1 2 3 4 5 6 7
This seems to suggest we can generalize the previous sum function and abstract over the increment function.
Chapter 4: Higher-Order Functions 4.1 Passing Functions
Note that we can create our own functions using fun x -> e, where e is the body of the function and x is the input argument, and pass them as arguments to the function sum. Or we can pass the name of a previously defined function like square to the function sum. In general, this means we can pass functions as arguments to other functions.
What about if we want to sum up all the odd numbers between a and b?
(* sumOdd: int -> int -> int *)
let rec sumOdd (a, b) = let rec sum (a,b) =
if a > b then 0
else a + sum(a+2, b) in
if (a mod 2) = 1 then (* a was odd *) sum(a,b)
else
(* a was even *)
sum(a+1, b)
let rec sum f (a, b) inc =
if (a > b) then 0
else (f a) + sum f (inc(a),b) inc
Isnt writing products instead of sums similar? The answer is also YES. Consider computing the product of a function f(k) for k between a and b.
(* product : (int -> int) -> int * int -> (int -> int) -> int *)
let rec product f (lo,hi) inc =
if (lo > hi) then 1
else (f lo) * product f (inc(lo),hi) inc
(* Using product to define factorial. *)
let rec fact n = product (fun x -> x) (1, n) (fun x -> x + 1)
The main difference is two-folded: First, we need to multiply the results, and second we need to return 1 as a result in the base case, instead of 0.
So how could we abstract over addition and multiplication and generalize the function further? We could define such a function series? Our goal is to write this function tail-recursively and we use an accumulator {acc:int} in addition to a function f:int -> int, a lower bound a:int, an upper bound b:int, increment function inc:int -> int. We also need parameterize the function series with a function comb:int
-> int -> int to combine the results. By instantiating comb with addition (i.e. (fun x y
-> x + y)) we obtain the function sum and by instantiating it with multiplication (i.e. (fun x y -> x * y)) we obtain the function prod.
43 c B. Pientka Fall 2020
Chapter 4: Higher-Order Functions
4.1 Passing Functions
(* series:
->
*)
let rec series comb f (a,b) inc acc = let rec series (a, acc) =
if (a > b) then acc
else
series (inc(a), comb acc (f a))
in
letrecsumSeries f(a,b)inc=series(funxy->x+y)f(a,b)inc0 let rec prodSeries f (a,b) inc = series (fun x y -> x * y) f (a, b) inc 1
series(lo, r)
(combiner ( f ((a,b)
( inc
( acc
( result
: int -> int -> int) -> :int->int)-> :int*int)->
: int -> int) ->
: int) : int)
1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17
Ok, we will stop here. Abstraction and higher-order functions are very powerful mechanisms in writing reusable programs.
4.1.1 Example 1: Integral
Let us consider here one familiar problem, namely approximating an integral (see Fig. 4.1). The integral of a function f(x) is the area between the curve y = f(x) and the x-axis in the interval [a, b]. We can use the rectangle method to approximate the integral of f(x) in the interval[a, b], made by summing up a series of small rectangles using midpoint approximation.
dx
f(a+(dx/2)
a
b
a+(dx/2)
Figure 4.1: Integral between a and b
44 c B. Pientka Fall 2020
Chapter 4: Higher-Order Functions 4.1 Passing Functions
To approximate the area between a and b, we compute the sum of rectangles, where the width of the rectangle is dx. The height is determined by the value of the function f. For example, the area of the first rectangle is dx f(a + (dx/2)).
Then we can approximate the area between a and b by the following series, where l = a + (dx)/2 is our starting point.
Rabf(x)dx f(l)dx+f(l+dx)dx+f(l+dx+dx)dx+
= dx (f(l) + f(l + dx) + f(l + 2 dx) + f(l + 3 dx) . . .)
Assuming we have an iterative sum function iter_sum for reals, we can now com- pute the approximation of an integral as follows:
1 2
1 2
1 2 3
let rec integral f (a,b) dx =
dx *. iter_sum f (a +. (dx /. 2.0) , b) (fun x -> x +. dx)
This is very short and elegant, and directly matches our mathematical theory. Alternatively, we could directly sum up over the area without factoring out dx. In other words, we could directly implement the definition
Rabf(x)dx f(l)dx+f(l+dx)dx+f(l+dx+dx)dx+ as follows:
let rec integral f (a,b) dx =
iter_sum (fun x -> f x *. dx) (a +. dx /. 2.0, b) (fun x -> x +. dx)
4.1.2 Example 2: Half interval method
The half-interval method is a simple but powerful technique for finding roots of an equation f(x) = 0, where f is a continuous function. The idea is that, if we are given points a and b such that f(a) < 0 < f(b), then f must have at least one zero between a and b. To locate a zero, let x be the average of a and b and compute f(x). If f(x)>0,thenfmusthaveazerobetweenaandx. Iff(x)<0,thenfmusthavea zero between x and b. Continuing in this way, we can identify smaller and smaller intervals on which f must have a zero. When we reach a point where the interval is small enough, the process stops. Since the interval of uncertainty is reduced by half at each step of the process, the number of steps required grows as (log(l/t)), where l is the length of the original interval and t is the error tolerance (that is, the size of the interval we will consider small enough). Here is a procedure that implements this strategy: let rec halfint (f:float -> float) (a, b) t = letmid=(a+.b)/.2.0 in
if abs_float (f mid) < t then mid45 c B. Pientka Fall 2020Chapter 4: Higher-Order Functions 4.1 Passing Functions4 5 6 71 2 3 41 add(1, add(2, add(3, add(4, ? ) ) ) )To stop the successive application of the function we also need a base. In our case where we add all the elements, the initial element would be 0 (i.e. we replace ? with 0).We can observe that cumulatively applying a given function to all elements in a list is useful for many situations: we might want to create a string 1234 from the given list (where the initial element would be the empty string), or we might want to multiply all elements (where the initial element would be 1), etc.We also observe that we sometimes want to gather all the results in the reverse order. While the former function is often referred to as fold-right the latter one is often described as fold-left.1 add4(add3(add2(add10)))Implementing a function fold_right which folds elements from right to left is elseif (f mid) < 0.0then halfint f (mid, b) t else halfint f (a, mid) t4.1.3 Combining Higher-order Functions with Recursive Data-TypesWe can also combine higher-order functions with recursive data-types. One of the most common uses of higher-order functions is the following: We want to apply a function f to all elements of a list. (* map: (a -> b) -> a list -> b list *)
let rec map f l = match l with |[] ->[]
| h::t -> (f h)::(map f t)
This is the first part in the MapReduce framework. What is second part Reduce referring to? It is referring to another popular higher-order function, often called fold in functional programming. Essentially fold applies a function f over a list of el- ements and produces a single result by combining all elements using f. For example, lets say we have a list of integers [1;2;3;4] and we want to add all of them. Then we essentially want to use a function add which cumulatively is applied to all elements in the list and adds up all the elements resulting in
straightforward.
1 2 3 4
(*fold_right (a->b->b)->alist->b->b*) (* fold_right f [x1; x2; ; xn] init
returns
46 c B. Pientka Fall 2020
5 6 7 8 9
10 11
1 2 3 4 5 6
Finally, we revisit another popular higher-order function, called filter. This func- tion can be used to filter out all the elements in a list which fulfill a certain predicate p:a -> bool.
Chapter 4: Higher-Order Functions
4.2 Returning Functions
f x1 (f x2 (f xn init))
or init if the list is empty. *)
let rec fold_right f l b = match l with |[] ->b
| h::t -> f h (fold_right f t b)
(* filter: (a -> bool) -> a list -> a list *)
let rec filter (p : a -> bool) l = match l with |[] ->[]
| x::l ->
if p x then x::filter p l else filter p l
You will find many similar functions already defined in the OCaml basis library.
4.2 Returning Functions as Results
In this section, we focus on returning functions as results. We will first show some simple examples demonstrating how we can transform functions into other functions. This means that higher-order functions may act as function generators, because they allow functions to be returned as the result from other functions.
Functions are very powerful. In fact, using a calculus of functions, the lambda- calculus, we can cleanly define what a computable function is. The lambda calculus is universal in the sense that any computable function can be expressed and evaluated using this formalism. It is thus equivalent to the Turing machine formalism and was originally conceived by Alonzo Church (1903-1995).
The question of whether two lambda calculus expressions are equivalent cannot be solved by a general algorithm, and this was the first question, even before the halting problem, for which undecidability could be proved. The lambda calculus is also the foundation for many programming languages in particular functional pro- gramming.
4.2.1 Example 1: Currying and uncurrying
Currying has its origin in the mathematical study of functions. It was observed by Frege in 1893 that it suffices to restrict attention to functions of a single argument.
47 c B. Pientka Fall 2020
Chapter 4: Higher-Order Functions 4.2 Returning Functions
For example, for any two parameter function f(x, y), there is a one parameter func- tion f such that f0(x) is a function that can be applied to y to give (f0(x))(y) = f(x, y). This idea can be easily implemented using higher-order functions. We will show how we can translate a function f : a * b -> c which expects a tuple x:a * b into a function f : a -> b -> c to which we can pass first the argument of type a and then the second argument of type b. The technique was named by Christopher Strachey (1916 1975) after logician Haskell Curry (1900-1982), and in fact any function can
be curried or uncurried (the reverse process).
In general, we call currying the transformation of a function which takes multiple
arguments in form of a tuple in such a way that it can be called as a chain of functions each with a single argument
1 2 3 4 5
(* curry : ((a * b) -> c) -> (a -> b -> c) *)
let curry f = (fun x y -> f (x,y))
(* uncurry: (a -> b -> c) -> ((a * b) -> c) *)
let uncurry f = (fun (y,x) -> f y x)
We say that f: a -> b -> c is the curried form of f: a * b -> c. To create func- tions we use the nameless function definition fun x -> e where x is the input and e denotes the body of the function.
A word about parenthesis and associativity of function types Given a func- tion type a -> b -> c, this is equivalent to a -> (b -> c); function types are right- associative. This in fact corresponds to how we use a function f of type a -> b -> c. Writing f arg1 arg2 is equivalent to writing (f arg1) arg2; function application is left- associative.
Lets look at why this duality between the function type and function application makes sense; f has type a -> b -> c and arg1 has type a. Therefore f arg1 must have type b -> c. Moreover, applying arg2 to the function f arg1, will return a result of type c.
(farg1)arg2:0 c | {z }
b -> c
Some more examples of higher-order functions
function definitions are equivalent:
let id = fun x -> x (* OR *)
let id x = x
1 2 3 4 5
Recall that the following two