λv Example in PLT Redex

This page works through an extended version of λ_v. It goes at a faster pace than the tutorial, highlighting a few key parts of Redex. The extended language starts with the language in the Why PLT Redex? page, but adds numbers, addition, a conditional expression, multi-arity procedures, and treats + as a value. The complete version, with added test cases, is here: lam-v.rkt.

To follow along, start up DrRacket and enter this code fragment:

#lang racket (require redex)

Here is the grammar for the language, written as code for PLT Redex:

(define-language λv (e (e e ...) (if0 e e e) x v) (v (λ (x ...) e) number +) (E (v ... E e ...) (if0 E e e) hole) (x (variable-except λ + if0)))

The traditional mathematical notation translates directly into PLT Redex: each line in a BNF description of λ_v's grammar becomes one line in language. The first part of the line specifies the non-terminal, and the remaining expressions on the line are productions for that non-terminal. Each production is written as a pattern, where non-terminals written in the pattern refer to the non-terminals in this grammar, and other symbols in the grammar are treated as literals (eg +, if0, and λ). The pattern

(variable-except λ
+ if0)

matches any symbol except those listed (in this case, λ, +, and if0).

The reduction rules also translate literally into a use of the reduction-relation form.

(define red (reduction-relation λv (--> (in-hole E (+ number_1 number_2)) (in-hole E ,(+ (term number_1) (term number_2))) "+") (--> (in-hole E (if0 0 e_1 e_2)) (in-hole E e_1) "if0t") (--> (in-hole E (if0 number_1 e_1 e_2)) (in-hole E e_2) "if0f" (side-condition (not (= 0 (term number_1))))) (--> (in-hole E ((λ (x ..._1) e) v ..._1)) (in-hole E (subst-n (x v) ... e)) "βv")))

The first reduction rule defines the semantics of addition. The pattern in the first argument to the first --> matches expressions where a syntactic term of the form (+ number number) is the next step to be performed. It also binds the pattern metavariables E_1, number_1, and number_2 to the context and +'s operands, respectively. In general, pattern variables with underscores must match the non-terminal before the underscore and bind variables to be used in the consequence of the rule.

The second argument to --> is a new S-expression. The in-hole pattern that appears on the left-hand side of the rule is dual to the in-hole term constructor used on the right-hand side. The former decomposes an expression into a context and a hole, and the latter composes an expression from a context and its hole's new content. The addition operation is replaced by the sum of the operands; the comma is used to escape to Racket and use its + operator (numeric constants in S-expressions are identical to the numbers they represent). The term form is PLT Redex's general-purpose tool for building S-expressions. Here we use it only to dereference pattern variables. Finally the "+" gives a name to this reduction rule; the names are used in the GUI.

The second and third rules specify how if0 behaves. If its first argument is 0, the if-expression reduces to its first alternative. The third rule uses a side-condition to ensure that it only applies when the test position of the if0 expression is non-zero. This side-condition is required, since the cases in a reduction relation are applied in an un-ordered fashion. If multiple rules match, the system just produces multiple outcomes.

The final rule specifies that function application rewrites via substitution. In general, ... means to duplicate the expression before it as many times (including zero) as is necessary. In this case, the subscript on the ellipses mean that the number of duplications must match. In other words, the number of formal parameters must match the number of actual parameters.

Although substitution functions are generally omitted when writing a paper, PLT Redex requires you to write it out. The substitution function subst-n accepts an arbitrary number of variables and recursively calls subst, which performs the substitution.

(define-metafunction λv subst-n : (x any) ... any -> any [(subst-n (x_1 any_1) (x_2 any_2) ... any_3) (subst x_1 any_1 (subst-n (x_2 any_2) ... any_3))] [(subst-n any_3) any_3])

The subst function is where the real work of substitution happens. Here is its definition.

(define-metafunction λv subst : x any any -> any ;; 1. x_1 bound, so don't continue in λ body [(subst x_1 any_1 (λ (x_2 ... x_1 x_3 ...) any_2)) (λ (x_2 ... x_1 x_3 ...) any_2) (side-condition (not (member (term x_1) (term (x_2 ...)))))] ;; 2. general purpose capture avoiding case [(subst x_1 any_1 (λ (x_2 ...) any_2)) (λ (x_new ...) (subst x_1 any_1 (subst-vars (x_2 x_new) ... any_2))) (where (x_new ...) ,(variables-not-in (term (x_1 any_1 any_2)) (term (x_2 ...))))] ;; 3. replace x_1 with e_1 [(subst x_1 any_1 x_1) any_1] ;; 4. x_1 and x_2 are different, so don't replace [(subst x_1 any_1 x_2) x_2] ;; the last cases cover all other expressions [(subst x_1 any_1 (any_2 ...)) ((subst x_1 any_1 any_2) ...)] [(subst x_1 any_1 any_2) any_2])

It accepts a variable, an expression to be substituted, and an expression to substitute. It is defined in cases that are based on the structure of the expression being substituted into (the last argument), but rather assuming that the input is an e, it makes no assumptions about the structure of its input, using Redex's any pattern (a built-in pattern that matches any term).

The first case avoids substituting x_1 into the body of a λ-expression when it binds x_1. The second case is a general purpose capture-avoiding step. It avoids capture by renaming the variable x_2 to x_new in body of the λ-expression being substituted into. The where clause chooses the particular name bound to the meta-variable x_new. The unquote beginning the clause's right-hand side escapes back to Scheme to use the variables-not-in function, which accepts an expression and a list of variables. It returns a list of variables that do not occur in its first argument, preferring to return the corresponding variable from its second argument, if possible. If not, it returns a symbol whose printed form begins with the printed form of the corresponding symbol in the second argument.

When subst recurs in the body, it uses the subst-vars helper function, rather than using subst.

(define-metafunction λv subst-vars : (x any) ... any -> any [(subst-vars (x_1 any_1) x_1) any_1] [(subst-vars (x_1 any_1) (any_2 ...)) ((subst-vars (x_1 any_1) any_2) ...)] [(subst-vars (x_1 any_1) any_2) any_2] [(subst-vars (x_1 any_1) (x_2 any_2) ... any_3) (subst-vars (x_1 any_1) (subst-vars (x_2 any_2) ... any_3))] [(subst-vars any) any])

The subst-vars does a capturing substitution. It blindly replace all occurrences of the variables in its first argument that appear anywhere in its third argument. It is safe in this case because we know that the new variables are freshly generated and thus cannot capture or be captured.

If the subst function did not use subst-var to recur in the second case, then it would be correct but would take exponential time, in the worst case. In particular, if the input is a series of nested λ-expressions that bind variables that are all in the expression that is to be substituted, each λ-expression will generate two recursive calls, each of which will also generate two recursive calls, etc, leading to an exponential number of recursive calls.

The third and fourth cases of subst handle variables, first when the variable being substituted matches the variable being substituted for and second when it does not.

The remaining rules recur into the structure of the term, but treating it as an arbitrary S-expression. Accordingly, if you were to add new, non-binding forms to e, this substitution function would not need to change.

With traces, we can visualize reduction sequences in λ_v. Evaluating the expression:

(traces red (term ((λ (n) (if0 n 1 ((λ (x) (x x)) (λ (x) (x x))))) (+ 2 2))))

produces a window containing a graph of the reduction sequence starting at the term passed to traces. The nodes are free form, and have been rearranged vertically; in addition the mouse was over the last term, causing the edges into and out of that term to be a darker blue.

Once we've explored the reduction system a little bit, we might formulate a conjecture about it. For example, one might believe that each non-value expression reduces. To test that conjecture using redex-check, we can define two functions, one that tests valueness and one that tests if an expression reduces:

;; value? : expression -> boolean ;; returns true if the input matches ;; the 'v' non-terminal (define value? (redex-match λv v)) ;; single-step? : expression -> boolean ;; returns true if 'e' steps to exactly one term (define (single-step? e) (= (length (apply-reduction-relation red e)) 1))

and we can put them together with redex-check

(redex-check λv e (or (value? (term e)) (single-step? (term e))))

Running this immediately falsify's the claim. Although this process is random, it is extremely likely to produce output very similar to the following

counterexample found after 1 attempt: i

indicating that free variables do not reduce. Fixing that is a simple matter of adding a rule like this one:

(--> (in-hole E x) (error "free var"))

but this still is not enough. See how many more rules are missing by downloading lam-v.rkt and trying it out!