[SOLVED] C Scheme Haskell Java scala compiler Macquarie University, Department of Computing

$25

File Name: C_Scheme_Haskell_Java_scala_compiler__Macquarie_University,_Department_of_Computing.zip
File Size: 781.86 KB

5/5 - (1 vote)

Macquarie University, Department of Computing

COMP332 Programming Languages 2019

Lintilla, a simple functional programming language.

Introduction

Lintillahttp:hitchhikers.wikia.comwikiLintilla is a language that contains elements from mainstream functional languages such as MLhttps:en.wikipedia.orgwikiStandardML, Haskellhttps:www.haskell.org, Scalahttps:www.scalalang.org and Schemehttps:en.wikipedia.orgwikiSchemeprogramminglanguage. It uses a syntax that has been borrowed from Doms favourite systems programming language Rusthttps:www.rustlang.org.

The description here provides a brief overview of the Lintilla language and its syntax. Further detail about aspects such as checking the validity of names or types and translating a program into an executable form will be added when we work on the Lintilla code generator in assignment 3.

The Lintilla language

The basic unit of a Lintilla program is the expression. There are no statements, their place is taken by expressions whose type is the special type unit. You can think of unit as a type which has only one value, called , and it is this value that is returned by those constructs which otherwise might be regarded as returning nothing like a procedure or a while loop.

A Lintilla program is comprised of a sequence of expressions of type unit separated by semicolons. For example, here is a simple program that binds the result of a simple computation to a variable and then prints the contents of that variable:

let x234;
print x

When this program is run it will print the value of the expression: the number
14.

Block expressions are used to build programs out of smaller expressions. A block expression is a pair of curly braces containing zero or more expressions separated by semicolons ;. The idea is that when a block is executed each expression in that block is executed in turn, and the value computed by the last expression becomes the value of that block. For example, here is a program consisting of a block expression that declares two variables and uses their values in the expression whose values if returned from the block:

let x
let a5;
let ba1;
ab
;
print x

This program will print the result of multiplying a by b, so 30 will be printed. Here the name a can be used in the definition of b since b is defined later, but that is a name analysis issue, so we dont need to worry about it for the moment. Notice, in particular, that the semicolon is used as an expression separator, not a line terminator; so the last line in this block isnt followed by a semicolon. An empty blockhas type unit.

Variables are immutable, in the sense that they are bound to a value when they are declared and they maintain the same value from there to the point where they go out of scope. An variable is declared and bound to a value in a let expression:

let domsvar10;

The scope of a variable extends from the point where it is declared to the end of the smallest enclosing block; or to the end of the program if the variable is bound at the toplevel.

Functions may also be declared, both at top level or within a block. For example here is the familiar factorial function given as a recursive function in Lintilla:

fn factorialn: intint
if n01elsenfactorialn1

This declaration defines a new function which takes a single parameter of type int and returns an int as its result. It introduces a new immutable variable called factorial which, as a first order approximation, we can think of as pointing to the code generated by compiling the function body. The result returned by a function is the value returned when its body, a block, is executed.

Functions may take more than one parameter:

fn modn: int, m: intint
nnmm
;

printmod23,10 prints 3, the remainder when 23 is divided by 10.

A procedure is a function which returns no value, or more precisely it returns the unique valueof type unit. Procedures are declared by omitting the return type specification in a fn declaration:

let x20;

fn printnplusxn : int
print xn
;

printnplusx23 prints the value 43

The same effect can be gained by using the the explicit return type unit:

fn printnplusxn : intunit
print xn

We do not distinguish declarations from other kinds of expressions for the purposes of parsing. That doesnt mean that it will always make sense to use a declaration wherever any other kind of expression is appropriate, but just that this isnt a decision that the parser will make. The places where declarations can legally occur will be determined, and enforced, by the Lintilla type checker.

Control expressions we have already seen the principle control expression provided by Lintilla, the if expression. We might note that the language specifies that if expressions must have both a then and an else clause and that these must both be blocks enclosed in curly braces.

Expression forms are interchangeable as long as they have the correct type. E.g., anywhere we can put a number can also take a block or some other kind of expression that evaluates to a number. For example, here is an artificial program that uses blocks nested inside an arithmetic operation:

let a3;
a1

let b10;
b1

Or more concisely:

let a3; a1let b10; b1

Weve seen a few different forms of expression: numbers, addition expressions, multiplication expressions and function call expressions. There are also other arithmetic operations, boolean values, boolean literals, relational and logical operators, and conditional expressions. The complete syntax of Lintilla is given below.

Finally Lintilla allows programmers to insert comments into their code. These begin with two slashesand extend from there to the end of the line. Comments are ignored by the compiler, which treats them just like white space.

Extensions

Programmer demand in recent moths has lead to Lintilla being extended to incorporate the following new constructs:

Logical operators and , orand notwhose semantics follows the usual conventions regarding short circuit evaluation of logical operators.

Mutable arrays which can contain values of any Lintilla type, including function and array types hence multidimensional arrays, which are homogeneously typed all entries in an array have the same type. Arrays are created using an array operator, dereferenced using the pling operator ! as in BCPL, extended using the append operationas in Scala, and updated to contain new values using the assignment operator :.

For loops these are genuine for loops in the style of Pascal and its derivates rather than the kind of logically controlled for loop found in C and Java. Lintilla also provides:

a loop construct which provides for the early termination of the current iteration of the smallest enclosing for loop and starts the next iteration of that loop, and
a break construct which immediately terminates the execution of the smallest enclosing for loop entirely, and returns control to the command that immediately follows that loop.

Lintilla syntax

Here is a complete contextfree grammar for the Lintilla language:

program : exp ; exp.

exp : assignexp
appendexp
logexp
ifexp
printexp
forexp
loopexp
breakexp
letdecl
fndecl.

logexp : logexplogexp
logexplogexp
logexplogexp
logexplogexp
logexplogexp
logexplogexp
logexplogexp
logexplogexp
logexp ! logexp
logexp
logexp
false
true
integer
array tipe
lengthexp
app
block
exp .

app : appexp , exp?
idnuse.

block :exp ; exp? .

assignexp : logexp : logexp

appendexp : logexplogexp

forexp : for idndeflogexp to logexp step logexp? do block

loopexp : loop

breakexp : break

ifexp : if exp block else block.

printexp : print exp.

letdecl : let idndefexp.

fndecl : fn idndefparamdecl , paramdecl? tipe? block.

paramdecl : idndef : tipe.

Finally, the syntax of types:

tipe : unit
bool
int
fntipe , tipe? tipe
array tipe
tipe .

We use the word tipe instead of type since the latter is a Scala keyword which prevents us from using it as the name of a parser in our code. A function type is specified using the syntax

fntype1, type2, , typenrestype

which denotes the type of a function that takes n parameters of types type1,, typen and returns a result of type restype. We might note that a procedure has a type of the form

fntype1, type2, , typenunit

and that Lintilla does not allow variables or parameters to have type unit.

Precedence and associativity

The grammar above is not immediately suitable for encoding as a parser. The logexp nonterminal is ambiguous since it makes no allowance for precedence and associativity of the operators. You should rewrite that grammar production to implement the following precedence and associativity rules:

The following expression constructs have precedence as shown from lowest to highest with constructs on the same line having the same precedence:

1. logical andand or
2. equaland lessthan
3. additionand subtraction
4. multiplicationand division
5. array dereferencing !
6. all other kinds of expression

The constructs in the highest precedence category include unary negation, logical not, blocks in curly braces and expressions in parentheses.

All binary expression operators are left associative, except for the relational operators equality and less than which are not associative.

Semantic rules

The following sections explain the semantic rules of Lintilla, including its type rules. Many languages allow name and type analysis to be handled separately. Lintilla is in this category because we can first identify which entity a name represents before we check whether typed entities are used legally.

The framework bundle for assignment 3 contains a semantic analyser module which implements the complete suite of these rules.

Name analysis

Like most languages, Lintilla name occurrences come in two varieties: defining occurrences represented by IdnDef nodes in the tree, and applied occurrences represented by IdnUse nodes. Defining occurrences appear as the defined name in let definitions and function definitions, and as the names of function parameters. All other identifier occurrences are applied and occur as plain names in expressions.

For example, in the following program the occurrences of f and g at the beginnings of the fn definitions are defining occurrences, as are the appearances of a and b in their parameter sections.

fn fa : intint
a1
;

fn gb : fnintintint
b5
;

print gf

The remaining identifiers are applied occurrences: the a in the second line, the b in the sixth line, and both g and f in the last line.

The scoping rules of Lintilla are very simple. An identifier defined in a block by a let or fn declaration can only be used later in that block including in nested blocks. Its scope extends from the point at which it is defined to the end of the smallest enclosing block; or to the end of the program if the identifier is defined at the toplevel. Thus, in the following program, the p on the last line is illegal because it is outside the scope of the definition of p within the inner block. The p on the third line is legal because it is inside the scope of the value definition of p.

let b
let p1;
p1
;
print p2

A variable cannot be redefined within the same block, so the second line in the following code would cause an error

let x1;
let x2

it can, however, be redefined in a inner block so the following code is legal:

let x1;

let x2;
print x
;
print x

In the code from the let on the fourth line and theon the sixth line the original binding of x to 1 is shadowed by a new binding of x to 2; we might say that this is a hole in the scope of the binding made at line 1. So when executed this code will print the number 2 followed by the number 1.

In a let binding the scope of the variable being does not include the initialisation expression of that let. So in the following example

let x10;

let xxx;
print x
;
print x

the x referred to in the initialisation expression xx is that bound in the let on line 1. So when executed this code will print the number 100 followed by the number 10. This rule disallows the definition of let bound identifiers whose definitions depend on the value being bound, something that would be ill defined why?.

On the other hand the function name identifier bound in a fn declaration is in scope in the body of the function being defined. This allows us to define directly recursive functions such as our old friend the factorial:

fn factn: intint
if n01elsenfactn1

In a fn declaration each parameter declares the binding of a new identifier whose scope extends throughout the body of the declared function. That scope also encompasses the parameter declarations to the right of a given one. The upshot of this rule is that it is illegal to give two parameters the same identifier or to attempt to rebind that identifier directly within the body of the function although it can still be rebound inside an inner block within the body.

It should be noted that the actual scope of the function identifier bound in a fn declaration is that of the block within which that declaration is situated. Its body is a smaller scope within that enclosing one. This means that it is legal to rebind a function identifier within its own body. So for example, the following is semantically correct:

fn l
let l10;
print l

The downside of rebinding the name of a function within its own body is that it is now not possible to make a recursive call.

Finally a for loop also starts a new scope, which includes its control variable, a defining occurrence, and extends to the end of the body of that loop. Its from, to, and step expressions are not included in that new scope, however, and they are evaluated in the enclosing scope. The control variable of a for loop may not be rebound in the body of that loop.

Type analysis

Lintilla has four forms of type: integers, Booleans, a trivial type and functions. The first three are indicated by the int, bool and unit type names respectively. A function type is indicated by the fn type constructor:

fntype1, , typentype

This denotes the type a a function which takes n parameters of types type1, type2, , typen and returns a result of type type. For example, the type

fnint, boolint

is the type of a function which takes two parameters, of types int and bool, and returns afunction of type int. It is legal for a function to take no parameters, in which case it would have a type of the form:

fntype

The unit type is used to denote the return type of functions, and other builtin constructs, that perform an action but dont actually return a result. This is often the case for functions whose purpose is to perform a side effect, like printing to the terminal. For example, the purpose of a print expression is to print a value computed by its operand, but it doesnt itself return a value. So an expression like

print 10x

has type unit. Builtin operations of type unit play the role that statements play in other languages. Functions with return type unit are sometimes refereed to as procedures.

The type of a variable introduced in a let declaration is the type of the
expression to which it is bound in that definition. For example, in the code

let x1;
print x

we know that x is has type int since it is bound to an expression whose
type is integer.

In a fn declaration all types of defined identifiers, both parameter identifiers and the identifier of the function itself, are given explicitly. The type of each parameter is given as a type expression, separated from the declared identifier by a colon. The function type itself is constructed using the list of specified parameter types and the return type given in the function signature. So in the following function declaration

fn fibres1 : int, res2 : int, count : intint
if 0count
fibres2, res1res2, count1
else
res1

the parameter identifiers res1, res2 and count are all declared to have the type int and the type of the function identifier fib is fnint, int, intint. If a function declaration lacks a return type specification thepart, that is if it is a procedure declaration, then it is assumed to have return type unit. So the procedure

fn print4timesn : int
print n;
print n;
print n;
print n

has type fnintunit. It is not legal for a function to take a parameter of unit type, consequently it is not legal for a function type to have the type unit in a parameter position.

Functions may have parameters and return values of which are of a function type, for example:

fn iteratef : fnintint, count : intfnintint
fn itern : int, count : intint
if count1
n
else
iterfn, count1

;

fn resn : intint
itern, count
;

res

Every legal expression in Lintilla has a type. The following points describe how the type of an expression is determined and give any typing conditions that are imposed on the types of values that occur as expression operands:

The Boolean expressions true and false are of type bool.
Any constant integer expression is of type int.
Any constant array expression array t where t denotes any Lintilla type has type array t.
The type of an applied identifier occurrence is the type given in its corresponding defining occurrence.
All of the expressions at the top level of a program must have type unit.
All of the expressions in a block, except for the last one, must have type unit. The type of a block expression is the type of the final expression in its body or unit if the block is empty.
fn and let declaration expressions have type unit.
print expressions have type unit.
for, loop and break expressions have type unit.
assignment : and array appendexpressions have type unit.
The initialisation expression in a let, to the right of the , can have any legal type except for unit.
In a function declaration parameters can be given any legal type except for unit and its return type can be any legal type. The declared return type must equal the type of the block given as the function body.
A function call fe1,,en must satisfy the following rules:
the expression f must have a function type fntype1,type2,,typemtype,
the number of actual parameters n must match the number of formal parameters m in the function type of f,
each actual parameter er r1,,n must have the corresponding parameter type typer given in the function type of f, and
the expression fe1,,en itself has type given by the return type type given in the function type of f.
The type of the condition expression in an if expression must be bool. The types of the then and else branches of the if expression can be of any type, but they must be of the same type as each other. The type of the if expression itself is the common type of its then and else blocks.
the left and right hand operands of an assignment : must have the same type.
the right hand operand of an appendcan have any type t, and its left hand must have the corresponding type array t.
the from, to and optional step expressions of a for loop must have type int.
the optional step expression of a for must be a constant expression.
the control variable of an for expression has type int.
the block that comprises the body of a for must have type unit.
The operands of addition, subtraction, multiplication and division expressions must be be of type int and their result is also of type int.
The operands of and, or and not expressions must be of type bool and their result is also of type bool
The operands of a lessthan expression must be of type int and its result is of type bool.
The operands of an equals expression must be of the same type either int or bool and its result is of type bool.
In all other cases not listed above, an expression can be of any type.

Dominic Verityhttp:orcid.org0000000241376982
Last modified: 22 October 2019
Copyright c 2019 by Dominic Verity and Anthony Sloane. Macquarie University. All rights reserved.http:mozilla.orgMPL2.0

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] C Scheme Haskell Java scala compiler Macquarie University, Department of Computing
$25