Wiki
Clone wikiinf225 / 2016 / Exercise 4
Exercise 4 (Compulsory)
You objective in this exercise is to work with the code we've built in the lectures, and turn it into a (reasonably) finished implementation of a simple language.
When you're done, you should be comfortable with:
- Passing around environments for context information
- Building basic evaluators and typecheckers
- Program execution with and without a store
- The distinction between abstract and concrete syntax
You may also realise the reasoning behind some design choices in languages you know (such as why variables from the surrounding scope must be final
in Java lambdas and anonymous classes).
Tasks
This is what you have to do. Read below for helpful information about how to do it. You are free to make your own decisions (including doing parts in another language if you wish), and even simplify the tasks if you need to. Once you're done, you show and explain your solution to Anya or Anna who will let you know whether it's approved or if you need further improvements. Deadline: you should get one us to review your solution before we leave for a conference at the end of the month, i.e., October 27th at the latest.
We may/will expand the hints section as we go along, based on your questions. Please ask at the lectures, by email or via Facebook message. Feel free to discuss with other students as well.
- Retrieve the code simple language implementation from the lectures, and check that you can run it.
- Write an evaluator for abstract syntax trees of our language (as defined in SimpleAST.rsc). It should follow the basic style of the evaluator from Simple.rsc (i.e., a set of recursive functions with an environment passed around), except it works on the AST datatype instead of concrete syntax. If you want a bigger challenge, write the evaluator for the imperative language (ImperativeAST.rsc) instead, where you also have to deal with a store / memory.
- Finish the type checker in ImperativeTypeChecker.rsc. You should finish these features:
- The current typechecker allocates store locations sequentially. This means that recursive invocations of a function will overwrite local variables and parameters. Instead, use the environment to keep track of how many variables you've made so far within the current function, and then make storage locations relative to the stack frame.
- Our language only supports single-argument functions. Make it work also with arbitrary-length argument lists.
- Overloaded functions are nice. Make it so that functions are selected not just based on name, but also based on the types of the arguments. In the AST, you can distinguish the functions based on the store location.
- Anonymous functions (lambdas) are all the rage these days – even Java and C++ have them now. Add this to the language, and update the evaluator and typechecker.
- You should probably expand the language a little bit. In particular,
if
is missing, so it's difficult to make interesting programs. You should also consider changing the syntax of Program so that a program can have a list of function definitions before the expression to be evaluated (instead of having to make lots of nestedlet
/let fun
s).
Detailed hints
1.
You find the code at this Git URL: https://bitbucket.org/sle-uib/inf225.git
To import in Eclipse, choose File -> Import -> Git -> Projects from Git, choose Clone URI and enter the URL. Click next a few times. It'll ask you about which project to import, and the project name is inf225.h16
(is probably preselected). Hit finish, and the code should show up in your project explorer.
For example, these things should work in the Console (provided you start the console in the inf225.h16 project):
rascal>import SimpleTypeChecker; ok rascal>import Simple; ok rascal>typecheck((Expr)`let x = 5 in x end`, ()) ... rascal>eval((Expr)`let x = 5 in x end`, ()) ...
2. Evaluator
Getting Started
- Make a new Rascal module (File -> New), name it SimpleEvaluator or something.
- You probably need the same imports as in Simple.rsc, plus you need import
SimpleAST
itself. - Your evaluator functions should look something like this:
public Value eval(Plus(e1, e2), Env env) { ... }
- Copy relevant definitions for Name, Value and Env from
Simple.rsc
- For testing, you can write programs directly in abstract syntax. For example,
Let("x", Int(5), Plus(Var("x"),Var("x")))
.
Hints
Note that you're now working in abstract syntax, rather than the concrete syntax in the evaluator we had in the lectures. This means we no longer care about program text; there's no parsing going on, and we don't need a grammar. (We assume that another component in our system has taken care of parsing and converting to an abstract syntax tree.)
- Values and patterns look like this:
Int(5)
,Int(a)
,Times(a, Int(b))
and so on, rather than(Expr)`5`
,(Expr)`<NUM a>`
and(Expr)`<Expr a>*<NUM b>`
. - You don't need to worry about converting numbers from strings to ints, since they are already ints in the AST.
- You can also make
eval
functions with a normal argument list, rather than using pattern matching. E.g.,int eval(ExprTree t, Env env)
, and then use aswitch
statement to deal with the various cases of AST nodes. - To support other types than
int
, you must update the return value of youreval
functions. You best option is to make a new data typedata Value = IntVal(int i) | StrVal(str s) | ...;
, and use that as your return values. This makes it easy to distinguish values of different types. For example, if you have two valuesv1
andv2
, you can check if they are both ints:if(<Int(i1), Int(i2)> := <v1, v2>) { return Int(i1+i2); }
You can write any program directly in the abstract syntax; it's just a bit more cumbersome than when you have a nice concrete syntax. You may find it easier to obtain abstract syntax trees if you repurpose the old concrete syntax-based evaluator and make it return ExprTree nodes – i.e., convert from concrete syntax to abstract syntax.
- If you want to try and make an imperative (store-oriented, rather than value-oriented) language, there are two approaches:
- For very simple updatable variables, you can get by with each evaluation step returning an updated environment. This is what Rascal does internally, and allows variables to change (i.e., be rebound to different values) without values/objects to change. See this short hint on semantics of updatable variables
- In the more advanced case, you need to have a storage location allocated to each variable. Your store can be a list of values, and whenever you declare a new variable you must find a new storage location for it. If you work with the
ImperativeAST
output from the typechecker (which you'll need to finish), the work of finding storage locations has already been done.
3.
Getting Started
- Read the stack frame chapter hand-out.
- The point is that each variable should have a store location on the current function's stack frame. The stack frame has space for all local variables in a function; when a function is called (which would happen in e.g. an evaluator), space on the stack is made available for the stack frame, and local variables are accessed relative to a frame pointer (which points to the stack frame in memory).
- Regarding store locations, there are two approaches: for a language with only local variables, you can keep track of the currently allocated store locations in the environment (e.g., just make a fake variable with a special name such as "__store_top", for instance). With global variables (for instance, storing a list of top-level functions), you can use the same scheme as the typechecker already uses (a global counter).
- For multi-argument functions, you also need to update the syntax. Remember that you can use
{Expr ","}*
to indicate a comma-separated list of zero or more arguments. You use the same syntax in patterns for your type-checking functions, e.g.,typecheck(`<ID f>(<{Expr ","}* args>`), env)
. The arguments will then be a list, which you can process with afor
-loop or list comprehension ([f(x) | x <- args]
). - For overloading, the environment will contain a list or set of functions rather than a single one. You must first determine the argument types, and then go through the list of candidates and pick the one where the argument types match.
- If you use abstract syntax instead of concrete syntax, assume that the type of variables are not yet determined. E.g., that variables are
Var("x", Unknown())
and you're supposed to resolve them toVar("x", Int())
.
Hints
- If you have functions nested within function (which the language syntax allows), and the inner function can access the outer function's variables, this exercise becomes a lot harder. You don't have to solve this issue, but at least try to think about it – could this be one of the reasons why Java requires that local variables that are used in an inner class are marked as
final
? - What happens if you have allocated space for an object on the stack, and the function returns a reference to it?
- A function bundled with its environment and variable storage (if any) is known as a closure. Languages such as Scheme allow functions to update non-local variables (e.g., variables from an outer scope). This makes the simple stack scheme we've used here unusable, since the life time of a variable will be dynamic rather than determined simply by the nesting of function calls. This is a useful feature – you can use it, e.g, to implement object orientation, but it can be tricky to implement. This is an example that makes use of closures with updatable variables:
let makeCounter() = // makes a function that gives a sequentially increasing number let counter = 0 in // a local counter variable let f() = { counter = counter + 1; return counter; } in f // return the inner function end end end
4.
Getting Started
Note that anonymous functions are essentially equivalent to making a regular function and then returning it.
Hints
Updated