CS 440 HW4
Updated 3/13 to give more specifics about substitution in problem 3 and fix a Markdown conversion error.
In this homework you will be writing a typechecker for the MiniML language,
for which you wrote an interpreter on HW3.
See the language spec and README.md
in your starter code repo for more information on MiniML and the codebase.
Note that you will be using the same starter code that you used for HW3.
Your finished toplevel will also use your interpreter code from HW3. You
will not be graded on the interpreter part, so don't worry if your interpreter
code has bugs (but we do still encourage you to fix any bugs we or you have
found in your code, so you learn from them!)
Your code for this homework will go in unify.ml and typecheck.ml .
Collaboration
This assignment is to be completed individually, subject to the academic
integrity policy on the course website.
Make sure you read and understand these.
At the end of this assignment, we will ask you to list any students you
discussed the homework with, as well as any websites or other resources
(please list specific URLs when applicable) you referred to.
Important notes about grading
Your submissions will be graded automatically for correctness, as well as
checked by hand. Please keep our autograder (and Xincheng, who has to run it)
happy:
Do not change or remove any lines that start with (*>* .
Do not change the names or types of any functions defined in the starter code. This includes adding rec to functions that don't have it.
Unification
The file unify.ml contains code to unify two types, which is used in type checking.
Recall that unification takes two types τ1 and τ2, potentially with unification variables like ?1 .
These are "holes" that can be unified with anything. Each instance of a particular variable must be filled with the same thing though (for example, if ?1 appears in both types, you must replace ?1 with the same type in both).
Recall that unification returns a substitution, which is a list of pairs (?i,τi) meaning "replace ?i with τi". For a substitution σ, we'll write [[σ]]τ to mean "τ with all the replacements in σ". So, for example,
[(?0, int); (?1, string)](?0 -> ?1) = int -> string
and
(?0, list(?1)); (?1, int)]?0 = int list
The OCaml definition for substitutions is given in unify.ml :
type substitution = (int * typ) list
In that definition, ?0 is represented as just the integer 0. We also include a function sub_all: substitution -> typ -> typ , where sub_all s t computes [s]t, and a function sub_ctx: substitution -> ctx -> ctx where sub_ctx s c substitutes s into all of the types in the context c .
Recall the (recursive) unification algorithm from class:
Algorithm: Unify(τ1,τ2)
Returns a substitution or an error.
- If τ1 and τ2 are both the same base type (int, string, bool, or unit), return the empty list [].
- If τ1 = τ2 = ?i (i.e., they are the same unification variable), return [].
- If τ1 = ?i, then:
a. If ?i occurs in τ2, then error.
b. Otherwise, return [(?i, τ2)].
- If τ2 = ?i, then:
a. If ?i occurs in τ1, then error.
b. Otherwise, return [(?i, τ1)].
- If τ1 = list(τ1') and τ2 = list(τ2'), then return Unify(τ1', τ2').
- If τ1 = τ1' -> τ1'' and τ2 = τ2' -> τ2'', then let σ = Unify(τ1', τ2') and return σ ++ Unify([σ]τ1'', [σ]τ2''), where ++ is the concatenation of substitutions.
- Similar to above for if τ1 = τ1' * τ1'' and τ2 = τ2' * τ2''.
- Otherwise, error.
The file unify.ml also contains one other function you might want:
new_type : unit -> typ generates a new type consisting of just
a unification variable ?n where ?n hasn't been used in the program
before.
This is useful for when we need to "guess" types to unify later.
Exercise
- Implement the function
unify : typ -> typ -> typ in
unify.ml , following the algorithm above. Remember that we implemented
two cases in class on Mar. 9, and the code is posted next to that
lecture on the schedule on the website, and you are welcome to use this
code.
Type Checking
The file typecheck.ml includes code to typecheck MiniCaml
programs.
It uses the unification function you wrote above.
We did most of the work, you just need to implement a few cases.
Implement the function type_of_const : const -> typ in
typecheck.ml .
The function should return the type of the given constant.
You shouldn't need to do any unification.
In the case of Nil , which represents the empty list [] ,
the type is of course t list for some t . What's t ?
We have no way of knowing at this point, so you'll just have to take a
guess...
Fill in the remaining cases of typecheck_exp in
typecheck.ml . This function takes an expression and returns a pair of
its type and a substitution. You can take a look at the cases we've
implemented so far, many of which we want over in class.
You need to implement the cases for EApp , EUnop and ECons (these
are listed at the end of the function and currently have
raise Unimplemented as a placeholder).
Here's an outline of how to implement these cases:
EApp (e1, e2) :
- Infer the type of
e1 ; call this t1
and the resulting substitution s1 .
- Infer the type of
e2 (using s1 on the context); call this type t2 and the resulting substitution s2 .
sub_all s2 t1 must be the same as TArrow (t2, tr) (that is, a function from
t2 to tr ), where tr is some type, but we don't know what type
yet.
- The type of the application expression is
tr . Before you return it, apply the substitution you got from unify to it.
EUnop (o, e) :
- Infer the type of
e ; call this t1 .
- Use
type_of_unop to get the expected argument and return types for
the unary operation o (like we did for the Binop case in class).
t1 must be the same type as the argument of o .
- The type of the Unop expression is the return type of
o .
ECons (e1, e2) :
- Infer the type of
e1 ; call this t1 and the resulting substitution s1 .
- Infer the type of
e2 (using s1 on the
context); call this type t2 and the resulting substitution s2 .
- Let
tl be TList t1' (where t1' is t1 substituted
with s2 ).
t2 must be the same as tl .
- The type of the Cons expression is
tl . Before you return it, apply the substitution you got from unify to it.
Testing
Compile and run your code using:
make hw4
./hw4
This will open something similar to the OCaml toplevel, which will let you
type declarations and will print out the bindings and types introduced:
Welcome to MiniCaml. Type #quit;; to exit.
# let x = 1;;
x : int = 1
# let f y = x + y;;
f : int -> int = <fun>
# let x = 2;;
x : int = 2
# f 2;;
- : int = 3
You can also use the binary hw4 in "batch mode" by giving it a .ml file
on the command line, e.g.
./hw4 ../src/examples/rec.ml
sum: int list -> int
length: 'a list -> int
map: ('d -> 'c * 'd list) -> 'c list
int_to_string: int -> string
onetwothree: string list
sum = <fun>
length = <fun>
map = <fun>
int_to_string = <fun>
onetwothree = (""one")::((""two")::((""three")::([])))
This will interpret the entire file and print out the type and
value of every top-level
binding. We've provided some sample files in the examples folder, and
you're welcome to write your own.
It's also important to make sure that, in addition to allowing well-typed
programs, your type checker appropriately rejects programs with type errors.
We've provided a few of these in examples/error .
Submission
When you are done with your work, make sure you have compiled your code
using make (and ideally tested it using the instructions above).
Submissions that do not compile will receive no credit.
When you're finished, use the following commands to submit:
git add unify.ml
git add typecheck.ml
git commit -m "HW4 submit"
git push
Please do actually use the commit message "HW4 submit" so we know when you
intended to submit your homework and can also differentiate this from when
you submitted HW3.
There are no additional files you need to add to the repo, just unify.ml
and typecheck.ml .
There are also no automatic tests, so you will not see a green check mark
(or red X) on GitHub.
As usual, double check GitHub and make sure you can see your latest code.
|