Matching on preexisting syntax properties tends to produce poor error messages
Using the define-typed-syntax
clause that matches against a preexisting syntax property on the input syntax (that is, the form that corresponds to “checking” rules rather than “synthesis” rules) tends to produce extremely poor type errors. For example, take this extremely simple language with a canonical definition of λ
:
; language.rkt
#lang turnstile
(provide #%module-begin #%top ann λ
(typed-out [unit Unit]) (type-out -> Unit))
(define-base-type Unit)
(define unit 'unit)
(define-type-constructor -> #:arity = 2)
(define-typed-syntax (ann e:expr τ:type) ≫
[⊢ e ≫ e- ⇐ τ.norm]
-------------------------
[⊢ e- ⇒ τ.norm])
(define-typed-syntax λ
[(_ x:id e:expr) ⇐ (~-> τ_in τ_out) ≫
[[x ≫ x- : τ_in] ⊢ e ≫ e- ⇐ τ_out]
-------------------------
[⊢ (λ- (x-) e-)]]
[(_ [x:id {~datum :} τ_in:type] e:expr) ≫
[[x ≫ x- : τ_in.norm] ⊢ e ≫ e- ⇒ τ_out]
-------------------------
[⊢ (λ- (x-) e-) ⇒ (-> τ_in τ_out)]])
Now, consider this simple sample program:
#lang s-exp "language.rkt"
(ann (λ x unit) Unit)
This produces the following, quite poor, error message:
λ: bad syntax
The reason for this is obvious enough, but it is still unhelpful. The (~-> τ_in τ_out)
pattern fails to match, so syntax-parse
backtracks, but the second pattern also fails to match. Since syntax-parse
can’t determine which pattern made more progress, it produces a simple “bad syntax” error, which is quite misleading for the user.
One way to fix this is to add a ~!
to the first pattern somewhere to force syntax-parse
to commit to the pattern, like this:
(define-typed-syntax λ
[(_ x:id e:expr) ⇐ {~and ~! (~-> τ_in τ_out)} ≫
[[x ≫ x- : τ_in] ⊢ e ≫ e- ⇐ τ_out]
-------------------------
[⊢ (λ- (x-) e-)]]
[(_ [x:id {~datum :} τ_in:type] e:expr) ≫
[[x ≫ x- : τ_in.norm] ⊢ e ≫ e- ⇒ τ_out]
-------------------------
[⊢ (λ- (x-) e-) ⇒ (-> τ_in τ_out)]])
This produces a better error message, though still not perfect:
λ: Expected -> type, got: Unit
This error is still pretty confusing, though, since really, the “expected” type is Unit
, and the “actual” type is a function. Again, it makes sense when considering how syntax-parse
synthesizes an error message, but it’s still poor.
It’s possible to manually produce a type error, of course, which allows the creation of a much nicer message:
(define-typed-syntax λ
[(_ x:id e:expr) ⇐ τ_λ ≫
#:fail-unless (->? #'τ_λ) (~a "expected " (type->str #'τ_λ) ", given a function")
#:with (~-> τ_in τ_out) #'τ_λ
[[x ≫ x- : τ_in] ⊢ e ≫ e- ⇐ τ_out]
-------------------------
[⊢ (λ- (x-) e-)]]
[(_ [x:id {~datum :} τ_in:type] e:expr) ≫
[[x ≫ x- : τ_in.norm] ⊢ e ≫ e- ⇒ τ_out]
-------------------------
[⊢ (λ- (x-) e-) ⇒ (-> τ_in τ_out)]])
…but this feels rather against the spirit of the Turnstile DSL. It would be nice if Turnstile could help to synthesize a better error message itself.
Comments (12)
-
-
reporter Yes, that error is equivalent to my example using a cut, so it’s certainly far better than “bad syntax” without any additional explanation. It’s still considerably worse than the handwritten error message, though, for the same reason I explained above. On the other hand, that error isn’t any worse than this one, for the same reason:
(define-typed-syntax #%app [(_ f:expr e:expr) ≫ [⊢ f ≫ f- ⇒ (~-> τ_in τ_out)] [⊢ e ≫ e- ⇐ τ_in] ------------------------- [⊢ (#%app- f- e-) ⇒ τ_out]]) (unit unit)
#%app: Expected -> type, got: Unit in: (#%app Unit2)
This error is slightly better in some respects (the expected/actual types are in the expected order), but it also manages to be considerably worse, since it doesn’t actually highlight the right source location!
I think, for both of these scenarios, it’s reasonable to ask Turnstile to provide more intelligent type errors. I think all the necessary information is available, but it likely requires helping
syntax-parse
a little more than Turnstile already does (and possibly relying a little bit less on the built-in error message synthesis, cute as it currently is). -
repo owner Thanks for the report. The
bad syntax
is definitely unacceptable. I think it makes sense to add a~post
to the expected type (even though it looks like we already have a few) since the first clause has made more progress than the second one due to the matched expression.I also agree about the err msg. It does look like we already try to do the right thing:
so I will look into why that msg is not firing.
-
repo owner I'm about to push a fix. One small problem is that Turnstile does not know what the type should be, other than a pattern, which is an implementation detail. Do you think it's ok to expose this pattern to the programmer?
Here's what the message would say:
; language-prog.rkt:3:5: TYPE_ERR: expression cannot have the type expected by its context ; expr: (λ x unit) ; context expects type: Unit ; expr must have type matching pattern: (~-> τ_in τ_out)
-
Should we let the pattern specify what it expects? Currently the
~->
pattern saysExpected -> type
, but other pattern expanders (for example a more complicated~>*
pattern that handles keywords, optional arguments, and rest arguments) should be free to specify failure messages in different ways.Just adding
~post
allows that, but at the cost of not knowing that it's a problem with the expected type as opposed to the expression. Would a~post
+ a~describe
pattern give the best of both?(~post (~parse (~describe "expected-type" τ-pat) #'τ))
.../bad-expected-type.rkt:26.5: λ: Expected -> type, got: Unit at: (#%app Unit2) in: (λ x unit) parsing context: while parsing expected-type term: (#%app Unit2) location: .../bad-expected-type.rkt:26.16
Another thing to consider, there could be multiple
\Leftarrow
rules with different expected types, for example one for~->
, one for~->*
, and one for~case->
. In those cases what should the error message be? -
reporter Do you think it's ok to expose this pattern to the programmer?
I considered that. Personally, I think exposing the pattern to the user isn’t good enough. That pattern could be arbitrarily complex, and I don’t think the two things should be necessarily coupled. The pattern really ought to be an implementation detail.
Consider
syntax/parse
itself for a moment. It is an industrial-strength library for parsing macros, and it provides error reporting. It works for everything from the simplest macros to the most complex, so its error reporting must similarly scale. Since it’s clear that error messages cannot always be synthesized without any help,syntax/parse
provides lots of ways to help and/or tweak the error generation when the defaults aren’t smart enough. It allows macro authors to annotate patterns and use classes. Importantly, though, these tools for improving error messages can often be added just once, in a syntax class, and it doesn’t really require users to break away from the nice, declarative DSL in order to get good errors.Similarly, I think Turnstile should ideally be an industrial-strength library for creating type systems. Obviously, it will have constraints, but it would be nice if it could scale similarly well. It’s fine to request that users add additional annotations in order to get the best error messages. For example, consider syntax classes: in order for
syntax/parse
to generate good messages, users usually have to provide a#:description
clause. Otherwise,syntax/parse
just generates best-effort messages.Applying that line of thinking to Turnstile, there are a couple of principles that make sense:
- Turnstile should synthesize best-effort error messages, but these error messages don’t have to be perfect.
- Error messages should be improvable by adding user-provided annotations, where desired.
- Annotations should ideally only needed to be added in one place, and they should not compromise the declarative nature of the DSL.
The obvious place to put this annotation is
define-type-constructor
, sincedefine-type-constructor
has some parallels withdefine-syntax-class
. Consider being able to do this:(define-type-constructor -> #:arity = 2 #:description "function type")
Without the
#:description
annotation, it should probably be inferred, so something like-> type
, which is similar to the existing error message but does not reveal the underlying pattern. -
Exactly, type patterns should be able to specify their failure messages for Turnstile or syntax-parse to use.
-
repo owner #:description
annotationI like it. I'll work on adding it.
-
repo owner Actually, adding
#:description
todefine-type-constructor
does not solve the problem since the test message will still say that that type is the "expected type", which makes sense from the pattern-expander's perspective.Concretely, this example would still error with "expected function type, got Unit". Is this ok?
In order to produce the better message, the error must be external to the type, ie from the rule. The rule would need a
#:expected-type-description
or something, which feels less nice.Will continue to think about this.
-
That's where the the extra
~describe
pattern on the other end helps a little bit. When Turnstile matches the expected type against the pattern, it currently does this:(~parse τ-pat #'τ)
If we change it to this:
(~post (~parse (~describe "expected-type" τ-pat) #'τ))
Then the error message will include
while parsing expected-type
. Is that good enough? -
repo owner Yes, that's where I currently have the new err msg (in my unpushed commit). But the rule still needs to specify it somewhere that looks nice.
-
reporter Just reusing
syntax/parse
patterns is cute and elegant, but perhaps it’s not necessarily worth tethering yourselves to them? If it isn’t possible to get good enough type errors just from syntax classes andsyntax/parse
’s default error synthesis, it would be possible to create separate type patterns that attach the necessary static information to generate good errors. - Log in to comment
You don't have to go all the way to
#:fail-unless
to get a more specific error message, even this does that:It might look like these should be equivalent, but they aren't in errors because of progress. In various places in Turnstile we've put
~post
patterns in to prioritize error messages. If there are two different classes errors that have the same priority, you getbad syntax
. The#:with
patterns are set up to have more progress than the initial pattern, so a#:with
will give it more priority.Should I add a
~post
to the expected type pattern here to give its errors higher priority? With that change this becomesWhile highlighting the
Unit
in(ann (λ x unit) Unit)
. Would that be good enough? The other question is whether this would mess up any other error messages.