Source

xilocaml / doc / Xilocaml.tex

\documentclass[a4paper,11pt]{article}

\usepackage[a4paper, margin=1.5cm]{geometry}

\author{Jan Wr\'oblewski\\
student no. 277632 of faculty of\\
Mathematics, Informatics and Mechanics of University of Warsaw}

\title{The Language Xilocaml}

\setlength{\parindent}{0mm}
\setlength{\parskip}{1mm}

\usepackage{url}
\usepackage{hyperref}

\newcommand{\emptyP}{\mbox{$\epsilon$}}
\newcommand{\terminal}[1]{\mbox{{\texttt {#1}}}}
\newcommand{\nonterminal}[1]{\mbox{$\langle \mbox{{\sl #1 }} \! \rangle$}}
\newcommand{\arrow}{\mbox{::=}}
\newcommand{\delimit}{\mbox{$|$}}
\newcommand{\reserved}[1]{\mbox{{\texttt {#1}}}}
\newcommand{\literal}[1]{\mbox{{\texttt {#1}}}}
\newcommand{\symb}[1]{\mbox{{\texttt {#1}}}}
\newcommand{\code}[1]{\mbox{{\texttt {#1}}}}

\begin{document}

\maketitle
\tableofcontents

\section{Introduction}

Xilocaml is a subset of OCaml 3.12. It has following features:
\begin{itemize}
	\item purely functional,
	\item polymorphic lists,
	\item strong static type system,
	\item first-class function closures,
	\item partial function applications,
	\item type inference (automatic deduction of types) with optional type annotations,
	\item lambda functions,
	\item parametric polymorphism.
\end{itemize}


%%%% BNFC %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\section{The lexical structure of Xilocaml}

\subsection{Literals}
OInteger literals are recognized by the regular expression
\({\nonterminal{digit}}+\)

OBool literals are recognized by the regular expression
\(\mbox{`t'} \mbox{`r'} \mbox{`u'} \mbox{`e'} \mid \mbox{`f'} \mbox{`a'} \mbox{`l'} \mbox{`s'} \mbox{`e'}\)

OFloat literals are recognized by the regular expression
\(({\nonterminal{digit}}+ \mbox{`.'} {\nonterminal{digit}}* \mid \mbox{`.'} {\nonterminal{digit}}+) ((\mbox{`e'} \mid \mbox{`E'}) (\mbox{`{$+$}'} \mid \mbox{`{$-$}'})? {\nonterminal{digit}}+)? \mid {\nonterminal{digit}}+ (\mbox{`e'} \mid \mbox{`E'}) \mbox{`{$-$}'}? {\nonterminal{digit}}+\)

OIdent literals are recognized by the regular expression
\({\nonterminal{lower}} ({\nonterminal{letter}} \mid {\nonterminal{digit}} \mid \mbox{`\_'} \mid \mbox{`''})*\)

OPolyIdent literals are recognized by the regular expression
\(\mbox{`''} ({\nonterminal{letter}} \mid {\nonterminal{digit}} \mid \mbox{`\_'} \mid \mbox{`''})*\)


\subsection{Reserved words and symbols}
The set of reserved words is the set of terminals appearing in the grammar. Those reserved words that consist of non-letter characters are called symbols, and they are treated in a different way from those that are similar to identifiers. The lexer follows rules familiar from languages like Haskell, C, and Java, including longest match and spacing conventions.

The reserved words used in Xilocaml are the following: \\

\begin{tabular}{lll}
{\reserved{List}} &{\reserved{and}} &{\reserved{bool}} \\
{\reserved{else}} &{\reserved{float}} &{\reserved{fun}} \\
{\reserved{hd}} &{\reserved{if}} &{\reserved{in}} \\
{\reserved{int}} &{\reserved{let}} &{\reserved{list}} \\
{\reserved{mod}} &{\reserved{not}} &{\reserved{open}} \\
{\reserved{rec}} &{\reserved{then}} &{\reserved{tl}} \\
\end{tabular}\\

The symbols used in Xilocaml are the following: \\

\begin{tabular}{lll}
{\symb{;;}} &{\symb{{$-$}{$>$}}} &{\symb{{$|$}{$|$}}} \\
{\symb{\&\&}} &{\symb{::}} &{\symb{[}} \\
{\symb{]}} &{\symb{(}} &{\symb{:}} \\
{\symb{)}} &{\symb{{$=$}}} &{\symb{;}} \\
{\symb{{$<$}{$>$}}} &{\symb{{$<$}}} &{\symb{{$<$}{$=$}}} \\
{\symb{{$>$}}} &{\symb{{$>$}{$=$}}} &{\symb{{$+$}}} \\
{\symb{{$-$}}} &{\symb{{$+$}.}} &{\symb{{$-$}.}} \\
{\symb{*}} &{\symb{/}} &{\symb{*.}} \\
{\symb{/.}} & & \\
\end{tabular}\\


\subsection{Comments}
There are no single-line comments in the grammar. \\Multiple-line comments are  enclosed with {\symb{(*}} and {\symb{*)}}. Unlike in OCaml, nested comments are not allowed.


\section{The syntactic structure of Xilocaml}
Non-terminals are enclosed between $\langle$ and $\rangle$. 
The symbols  {\arrow}  (production),  {\delimit}  (union) 
and {\emptyP} (empty rule) belong to the BNF notation. 
All other symbols are terminals.\\

\begin{tabular}{lll}
{\nonterminal{Program}} & {\arrow}  &{\nonterminal{ComBeg}} {\nonterminal{Exp}} {\nonterminal{ComEnd}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ComBeg}} & {\arrow}  &{\terminal{open}} {\terminal{List}} {\terminal{;;}}  \\
 & {\delimit}  &{\emptyP} \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ComEnd}} & {\arrow}  &{\terminal{;;}}  \\
 & {\delimit}  &{\emptyP} \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp}} & {\arrow}  &{\terminal{let}} {\nonterminal{ListLocDecl}} {\terminal{in}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\terminal{let}} {\terminal{rec}} {\nonterminal{ListLocDecl}} {\terminal{in}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\terminal{fun}} {\nonterminal{ListArgument}} {\terminal{{$-$}{$>$}}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\terminal{if}} {\nonterminal{Exp}} {\terminal{then}} {\nonterminal{Exp}} {\terminal{else}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\nonterminal{Exp1}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp1}} & {\arrow}  &{\nonterminal{Exp1}} {\terminal{{$|$}{$|$}}} {\nonterminal{Exp2}}  \\
 & {\delimit}  &{\nonterminal{Exp2}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp2}} & {\arrow}  &{\nonterminal{Exp2}} {\terminal{\&\&}} {\nonterminal{Exp3}}  \\
 & {\delimit}  &{\nonterminal{Exp3}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp3}} & {\arrow}  &{\nonterminal{Exp3}} {\nonterminal{OCmp}} {\nonterminal{Exp4}}  \\
 & {\delimit}  &{\nonterminal{Exp4}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp4}} & {\arrow}  &{\nonterminal{Exp5}} {\terminal{::}} {\nonterminal{Exp4}}  \\
 & {\delimit}  &{\nonterminal{Exp5}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp5}} & {\arrow}  &{\nonterminal{Exp5}} {\nonterminal{OIfxW}} {\nonterminal{Exp6}}  \\
 & {\delimit}  &{\nonterminal{Exp6}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp6}} & {\arrow}  &{\nonterminal{Exp6}} {\nonterminal{OIfxS}} {\nonterminal{Exp7}}  \\
 & {\delimit}  &{\nonterminal{Exp7}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp7}} & {\arrow}  &{\nonterminal{OUnr}} {\nonterminal{Exp7}}  \\
 & {\delimit}  &{\nonterminal{Exp8}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp8}} & {\arrow}  &{\terminal{not}} {\nonterminal{Exp10}}  \\
 & {\delimit}  &{\terminal{hd}} {\nonterminal{Exp10}}  \\
 & {\delimit}  &{\terminal{tl}} {\nonterminal{Exp10}}  \\
 & {\delimit}  &{\nonterminal{Exp9}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp9}} & {\arrow}  &{\nonterminal{Exp9}} {\nonterminal{Exp10}}  \\
 & {\delimit}  &{\nonterminal{Exp10}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp10}} & {\arrow}  &{\nonterminal{OInteger}}  \\
 & {\delimit}  &{\nonterminal{OFloat}}  \\
 & {\delimit}  &{\nonterminal{OBool}}  \\
 & {\delimit}  &{\terminal{[}} {\nonterminal{ListListElem}} {\terminal{]}}  \\
 & {\delimit}  &{\nonterminal{Exp11}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp11}} & {\arrow}  &{\nonterminal{OIdent}}  \\
 & {\delimit}  &{\nonterminal{Exp12}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Exp12}} & {\arrow}  &{\terminal{(}} {\nonterminal{Exp}} {\terminal{:}} {\nonterminal{Type}} {\terminal{)}}  \\
 & {\delimit}  &{\terminal{(}} {\nonterminal{Exp}} {\terminal{)}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ListArgument}} & {\arrow}  &{\nonterminal{Argument}}  \\
 & {\delimit}  &{\nonterminal{Argument}} {\nonterminal{ListArgument}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Argument}} & {\arrow}  &{\nonterminal{OIdent}}  \\
 & {\delimit}  &{\terminal{(}} {\nonterminal{OIdent}} {\terminal{:}} {\nonterminal{Type}} {\terminal{)}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ListLocDecl}} & {\arrow}  &{\nonterminal{LocDecl}}  \\
 & {\delimit}  &{\nonterminal{LocDecl}} {\terminal{and}} {\nonterminal{ListLocDecl}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{LocDecl}} & {\arrow}  &{\nonterminal{OIdent}} {\terminal{{$=$}}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\nonterminal{OIdent}} {\terminal{:}} {\nonterminal{Type}} {\terminal{{$=$}}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\nonterminal{OIdent}} {\nonterminal{ListArgument}} {\terminal{{$=$}}} {\nonterminal{Exp}}  \\
 & {\delimit}  &{\nonterminal{OIdent}} {\nonterminal{ListArgument}} {\terminal{:}} {\nonterminal{Type}} {\terminal{{$=$}}} {\nonterminal{Exp}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ListListElem}} & {\arrow}  &{\emptyP} \\
 & {\delimit}  &{\nonterminal{ListElem}}  \\
 & {\delimit}  &{\nonterminal{ListElem}} {\terminal{;}} {\nonterminal{ListListElem}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{ListElem}} & {\arrow}  &{\nonterminal{Exp}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{OCmp}} & {\arrow}  &{\terminal{{$=$}}}  \\
 & {\delimit}  &{\terminal{{$<$}{$>$}}}  \\
 & {\delimit}  &{\terminal{{$<$}}}  \\
 & {\delimit}  &{\terminal{{$<$}{$=$}}}  \\
 & {\delimit}  &{\terminal{{$>$}}}  \\
 & {\delimit}  &{\terminal{{$>$}{$=$}}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{OIfxW}} & {\arrow}  &{\terminal{{$+$}}}  \\
 & {\delimit}  &{\terminal{{$-$}}}  \\
 & {\delimit}  &{\terminal{{$+$}.}}  \\
 & {\delimit}  &{\terminal{{$-$}.}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{OIfxS}} & {\arrow}  &{\terminal{*}}  \\
 & {\delimit}  &{\terminal{/}}  \\
 & {\delimit}  &{\terminal{*.}}  \\
 & {\delimit}  &{\terminal{/.}}  \\
 & {\delimit}  &{\terminal{mod}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{OUnr}} & {\arrow}  &{\terminal{{$-$}}}  \\
 & {\delimit}  &{\terminal{{$+$}}}  \\
 & {\delimit}  &{\terminal{{$-$}.}}  \\
 & {\delimit}  &{\terminal{{$+$}.}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Type}} & {\arrow}  &{\nonterminal{Type1}} {\terminal{{$-$}{$>$}}} {\nonterminal{Type}}  \\
 & {\delimit}  &{\nonterminal{Type1}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Type1}} & {\arrow}  &{\nonterminal{Type1}} {\terminal{list}}  \\
 & {\delimit}  &{\nonterminal{Type2}}  \\
\end{tabular}\\

\begin{tabular}{lll}
{\nonterminal{Type2}} & {\arrow}  &{\terminal{int}}  \\
 & {\delimit}  &{\terminal{float}}  \\
 & {\delimit}  &{\terminal{bool}}  \\
 & {\delimit}  &{\nonterminal{OPolyIdent}}  \\
 & {\delimit}  &{\terminal{(}} {\nonterminal{Type}} {\terminal{)}}  \\
\end{tabular}\\


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\section{Semantics of Xilocaml}

Xilocaml is a subset of OCaml 3.12. with very little changes. All syntactically valid constructions not described below have the same meaning as in language OCaml 3.12. For description of OCaml semantics, see \cite{ocaml}.

\subsection{Arithmetics}
There are two types of numerical values in Xilocaml on which arithmetic operations may be performed:
\begin{itemize}
	\item {\reserved{int}} - integers of arbitrary precision (which differs from OCaml);
	\item {\reserved{float}} - floating-point numbers of double precision, following standard IEEE-754 (same as in OCaml).
\end{itemize}

Floating-point operations have the same meaning as in OCaml and integer operations are performed on arbitrary precision integers. For compatibility to OCaml 3.12 implementation, sequence of unary prefix operators {\symb{+}}, {\symb{-}}, {\symb{+.}}, {\symb{-.}} applied to a floating-point constant is allowed and has the standard meaning.

\subsection{Explicit types}
It is possible to explicitly define types in Xilocaml similarly to OCaml. The only difference is that in Xilocaml, explicitly defined polymorphic types are local for each declaration. That is, a polymorphic type (like {\code{'a}}) actual type is independent of it's usage outside of {\code{pattern}} in declaration:
\begin{verbatim}
pattern = expr
\end{verbatim}

\subsection{Type inference}
Type inference in Xilocaml behave the same way as in OCaml (without enabled recursive types). If Xilocaml will be unable to detect the type of a value, a semantics error will occur. Xilocaml should be able to deduct types of all its constructions that implementation of OCaml 3.12. is able to.

\subsection{\reserved{let rec}}
Behaviour of {\reserved{let rec}} in Xilocaml differs from OCaml. In the following construction:
\begin{verbatim}
let rec pattern_1 = expr_1 and ... and pattern_n = expr_n in expr
\end{verbatim}
binding of names is performed like described below:
\begin{itemize}
	\item If {\code{pattern\_i}} is a function definition ({\code{fun\_name arg1\_name arg2\_name ...}}) with at least one argument name, binding of names in {\code{expr\_i}} is the binding of names as in nonrecursive {\reserved{let}} local definition updated by names of variables and functions defined in {\code{pattern\_1 = expr\_1}}, ..., {\code{pattern\_n = expr\_n}};
	\item Otherwise, biding of names in {\code{expr\_i}} is the same as in nonrecursive {\reserved{let}} local definition.
\end{itemize}

Type inference in {\reserved{let rec}} local definition behaves the same way as in OCaml.


\section{Xilocaml interpreter}

\subsection{Usage}
Xilocaml interpreter takes as an input a Xilocaml program and writes on standard output computed value and its type. If the computed value is a function, {\code{<fun>}} is written instead with its type. A Xilocaml program consists of a single expression and might be prepended by {\code{open List;;}} or/and appended by {\code{;;}}, for compatibility to OCaml interpreter. That way, input of Xilocaml interpreter may be written in a way to be valid both in Xilocaml and OCaml interpreter and even give the same output (that is, printing out the same value and its type).

\subsection{Errors}
If an error occurs, it is written on standard error output with its description and nothing is written on standard output. There are different types of errors:
\begin{itemize}
	\item syntactic,
	\item semantic,
	\item runtime.
\end{itemize}

Syntactic error will occur if input isn't conforming to Xilocaml grammar. Xilocaml's internals processing the syntactic part of input are based on Haskell libraries Alex and Happy. In some cases, an internal error of those libraries was observed and is written on standard error output the same way as other errors (for example it was observed for an empty or {\code{1+}} input).

Semantic error will occur if input's semantics are incorrect. In most cases that means a type error, when unification of given and expected type failed.

Runtime error might occur in the following cases:
\begin{itemize}
	\item division of integer by zero,
	\item taking head or tail of an empty list,
	\item comparing functional values (it might not be detected as a semantic error when functional values are passed to a polymorphic function).
\end{itemize}

If any bound name or constant is used in expression that generated semantic or runtime error, its approximate line and column will be printed out together with code and error description. In case of syntactic error, printed out data is library-dependent, though usually position of error along with additional information is printed out too.

\subsection{Tail recursion}
In Xilocaml interpreter, tail recursion isn't implemented, so programs depending on it may quickly run out of memory.


\begin{thebibliography}{99}
\addcontentsline{toc}{section}{Bibliography}

\bibitem[OCaml]{ocaml} The OCaml Language. \url{http://caml.inria.fr/pub/docs/manual-ocaml/language.html}

\end{thebibliography}

\end{document}