M2C Modula-2 Compiler & Translator Rework Project
Welcome to the Wiki of the M2C Modula-2 Compiler & Translator Rework Project
The objective of this project is to completely rewrite and replace the codebase of the M2C compiler, originally by V.Makarov with a highly portable, reliable, readable and maintainable new M2C compiler.
The purpose of the new M2C compiler is to provide a means:
- to compile and run program examples from early Modula-2 books (PIM3/PIM4), in particular works by Wirth.
- to develop, build and bootstrap a compiler for Modula-2 Revision 2010 written in a subset of M2 R10.
The new M2C compiler shall support the classic Modula-2 language described in the third and fourth editions of "Programming in Modula-2" (Wirth, 1983-1985) and selected features of the revised Modula-2 language described in "Modula-2 Revision 2010" (Kowarsch and Sutcliffe, 2010-2015). For details, see section M2C Language Extensions further below. The ISO Modula-2 dialect will not be supported.
The grammar of the compiler is in the project repository
The new M2C compiler is licensed under the GNU Lesser General Public License (LGPL) both v.2.1 and v.3.
- Replacement Lexer
- Replacement Parser
- Replacement AST (abstract syntax tree)
- Replacement Semantic Analyser
- Replacement Code Generator
- Replacement Driver Program
- Vim integration: Syntax colouring and compiler invocation
- Replacement Lexer operational in Nov 2015
- Replacement Parser operational on Dec 11, 2015
- Replacement Driver operational with partial functionality
- AST library and API
- AST exporter for S-expression and GraphViz DOT output
- OS specific file system and IO libraries for AmigaOS, POSIX, VMS and Windows.
- OS specific pathname parsing/splitting libraries for AmigaOS, POSIX, VMS and Windows.
- Modula-2 to C identifier conversion library (part of code generator)
- Symbol table library and API
Current Work Items
- Vim syntax colouring
- Testing symbol table library
- Testing AST library and AST exporter
- Testing OS specific implementations of fileutils
- Testing OS specific pathname validation, splitting and composition
- C99 code generator
Upcoming Work Items
- Dependency graph generator
- Symbol table to SYM file exporter
We estimate the full-time effort required to complete all remaining deliverables to be two to three man months. Community end-to-end testing may then add another two calendar months to iron out bugs on all supported platforms.
At the present time we are unable to commit any full-time resources and this makes it very difficult to predict a calendar date for completion. However, we are currently in talks with a potential sponsor about a sponsorship. With a bit of luck, the sponsorship may allow one of us to work full-time on the remaining deliverables until completion. An update will be posted here in due course.
The M2C compiler is written in C and therefore a C compiler is required to build it. There are no dependencies on any libraries other than the C standard library. It has been built and tested with LLVM and GCC with C99/C11 standard settings on various OSes, DEC/HPE C on OpenVMS and MSVC 2015 on Windows.
At this time, M2C has been build and tested on AmigaOS, FreeBSD, Cygwin, Linux, MacOS X, OpenVMS/Alpha and Microsoft Windows. It should also build and work on other BSD systems and other recent Unix and Unix-like systems such as AIX, HP-UX, QNX and Solaris, further on MS-DOS and OS/2. The compiler is designed to be portable with very little effort.
M2C is being integrated into Vim, including GUI versions of Vim for Linux, Mac OS X and Windows. At this time our working copy of Vim (v.7.4) supports syntax colouring with dialect differentiation. Filetype detection scans .def and .mod files to automatically determine the dialect with support for special comments that act as dialect tags. In addition, the Syntax menu provides a choice of PIM, ISO and M2 R10 to set the dialect manually. Outstanding are code folding and invoking the compiler from Vim.
M2C is designed for utmost portability. Operating systems supported thus far are AmigaOS, BSD, Linux, MacOS X, OpenVMS and Windows. The Windows version should also cover MS-DOS and OS/2. Work on Plan9 and RISC OS is about to commence and several other systems are on our to do list.
How to Build from Source
Step 1: Obtain a copy of the source code
$ hg clone https://firstname.lastname@example.org/trijezdci/m2c-rework
Step 2: Invoke the make utility within the m2c-rework directory
This will build both the lexer test program testlex as well as the driver program m2c.
A list of compiler options can be obtained by invoking the driver program with the help option.
$ ./m2c --help
To run m2c on a Modula-2 source file it is invoked with the name of the source file, with any compiler options following after the filename.
Let's assume we use the following example source taken from Niklaus Wirth's book "Programming in Modula-2", 4th Edition, page 96 ...
DEFINITION MODULE EBNFScanner; (* N.Wirth, PIM4, page 96 *) TYPE Symbol = (ident, literal, lpar, lbk, lbr, bar, eql, period, rpar, rbk, rbr, other); CONST IdLength = 24; VAR sym: Symbol; (* next symbol *) id: ARRAY [0 .. IdLength] OF CHAR; lno: INTEGER; PROCEDURE GetSym; PROCEDURE MarkError(n: INTEGER); PROCEDURE SkipLine; END EBNFScanner.
As this source file follows PIM syntax the compiler needs to be run in PIM3 or PIM4 mode, using compiler options --pim3 or --pim4
$ ./m2c EBNFScanner.def --pim4
It should print 0 parse errors for this input.
m2c Modula-2 Compiler & Translator, version 1.00 parse error count: 0
To get more details on any reported warnings and errors, compiler switch --verbose may be used
$ ./m2c ProcType.def --pim4 --verbose
It should then print the source line for each warning or error and mark the offending lexeme with a caret '^'.
line 9, column 23, error: unexpected reserved word CONST found expected ARRAY, VAR or identifier. TYPE P4 = PROCEDURE ( CONST CHAR ); ^ ...
Parser Debug Mode
To get feedback on what the parser is doing, compiler option --parser-debug may be used
$ ./m2c EBNFScanner.def --pim4 --parser-debug
It should then print the current syntax rule, line, column and lexeme.
@ line: 1, column: 1, lookahead: DEFINITION *** definition *** @ line: 3, column: 1, lookahead: TYPE *** typeDefinition *** @ line: 3, column: 6, lookahead: P1 ...
If you would like to contribute to the project, please get in touch via the M2C project's home page at savannah
or via the M2C project's IRC channel #m2c at OFTC
or via email to the developer or the maintainer
trijezdci (gmail) or sinuhe (gnu.org)
Of particular interest at this stage are contributions of OS specific versions of the fileutils and pathnames modules for yet unsupported host platforms. On the wish list are FreeMint, Haiku, NSK, Plan9 and zOS. Furthermore, volunteer testers are needed for host platforms that should be covered by the Windows version, in particular FreeDOS, MS-DOS and OS/2.