camlspotter avatar camlspotter committed f23bfb7

doc

Comments (0)

Files changed (3)

 ^.*\.(cm.*|o|sp.t|annot|opt|run|out|omc)$
 .*~$
 tests_out/.*\.ml
+docs/_build
 
-
+- If there is no change required, it should be noticed to the editor, in order not to modify the buffer unnecessarily.
+- Cursor move after reindent.

docs/ocaml-indent.rst

+====================================================
+OCaml-indent: OCaml source code indenter in OCaml
+====================================================
+
+Code available at https://bitbucket.org/camlspotter/ocaml-indent/wiki/Home .
+
+Motivation
+====================================
+
+To reindent OCaml program source code in editors, there are already several tools are available:
+
+- ocaml-mode for Emacs
+- tuareg-mode for Emacs 
+- ocaml.vim for Vim
+- omlet.vim for Vim
+
+All they are great, but are written in non OCaml language, even though they are for OCaml programming.
+
+The drawback is clear: they are hard to customize (for OCaml programmers).
+Every OCaml user has his own taste in OCaml source code indentation.
+If an existing indenter provide configurable options for his taste, it is ok, but
+otherwise, he need to hack it, but it is not written in OCaml... Or, he must obey someone else's taste. :-(
+
+For example, I myself use tuareg for years, and modified it for my style, 
+but Elisp hacking is not so easy for me.
+
+So, I wanted to have something more OCaml friendly: OCaml source code indenter written in OCaml itself.
+It should be much easier for me to fix it for my style, and probably easier for you to hack, too.
+
+Design
+=======================================
+
+As an external helper
+---------------------------
+
+``ocaml-indent`` is a simple command line tool, which takes OCaml source code text
+(and some command line options, of course), then prints out reindented code.
+
+Each editor (Emacs, Vim, ...) must communicate with ``ocaml-indent`` 
+for interactive reindentation, and 
+of course someone must prepare an extension for the editor,
+but the coding should be minimum. 
+For example, ``ocaml-indent.el`` for Emacs is just around 60LoC.
+
+Lexer based
+--------------------
+
+``ocaml-indent`` is lexer based, and uses OCaml's ``lexer.mll``.
+
+OCaml is still an evolving language. 
+At each version, its syntax is enriched. CamlP4 modules also extend OCaml syntax. 
+Therefore, ``ocaml-indent`` must be flexible against these syntax (parser) changes,
+and cannot rely on some specific ``parser.mly``. 
+(Parser.mly also drops all the parentheses at parsing, and it is another drawback.)
+
+On the other hand, its lexer (``lexer.mll``) is pretty stable.
+Lexer.mll ignores comments but it is very easy to modify it to preserve them.
+
+The indent analysis of ``ocaml-indent`` is a small state machine, which 
+observes a stream of lexer tokens, and updates its state including the indentation level of each source line.
+The analysis does not know the complete syntax of OCaml language, but a little, vaguely:
+for example, ``with`` must be paired with ``match``, ``try``, ``{``,  ``type`` or ``exception``.
+(The coupling of ``type`` and ``exception`` is for type-conv P4 macros.)
+So far, the state update rules look enough simple and easy to fix/update/customize.
+
+No backward parsing
+---------------------
+
+``ocaml-indent`` does not perform backward parsing. It parses (lexes) from the head of the source file.
+
+Usually source code indenters are implemented as backward parsers:
+they parse source codes backward from the position of the reindentation target
+until enough information for the reindentation is obtained. 
+Thus they minimize the amount of parsing.
+
+In my adventure of parser combinators in OCaml
+( http://camlspotter.blogspot.com/2011/05/planck-small-parser-combinator-library.html ), 
+I have found OCaml's lexer (``ocamllex`` and ``lexer.mll``) is extremely fast. 
+For example, OCaml lexer can parse all the ``*.ml`` and ``*.mli`` files in OCaml source tree
+in less than 1 second. It is more than 400000 lines/sec.
+The hugest FP source code I have ever seen in production is around 4000 lines
+(Don't let me tell where I saw it :-) Of course, I just cut it down into several files immediately.),
+and even it can be parsed in 0.01 sec.
+
+Apparently, for ``ocaml-indent``, there is no need of backward parsing(lexing).
+It can just use the good old ``lexer.mll``.
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.