Source

orakuda /

The default branch has multiple heads

Filename Size Date modified Message
pa
patch
test
242 B
389 B
739 B
15.9 KB
7.1 KB
2.5 KB
604 B
3.1 KB
2.0 KB
114 B
* ORakuda

    Available at http://forge.ocamlcore.org/projects/orakuda/ 
    It is not a "release", but just a proof of concept of what I am
    thinking of. Sources are updated and APIs are changed without any
    announcement.

    ORakuda is a small library, CamlP4 extensions and a tiny patch to
    CamlP4 which provides a handy way to write OCaml scripts a la
    Perl (or other scripting language). Its main features are:

- PCRE expression and matching of Perl like syntax: 

    Ex. 
        $/regexp/
        str |! $/regexp/ -> ... | _ -> ...
        $s/pattern/template/g

- Variable and expression references in string: 

    Ex. 
        $"Your are ${name} and %{age}02d years old."

- Sub-shell call by back-quotes: 

    Ex. 
        let status = $`wc` ~f:handle_output in ...

- Easy hashtbl access: 

    tbl${key} <- value for Hashtbl.replace tbl key value.

    Some are reimplementations of existing CamlP4 extensions. ORakuda's 
    main contribution is to provide more natural lexical interface for 
    users of Perl and other script languages.

** Name

    ORakuda has two meanings in Japanese:
    
        * 大(O)駱駝(Rakuda): 大(big) 駱駝(dromedary/bactrian camel)
        * おお(Oh)楽だ(Rakuda): "Oh, it's easy!"
    
    A good name for Perlish OCaml, isn't it ?

** How to build

    No binary package, no source.tar.gz, no step-by-step installation guide.
    
    Requirements: OCaml 3.12.0, findlib, pcre-ocaml, omake, spotlib
    
    - check out the svn: 
    
        svn checkout svn://svn.forge.ocamlcore.org/svnroot/orakuda
    
    - apply patch/camlp4-lexer-plugin-0.5.patch to OCaml 3.12.0 source,
      then compile+install the compiler
    
    - build the library and pa_* extensions: omake + omake install
    
    - omake top_test launches a toplevel with the extension

** How to use

    For easier access to functions, users are recommend to declare

        open Rakuda.Std
    
    at the begining of their source file.

    $ Prefix
    
    ORakuda extends OCaml's lexer and introduces new lexer rules
    prefixed by character '$'. '$' means Perl (and of course money, you
    know). The character '$' can be still used to define operators, as far
    as it does not conflict with ORakuda extensions.
    
    Perl like PCRE $//
    
    SYNTAX
    
        $/regular expression/flags

        flags ::= [imsxU8]*    8 is for `UTF8
    
    $// expression creates a PCRE expression. You can write regexps more
    naturally than Pcre.create "..." where you have to escape '\'
    characters: Pcre.create "function\\(arg\\)" can be written more simply
    as $/function\(arg\)/.

*** PCRE creation $//

    The type of $// expression is not Pcre.regexp but 'a Regexp.t, where
    the type parameter 'a encodes accessor information of the regexp's
    groups. See GROUP OBJECT METHODS for details:
    
        # $/(hello)world(?P<name>[a-z]+)/;;
        - : < _0 : string; _1 : string; _2 : string; _group : int -> string;
              _groups : string array; _left : string;
              _named_group : string -> string;
              _named_groups : (string * string) list; _right : string;
              _unsafe_group : int -> string; name : string >
            Rakuda.Std.Regexp.t
    
    
    In non-toplevel environment, a regular expression by $// is defined
    just ONCE at the top of the source file, no matter where it is
    declared. Therefore uses of $// expressions inside frequently called
    functions have NO runtime penalty.

    SIMPLE MATCH
    
        Regexp.exec_exn, (=~) and Regexp.exec are for simple regexp
        matching and return group objects if matches are successful. The
        matched groups can be retrieved through them:
        
          # let res = "123 + variable12;;" =~ $/([a-z_][_A-Za-z0-9]*)/;;
          val res :
          < _0 : string; _1 : string; _group : int -> string; _groups : string array;
            _left : string; _named_group : string -> string;
            _named_groups : (string * string) list; _right : string;
            _unsafe_group : int -> string > =
          <obj>
    
    GROUP OBJECT METHODS
    
        _0 .. _9        Correspond with $0 .. $9 in Perl regexp match
      
        _left, _right   Corresp with $` and $'
      
        name            Accessor for named groups, defined by Python
                        extension of named groups: (?P<name>regexp)
      
        _named_gruop    More primitive group accessor methods
        _named_gruops
        _unsafe_group

*** Perl like PCRE case match $// -> ... 

    SYNTAX
    
       $/regular expression/flags as var -> e
       | $/regular expression/flags as var -> e
       | ...
       | _ -> e
    
    Multiple regular expression pattern match cases can be written using
    PCRE case match expression [$// -> ...].
    
    The case match expression is a '|'-separated list of cases 
    [$/regular expression/flags as var -> e] which can be ended by an
    optional default case [_ -> e]. The entire case match expression is a
    function which takes a string then matches it against each regexp from
    the top to bottom. If one of the regexps of a case 
    [$/regexp/flags as var -> e] matches with the string, then the
    expression [e] is evaluated with a binding the group object by [var].
    If the expression [e] has no use of [var], the binding can be omitted
    like [$/regexp/flags -> e]. In the case when none of the regexps
    matches with the string, and if the default case [_ -> e] exists, then
    [e] is evaluated. If there is no default case, the function raises
    [Not_found].
    
    PCRE case match expression is recommended to use with the pipe
    operator (|!) or (|>):
    
       # "variable123" 
         |! $/^[0-9]+$/ as v -> `Int (int_of_string v#_0)
         |  $/^[a-z][A-Za-z0-9_]*$/ as v -> `Variable v#_0
         |  _ -> failwith "parse error";;
    
       - : [> `Int of int | `Variable of string ] = `Variable "variable123"

*** Perl like PCRE substitution $s///

    SYNTAX
    
        $s/regular expression/template/flags

** Perl like sprintf $""

    SYNTAX
    
        $"..."
    
    Short hand of Printf.sprintf "..." with inlined variable and
    expression embed by $-notation.
    
    EXAMPLE
    
        $"... $foo123 ..."     
            Equivalent to Printf.sprintf "... %s ..." foo123
      
        $"... ${Hashtbl.find tbl k} ..."
            Equivalent to Printf.sprintf "... %s ..." (Hashtbl.find tbl k)
      
        $"...%${var}02d..."
            Equivalent to Printf.sprintf "...%02d..." var
      
        $"...%s...%${var}02d..."
            Equivalent to 
                fun s -> Printf.sprintf "...%s...%02d..." s var

** Perl like sub-shell call $``

    SYNTAX

        $`command line` ~f:func

    Sub-shell call of [command line] and retrieves its stdout/err outputs
    by function [func].

** Perl like hashtbl access tbl${key}

    Macros provide Perl like syntax for Hashtbl.xxx functions.
    
    SYNTAX
      
        tbl${key}             Hashtbl.find tbl key
        tbl${key} <- data     Hashtbl.replace tbl key data
     
        tbl$+{key}            Hashtbl.find_all tbl key data
        tbl$+{key} <- data    Hashtbl.add tbl key data
     
        tbl$?{key}            Hashtbl.mem tbl key