Commits

cacol89 committed 71e27c3

added some documentation, removed query flattening

Comments (0)

Files changed (4)

 	* added support for equalities in select clauses and added some tests 
 	* added yadi_utils file and placed eval data structures there
 	* fixed test and added compatibility with flattening 
-	* 
+
+2013-01-27: First official YADI release. Added next functionalities to the first release:
+
+        * Automated tests (make tests)
+        * Automated setup of testdb (make testdb)
+        * Perform arity-checks
+        * Include single-line comments
+        * Enable constants inside goals and rule-heads
+        * Support for inequalities in the bodies
+        * Support for multiple definitions of predicates (rule union)
+        * Improved error reporting
+        * Included parameters by flags when calling the executable
+        * Enabled yadi to read files through the flag -f
+        * Enabled yadi to calculate recursive programs
+        * Enabled yadi to solve negation in predicates
+        * Made yadi solve programs by stratification
+        * Included support for aggregate functions
 
 1. Uncompress the source archive and go to the root directory 
 2. Run 'make'
-
-[Remark: autoconf and automake greatly help in the installation process. I recommend you use them for your own development.]
+3. The final executable file can be found in the 'bin' directory and its called 'yadi'.
+4. For starting to use the program, just execute the 'yadi' file, for information about the
+command line, pass the "--help" flag.
 
 Uninstalling
 ============
 
-TODO
+1. Just remove the directory with the yadi source.
 
 The purpose of the project is to develop a Command Line Interpreter for Datalog queries. It aims at converting Datalog queries into SQL statements. Then, a backend Relational Database is in charge of evaluating the query. 
 
+So far, YADI is able to solve recursive Datalog programs with negation. Nevertheless, recursive programs must satisfy the following five conditions for being accepted and solved by YADI:
 
-Pre-requisites
--------------------------
+1.- No two predicates are mutually recursive; the only allowed form of recursion is self-recursion.
+2.- A predicate cannot make a negative recursive call to itself.
+3.- A predicate cannot contain more than one recursive goal (call to itself) in its body.
+4.- If the predicate is defined by several rules (union of rules), then only one of the definitions can have a recursive call.
+5.- Every recursive predicate must have a non-recursive definition (base case).
+
+Because of this rules, the language accepted by YADI is more powerful than semipositive-datalog but less powerful than stratified-datalog.
+
+YADI also accepts aggregate functions, they must be defined in the head of a rule, for example:
 
-It is madatory to connect a PostgreSQL server, with a role, a database and at least one populated table to formulate queries against that table.
-	
-We already have a database schema for testing purposes. You can execute the sql file below
+Q(x,y,sum(z)) :- P(x,y) and R(y,z).
 
-	yadi/test/sql/employee.sql
+Here the resulting set will consist of 'x,y' tuples and the sum of 'z' variables of these tuples. Aggregation comparison is also accepted by YADI, for example:
 
-with the following statement :
-	psql -d test_db -a -f yadi/test/sql/employee.sql
+Q(x,y,count(z)) :- P(x,y) and R(y,z) and count(z) >= 10.
 
-be careful with the file owner permissions, you might need to give "postgres" user read/write permissions 
-If your database schema is ready, now we can try to make YADI work.		
+Please refer to the documentation for more information about YADI's expressive power.
+
+Pre-requisites
+-------------------------
 
+For running YADI it is necessary to have access to a PostgreSQL database. As it can be seen below, when starting the program it will be necessary to indicate a username, database name, host, port and password. Obviously the provided user will need to have write/read permission on the provided database.
 
 
 Usage
 -------------------------
 
-./yadi <connexion info>
 
-Example:
-./yadi "host=localhost port=5432 user=test password=test dbname=test_db"
+./yadi [OPTIONS]
 
-You must see something like this :
-dbname    = test_db
-user      = postgres
-password  = *password*
-host      = localhost
-port      = 5432
-tty       = 
-options   = 
-pid       = 16007
---------------
-DB Schema:
-Departments(budget,name,code) :- .
-Employees(department,lastname,name,empid) :- .
-Phonenumber(employee,number) :- .
+OPTIONS:
+  -h host     : database server host (default: "yadi")
+  -p port     : database server port (default: "5432")
+  -U user     : database user (default: "yadi")
+  -w password : database user password (default: empty)
+  -d dbname   : database name to connect to (default: "yadi")
+  -db         : print debugging information
+  -f file     : read program from file
+  -help  Display this list of options
+  --help  Display this list of options
 
+Observe that the flag "-db" will make YADI print debugging information, such as the stratification of programs, generated SQL, database schema, etc.
 
-Sample Queries
+Sample queries
 -------------------------
-Alldept(x,y,z) :- Departments(x,y,z).
-?- Alldept(x,y,z).
-/
 
-Allemp(x,y,z,w) :- Employees(x,y,z,w).
-?- Allemp(x,y,z,w).
-/
+The directory 'test' contains several test cases. Specifically, the directory 'test/sql' contains several sample databases. A script for setting automatically the testdb has been included, it can be invoked by running "make testdb". However the file 'test/database.conf' must be defined with the connection information, this file must have the following format:
 
-All_pho(x,y) :- Phonenumber(x,y).
-?- All_pho(x,y).
-/
+host=localhost
+port=5432
+user=myuser
+password=secret
+dbname=mydb
+ 
+Furthermore, the directory 'test/integration' contains several test cases. A script for automatically running the tests was also included, running 'make tests' will invoke it.
 
+In summary, for setting the test DB and testing:
 
+1.- Create the file 'test/database.conf' and fill in the required information.
+2.- Run 'make testdb', this will set the test database.
+3.- Run 'make tests', the tests will be run.
 
 ---------------------------------------------------------------------------
 
 Contact Information and Contributing
 ------------------------------------
 
-In the case of bugs, feature requests, contributions and similar, please
+In the case of bugs, feature requests, contributions and similar, please contact
 
 Carlos Colmenares   <carlos.a.colmenares.r@gmail.com>
-Mark Abspoel        <mail@markabspoel.nl>
-Jose Robles Noriega <atease@vista.aero>
-Hasan Saygin Arkan  <sayginarkan@gmail.com>
 
-based on the core code of maintainer :
+Based on the core code of maintainer:
   * Guillaume Raschia <guillaume.raschia@univ-nantes.fr>
+Past colaborators of the project:
+  * Mark Abspoel        <mail@markabspoel.nl>
+  * Jose Robles Noriega <atease@vista.aero>
+  * Hasan Saygin Arkan  <sayginarkan@gmail.com>
 
 Up-to-date information should be available at:
 <https://bitbucket.org/cacol89/yadi/>

src/query_flattening.ml

-(**
- * This file contains operations for flattening a query, given the DB's edb,
- * the main function is 'edb_query'.
- *)
-
-open Expr ;;
-
-(** get the query expression, 
- *  check if there is one query, or more than one 
- *  @param get_query takes input assumed to be query 
- *  @return true if there is query, otherwise error statements *)
-let get_query e = match e with
-    | Prog sttl    -> ( let lq = ( List.filter (fun r -> match r with
-                                                    | Query _ -> true 
-                                                    | _ -> false) sttl ) in
-                    match lq with 
-                        | []     -> failwith "error: there is no query"
-                        | h::[]    -> h
-                        | h::_ -> failwith "error: there are too many queries"
-                    )
-;;
-
-(** substitute a variable 'var' with a numbered var, using and extending
- * if necessary the symbol table hash *)
-let sub_var var hash n = match var with
-    | NamedVar x -> if Hashtbl.mem hash x
-        then NumberedVar (Hashtbl.find hash x)
-        else (n := !n + 1; Hashtbl.add hash x !n; NumberedVar !n)
-    | AnonVar    -> n := !n + 1; NumberedVar !n
-    | NumberedVar x -> NumberedVar x
-;;
-
-(** substitute all variables in the rterm 'rterm' *)
-let subvars_rterm rterm hash n = match rterm with
-    | Pred (x, vl) -> (Pred (x, List.map (fun x -> sub_var x hash n) vl))
-;;
-
-(** substitute all variables in the term 'term' *)
-let rec subvars_term term hash n = match term with
-    | Rel r        -> Rel (subvars_rterm r hash n)
-    | Equal (s, i) -> Equal (sub_var s hash n, i)
-    | Not t        -> subvars_term t hash n
-;;
-
-
-(** substitute all variables in the rule 'rule' *)
-let subvars_rule rule hash n = match rule with
-    | Rule (r, t) ->
-            let r2 = subvars_rterm r hash n in
-            let t2 = List.map (fun x -> subvars_term x hash n) t in
-            Rule (r2, t2)
-    | Query _ -> invalid_arg "subvars_rule"
-;;
-
-let hash_size = 100 ;;
-
-exception IncompleteProgramError ;;
-
-(** sort the list of rules 'e' such that the predicate in the head of each rule
- * only depends on the rules that preceed it.  
- * @param edb is a string list of predicate names in the edb 
- * @return If the procedure cannot derive each predicate
- * from the edb, it throws an IncompleteProgramError*)
-let sorted_rules edb e = match e with
-    | Prog sttl -> let lq = ( List.filter (fun r -> match r with
-                                                    | Query _ -> false 
-                                                    | Rule _ -> true) sttl ) in
-        (* to be applied iteratively: preds is a Hashtbl of hashes that can be
-         * derived from the edb. n is the number of items in preds, idb is a
-         * list of rules that haven't been derived yet *)
-        let build preds n idb =
-            List.filter (fun rule -> match rule with
-            | Rule (h, t) -> 
-                if List.for_all (fun x -> Hashtbl.mem preds (get_predname x)) t then
-                    (n := !n + 1; Hashtbl.add preds (get_rterm_predname h) !n; false)
-                else
-                    true
-            | Query _ -> invalid_arg "build"
-            ) idb in
-        let n = ref (List.length edb) in
-        let preds = Hashtbl.create hash_size in
-        let _ = List.iter (fun x -> n := !n + 1; Hashtbl.add preds x !n) edb in
-        let idb = ref lq in
-        let k = ref !n in
-        let break = ref false in
-        while not !break do
-            idb := build preds n !idb;
-            if !n <= !k then break := true else k := !n
-        done;
-        if List.length !idb > 0 then
-            raise IncompleteProgramError;
-        let cmp preds a b =
-            if (get_rule_predname a) = (get_rule_predname b) then 0 else
-            if Hashtbl.find preds (get_rule_predname a) < Hashtbl.find preds (get_rule_predname b) then -1 else 1 in
-        List.sort (cmp preds) lq
-;;
-
-(** make a Hashtbl out of a list *)
-let list_hash l =
-    let h = Hashtbl.create hash_size in
-    List.iter (fun x -> Hashtbl.add h x true) l; h
-;;
-
-(** statement's body (term list) *)
-let stt_body st = match st with
-    | Rule (h, t) -> t
-    | Query q     -> [Rel q]
-;;
-
-(** get query's rterm *)
-let get_query_rterm r = match r with
-    | Query q -> q
-    | _ -> invalid_arg "get_query_term"
-;;
-
-(** check if any numbered variable used *)
-let int_of_var var = match var with
-    | NumberedVar i -> i
-    | _ -> invalid_arg "int_of_var"
-;;
-
-(** try substitute the term 'term' using rule 'rule'. Returns list term * bool. *)
-let subterm term rule sym n = match term with
-    | Rel r -> if (get_rterm_predname r = get_rule_predname rule) then
-            let rterm = subvars_rterm r sym n in
-            let sub_sym = Hashtbl.create hash_size in
-            let k = ref 0 in
-            (List.iter2 (fun x y -> Hashtbl.add sub_sym x y; incr k)
-                (List.map string_of_var (get_rterm_varlist (rule_head rule)))
-                (List.map int_of_var (get_rterm_varlist rterm));
-            (List.map (fun x -> subvars_term x sub_sym k) (stt_body rule), true))
-        else
-            ([term], false)
-    (* TODO: support NOT *)
-    | _ -> invalid_arg "subterm"
-;;
-        
-
-exception VariableResolutionError;;
-
-(* TODO: check for consistent arity for edb/idb predicates *)
-(** repeatedly process rules: given a program, find a rule using just edb
- * terms that is equal to the given query *)
-let edb_query edb e =
-    let sym = Hashtbl.create hash_size in
-    let n = ref 0 in
-    let q = subvars_rterm (get_query_rterm (get_query e)) sym n in
-    let arity = match (get_query e) with
-        | Query q -> get_arity q
-        | _ -> 0
-    in
-
-    (* check for double variable in query head *)
-    if arity > !n then raise VariableResolutionError ; 
-
-    let rules = sorted_rules edb e in
-    let edb_hash = list_hash edb in
-    let rec sub q =
-        try
-            let rule = List.find
-                (fun y -> List.exists (fun x -> ((get_predname x) = (get_rule_predname y))) q)
-                rules in
-            
-            let found = ref false in
-            let q2 = List.map (fun x -> if !found then [x] else 
-                let subbed, f = subterm x rule sym n in
-                if f then
-                    (found := true;
-                    subbed)
-                else
-                    [x]
-            ) q in
-            sub (List.flatten q2)
-        with exn ->
-            (* TODO: check whether this only contains terms from the edb *)
-            q
-        in
-    Rule (q, sub [Rel q])
-            
-;;