+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
+ <title>File Selection Language Reference</title>
+ <link rel="stylesheet" href="fsl.css" type="text/css">
+ <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
+ <meta name="author" content="Kristian Ovaska">
+<h1>File Selection Language Reference</h1>
+Version 0.5 (2005-10-21)<br>
+Kristian Ovaska (kristian.ovaska [at] helsinki.fi)
+<li><a href="#overview">1. Overview</a>
+<li><a href="#general-syntax">2. General syntax</a>
+<li><a href="#glob-patterns">3. Glob patterns</a>
+<li><a href="#rules">4. Rules</a>
+ <li><a href="#glob-list">4.1 Glob list rule</a>
+ <li><a href="#for-each">4.2 For-each rule</a>
+ <li><a href="#in-block">4.3 IN-block</a>
+ <li><a href="#if-block">4.4 IF-block</a>
+<li><a href="#expressions">5. Expressions</a>
+ <li><a href="#expressions-general">5.1 General</a>
+ <li><a href="#functions">5.2 Built-in functions</a>
+<li><a href="#eval-order">6. Rule evaluation order</a>
+<li><a href="#examples">7. Examples</a>
+<h2 id="overview">1. Overview</h2>
+File Selection Language (FSL) is a descriptive language for
+file selection. A FSL program, also called a rule set, is
+made out of rules. Each rule tells whether a file should
+or should not be included in the file set.
+FSL rules utilize glob patterns. The pattern <tt>*</tt> matches
+all files, <tt>dir1/*</tt> matches all files under dir1,
+<tt>dir1/somefile</tt> matches only the file dir1/somefile, and
+However, FSL rules are not limited to bare globs. See
+<a href="#rules">below</a> for a full specification of the rules.
+There are two basic kinds of rules: inclusive and exclusive.
+Inclusive rules tell that certain files should be included
+in the file set. When evaluating an inclusive rule, the
+file system is scanned for all files matching the rule.
+Exclusive rule (starts with "NOT") is an exception to an
+inclusive rule. It says that even if a file matched an
+inclusive rule earlier, it must be excluded from the file set.
+Notice that exclusive rules don't cause any file system
+scanning by themselves: all scanning comes from inclusive rules.
+If you exclude a directory with an exclusive rule, you exclude
+all files and subdirectories of it as well. This is a good
+way to speed up file scanning.
+<h2 id="general-syntax">2. General syntax</h2>
+The program consists of a list of rules, which are separated by newlines.
+Simple rules (usually) fit on one line, while complex ones span several lines.
+The character # marks the beginning of a comment. Everything after it
+on the same line is ignored.
+Inside so-called block rules (see <a href="#rules">below</a>), the child
+rules must be indented with spaces or tabs. You can choose the indentation
+level, but you must always use the same amount within the rule file.
+Mixing spaces and tabs is unwise. Nested blocks are indented in the
+Simple (non-block) rules generally fit on one line. Due to the block
+indentation system, simple rules normally even can't span several lines;
+However, expressions (see <a href="#expressions">below</a>)
+that have open parenthesis can span several lines freely.
+Python programmers will recognize the FSL indentation system.
+Everything is case-insensitive: keywords and glob patterns.
+<h2 id="glob-patterns">3. Glob patterns</h2>
+When you write glob patterns, you can use two forms: bare strings
+and quoted strings. Bare strings are written as-is, while quoted
+strings have quotation marks around them.
+Bare string: <tt>dir1/*</tt><br>
+Quoted string: <tt>"dir1/*"</tt>
+There are limitations to bare strings. Bare strings:
+<li>may not contain whitespace
+<li>may not contain the characters <tt>, ( ) " < > = ( ) # | !</tt>
+<li>may not be a reserved word: AND, EACH, IF, IN, NONREC, NOT, OR
+<li>can't be used in expressions (see <a href="#expressions">below</a>)
+For example, the pattern "aaa bbb" must be a quoted string.
+Glob patterns may contain both forward slashes (<tt>/</tt>) and
+backward slashes (<tt>\</tt>). Forward slashes work on Windows, too,
+and backward slashes work on Unix. Glob patterns may contain full
+Windows drive specifiers (e.g. <tt>c:\somedir\*</tt>); they don't obviously
+By default, glob patterns are recursive, i.e. <tt>*</tt> matches all files,
+including the subdirectories. You get nonrecursive behaviour by
+appending "NONREC" to the glob pattern. For example, <tt>* NONREC</tt>
+matches only the files in current root directory, but not in
+There are two flavours of glob patterns: absolute and relative.
+Absolute patterns start with a forward or backward slash or a Windows
+drive specifier. A pattern that is not absolute is, logically, a
+Relative glob patterns are evaluated in the context of a root
+directory. By default, the root directory is the current working
+directory, but may be set to any directory.
+For example, the rule <tt>*</tt> will produce all files in the
+file system if the root directory is the file system root, but only
+the files under <tt>/usr/local</tt> if the root directory
+The root directory is given to the FSL interpreter as parameter. Also,
+so-called IN-blocks (see <a href="#in-block">below</a>) change the
+effective root directory temporarily. Absolute patterns are not
+allowed inside IN-blocks.
+Usually, it is better to use relative globs, since they are more
+flexible than absolute globs.
+Absolute globs are always evaluated in the context of the same
+root directory, the file system root.
+Let's say you have created a relative rule-set for your Unix machine
+that you normally evaluate with <tt>/</tt> as the root directory.
+Some day, you mirror the file system into another Unix machine
+(or a Windows machine using Samba) into a directory <tt>/usr/somedir</tt>.
+Now, you can simply use your existing relative rule-set. This wouldn't
+be possible if you had hard-coded all the paths into the rules.
+<h2 id="rules">4. Rules</h2>
+There are two basic rule types: <a href="#glob-list">glob list</a> rule
+and <a href="#for-each">for-each</a> rule.
+Both may be prefixed with "NOT", which makes them exclusive rules.
+There is also two compound rule types: <a href="#in-block">IN-block</a>
+and <a href="#if-block">IF-block</a>.
+<rule> := (NOT)? <glob-list>
+ | (NOT)? <for-each>
+ | IN <directory> <start-block> <rule>+ <end-block>
+ | IF <expression> <start-block> <rule>+ <end-block>
+<h3 id="glob-list">4.1 Glob list rule</h3>
+This is the most basic rule. Glob list rule is, as the
+name implies, a list of glob patterns separated by commas.
+A file matches a glob list rule if it matches any of the globs.
+In glob patterns, bare strings and quoted string may be mixed
+freely, as can recursive and nonrecursive (NONREC) glob patterns.
+<glob-pattern> (, <glob-pattern>)* (IF <expression>)?
+The IF-expression is optional. If present, the
+glob list rule is applied only if the <a href="#expressions">expression</a>
+is true. The expression is evaluated only once, not for every
+file. When using expressions, you usually want to evaluate
+the expression for every file in turn. In this case,
+you have to use the <a href="#for-each">for-each</a> rule.
+somefile, "some file with spaces"
+ (excludes both *.ps and *.eps)
+*.html IF exists("index.html")
+ (include *.html files only if index.html is present)
+<h3 id="for-each">4.2 For-each rule</h3>
+EACH <variable name> (IN <glob list>)? IF <expression>
+For-each rule is an enchanced glob list. Each file matched by
+the glob list is included/excluded only if the <a href="#expressions">expression</a>
+The expression is evaluated for every file in turn.
+The IN-section is optional. If omitted, the glob <tt>*</tt> is used.
+EACH f IN * IF size(f) > 1024 (include files larger than 1 KB)
+EACH f IF size(f) > 1024 (the same)
+NOT EACH f IN *.ps IF date(f) < "2000"
+ (excludes *.ps files from the previous millennium)
+<h3 id="in-block">4.3 IN-block</h3>
+IN-block contains a list of rules that are executed in a different
+root directory. The effective root directory is calculated by
+concatenating the previous root directory and the directory given
+The rules under the IN-block can be any rules: glob lists,
+for-each rules, or other IN-blocks.
+All glob patterns must be relative. The directory specifier may
+be absolute if the IN-block is a top-level IN-block. In a nested
+IN-block, the directory specifier must also be relative.
+<p>This includes all files under dir1 and is exactly the same as the rule
+Example of nested IN-blocks:
+This matches the files <tt>dir1/dir2/dir3/*</tt>.
+<h3 id="if-block">4.4 IF-block</h3>
+IF-block is a bit like an glob list rule with an IF-expression,
+but an IF-block may contain several rules. The rules are applied
+only if the expression evaluates to true. The expression is
+<h2 id="expressions">5. Expressions</h2>
+<h3 id="expressions-general">5.1 General</h3>
+Expressions are used in for-each rules, glob list rules and IF-block rules
+to determine whether a rule should be applied. Each expression evaluates
+Expressions are made of:
+<li>integer, floating point, string and timestamp literals, e.g. <tt>50</tt>,
+ <tt>5.23</tt>, <tt>"abc"</tt>, <tt>"2005-08-05 21:30"</tt>
+<li>variable references if inside a for-each rule, e.g. <tt>f</tt>
+<li>function calls, e.g. <tt>size(f)</tt>, <tt>now()</tt>
+<li>logical operators NOT, AND, OR, e.g. "<tt>NOT expr</tt>",
+ "<tt>expr1 AND expr2</tt>", "<tt>expr1 OR expr2</tt>"
+<li>comparison operators <tt>< <= > >= = !=</tt>
+ (these work for numbers, strings and timestamps)
+Timestamp literals are written inside quotation marks just like strings.
+However, they are converted to a "real" timestamp representation internally.
+An invalid timestamp literal results in a parse error.
+Accepted timestamp formats are:
+<li><tt>yyyy</tt> (month=1, day=1, hour=0, minute=0, second=0)
+Notice that logical NOT (inside an expression) is
+conceptually different from the exclusion NOT before a rule.
+Logical NOT merely reverses the truth value of an expression.
+<expression> := <simple-expression> ((AND | OR) <expression>)?
+<simple-expression> := (NOT)? <atom> (<compare-op> <atom>)?
+ | (NOT)? "(" <expression> ")"
+<atom> := <string>
+ | <variable-name>
+ | <function-name> "(" <atom> ")"
+<compare-op> := "<" | "<=" | ">" | ">=" | "=" | "!="
+Expressions with open parenthesis can span several lines, unlike
+normal simple FSL rules. Inside the expression, indentation doesn't
+EACH f IN *.txt IF (size(f) < 1000
+The following example would be a syntax error because there
+are no (open) parenthesis:
+EACH f IN *.txt IF size(f) < 1000
+<h3 id="functions">5.2 Built-in functions</h3>
+Expressions can use built-in functions, which can be divided into two
+categories: predicate and value functions. Predicate functions return
+a truth value and value functions return a value (like a number).
+Value function calls can't be used as complete expressions as themselves,
+they must be combined with comparison operators.
+ <th align="left">Function
+ <th align="left">Description
+ <td><tt>filename -> float</tt>
+ <td>Return age of file in days as floating point number (based on modification date)
+ <td><tt>filename -> filename</tt>
+ <td>Return file name without (outermost) extension,
+ e.g. for filename <tt>"dir/aaa.ext"</tt>, return <tt>"dir/aaa"</tt>
+ <td><tt>filename -> datetime</tt>
+ <td>Return modification date of file
+ <td><tt>filename -> boolean</tt>
+ <td>Return true if given file exists
+ <td><em>extract(time, part)</em>
+ <td><tt>datetime, string -> int</tt>
+ <td>Extract part from timestamp. Part is one of "year", "month", "day",
+ "hour", "minute", "second", "week", "weekday".
+ <td><tt>-> datetime</tt>
+ <td>Return current time
+ <td><tt>filename -> int</tt>
+ <td>Return size of file in bytes
+<h2 id="eval-order">6. Rule evaluation order</h2>
+Rules are evaluated from the first to the last and the last
+matching rule is applied.
+For example, there are the following rules:
+The first rule matches all files, but the second rule excludes
+all *.jpg files. As the result, all files except *.jpg files are
+included in the file set.
+This matches all files, including *.jpg files, because
+the last rule (*) tells to include all files. Also, no exclusive
+rule at the beginning of a rule set ever has any effect.
+Indeed, the FSL interpreter warns you in this case:
+ Warning: exclusive rule at beginning - has no effect
+Usually, you should have inclusion rules at the beginning of the
+rule set and exclusion rules at the end.
+<h2 id="examples">7. Examples</h2>
+Includes config.ini file under root directory, if present.
+Includes config.ini files under all directories, except in my/prog
+Up: <a href="index.html">FSL index</a><br>