Commits

andrew cooke committed b6b5f22

choco + regex

  • Participants
  • Parent commits b467491

Comments (0)

Files changed (1)

File _posts/2012-04-11-choco-regular.md

+---
+layout: post
+title: 'Choco and Regular Expressions'
+tags: andrew choco clojure
+comments: true
+---
+
+{{ page.title }}
+================
+
+A quick update to the series on [Choco][choco] and pipes last week.
+
+An alternative way to specify the neighbour constraints is to use regular
+expressions.  That may seem a little odd, but much of the more advanced
+constraints in Choco are based around DFAs (Discrete Finite Automata).
+
+DFAs are the "science" behind regular expressions - they are what a regular
+expression is compiled to (at least in theory; in practice many regular
+expressions are not strictly regular, and [many regex implementations are
+poor][cox-regex]).
+
+How does this work?
+
+Well, a DFA is just a set of rules (transitions) associated with particular
+states.  You start at the start state and look at the first character; the
+rules for the start state tell you which to state to move to before you look
+at the next character.  And so it repeats, until you have either matched all
+characters, or there is no suitable rule.
+
+That's a pretty powerful and general approach, so building it into Choco makes
+for a flexible constraint engine.  At least in theory.  In practice, I found
+it a little confusing at first, but I'm starting to understand and have just
+got a simple test case working.
+
+The basic idea is the values associated with variables are the "characters" in
+the regular expression (for integer values you really can use a regex -
+something like `"123"` means values 1, 2 and 3 for three variables, while
+`"<10><11>"` is the syntax for 10 and 11 - but you can also construct the DFA
+directly, and there are a bunch of simpler API methods, like the one I'll use
+below).  Obviously, this means that along with the regular expression you need
+to specify which variables it applies to (and their order, which is implicit
+in the array used in the API).
+
+Back to my simple test case.  If you read the [earlier article][article-5] you
+may recall that we needed to constrain neighbouring tiles, so that pipes
+connect correctly.  I did that by defining binary relations for neighbouring
+variables.
+
+An alternative way to provide this constraint is to say that two neighbouring
+cells must have values that satisfy a regular expression, where the regular
+expression is simply the list of acceptable pairs.  And Choco provides a
+simple API for this: `Choco.regular(IntegerVariable[], List<int[]>)`.
+
+Here's the new code, which replaces the earlier `constrain-pairs`:
+
+{% highlight clojure %}
+(defn constrain-pairs-regular [model n lo hi masks relation? variables swap]
+  (let [tuples
+        (java.util.ArrayList.
+          (map int-array
+            (for [m1 masks, m2 masks :when (relation? m1 m2)] [m1 m2])))]
+    (doseq [[v1 v2]
+            (map #(map variables %)
+              (for [i (range (dec n)), j (range n)]
+                (map swap [[i j] [(inc i) j]])))]
+      (.addConstraint model
+        (Choco/regular (into-array IntegerVariable [v1 v2]) tuples)))))
+{% endhighlight %}
+
+It's almost identical to the original (a little simpler), but appears to
+trigger quite different behaviour in the engine.  Unfortunately the new
+behaviour is not an improvement - the benchmarks are 2-3 times slower
+(presumably because the DFA machinery has more overhead than the simple binary
+relations).
+
+Still, it's an extra weapon to add to the armoury for more complex programs in
+the future...
+
+As ever, latest code is in [the repo][pipes_repo].
+
+Andrew
+
+[cox-regex]: http://swtch.com/~rsc/regexp/regexp1.html
+[article-5]: http://isti.bitbucket.org/2012/04/07/pipes-clojure-choco-5.html
+[pipes_repo]: https://bitbucket.org/isti/clojure-pipes
+[choco]: http://www.emn.fr/z-info/choco-solver/