Commits

certainty  committed 4d58744

added draft for 'i did missbehave'

  • Participants
  • Parent commits add897e

Comments (0)

Files changed (2)

File _posts/2012-10-25-dynamic_attributes.md

 {% include setup %}
 
 
-This short article describes dynamic attributes for ruby. Lately I've had to do with code that 
+This short article describes dynamic attributes for ruby. Lately I've had to do with code that
 showed a common pattern which I named <strong>dynamic attributes</strong>.
 
 
 ## The Problem
 
-Did you ever come across a situation where you had an instance-variable and wanted 
-a way to temporarily change the value of that variable during the execution of a block and have then the original 
+Did you ever come across a situation where you had an instance-variable and wanted
+a way to temporarily change the value of that variable during the execution of a block and have then the original
 value restored?
 
 That is what dynamic attributes are for.
 ### fluid-let
 
 {% highlight scheme %}
+
 (define counter 0)
 (define (bump-counter!) (set! counter (+ 1 counter)) counter)
 
 (fluid-let ((counter 10))
-  (display (bump-counter!)) 
+  (display (bump-counter!))
   (newline)
-  (display (bump-counter!)) 
+  (display (bump-counter!))
   (newline))
 
 (newline)
 (display counter)
 (newline)
+
 {% endhighlight %}
 
 <pre>
 Boot finished
 </code>
 
-Parameters have other cool properties and features. For example they're interacting nicely with threads, support guard-procedures and such. 
+Parameters have other cool properties and features. For example they're interacting nicely with threads, support guard-procedures and such.
 
 But let's get back to ruby now. The following piece of code illustrates what a ruby-version of this could look like:
 
 {% highlight ruby %}
 class MyLogger
   dynamic_attr :context
-  
+
   def log(msg)
     puts "#{context} #{msg}"
   end
 
 logger.with_context("Boot") do
   logger.log("starting up")
-  
+
   logger.with_context("Network") do
     logger.log("starting up")
   end
-  
+
   logger.log("finishd")
 end
 {% endhighlight %}
   end
 end
 {%endhighlight%}
-
-

File _posts/2013-03-27-i_did_missbehave.md

+---
+layout: post
+title: I did mis(s)behave
+tagline: Lessons learned from a failed project
+code:
+  url: http://bitbucket.org/certainty/missbehave
+  caption: missbehave
+tags: [scheme,missbehave,chicken,tdd,bdd,testing]
+---
+
+<h3 class="statement">It is in our failures, that we learn the most important lessons</h3>
+
+I guess it is really hard to admit that a project has failed. That's probably why many projects are taken
+further and further even though they have failed a long time ago.
+
+One of my failed projects is a little scheme library called [missbehave](http://wiki.call-cc.org/eggref/4/missbehave).
+I intended to provide a testing framework that could be used in [TDD](http://en.wikipedia.org/wiki/Test-driven_development) and especially in [BDD](http://en.wikipedia.org/wiki/Behavior-driven_development). It was inspired and mostly architectured after the really neat [rspec-library](http://rspec.info). If you're a ruby programmer
+and you don't know it yet, go ahead and give it a try.
+
+
+### How did it fail?
+
+Well the most obvious thing I realized was that even I as the developer of the library didn't use it much.
+I used it to some extend, but whenver I wanted to make sure things work and I had to get things done, I switched to the [defacto
+testing library for chicken scheme](http://wiki.call-cc.org/eggref/4/test).
+And so did others.
+
+There weren't many who tried the library, and when they did they immediatly recognized problems.
+Fixing those problems became harder and harder, which is another indicator for a failed project.
+
+While it did provide some new and useful features, it was just another testing library, and there
+already have been very mature ones.
+
+
+### Missbehave: the bad parts
+
+Let me walk you through the parts of the library that are the reason for its failure. There are things that I really like about
+the library which I will outline in [the good parts](#missbehave_the_good_parts).
+
+#### Behavior verification
+
+That's one of the things the library aimed at. I intended to enable BDD in scheme.
+The problem with that is that most of the testing techniques that currently exist
+to support BDD are alien or at least unnatural for scheme. Let me decsribe shortly what that
+normally looks like and you'll realize that this is not what scheme is about most of the time.
+
+BDD by definition is an outside-in approach, or top-down, or interface-first.
+This means that the programmer starts with the definition
+of the interface and works his way from the outermost layer inwarts.
+
+Interface in that context can indeed be the GUI or more commonly protocols
+describing the services a particular '''object''' provides. It is common to test behavior with objects that have not been implemented at that
+point. This is done by using test doubles or mocks which are used as replacement for the actual thing that will be implemented later.
+Often these mocks represent [depended-on components (DOCs)](http://xunitpatterns.com/DOC.html) that the  [system under test (SUT)](http://xunitpatterns.com/SUT.html) interacts with. So if we want not make sure that the SUT behaves as expected, we can not do that by just looking
+at its direct output; instead we have to verify it through indirect output performed on the DOC. A method call does not just return
+a value (if it does), but also invokes methods on DOCs, that have often been injected. See also [dependency injection DI](http://xunitpatterns.com/Dependency%20Injection.html).
+
+This type of testing is called [behavior verification](http://xunitpatterns.com/Behavior%20Verification.html).
+
+In functional programs or mostly functional programs, we don't have these
+kinds of functions. We are in the fortunate position to be able to determine the correctness of a function just by looking
+at the return value of the function. Indirect outputs would normally be side-effects in this context.
+
+That doesn't mean that scheme programs don't have side-effects, but they are rare and generall discouraged.
+That again means, that I have provided a library that eases the testing/development of a small fraction of the code that you typically produce in scheme.
+That's not very useful, is it? So the bad idea here was:
+
+<h3 class="statement">I worked against the language and its characteristics.</h3>
+
+Functional programs aren't about behavior, but rather about values and computation. That doesn't mean that functional systems don't have behavior, but they don't interest us much when we apply tests to the system.
+
+#### Trust
+
+Even I didn't have much trust in it. This may be partly
+due to the messi implementation and partly because there really were things that just didn't work.
+This lowered the overall trust in the library, and trust is an essential property of a tool that you use to make sure
+that your code works. That means that the testing tool needs to work correct and work well.
+
+
+#### Procedure expectations
+
+Since scheme programs are usually not build in a OO fashion with compound objects and all that stuff, I provided a way
+to verify that a certain function has been called. Additionally you could verify that it has been called a certain amount
+of time and and with given arguments. The following is an example of that.
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers srfi-1)
+
+(define (shuffle ls)
+  (let ((len (length ls)))
+    (if (< len 2)
+        ls
+        (let ((item (list-ref ls (random len))))
+          (cons item (shuffle (remove (lambda (i) (equal? i item)) ls)))))))
+
+(define (yodize str)
+   (string-intersperse
+    (shuffle
+      (string-split str " "))))
+
+(context "procedure expectations"
+  (it "calls shuffle once"
+     (expect (yodize "may the force be with you") (to (call shuffle once)))))
+
+{% endhighlight %}
+
+As I pointed out earlier, this can be useful in some situations, but those are rare.
+A point that is irrelevant to the user of the library is that the implementation of procedure expectations is somewhat hacky and brittle.
+It is implemented using a lot of mutation.
+
+
+#### Stubs and mocks
+
+As I explained in behavior verification it is common to introduce test doubles so the library added a possibility to mock procedures.
+Though the general idea might be pleasing the particular implementation didn't feel right.
+I essentially redefined the procedures to have the desired behavior. Also I made again heavy use of the [advice egg](http://wiki.call-cc.org/eggref/4/advice) to do this. See the following example that stubs the result of (car).
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers missbehave-stubs srfi-1)
+(stub! car (returns '()))
+(car (list 1 2 3))
+(car '())
+{% endhighlight %}
+
+Procedure stubs aren't that useful since in functional languages we are more concerned about the outcome, rather than if a procedure
+was invoked. Most likely we will have an interface that accepts a procedure or uses a paramater. For both cases we can
+provide implementations that fit in our tests, without resorting to replacing a function's implementation. That's a natural
+property of higher order functions. Actually it's the defining property of them.
+
+#### Hooks
+
+A key part of the library are contexts. Contexts are a snapshot of the world in a given state. They supported hooks that could
+be used to setup a certain state of the world at a given point in time or rather at a given time inside the test cycle.
+In traditional test frameworks this is were your setup and teardown code resides. The following example illustrates this:
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers missbehave-stubs srfi-1)
+
+(context "context with hooks"
+  (before :each (set! ($ 'answer) 42))
+  (it "should have the answer"
+    (expect ($ 'answer) (to (be 42)))))
+
+{% endhighlight %}
+
+As it turns out, this feature is really bad since it embraces mutable state and what it's even worse, it hides when the mutation happens.
+It's way clearer to just use let-bindings to share values accross examples and use an explicit set! if you must.
+
+#### The runner
+
+This is something that turned out to complicate things. The library comes with a binary that is used to run missbehave tests. This means that
+you can not just run the test file itself using csi or something. That also means that you can't compile your test file. This is really unfortunate
+as the chicken CI expects the tests to work in a certain way and without going through some hoops it was not possible to run missbehave in the
+context of [salmonella](http://tests.call-cc.org/). I added a way to dot that later, as the following example shows:
+
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers missbehave-stubs srfi-1)
+
+(run-specification
+  (call-with-specification
+    (make-empty-specification
+      (lambda ()
+        (it "should work")))))
+
+{% endhighlight %}
+
+Not exactly short, but it did work to some degree. The more problematic part was, again, an implementation defail. I had to go through some hoops
+to make the runner work. It used some hacks in conjunction with eval that I'm not very proud of. You can check the [sourcecode](https://bitbucket.org/certainty/missbehave/src/578b051764092dab0c5bd9c7d66640f44d281c25/behave.scm?at=default#cl-231) if you want to see it.
+
+The last problem is that the way it was designed, it didn't work well (read: "didn't work at all") in the REPL and thus you could
+not use it to throw in some quick verifications to prove that you're on the right track.
+That is really something that is bad for a lisp.
+
+
+
+### Missbehave: the good parts
+
+Now that I've showed you the bad parts, it's time to look at the things that I didn't mess up totally. There are some things that are valuable and
+nice to have. Indeed some of these things will make it into a new library that intents to honor the language more. It's a work in progress, but
+if you're curious you can take a peek at [veritas](https://bitbucket.org/certainty/veritas).
+
+
+#### The matcher abstraction
+
+missbehave introduced i thing called a matcher that was used to verify expectations. A matcher is
+a higher order function that knows how to verify the behavior of the subject that is passed to it.
+Also it knows how to generate messages for the failure and success case.
+Matchers serve two goals.
+
+1. they are a means to extend the test library. That's a very lispy approach as lisp itself is intended to be extended
+  by custom functions that look as if they belong to the lisp/scheme core itself.
+
+2. the shall improve the expressiveness of the test. By creating clever matchers the source code
+  is able to express what happens more clearly, possibly using vocabulary from the problem domain.
+
+The following code snippet shows these matchers and compares it to the equivalent tests using the [test egg](http://wiki.call-cc.org/eggref/4/test).
+
+#### Meta information and filters
+
+The library provided a way to attach meta data to examples and contexts. The user could then use filters to run only examples that
+have corresponding meta data. This is a valuable feature as it gives you fine grained control on which tests are run.
+For example you might have platform dependent tests, that you only want to run on the matching platform. You could tag your tests
+with the OS they support and run them filtered. Another example would be fast and slow tests, where you generally want to run the slow tests
+during CI but not so much during development. I think this is really useful, but it should be opt-in. And it should be orthogonal to the
+other features. In missbehave the syntax for examples and contexts supported a variation that was used to declare metadata.
+In that regard this feature was bound to the syntax of these things. What I want instead is to let this be composable and usable "a la carte".
+That means you want to be able to mix and match contexts and meta-data and examples and meta-data without requiring them to know from each other.
+
+In missbehave it looks something like this:
+
+
+What I currently have in mind for veritas is:
+
+{% highlight scheme %}
+(use veritas)
+
+(meta (os: 'linux pace: 'slow)
+  (verify #t))
+
+(meta (os: 'linux)
+  (context "this is some context"
+    (verify #t)))
+
+{% endhighlight %}
+
+So that's completely orthogonal to the notion and syntax of contexts and examples. Also I want meta data to compose in the way that
+nested meta data "adds up", so that the inner most expression holds the union of all meta data surrounding it.
+
+#### Pending tests
+
+Pending tests are extremely valuable and I don't quite understand why they are not supported by the test egg, or at least not directly.
+As the name suggests you can temporarily disable the execution of tests by marking them pending. The point is that these tests aren't run,
+but they are reported as being pending, so that you know that they are actually there. This means, that you can't accidently forget them.
+In missbehave you can define a pending tests in two ways. The first way is to mark it explicitly as pending as the following example shows:
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers missbehave-stubs )
+
+(describe "Pending"
+ (it "is explicitly pending"
+   (pending)
+   (expect '() (be a number))))
+{% endhighlight %}
+
+As you see you could add a call to pending at any point in the expectation which would make the expectation exit early and skip the
+verification machinery. The second way is to make an example implicitly pending by omitting the body.
+
+{% highlight scheme %}
+(use missbehave missbehave-matchers missbehave-stubs )
+
+  (describe "Pending"
+     (it "is implicitly pending"))
+
+{% endhighlight %}
+
+This is especially nice, if you start by outlining the things you intend to test and then you fill in the actual code.
+This way it's hard to forget some of the tests.
+
+So this is really something that is valuable and will be added to veritas as well, but in a slightly different way.
+Again I want it to be usable a-la-carte and compose well. This is what it will probably look like in veritas:
+
+
+{% highlight scheme %}
+(use veritas)
+
+(pending
+  (context "this is some context"
+    (verify #t)))
+
+(pending "some reason"
+ (verify #f))
+
+{% endhighlight %}
+
+### What now?
+
+As I wrote before, I have learned from my failures and work on a testing library that incorporates the good parts and throws away the bad parts.
+This library will be called veritas and is a work in progress. It will further more encourage the use of quick-check like
+automated value generators as well as using the REPL as a host to run tests interactively. I'll post about it once it's ready.
+
+
+### Wrapup
+
+I hope you enjoyed this little journey through all my failures. It has certainly been a pleasure for me and a healthy way to look at the "monster" I've made.
+I'm sure there is still alot to learn for me and I'm open to it. I want to thank all the helpful people that provided valuable feedback for this post
+and for missbehave. I for one will continue to improve, which means I will continue to fail. Promised! ;)