Alternative clojure.test Integration With test.check
I’ve enjoyed using test.check lately, but its integration with clojure.test doesn’t fit with how I want to use it very well. In this post we’re going to explore a different approach, which I’m going to try to get into test.chuck.
The Problem
To demonstrate my difficulty, consider these basic tests in clojure.test
:
(deftest integer-facts
(testing "positive"
(is (> 1 0))
(is (> 24 0)))
(testing "negative"
(is (< -1 0))
(is (< -24 0))))
Instead of checking hard-coded integers, I would prefer test.check
to do its magic. To do this using defspec
(the existing integration option), I would have to do something like this:
(defspec positive-integer-facts 100
(prop/for-all [i gen/s-pos-int]
(> i 0)))
(defspec negative-integer-facts 100
(prop/for-all [i gen/s-neg-int]
(< i 0)))
There’s a lot of boilerplate here, and it’s not communicating that these two properties are related. Alternatively, I could join everything into one big condition:
(defspec integer-facts 100
(prop/for-all [p gen/s-pos-int
n gen/s-neg-int]
(and (> p 0)
(< n 0))))
This is also unsatisfying, as I wouldn’t be able to tell easily what failed.
Setting aside that issue for now, let’s look at the output:
lein test checking.core-test
{:test-var "positive-integer-facts", :result true, :num-tests 100, :seed 1422280152456}
{:test-var "negative-integer-facts", :result true, :num-tests 100, :seed 1422280152465}
Ran 2 tests containing 2 assertions.
0 failures, 0 errors.
Unlike deftest
, defspec
prints a report even when it passes. If I start using properties liberally my test output will quickly get too noisy.
The transition path is also difficult. Look at the original deftest
code and notice that moving to defspec
feels like a complete rewrite, as opposed to an upgrade from hard-coded to generated values.
Also, consider the case where we’re performing multiple assertions stepping through a stateful piece of code.
(deftest counter
(testing "increasing"
(let [c (atom 2)]
(swap! c inc)
(is (= @c (inc 2)))
(swap! c inc)
(is (> @c 0)))))
Moving this to defspec
can be tricky:
(defspec increasing-counter 100
(prop/for-all [i gen/s-pos-int]
(let [c (atom i)]
(swap! c inc)
(let [single-inc @c]
(swap! c inc)
(and (= single-inc (inc i))
(> @c 0))))))
The Alternative
What I’d really like to say is exactly the same thing as the original deftest
code, with just a little bit of variation for the generated values:
(deftest integer-facts
(checking "positive" [i gen/s-pos-int]
(is (> i 0)))
(checking "negative" [i gen/s-neg-int]
(is (< i 0))))
(deftest counter
(checking "increasing" [i gen/s-pos-int]
(let [c (atom i)]
(swap! c inc)
(is (= @c (inc i)))
(swap! c inc)
(is (> @c 0)))))
Notice how simple it is to move from testing
to checking
. With defspec
the code screams generative testing and mentions the assertions as an aside. With checking
generative testing becomes the aside and the assertions becomes the focus.
Naive Implementation
A first pass on checking
looks something like this:
(defmacro checking [name bindings & body]
`(testing ~name
(tc/quick-check 100
(prop/for-all ~bindings ~@body))))
This works because the is
macro completely bypasses the tc/quick-check
construct. While this gets us limping along, there are a couple problems.
First, let’s force a failure:
(deftest failure
(checking "incorrect" [i gen/pos-int]
(is (< i 50))))
Now look at the output for this failure:
FAIL in (failure) (core_test.clj:64)
incorrect
expected: (< i 50)
actual: (not (< 55 50))
lein test :only checking.core-test/failure
FAIL in (failure) (core_test.clj:64)
incorrect
expected: (< i 50)
actual: (not (< 52 50))
lein test :only checking.core-test/failure
...
We’re seeing a failure for every attempt when tc/quick-check
tries to narrow down to the smallest failure. All we really want to know about is the result of this search.
The other problem is that tc/quick-check
only traces when the final sexp is false:
(deftest unsearched-failure
(checking "incorrect" [i gen/pos-int]
(is (< i 50))
(is (= i i))))
Which becomes obvious in the test output:
FAIL in (unsearched-failure) (core_test.clj:68)
incorrect
expected: (< i 50)
actual: (not (< 57 50))
lein test :only checking.core-test/unsearched-failure
FAIL in (unsearched-failure) (core_test.clj:68)
incorrect
expected: (< i 50)
actual: (not (< 61 50))
...
lein test :only checking.core-test/unsearched-failure
FAIL in (unsearched-failure) (core_test.clj:68)
incorrect
expected: (< i 50)
actual: (not (< 90 50))
Intercepting Reporting
We’re seeing the failure, but getting a lot of noise as well. What we really need here is an alternative test-reporting framework that can be nested within the checking macro. Fortunately, clojure.test
is designed to replace that framework by overriding the report multimethod. Let’s look at what one of these reports looks like:
user> (let [result (atom nil)] (with-redefs [clojure.test/report #(reset! result %)]
(is (= 1 2))) @result)
{:message nil, :actual (not (= 1 2)), :expected (= 1 2), :type :fail,
:file "form-init7960148245055492106.clj", :line 1}
user> (let [result (atom nil)] (with-redefs [clojure.test/report #(reset! result %)]
(is (= 1 1))) @result)
{:type :pass, :expected (= 1 1), :actual (#<core$_EQ_ clojure.core$_EQ_@39877a52> 1 1),
:message nil}
user> (let [result (atom nil)] (with-redefs [clojure.test/report #(reset! result %)]
(is (/ 1 0))) @result)
{:message nil, :actual #<ArithmeticException java.lang.ArithmeticException: Divide by zero>,
:expected (/ 1 0), :type :error, :file "Numbers.java", :line 156}
So the condition quick-check
is looking for is actually if :type is not :pass. If we could catch that investigations would work correctly:
(defmacro checking [name bindings & body]
`(testing ~name
(is (:result (tc/quick-check 100
(prop/for-all ~bindings
(let [pass# (atom true)]
(with-redefs [report #(swap! pass#
(fn [r#] (and r# (= (:type %) :pass))))]
~@body)
@pass#)))))))
Output:
FAIL in (searched-failure) (core_test.clj:63)
incorrect
expected: (:result (clojure.test.check/quick-check 100 (clojure.test.check.properties/for-all [i gen/pos-int] (clojure.core/let [pass__616__auto__ (clojure.core/atom true)] (clojure.core/with-redefs [clojure.test/report (fn* [p1__615__617__auto__] (clojure.core/swap! pass__616__auto__ (clojure.core/fn [r__618__auto__] (clojure.core/and r__618__auto__ (clojure.core/= (:type p1__615__617__auto__) :pass)))))] (is (< i 50))) (clojure.core/deref pass__616__auto__)))))
actual: false
Tracking Reports
This guarantees that the investigation happens, but we’ve lost the individual assertions in the process. What we need is a way to get at only those reports which were generated in the final failing execution. There’s not a mechanism for passing information out of a failure, so we’ll have to use a closure with some state to simulate the effect:
(defmacro checking [name bindings & body]
`(testing ~name
(let [final-reports# (atom [])]
(tc/quick-check 100
(prop/for-all ~bindings
(let [reports# (atom [])]
(with-redefs [report #(swap! reports# conj %)]
~@body)
(let [pass# (every? #(= (:type %) :pass) @reports#)]
(when (or (not pass#) (empty? @final-reports#))
(reset! final-reports# @reports#))
pass#))))
(doseq [r# @final-reports#]
(report r#)))))
And the output:
lein test :only checking.core-test/unsearched-failure
FAIL in (unsearched-failure) (core_test.clj:68)
incorrect
expected: (< i 50)
actual: (not (< 50 50))
Including Results
So things are functioning correctly, and the failures are easy to read. In more complex scenarios the value may have been transformed heavily before the assertion is made, in which case we want to present the tc/quick-check
return value.
(defmacro checking [name bindings & body]
`(testing ~name
(let [final-reports# (atom [])]
(let [result# (tc/quick-check 100
(prop/for-all ~bindings
(let [reports# (atom [])]
(with-redefs [report #(swap! reports# conj %)]
~@body)
(let [pass# (every? #(= (:type %) :pass) @reports#)]
(when (or (not pass#) (empty? @final-reports#))
(reset! final-reports# @reports#))
pass#))))]
(is (:result result#) result#))
(doseq [r# @final-reports#]
(report r#)))))
Output:
lein test :only checking.core-test/unsearched-failure
FAIL in (unsearched-failure) (core_test.clj:67)
incorrect
{:result false, :seed 1422311111994, :failing-size 58, :num-tests 59, :fail [55],
:shrunk {:total-nodes-visited 23, :depth 3, :result false, :smallest [50]}}
expected: (:result result__618__auto__)
actual: false
lein test :only checking.core-test/unsearched-failure
FAIL in (unsearched-failure) (core_test.clj:68)
incorrect
expected: (< i 50)
actual: (not (< 50 50))
Cleaning Up
I’m happy with how things are reported now, but the code is a pretty big mess. Decomposing the logic should help with that:
(defn report-when-failing [result]
(is (:result result) result))
(defmacro capture-reports [body]
`(let [reports# (atom [])]
(with-redefs [report #(swap! reports# conj %)]
~@body)
@reports#))
(defn pass? [reports]
(every? #(= (:type %) :pass) reports))
(defn report-needed? [reports final-reports]
(or (not (pass? reports)) (empty? final-reports)))
(defn save-to-final-reports [reports final-reports]
(when (report-needed? reports @final-reports)
(reset! final-reports reports)))
(defmacro checking [name bindings & body]
`(testing ~name
(let [final-reports# (atom [])]
(report-when-failing (tc/quick-check 100
(prop/for-all ~bindings
(let [reports# (capture-reports ~body)]
(save-to-final-reports reports# final-reports#)
(pass? reports#)))))
(doseq [r# @final-reports#]
(report r#)))))
Number of Tests
One final touch is that the number of tests really shouldn’t be hard-coded. This adds to the checking
footprint slightly, but that seems worth it to simplify our ability to control the number of tests being run.
(defn report-when-failing [result]
(is (:result result) result))
(defmacro capture-reports [body]
`(let [reports# (atom [])]
(with-redefs [report #(swap! reports# conj %)]
~@body)
@reports#))
(defn pass? [reports]
(every? #(= (:type %) :pass) reports))
(defn report-needed? [reports final-reports]
(or (not (pass? reports)) (empty? final-reports)))
(defn save-to-final-reports [reports final-reports]
(when (report-needed? reports @final-reports)
(reset! final-reports reports)))
(defmacro checking [name tests bindings & body]
`(testing ~name
(let [final-reports# (atom [])]
(report-when-failing (tc/quick-check ~tests
(prop/for-all ~bindings
(let [reports# (capture-reports ~body)]
(save-to-final-reports reports# final-reports#)
(pass? reports#)))))
(doseq [r# @final-reports#]
(report r#)))))
Update
The macro has been accepted to test.chuck in release 0.1.12. Gary Fredericks helped me work out some concurrency problems with the above implementation.
First, with-redefs
rebinds things globally, which means that we can end up saving to the wrong atom. It’s much better to use binding
:
(defmacro capture-reports [body]
`(let [reports# (atom [])]
(binding [report #(swap! reports# conj %)]
~@body)
@reports#))
Second, using reset!
in save-to-final-reports
causes a race condition between checking the condition and assigning to the atom. To get around this we can rearrange the arguments of save-to-final-reports
and call swap!
instead:
(defn save-to-final-reports [final-reports reports]
(if (report-needed? reports final-reports)
reports
final-reports))
(defmacro checking
[name tests bindings & body]
`(testing ~name
(let [final-reports# (atom [])]
(report-when-failing (tc/quick-check ~tests
(prop/for-all ~bindings
(let [reports# (capture-reports ~body)]
(swap! final-reports# save-to-final-reports reports#)
(pass? reports#)))))
(doseq [r# @final-reports#]
(report r#)))))