Posts Tagged ‘ruby’

mock abuse – asserting the implementation

Tuesday, February 3rd, 2009

I use mocks (and stubs and other test doubles) for checking expectations: I’m a “mockist“.  Likewise, a guideline for specs or unit tests is to focus on testing your code (separately from integration tests which might exercise the full stack), and assume that the framework has tested its own code.  However, I think that this guideline is sometimes taken too far, leading to mock abuse.

Here are some guidelines that help me follow another important principle: test the interface, not the implementation.

  • If the test looks like a copy and paste or trivial transformation of the production code, something is probably wrong.
  • “Complex” code should be avoided in specs, especially if it is a reproduction of the production code (e.g. SQL, ActiveRecord::Base#find conditions, and non-trivial regular expressions).
  • I prefer business language examples over technical language modeling (e.g. “belongs_to” or “verifies_format_of“).

Some (bad) examples

As an example of what I’m talking about: Testing Named Scope with shoulda’s should_have_named_scope (also should_belong_to, should_have_one, and should_have_many from shoulda’s ActiveRecord Macros).  The rails documentation for named_scope also suggests that you can test it by looking at the proxy_options method.

What’s wrong with the following?

# from rails rdoc:
expected_options = { :conditions => { :colored => 'red' } }
assert_equal expected_options, Shirt.colored('red').proxy_options
# from shoulda
should_have_named_scope :eighteen,  :conditions => { :age => 18 }
# example of the sort of thing I've seen in codebases I work with
Model.should_receive(:find)
     .with(:all,
           :conditions => ["foo => ? and (bar like ? or blatz in ?)",
                           foo, bar, blatz]).and_return([...])
# I've also seen bastardizations of the above for SQL building
# code which gets piped into a mocked Model.find_by_sql

Should your specs care if those conditions are created as hashes or arrays, or even at all?  From the outside, it is completely unimportant whether it uses named_scope or a conditions hash behind the scenes.   What you should care about is whether or not they return the correct objects, don’t return the wrong objects, and (maybe) stack appropriately with other named scopes or finder methods.

More practically, those conditions hashes qualify as production code that represents the very core of the business logic you are supposed to be testing.  If you expect that {:colored => 'red'}, but the column name is “color” or the value is actually stored in hex RGB, you won’t have a hint that the code is broken.  If you expect that {:age => 18}, your specs won’t remind you that you were actually supposed to be matching 18 and older.  If you expect the wrong SQL, then your specs have lost their power to help you find the right SQL.  You are wide open for letting sloppy bugs through the very specs that should have caught them.

So, these mocks verify expectations that don’t matter much, but they copy and paste the bits that do matter, the parts you really should be testing.   You are gaining “code coverage”, and thus false confidence.  It’s like a weird form of regulatory capture.  However, a few examples of what you should and should not match would clarify many mistakes immediately.  When working with examples, the boundaries that you need to poke often become obvious.

Good examples: Named scope or finder methods

# the following lines would generally be done using factories
good_examples = [FooBar.create!(...),
                 FooBar.create!(...), ...]
bad_examples  = [FooBar.create!(...)
                 FooBar.create!(...), ...]
# now exercise the code under test
results = FooBar.my_spectacular_finder_or_named_scope(...)
# expectations
good_examples.each {|e| results.should     include(e) }
bad_examples. each {|e| results.should_not include(e) }
# or perhaps
results.should == good_examples

This wins in almost every regard over mocking SQL or checking the conditions hash or proxy_options.  It is clear what is being tested and how. We are oriented towards results, not implementation. The examples should make it clear if we are missing anything.

However, it is not as concise, but it doesn’t matter how concise bad specs are.  We could make macros to make the above code more concise, while still retaining the clarity.  It would also be nice to follow the “one assertion per example” rule, so that developers get fine grained failure notifications.

And it may be slower if it hits the DB, but it doesn’t matter how fast bad specs run.  This is a trade-off  of the ActiveRecord ORM style: to test that you wrote the right queries, you need run the queries and verify the results.

Alternatively, this style is well suited to cucumber or fit tables, enabling the subject matter experts to easily contribute their own examples (and detect mistakes that the developer wouldn’t know about).

Good examples: Validations

Sorry to pick on shoulda above… but it was the easiest example to google for.  To balance the scales there, I do like two of the macros shoulda uses for validations: should_allow_values_for and should_not_allow_values_for.  I also wrote up a short little rspec matcher that does about the same thing.

# shoulda's macros
should_not_allow_values_for :phone_number, "abcd", "1234"
should_allow_values_for     :phone_number, "(123) 456-7890"
# my rspec validation matchers
it { should.reject(:phone_number).of "abcd" }
it { should.reject(:phone_number).of "1234" }
it { should.accept(:phone_number).of "(123) 456-7890" }
it { should.accept(:phone_number).of "(123)456-7890" }
it { should.accept(:phone_number).of "123-456-7890" }

Specs that verify that the “appropriate” regular expression was sent to validates_format_of scare me.  Regular expressions aren’t just opaque blobs, they are a DSL, a programming language in their own right.  Regexes are code, and we test code.  You shouldn’t care that a particular regular expression is sent to a particular rails validation macro; you care that your model validations work appropriately, i.e. accept some values and reject others.

When I’ll ignore my own advice

Here are some contraindications:

I just don’t know what to do:  I might be exploring new realms, and feel a bit lost.  Writing some specs, any specs, is a good way to learn, a good way to keep focused while working, and a good way to just get the job done (TATFT).  Even if the specs are no good and tightly coupled to implementation, at least they can act as “change notifiers”.  The next time someone edits the code, they’ll trigger a build failure that will force them to re-evaluate the specs.  Maybe they’ll come at it with more knowledge and improve the situation.

Pair programming significantly reduces the likelihood of this occurring.

too much yak shaving, time to punt:  I know what to do, but it’s too much work.  For example, if you have some models that are so truly heinous that you cannot simply or clearly create them right there in your spec, my first instinct would be to invest some time in my factories.  It’ll pay off.  But that is yak-shaving, and it may be too much for now.  “Emergency mocking” is excusable.  Document your compromise in the spec, and hopefully that technical debt will be repaid in the future.

But excessive mocking can feel like yak-shaving too.  Weigh your options before going the “quick and dirty” route; it might actually be slow and dirty.

debugging and sanity checks:  If I’m trying to figure out why the results aren’t what I expect them to be, I’ll probably throw in some extra assertions about the implementation, as a sanity check.  I might delete them when I’m done or keep them in but document them as special.

There may be other contraindications, but these are what come to my mind.

What do you think?

ruby defined? # => “expression”

Thursday, December 11th, 2008

Ever spend a long time debugging something, only to find out an hour or two later that you were staring at the problem all along, and you should have realized it from the beginning? And then, of course, the solution only requires a tiny tweak to one line, perhaps merely the addition of a few parenthesis.

>> defined? RAILS_ENV
=> "constant"
>> defined? RAILS_ENV && RAILS_ENV == "development"
=> "expression"
>> (defined? RAILS_ENV) && (RAILS_ENV == "development")
=> false

D’oh! (let’s ignore for now the issue of whether or not the “fixed” code is reasonable to run anyway)

I was curious to see if the expression was evaluated, and it appears that it is… sort of:

>> def foo
>>   $stderr.puts ":foo"
>> end
=> nil
>> def bar
>>   $stderr.puts ":bar"
>>   raise "hell"
>> end
=> nil
>> defined? foo
=> "method"
>> defined? bar
=> "method"
>> defined? foo || 1
:foo
=> "expression"
:bar
>> defined? bar || 1
=> nil

Notice: the methods were executed, but the exception from bar was caught and squirreled away somewhere, and defined? the returned nil, even though the method and expression were certainly “defined”.

Times and time zones considered dangerous (especially in tests)

Friday, December 5th, 2008

Just wanted to put out a note of caution to other developers.  Any code (test/spec or production) that depends on global environment settings or non-deterministic functions needs to be treated with fear and trembling.  As a concrete example of this maxim, code that deals with Time, Date, and DateTime has the possibility to trip you up with both global environment and non-determinism.  I recently discovered a potential bug related to this… but it took me 45 minutes to track down… so I wrote up the following afterwards:


nota bene: some of the methods described herein are not defined by Ruby’s core library’s but by Rails’ ActiveSupport.

Any non-determinism in tests is dangerous, and Time.now and its close cousins are certainly on that list.  Whether the code which depends on the current date is in the production code or in the spec code, we’ll need to be very careful of it, lest it trip us up.

I’ve known this for a long time… but I just spent about 45 minutes perplexed with a particularly stupid bug that simply required an attentive eye to unravel… I’ll gladly share my debugging session (condensed for time and minus the aggravation and going in circles), and hopefully save someone else the hassle:

>> DateTime.yesterday < DateTime.now
=> true
>> DateTime.tomorrow < DateTime.now
=> true

Well that shouldn’t ever happen.  Confused yet?  Or do you already know where this is going?  I spent a great deal of time wondering if it was some weird interaction between the Date, Time, and DateTime classes (and trust me, those weird interactions do exist, although maybe they weren’t the cause of this particular bug).

>> DateTime.tomorrow
=> Sun, 24 Aug 2008
>> DateTime.yesterday
=> Fri, 22 Aug 2008
>> DateTime.now
=> Sat, 23 Aug 2008 23:24:35 -0400

Ummm… nothing that would explain the comparison results above… but that’s curious… DateTime.now gives a different inspect output than the others.  I wonder why?

>> DateTime.yesterday.class
=> Date
>> DateTime.tomorrow.class
=> Date
>> DateTime.now.class
=> DateTime

Intriguing.  Certainly not what I expected, given that all of these values were obtained from class methods on DateTime.  Perhaps this means that the comparison requires some coercion to work…  I didn’t investigate the source to see if it uses “to_datetime” as opposed to “to_time”, but I’m assuming that it does, because of the following:

>> DateTime.tomorrow.to_time < DateTime.now.to_time
=> false
>> DateTime.tomorrow.to_datetime < DateTime.now.to_datetime
=> true

So what does Date#to_datetime do to screw us over?

>> DateTime.yesterday.to_datetime
=> Fri, 22 Aug 2008 00:00:00 0000
>> DateTime.tomorrow.to_datetime
=> Sun, 24 Aug 2008 00:00:00 0000
>> DateTime.now.to_datetime
=> Sat, 23 Aug 2008 23:28:11 -0400

Do you see it?  The answer is right there, but it’s easy to miss.  I missed it over and over again for about 40 minutes.  ”-0400” — the timezone.  The timezone had been screwing me up all along.  The DateTime object has my timezone from the get-go, but the Date objects were converted as UTC. Lovely… just lovely.

Now, I wasn’t the original author of the code.  Just an archaeologist.  My guess: the code worked when it was built (during business hours, not in the wee hours of the night when the timezones would trip them up), and it looked like the simplest thing needed.  Generally, you really can’t fault a developer for going that route. I certainly don’t.  But times and dates and especially anything related to the current time and date… well, these are special animals, and need to be dealt with special care.

The long and short of it?  Timezones are global environment state, and you need to be aware of it with any code (especially tests) that deal with time.  Avoid DateTime.yesterday or DateTime.tomorrow like the plague in your test/spec code (unless you enjoy tests that spuriously fail near the witching hour).  You should write your tests (and your production code) in such a way that you can simulate different timezones and different times of day in order to tease out bugs.  And if you need to compare or convert dates and times in your production code, be very aware of what time and timezone your business rules will require, especially if you have any Date objects that might be automatically coerced into DateTime objects.