Posts Tagged ‘rails’

Times and time zones considered dangerous (especially in tests)

Friday, December 5th, 2008

Just wanted to put out a note of caution to other developers.  Any code (test/spec or production) that depends on global environment settings or non-deterministic functions needs to be treated with fear and trembling.  As a concrete example of this maxim, code that deals with Time, Date, and DateTime has the possibility to trip you up with both global environment and non-determinism.  I recently discovered a potential bug related to this… but it took me 45 minutes to track down… so I wrote up the following afterwards:


nota bene: some of the methods described herein are not defined by Ruby’s core library’s but by Rails’ ActiveSupport.

Any non-determinism in tests is dangerous, and Time.now and its close cousins are certainly on that list.  Whether the code which depends on the current date is in the production code or in the spec code, we’ll need to be very careful of it, lest it trip us up.

I’ve known this for a long time… but I just spent about 45 minutes perplexed with a particularly stupid bug that simply required an attentive eye to unravel… I’ll gladly share my debugging session (condensed for time and minus the aggravation and going in circles), and hopefully save someone else the hassle:

>> DateTime.yesterday < DateTime.now
=> true
>> DateTime.tomorrow < DateTime.now
=> true

Well that shouldn’t ever happen.  Confused yet?  Or do you already know where this is going?  I spent a great deal of time wondering if it was some weird interaction between the Date, Time, and DateTime classes (and trust me, those weird interactions do exist, although maybe they weren’t the cause of this particular bug).

>> DateTime.tomorrow
=> Sun, 24 Aug 2008
>> DateTime.yesterday
=> Fri, 22 Aug 2008
>> DateTime.now
=> Sat, 23 Aug 2008 23:24:35 -0400

Ummm… nothing that would explain the comparison results above… but that’s curious… DateTime.now gives a different inspect output than the others.  I wonder why?

>> DateTime.yesterday.class
=> Date
>> DateTime.tomorrow.class
=> Date
>> DateTime.now.class
=> DateTime

Intriguing.  Certainly not what I expected, given that all of these values were obtained from class methods on DateTime.  Perhaps this means that the comparison requires some coercion to work…  I didn’t investigate the source to see if it uses “to_datetime” as opposed to “to_time”, but I’m assuming that it does, because of the following:

>> DateTime.tomorrow.to_time < DateTime.now.to_time
=> false
>> DateTime.tomorrow.to_datetime < DateTime.now.to_datetime
=> true

So what does Date#to_datetime do to screw us over?

>> DateTime.yesterday.to_datetime
=> Fri, 22 Aug 2008 00:00:00 0000
>> DateTime.tomorrow.to_datetime
=> Sun, 24 Aug 2008 00:00:00 0000
>> DateTime.now.to_datetime
=> Sat, 23 Aug 2008 23:28:11 -0400

Do you see it?  The answer is right there, but it’s easy to miss.  I missed it over and over again for about 40 minutes.  ”-0400” — the timezone.  The timezone had been screwing me up all along.  The DateTime object has my timezone from the get-go, but the Date objects were converted as UTC. Lovely… just lovely.

Now, I wasn’t the original author of the code.  Just an archaeologist.  My guess: the code worked when it was built (during business hours, not in the wee hours of the night when the timezones would trip them up), and it looked like the simplest thing needed.  Generally, you really can’t fault a developer for going that route. I certainly don’t.  But times and dates and especially anything related to the current time and date… well, these are special animals, and need to be dealt with special care.

The long and short of it?  Timezones are global environment state, and you need to be aware of it with any code (especially tests) that deal with time.  Avoid DateTime.yesterday or DateTime.tomorrow like the plague in your test/spec code (unless you enjoy tests that spuriously fail near the witching hour).  You should write your tests (and your production code) in such a way that you can simulate different timezones and different times of day in order to tease out bugs.  And if you need to compare or convert dates and times in your production code, be very aware of what time and timezone your business rules will require, especially if you have any Date objects that might be automatically coerced into DateTime objects.

features of rails migrations you should probably use

Saturday, November 29th, 2008

I recently paired with another developer to fix a bug in a rails DB migration.  As we cleaned up the code in order to analyze the bug, we noticed two simple features that were not being used, and the other developer recommended that I write up an email to point these features out to everyone else.  And now I’m cleaning up that email to post here.  Hopefully this helps someone else out.  Both of these (and more) are documented at http://api.rubyonrails.org/classes/ActiveRecord/Migration.html

Cool feature: say_with_time

If you find yourself putting comments around your code to explain to developers what’s going on, please consider instead using “say_with_time“.  Then you can document what is happening both in the code and on the console when the migration is actually running… and you’ll get other nice info printed out (like the elapsed time) as well.

Important feature: ActiveRecord::IrreversableMigration

If the migration cannot safely or easily be migrated downwards, then we need to communicate that clearly to other developers.  But “puts” isn’t good enough.  Instead, “raise ActiveRecord::IrreversableMigration“.

For complicated migrations, even if it is possible to safely reverse the migration, I strongly prefer simply raising the exception.  It’s too easy to make a mistake, and then you’ll have a DB that claims to be at one version, but has corrupted data for that version, which will most likely lead to more pain and suffering down the line.

If the migration is just cleaning up bad data, then there’s probably no real need to reverse it.  But in that case maybe you should at least print out a message to the screen letting the developer know that nothing is happening, and why that is okay.

Since I rarely ever use down migrations, my threshold is probably lower than most; if :Rinvert from rails.vim can’t automatically generate the down migration, then I will probably simply raise the exception.  I’ve personally witnessed too many needless bugs due to corrupted data and too many broken down migrations to invest any significant time into them.  At any rate, developers should use discretion with down migrations.

Oh, and please don’t ever run down migrations in production.  That’s what database backups are for, and you are backing up before you upgrade your production database, aren’t you?

Use the progressbar gem for long running data migrations

And one other thing that is not included with rails that you should probably be using anyway: the progressbar gem.  If you any long running data migrations, this is a must.  And just because it isn’t long running for you with your developer DB doesn’t mean it won’t be long running during deployment to production.  It’s trivially easy to use, and your deployer’s won’t be stuck wondering if their connection has been dropped or the migration has locked up.  And the ETA will let them know if they have time to get a cup of coffee.  The other developers and deployers will thank you.

Simple Example

(albeit, also a poorly contrived example)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
require 'progressbar'
 
class CreateWidgetAuxiliaryFrobs < ActiveRecord::Migration
 
  def self.up
    create_table :widget_auxiliary_frobs do |t|
      t.integer "widget_id"
      t.string  "frob_type"
      t.integer "frobitude"
      # etc...
    end
 
    say_with_time("migrating froms from widgets") do
      widgets = Widget.find(:all)
      pbar = ProgressBar.new("Generating Widget Frobs", widgets.size)
      widgets.each do |w|
       # this code changes the data irreversibly
       # this code can't be (easily) rewritten with a SQL UPDATE or INSERT
       # etc  etc  etc
       pbar.inc
      end
      pbar.finish
    end
 
    say_with_time("delete obsolete widget/wadgit data") do
      Wadget.delete_all("value = 'kerfluffle'")
      remove_column :widget, :foo
      remove_column :widget, :bar_id
      # etc...
    end
  end
 
  def self.down
    raise ActiveRecord::IrreversibleMigration
  end
end

If the dataset is very large

If the dataset is especially large, you’ll want to iterate through it in a less naive manner than I did above: “Widget.find(:all).each“.  At the very least, you’ll want to iterate in such a way that already handled objects can be garbage collected prior to the end of the loop.  This might be necessary to avoiding the dreaded NoMemoryError (or decreased speed due to massive swapping).  This can be handled simply by iterating through the dataset using pagination, but you could also employ a more sophisticated strategy.