Posts Tagged ‘database’

features of rails migrations you should probably use

Saturday, November 29th, 2008

I recently paired with another developer to fix a bug in a rails DB migration.  As we cleaned up the code in order to analyze the bug, we noticed two simple features that were not being used, and the other developer recommended that I write up an email to point these features out to everyone else.  And now I’m cleaning up that email to post here.  Hopefully this helps someone else out.  Both of these (and more) are documented at http://api.rubyonrails.org/classes/ActiveRecord/Migration.html

Cool feature: say_with_time

If you find yourself putting comments around your code to explain to developers what’s going on, please consider instead using “say_with_time“.  Then you can document what is happening both in the code and on the console when the migration is actually running… and you’ll get other nice info printed out (like the elapsed time) as well.

Important feature: ActiveRecord::IrreversableMigration

If the migration cannot safely or easily be migrated downwards, then we need to communicate that clearly to other developers.  But “puts” isn’t good enough.  Instead, “raise ActiveRecord::IrreversableMigration“.

For complicated migrations, even if it is possible to safely reverse the migration, I strongly prefer simply raising the exception.  It’s too easy to make a mistake, and then you’ll have a DB that claims to be at one version, but has corrupted data for that version, which will most likely lead to more pain and suffering down the line.

If the migration is just cleaning up bad data, then there’s probably no real need to reverse it.  But in that case maybe you should at least print out a message to the screen letting the developer know that nothing is happening, and why that is okay.

Since I rarely ever use down migrations, my threshold is probably lower than most; if :Rinvert from rails.vim can’t automatically generate the down migration, then I will probably simply raise the exception.  I’ve personally witnessed too many needless bugs due to corrupted data and too many broken down migrations to invest any significant time into them.  At any rate, developers should use discretion with down migrations.

Oh, and please don’t ever run down migrations in production.  That’s what database backups are for, and you are backing up before you upgrade your production database, aren’t you?

Use the progressbar gem for long running data migrations

And one other thing that is not included with rails that you should probably be using anyway: the progressbar gem.  If you any long running data migrations, this is a must.  And just because it isn’t long running for you with your developer DB doesn’t mean it won’t be long running during deployment to production.  It’s trivially easy to use, and your deployer’s won’t be stuck wondering if their connection has been dropped or the migration has locked up.  And the ETA will let them know if they have time to get a cup of coffee.  The other developers and deployers will thank you.

Simple Example

(albeit, also a poorly contrived example)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
require 'progressbar'
 
class CreateWidgetAuxiliaryFrobs < ActiveRecord::Migration
 
  def self.up
    create_table :widget_auxiliary_frobs do |t|
      t.integer "widget_id"
      t.string  "frob_type"
      t.integer "frobitude"
      # etc...
    end
 
    say_with_time("migrating froms from widgets") do
      widgets = Widget.find(:all)
      pbar = ProgressBar.new("Generating Widget Frobs", widgets.size)
      widgets.each do |w|
       # this code changes the data irreversibly
       # this code can't be (easily) rewritten with a SQL UPDATE or INSERT
       # etc  etc  etc
       pbar.inc
      end
      pbar.finish
    end
 
    say_with_time("delete obsolete widget/wadgit data") do
      Wadget.delete_all("value = 'kerfluffle'")
      remove_column :widget, :foo
      remove_column :widget, :bar_id
      # etc...
    end
  end
 
  def self.down
    raise ActiveRecord::IrreversibleMigration
  end
end

If the dataset is very large

If the dataset is especially large, you’ll want to iterate through it in a less naive manner than I did above: “Widget.find(:all).each“.  At the very least, you’ll want to iterate in such a way that already handled objects can be garbage collected prior to the end of the loop.  This might be necessary to avoiding the dreaded NoMemoryError (or decreased speed due to massive swapping).  This can be handled simply by iterating through the dataset using pagination, but you could also employ a more sophisticated strategy.