~michaelforrest/use-case-mapper/trunk

Migrations are a convenient way for you to alter your database in a structured and organised manner. You could edit fragments of SQL by hand but you would then be responsible for telling other developers that they need to go and run it. You'd also have to keep track of which changes need to be run against the production machines next time you deploy.

Active Record tracks which migrations have already been run so all you have to do is update your source and run +rake db:migrate+. Active Record will work out which migrations should be run. It will also update your +db/schema.rb+ file to match the structure of your database.

Migrations also allow you to describe these transformations using Ruby. The great thing about this is that (like most of Active Record's functionality) it is database independent: you don't need to worry about the precise syntax of +CREATE TABLE+ any more that you worry about variations on +SELECT *+ (you can drop down to raw SQL for database specific features). For example you could use SQLite3 in development, but MySQL in production.

You'll learn all about migrations including:

* The generators you can use to create them

* The methods Active Record provides to manipulate your database

* The Rake tasks that manipulate them

* How they relate to +schema.rb+

endprologue.

h3. Anatomy of a Migration

Before I dive into the details of a migration, here are a few examples of the sorts of things you can do:

<ruby>

class CreateProducts < ActiveRecord::Migration

def self.up

create_table :products do |t|

t.string :name

t.text :description

t.timestamps

end

def self.down

drop_table :products

end

</ruby>

This migration adds a table called +products+ with a string column called +name+ and a text column called +description+. A primary key column called +id+ will also be added, however since this is the default we do not need to ask for this. The timestamp columns +created_at+ and +updated_at+ which Active Record populates automatically will also be added. Reversing this migration is as simple as dropping the table.

Migrations are not limited to changing the schema. You can also use them to fix bad data in the database or populate new fields:

<ruby>

class AddReceiveNewsletterToUsers < ActiveRecord::Migration

def self.up

change_table :users do |t|

t.boolean :receive_newsletter, :default => false

end

User.update_all ["receive_newsletter = ?", true]

end

def self.down

remove_column :users, :receive_newsletter

end

</ruby>

This migration adds a +receive_newsletter+ column to the +users+ table. We want it to default to +false+ for new users, but existing users are considered

to have already opted in, so we use the User model to set the flag to +true+ for existing users.

NOTE: Some "caveats":#using-models-in-your-migrations apply to using models in your migrations.

h4. Migrations are Classes

A migration is a subclass of <tt>ActiveRecord::Migration</tt> that implements two class methods: +up+ (perform the required transformations) and +down+ (revert them).

Active Record provides methods that perform common data definition tasks in a database independent way (you'll read about them in detail later):

* +create_table+

* +change_table+

* +drop_table+

* +add_column+

* +change_column+

* +rename_column+

* +remove_column+

* +add_index+

* +remove_index+

If you need to perform tasks specific to your database (for example create a "foreign key":#active-record-and-referential-integrity constraint) then the +execute+ function allows you to execute arbitrary SQL. A migration is just a regular Ruby class so you're not limited to these functions. For example after adding a column you could write code to set the value of that column for existing records (if necessary using your models).

On databases that support transactions with statements that change the schema (such as PostgreSQL), migrations are wrapped in a transaction. If the database does not support this (for example MySQL and SQLite) then when a migration fails the parts of it that succeeded will not be rolled back. You will have to unpick the changes that were made by hand.

h4. What's in a Name

Migrations are stored in files in +db/migrate+, one for each migration class. The name of the file is of the form +YYYYMMDDHHMMSS_create_products.rb+, that is to say a UTC timestamp identifying the migration followed by an underscore followed by the name of the migration. The migration class' name must match (the camelcased version of) the latter part of the file name. For example +20080906120000_create_products.rb+ should define +CreateProducts+ and +20080906120001_add_details_to_products.rb+ should define +AddDetailsToProducts+. If you do feel the need to change the file name then you <em>have to</em> update the name of the class inside or Rails will complain about a missing class.

Internally Rails only uses the migration's number (the timestamp) to identify them. Prior to Rails 2.1 the migration number started at 1 and was incremented each time a migration was generated. With multiple developers it was easy for these to clash requiring you to rollback migrations and renumber them. With Rails 2.1 this is largely avoided by using the creation time of the migration to identify them. You can revert to the old numbering scheme by setting +config.active_record.timestamped_migrations+ to +false+ in +config/environment.rb+.

The combination of timestamps and recording which migrations have been run allows Rails to handle common situations that occur with multiple developers.

For example Alice adds migrations +20080906120000+ and +20080906123000+ and Bob adds +20080906124500+ and runs it. Alice finishes her changes and checks in her migrations and Bob pulls down the latest changes. Rails knows that it has not run Alice's two migrations so +rake db:migrate+ would run them (even though Bob's migration with a later timestamp has been run), and similarly migrating down would not run their +down+ methods.

Of course this is no substitution for communication within the team. For example, if Alice's migration removed a table that Bob's migration assumed to exist, then trouble would certainly strike.

h4. Changing Migrations

Occasionally you will make a mistake when writing a migration. If you have already run the migration then you cannot just edit the migration and run the migration again: Rails thinks it has already run the migration and so will do nothing when you run +rake db:migrate+. You must rollback the migration (for example with +rake db:rollback+), edit your migration and then run +rake db:migrate+ to run the corrected version.

In general editing existing migrations is not a good idea: you will be creating extra work for yourself and your co-workers and cause major headaches if the existing version of the migration has already been run on production machines. Instead you should write a new migration that performs the changes you require. Editing a freshly generated migration that has not yet been committed to source control (or more generally which has not been propagated beyond your development machine) is relatively harmless. Just use some common sense.

100

101

h3. Creating a Migration

102

103

h4. Creating a Model

104

105

The model and scaffold generators will create migrations appropriate for adding a new model. This migration will already contain instructions for creating the relevant table. If you tell Rails what columns you want then statements for adding those will also be created. For example, running

106

107

<shell>

108

ruby script/generate model Product name:string description:text

109

</shell>

110

111

will create a migration that looks like this

112

113

<ruby>

114

class CreateProducts < ActiveRecord::Migration

115

def self.up

116

create_table :products do |t|

117

t.string :name

118

t.text :description

119

120

t.timestamps

121

end

122

end

123

124

def self.down

125

drop_table :products

126

end

127

end

128

</ruby>

129

130

You can append as many column name/type pairs as you want. By default +t.timestamps+ (which creates the +updated_at+ and +created_at+ columns that

131

are automatically populated by Active Record) will be added for you.

132

133

h4. Creating a Standalone Migration

134

135

If you are creating migrations for other purposes (for example to add a column to an existing table) then you can use the migration generator:

136

137

<shell>

138

ruby script/generate migration AddPartNumberToProducts

139

</shell>

140

141

This will create an empty but appropriately named migration:

142

143

<ruby>

144

class AddPartNumberToProducts < ActiveRecord::Migration

145

def self.up

146

end

147

148

def self.down

149

end

150

end

151

</ruby>

152

153

If the migration name is of the form "AddXXXToYYY" or "RemoveXXXFromYYY" and is followed by a list of column names and types then a migration containing the appropriate +add_column+ and +remove_column+ statements will be created.

154

155

<shell>

156

ruby script/generate migration AddPartNumberToProducts part_number:string

157

</shell>

158

159

will generate

160

161

<ruby>

162

class AddPartNumberToProducts < ActiveRecord::Migration

163

def self.up

164

add_column :products, :part_number, :string

165

end

166

167

def self.down

168

remove_column :products, :part_number

169

end

170

end

171

</ruby>

172

173

Similarly,

174

175

<shell>

176

ruby script/generate migration RemovePartNumberFromProducts part_number:string

177

</shell>

178

179

generates

180

181

<ruby>

182

class RemovePartNumberFromProducts < ActiveRecord::Migration

183

def self.up

184

remove_column :products, :part_number

185

end

186

187

def self.down

188

add_column :products, :part_number, :string

189

end

190

end

191

</ruby>

192

193

You are not limited to one magically generated column, for example

194

195

<shell>

196

ruby script/generate migration AddDetailsToProducts part_number:string price:decimal

197

</shell>

198

199

generates

200

201

<ruby>

202

class AddDetailsToProducts < ActiveRecord::Migration

203

def self.up

204

add_column :products, :part_number, :string

205

add_column :products, :price, :decimal

206

end

207

208

def self.down

209

remove_column :products, :price

210

remove_column :products, :part_number

211

end

212

end

213

</ruby>

214

215

As always, what has been generated for you is just a starting point. You can add or remove from it as you see fit.

216

217

h3. Writing a Migration

218

219

Once you have created your migration using one of the generators it's time to get to work!

220

221

h4. Creating a Table

222

223

Migration method +create_table+ will be one of your workhorses. A typical use would be

224

225

<ruby>

226

create_table :products do |t|

227

t.string :name

228

end

229

</ruby>

230

231

which creates a +products+ table with a column called +name+ (and as discussed below, an implicit +id+ column).

232

233

The object yielded to the block allows you create columns on the table. There are two ways of doing this: The first (traditional) form looks like

234

235

<ruby>

236

create_table :products do |t|

237

t.column :name, :string, :null => false

238

end

239

</ruby>

240

241

the second form, the so called "sexy" migration, drops the somewhat redundant +column+ method. Instead, the +string+, +integer+, etc. methods create a column of that type. Subsequent parameters are the same.

242

243

<ruby>

244

create_table :products do |t|

245

t.string :name, :null => false

246

end

247

</ruby>

248

249

By default +create_table+ will create a primary key called +id+. You can change the name of the primary key with the +:primary_key+ option (don't forget to update the corresponding model) or if you don't want a primary key at all (for example for a HABTM join table) you can pass +:id => false+. If you need to pass database specific options you can place an SQL fragment in the +:options+ option. For example

250

251

<ruby>

252

create_table :products, :options => "ENGINE=BLACKHOLE" do |t|

253

t.string :name, :null => false

254

end

255

</ruby>

256

257

will append +ENGINE=BLACKHOLE+ to the SQL statement used to create the table (when using MySQL the default is +ENGINE=InnoDB+).

258

259

The types supported by Active Record are +:primary_key+, +:string+, +:text+, +:integer+, +:float+, +:decimal+, +:datetime+, +:timestamp+, +:time+, +:date+, +:binary+, +:boolean+.

260

261

These will be mapped onto an appropriate underlying database type, for example with MySQL +:string+ is mapped to +VARCHAR(255)+. You can create columns of types not supported by Active Record when using the non-sexy syntax, for example

262

263

<ruby>

264

create_table :products do |t|

265

t.column :name, 'polygon', :null => false

266

end

267

</ruby>

268

269

This may however hinder portability to other databases.

270

271

h4. Changing Tables

272

273

A close cousin of +create_table+ is +change_table+, used for changing existing tables. It is used in a similar fashion to +create_table+ but the object yielded to the block knows more tricks. For example

274

275

<ruby>

276

change_table :products do |t|

277

t.remove :description, :name

278

t.string :part_number

279

t.index :part_number

280

t.rename :upccode, :upc_code

281

end

282

</ruby>

283

removes the +description+ and +name+ columns, creates a +part_number+ column and adds an index on it. Finally it renames the +upccode+ column. This is the same as doing

284

285

<ruby>

286

remove_column :products, :description

287

remove_column :products, :name

288

add_column :products, :part_number, :string

289

add_index :products, :part_number

290

rename_column :products, :upccode, :upc_code

291

</ruby>

292

293

You don't have to keep repeating the table name and it groups all the statements related to modifying one particular table. The individual transformation names are also shorter, for example +remove_column+ becomes just +remove+ and +add_index+ becomes just +index+.

294

295

h4. Special Helpers

296

297

Active Record provides some shortcuts for common functionality. It is for example very common to add both the +created_at+ and +updated_at+ columns and so there is a method that does exactly that:

298

299

<ruby>

300

create_table :products do |t|

301

t.timestamps

302

end

303

</ruby>

304

will create a new products table with those two columns (plus the +id+ column) whereas

305

306

<ruby>

307

change_table :products do |t|

308

t.timestamps

309

end

310

</ruby>

311

adds those columns to an existing table.

312

313

The other helper is called +references+ (also available as +belongs_to+). In its simplest form it just adds some readability

314

315

<ruby>

316

create_table :products do |t|

317

t.references :category

318

end

319

</ruby>

320

321

will create a +category_id+ column of the appropriate type. Note that you pass the model name, not the column name. Active Record adds the +_id+ for you. If you have polymorphic +belongs_to+ associations then +references+ will add both of the columns required:

322

323

<ruby>

324

create_table :products do |t|

325

t.references :attachment, :polymorphic => {:default => 'Photo'}

326

end

327

</ruby>

328

will add an +attachment_id+ column and a string +attachment_type+ column with a default value of 'Photo'.

329

330

NOTE: The +references+ helper does not actually create foreign key constraints for you. You will need to use +execute+ for that or a plugin that adds "foreign key support":#active-record-and-referential-integrity.

331

332

If the helpers provided by Active Record aren't enough you can use the +execute+ function to execute arbitrary SQL.

333

334

For more details and examples of individual methods check the API documentation, in particular the documentation for "<tt>ActiveRecord::ConnectionAdapters::SchemaStatements</tt>":http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html (which provides the methods available in the +up+ and +down+ methods), "<tt>ActiveRecord::ConnectionAdapters::TableDefinition</tt>":http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/TableDefinition.html (which provides the methods available on the object yielded by +create_table+) and "<tt>ActiveRecord::ConnectionAdapters::Table</tt>":http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/Table.html (which provides the methods available on the object yielded by +change_table+).

335

336

h4. Writing Your +down+ Method

337

338

The +down+ method of your migration should revert the transformations done by the +up+ method. In other words the database schema should be unchanged if you do an +up+ followed by a +down+. For example if you create a table in the +up+ method you should drop it in the +down+ method. It is wise to do things in precisely the reverse order to in the +up+ method. For example

339

340

<ruby>

341

class ExampleMigration < ActiveRecord::Migration

342

343

def self.up

344

create_table :products do |t|

345

t.references :category

346

end

347

#add a foreign key

348

execute <<-SQL

349

ALTER TABLE products

350

ADD CONSTRAINT fk_products_categories

351

FOREIGN KEY (category_id)

352

REFERENCES categories(id)

353

SQL

354

355

add_column :users, :home_page_url, :string

356

357

rename_column :users, :email, :email_address

358

end

359

360

def self.down

361

rename_column :users, :email_address, :email

362

remove_column :users, :home_page_url

363

execute "ALTER TABLE products DROP FOREIGN KEY fk_products_categories"

364

drop_table :products

365

end

366

end

367

</ruby>

368

Sometimes your migration will do something which is just plain irreversible, for example it might destroy some data. In cases like those when you can't reverse the migration you can raise +IrreversibleMigration+ from your +down+ method. If someone tries to revert your migration an error message will be

369

displayed saying that it can't be done.

370

371

372

h3. Running Migrations

373

374

Rails provides a set of rake tasks to work with migrations which boils down to running certain sets of migrations. The very first migration related rake task you use will probably be +db:migrate+. In its most basic form it just runs the +up+ method for all the migrations that have not yet been run. If there are no such migrations it exits.

375

376

Note that running the +db:migrate+ also invokes the +db:schema:dump+ task, which will update your db/schema.rb file to match the structure of your database.

377

378

If you specify a target version, Active Record will run the required migrations (up or down) until it has reached the specified version. The

379

version is the numerical prefix on the migration's filename. For example to migrate to version 20080906120000 run

380

381

<shell>

382

rake db:migrate VERSION=20080906120000

383

</shell>

384

385

If this is greater than the current version (i.e. it is migrating upwards) this will run the +up+ method on all migrations up to and including 20080906120000, if migrating downwards this will run the +down+ method on all the migrations down to, but not including, 20080906120000.

386

387

h4. Rolling Back

388

389

A common task is to rollback the last migration, for example if you made a mistake in it and wish to correct it. Rather than tracking down the version number associated with the previous migration you can run

390

391

<shell>

392

rake db:rollback

393

</shell>

394

395

This will run the +down+ method from the latest migration. If you need to undo several migrations you can provide a +STEP+ parameter:

396

397

<shell>

398

rake db:rollback STEP=3

399

</shell>

400

401

will run the +down+ method from the last 3 migrations.

402

403

The +db:migrate:redo+ task is a shortcut for doing a rollback and then migrating back up again. As with the +db:rollback+ task you can use the +STEP+ parameter if you need to go more than one version back, for example

404

405

<shell>

406

rake db:migrate:redo STEP=3

407

</shell>

408

409

Neither of these Rake tasks do anything you could not do with +db:migrate+, they are simply more convenient since you do not need to explicitly specify the version to migrate to.

410

411

Lastly, the +db:reset+ task will drop the database, recreate it and load the current schema into it.

412

413

NOTE: This is not the same as running all the migrations - see the section on "schema.rb":#schema-dumping-and-you.

414

415

h4. Being Specific

416

417

If you need to run a specific migration up or down the +db:migrate:up+ and +db:migrate:down+ tasks will do that. Just specify the appropriate version and the corresponding migration will have its +up+ or +down+ method invoked, for example

418

419

<shell>

420

rake db:migrate:up VERSION=20080906120000

421

</shell>

422

423

will run the +up+ method from the 20080906120000 migration. These tasks check whether the migration has already run, so for example +db:migrate:up VERSION=20080906120000+ will do nothing if Active Record believes that 20080906120000 has already been run.

424

425

h4. Being Talkative

426

427

By default migrations tell you exactly what they're doing and how long it took. A migration creating a table and adding an index might produce output like this

428

429

<shell>

430

20080906170109 CreateProducts: migrating

431

-- create_table(:products)

432

-> 0.0021s

433

-- add_index(:products, :name)

434

-> 0.0026s

435

20080906170109 CreateProducts: migrated (0.0059s)

436

</shell>

437

438

Several methods are provided that allow you to control all this:

439

440

* +suppress_messages+ suppresses any output generated by its block

441

* +say+ outputs text (the second argument controls whether it is indented or not)

442

* +say_with_time+ outputs text along with how long it took to run its block. If the block returns an integer it assumes it is the number of rows affected.

443

444

For example, this migration

445

446

<ruby>

447

class CreateProducts < ActiveRecord::Migration

448

def self.up

449

suppress_messages do

450

create_table :products do |t|

451

t.string :name

452

t.text :description

453

t.timestamps

454

end

455

end

456

say "Created a table"

457

suppress_messages {add_index :products, :name}

458

say "and an index!", true

459

say_with_time 'Waiting for a while' do

460

sleep 10

461

250

462

end

463

end

464

465

def self.down

466

drop_table :products

467

end

468

end

469

</ruby>

470

471

generates the following output

472

473

<shell>

474

20080906170109 CreateProducts: migrating

475

Created a table

476

-> and an index!

477

Waiting for a while

478

-> 10.0001s

479

-> 250 rows

480

20080906170109 CreateProducts: migrated (10.0097s)

481

</shell>

482

483

If you just want Active Record to shut up then running +rake db:migrate VERBOSE=false+ will suppress any output.

484

485

h3. Using Models in Your Migrations

486

487

When creating or updating data in a migration it is often tempting to use one of your models. After all they exist to provide easy access to the underlying data. This can be done but some caution should be observed.

488

489

Consider for example a migration that uses the +Product+ model to update a row in the corresponding table. Alice later updates the +Product+ model, adding a new column and a validation on it. Bob comes back from holiday, updates the source and runs outstanding migrations with +rake db:migrate+, including the one that used the +Product+ model. When the migration runs the source is up to date and so the +Product+ model has the validation added by Alice. The database however is still old and so does not have that column and an error ensues because that validation is on a column that does not yet exist.

490

491

Frequently I just want to update rows in the database without writing out the SQL by hand: I'm not using anything specific to the model. One pattern for this is to define a copy of the model inside the migration itself, for example:

492

493

<ruby>

494

class AddPartNumberToProducts < ActiveRecord::Migration

495

class Product < ActiveRecord::Base

496

end

497

498

def self.up

499

...

500

end

501

502

def self.down

503

...

504

end

505

end

506

</ruby>

507

The migration has its own minimal copy of the +Product+ model and no longer cares about the +Product+ model defined in the application.

508

509

h4. Dealing with Changing Models

510

511

For performance reasons information about the columns a model has is cached. For example if you add a column to a table and then try and use the corresponding model to insert a new row it may try and use the old column information. You can force Active Record to re-read the column information with the +reset_column_information+ method, for example

512

513

<ruby>

514

class AddPartNumberToProducts < ActiveRecord::Migration

515

class Product < ActiveRecord::Base

516

end

517

518

def self.up

519

add_column :product, :part_number, :string

520

Product.reset_column_information

521

...

522

end

523

524

def self.down

525

...

526

end

527

end

528

</ruby>

529

530

531

h3. Schema Dumping and You

532

533

h4. What are Schema Files for?

534

535

Migrations, mighty as they may be, are not the authoritative source for your database schema. That role falls to either +db/schema.rb+ or an SQL file which Active Record generates by examining the database. They are not designed to be edited, they just represent the current state of the database.

536

537

There is no need (and it is error prone) to deploy a new instance of an app by replaying the entire migration history. It is much simpler and faster to just load into the database a description of the current schema.

538

539

For example, this is how the test database is created: the current development database is dumped (either to +db/schema.rb+ or +db/development.sql+) and then loaded into the test database.

540

541

Schema files are also useful if you want a quick look at what attributes an Active Record object has. This information is not in the model's code and is frequently spread across several migrations but is all summed up in the schema file. The "annotate_models":http://agilewebdevelopment.com/plugins/annotate_models plugin, which automatically adds (and updates) comments at the top of each model summarising the schema, may also be of interest.

542

543

h4. Types of Schema Dumps

544

545

There are two ways to dump the schema. This is set in +config/environment.rb+ by the +config.active_record.schema_format+ setting, which may be either +:sql+ or +:ruby+.

546

547

If +:ruby+ is selected then the schema is stored in +db/schema.rb+. If you look at this file you'll find that it looks an awful lot like one very big migration:

548

549

<ruby>

550

ActiveRecord::Schema.define(:version => 20080906171750) do

551

create_table "authors", :force => true do |t|

552

t.string "name"

553

t.datetime "created_at"

554

t.datetime "updated_at"

555

end

556

557

create_table "products", :force => true do |t|

558

t.string "name"

559

t.text "description"

560

t.datetime "created_at"

561

t.datetime "updated_at"

562

t.string "part_number"

563

end

564

end

565

</ruby>

566

567

In many ways this is exactly what it is. This file is created by inspecting the database and expressing its structure using +create_table+, +add_index+, and so on. Because this is database independent it could be loaded into any database that Active Record supports. This could be very useful if you were to distribute an application that is able to run against multiple databases.

568

569

There is however a trade-off: +db/schema.rb+ cannot express database specific items such as foreign key constraints, triggers or stored procedures. While in a migration you can execute custom SQL statements, the schema dumper cannot reconstitute those statements from the database. If you are using features like this then you should set the schema format to +:sql+.

570

571

Instead of using Active Record's schema dumper the database's structure will be dumped using a tool specific to that database (via the +db:structure:dump+ Rake task) into +db/#{RAILS_ENV}_structure.sql+. For example for PostgreSQL the +pg_dump+ utility is used and for MySQL this file will contain the output of +SHOW CREATE TABLE+ for the various tables. Loading this schema is simply a question of executing the SQL statements contained inside.

572

573

By definition this will be a perfect copy of the database's structure but this will usually prevent loading the schema into a database other than the one used to create it.

574

575

h4. Schema Dumps and Source Control

576

577

Because schema dumps are the authoritative source for your database schema, it is strongly recommended that you check them into source control.

578

579

h3. Active Record and Referential Integrity

580

581

The Active Record way claims that intelligence belongs in your models, not in the database. As such, features such as triggers or foreign key constraints, which push some of that intelligence back into the database, are not heavily used.

582

583

Validations such as +validates_uniqueness_of+ are one way in which models can enforce data integrity. The +:dependent+ option on associations allows models to automatically destroy child objects when the parent is destroyed. Like anything which operates at the application level these cannot guarantee referential integrity and so some people augment them with foreign key constraints.

584

585

Although Active Record does not provide any tools for working directly with such features, the +execute+ method can be used to execute arbitrary SQL. There are also a number of plugins such as "redhillonrails":http://agilewebdevelopment.com/plugins/search?search=redhillonrails which add foreign key support to Active Record (including support for dumping foreign keys in +db/schema.rb+).

586

587

h3. Changelog

588

589

"Lighthouse ticket":http://rails.lighthouseapp.com/projects/16213-rails-guides/tickets/6

590

591

* September 14, 2008: initial version by "Frederick Cheung":credits.html#fcheung

Older »