Drupal 5: Migrating a production database to a QA server

| development, drupal

Building on the configuration management strategy I described last time, I wrote some scripts to make it easier for other developers to migrate the production database to the QA server or to get a copy of the production database for their local system. I needed to consider the following factors:

  • Domain name changes: Because we use Domain Access to serve multiple subdomains using a single Drupal installation and shared sign-on, we needed to make sure that all instances of the domain root are replaced and the correct domain_root is set. For example, the site URLs might be:
    Production QA Local
    example.com qa.example.com local.example.com
    foo.example.com foo.qa.example.com foo.local.example.com
    bar.example.com bar.qa.example.com bar.local.example.com
  • Privacy concerns: The QA database and local copies of the database should not contain sensitive user information. All users aside from the administrator should be removed from the system, and all content on site should be limited to editorial content.
  • Efficiency: We don’t need data like access logs or watchdog logs in QA or for local testing. This saves quite a lot of time and space during the data migration process.

Here’s how I did it:

  1. I identified tables I needed to copy over and tables that I could just leave empty. I did this by going through the output of “SHOW TABLES” in a MySQL console.
  2. In my Makefile, I declared a DB_DATA variable that contained a list of tables I wanted to copy.
  3. I wrote a backup_partial target in my Makefile that issued a series of commands:
    mysqldump -u ${DB_USER} --password=${DB_PASSWORD} ${DB} --no-data > partial.dump  # export the schema
    mysqldump -u ${DB_USER} --password=${DB_PASSWORD} ${DB} --opt --complete-insert ${DB_DATA} >> partial.dump # Dump some of the data
    mysqldump -u ${DB_USER} --password=${DB_PASSWORD} ${DB} --opt --complete-insert --where='uid< =1' users users_roles >> partial.dump # Dump the admin and anonymous users
    echo "UPDATE node SET uid=1;" >> partial.dump # Set all the node authors to admin
    echo "REPLACE INTO search_index_queue (nid, module, timestamp)  select nid, type, unix_timestamp(now()) FROM node;" >> partial.dump # Prepare for reindexing
    
  4. I wrote a shell script on an internal server that accepted an argument (the domain to translate to) and performed the following steps:
    1. ssh into the production server and run make with my backup_partial target, compressing the resulting partial.dump
    2. scp the partial.dump.gz from the production server onto the internal server
    3. unpack partial.dump.gz
    4. figure out what $DOMAIN is supposed to be based on the arguments
    5. run perl -pi -e "s/example.com/$DOMAIN/" partial.dump
    6. load partial.dump into my database
    7. run cron.php if it can
  5. I added two buttons to my web-based deploy script: one button to migrate the production database to the QA server, one button to make a copy of the production database for the domain “local.example.com”. Both buttons call
  6. I created multisite settings.php in my Drupal directory (ex: sites/local.example.com and sites/qa.example.com). The production settings go in default/settings.php, and the multisite settings.php override it like this:
    $conf = array(
      'domain_root' => 'local.example.com',
    );
    $cookie_domain = '.' . $conf['domain_root'];
    

    $conf allows you to override Drupal variables returned by variable_get.

So now, I can click on a button to migrate a sanitized copy of the production database to the QA server or to my local system. Sweet!

You can comment with Disqus or you can e-mail me at sacha@sachachua.com.