Drupal in the trenches: MySQL and DELETE joins; multiple tests

Today was the first time I used a DELETE statement that joined multiple tables. I didn’t even know you could do that. =)

I needed to look into this because my update function was just taking so much time and memory. We have a little over 3,000 nodes, most of which are usernodes automatically created by the usernode module. I needed to rebuild the node access entries for a few hundred content nodes to take into account the new partner access module I wrote, but node_access_rebuild took quite a while.

The first approach I thought of taking was to delete usernode because we didn’t seem to need it any more. In order to do that, though, I would still have to delete all the user nodes, though, and invoking all the hooks for each of the 2000+ user nodes took a while as well.

The second approach was to delete just the node access records related to the nodes I needed to update. I found a way to do it even faster – instead of querying the node IDs and then deleting individual rows, I could do something like this instead:

db_query("DELETE FROM na USING {node} n
          INNER JOIN {node_access} na ON (n.nid=na.nid) WHERE n.type='%s'", $type);
$rs = db_query("SELECT nid FROM {node} WHERE type='%s'", $type);
while ($row = db_fetch_array($rs)) {
  $node = node_load($row['nid'], NULL, TRUE);
  if (!empty($node)) {
    node_access_acquire_grants($node);
  }
}

(UPDATE: Fixed MySQL statement)

That was interesting. =)

I was also running into memory problems trying to run all the unit tests in one go. I broke it into individual tests with:

drush test list | grep KEYWORD | xargs -n 1 drush test run | tee out

where KEYWORD is the keyword I use to filter the list for just the tests I wanted to run, and out is the file that stores a copy of the results. It’s not neatly summarized, but oh well. =) It works.

  • http://synapsefire.blogspot.com/ Stefan Kudwien

    Beware though, this is a MySQL-specific extension to DELETE. FWIW, an SQL standards compliant approach would be using a subquery:

    db_query("DELETE FROM {node_access} WHERE nid IN (SELECT nid FROM {node} WHERE type = '%s')", $type);

    However, this will most likely be slower.

  • http://robshouse.net Robert Douglass

    I ran into a nearly identical problem yesterday with an upgrade script that needed to delete 6000 nodes. It seems to me that, in Drupal 7, now that we track foreign keys, we should be able to get rid of a lot of the deleting that happens through hooks and just depend on cascading deletes at the database level. Hopefully this will speed the mass deleting of nodes even further. I believe that Drupal 7 also allows for sending in an array of nids for deletion, but haven’t tested to see if it helps the case of mass deletion significantly.

    Here’s the commit that puts foreign keys into D7.
    http://is.gd/M4Dy