May 21, 2009

Bulk view

Drupal from the trenches: Database glitches

The problem: Our Drupal site stopped displaying nodes on our fuzzysearch-based pages.

I started breaking down the SQL query to find out which joined table didn’t have the data that was supposed to be there. At first, I thought it might be that the publication dates or expiration dates were incorrect, but that wasn’t it. All of the nodes were set to ‘published’ (status = 1), and all remained correctly associated with the domain through the Domain module.

I checked the search_fuzzy_index table. Hmm. The table was smaller than it should be. When I queried it for a node that should’ve been there, I got zero rows.

I tried forcing the system to reindex the node. Still nothing.

I printed out the SQL statements and ran them myself. ERROR 1062 (23000): Duplicate entry '2147483647' for key 1.

Wha?

That looked like a suspiciously large number. I popped it into Google and found a lot of people with the same problem. Turns out that the autoincrement sequence fuzzysearch was using had jumped all the way to the end, for some reason or another (we couldn’t have been reindexing _that_ much!). I changed the column to a BIGINT, reindexed all the nodes using Drush commands like:

drush eval "\$rs=db_query(\"select nid from node where type='transitions_job'\");while(\$nid=db_fetch_array(\$rs)){fuzzysearch_reindex(\$nid['nid'], 'fuzzysearch');} fuzzysearch_cron();"

(using my custom eval command),

That worked. I reported the results to my team members in our group chat.

The IT architect said we were still having disk space issues. We’d removed all the unnecessary files and stale backups already, and I didn’t think there was much more that we could trim. I used du --max-depth=2 /var | sort -n to look at the disk usage in the /var tree, which was where we were having those problems. The three biggest directories were /var/www, /var/lib/mysql, and /var/spool/mail, and we’d already scrubbed as much as we could.

I used ls -lSr to look at the contents of /var/lib/mysql, and noticed that our watchdog table was 2.3 GBs. Gah! Turned out that Drupal had logged each of the PHP errors raised when we were trying to fix the previous problem. We didn’t want to delete all the logs, so we just deleted the logs of type ‘php’. After the IT architect ran DELETE FROM watchdog WHERE type='php'" in the MySQL client, though, we still didn't have free space.

I guessed that the lack of free space could be solved by compacting the MySQL database, which I did with OPTIMIZE TABLE watchdog. That solved it!

That was an interesting day. =)