Drupal, HTML Purifier, and embedding IFRAMES from YouTube

Posted: - Modified: | drupal, geek, work

I know, I know. I shouldn’t allow IFRAMEs at all. But the client’s prospective users were really excited about images and video, and Drupal’s Media module wasn’t going to be quite enough. So I’ve been fighting with CKEditor, IMCE, and HTML Purifier to figure out how to make it easier. I’m hoping that this will be like practically all my other Drupal posts and someone will comment with a much better way to do things right after I describe what I’ve done. =)

First: images. There doesn’t seem to be a cleaner way than the “Browse server” – “Upload” combination using CKEditor and IMCE. I tried using WYSIWYG, TinyMCE and IMCE. I tried ImageBrowser, but I couldn’t get it to work. I tried FCKEditor, which looked promising, but I got tangled in figuring out how to control other parts of it. I’m just going to leave it as CKEditor and IMCE at the moment, and we can come back to that if it turns out to be higher priority than all the other things I’m working on. This is almost certainly my limitation rather than the packages’ limitations, but I don’t have the time to exhaustively tweak this until it’s right. Someday I may finally learn how to make a CKEditor plugin, but it will not be in the final week of this Drupal project.

Next: HTMLPurifier and Youtube. You see, Youtube switched to using IFRAMEs instead of Flash embeds. Allowing IFRAMEs is like allowing people to put arbitrary content on your webpage, because it is. The HTML Purifier folks seem firmly against it because it’s a bad idea, which it also is. But you’ve got to work around what you’ve got to workaround. Based on the Allow iframes thread in the HTMLPurifier forum, this is what I came up with:

Step 1. Create a custom filter in htmlpurifier/library/myiframe.php.

<?php
// Iframe filter that does some primitive whitelisting in a
// somewhat recognizable and tweakable way
class HTMLPurifier_Filter_MyIframe extends HTMLPurifier_Filter
{
  public $name = 'MyIframe';
  public function preFilter($html, $config, $context) {
    $html = preg_replace('/<iframe/i', '<img class="MyIframe"', $html);
    $html = preg_replace('#</iframe>#i', '', $html);
    return $html;
  }
  public function postFilter($html, $config, $context) {
    $post_regex = '#<img class="MyIframe"([^>]+?)>#';
    return preg_replace_callback($post_regex, array($this, 'postFilterCallback'), $html);
  }
  protected function postFilterCallback($matches) {
    // Whitelist the domains we like
    $ok = (preg_match('#src="http://www.youtube.com/#i', $matches[1]));
    if ($ok) {
      return '<iframe ' . $matches[1] . '></iframe>';
    } else {
      return '';
    }
  }
}

Step 2. Include the filter in HTMLPurifier_DefinitionCache_Drupal.php. I don’t know if this is the right place, but I saw it briefly mentioned somewhere.

// ... rest of file
require_once 'myiframe.php';

Step 3. Create the HTML Purifier config file. In this case, I was changing the config for “Filtered HTML”, which had the input format ID of 1. I copied config/sample.php to config/1.php and set the following:

function htmlpurifier_config_1($config) {
  $config->set('HTML.SafeObject', true);
  $config->set('Output.FlashCompat', true);
  $config->set('URI.DisableExternalResources', false);
  $config->set('Filter.Custom', array(new HTMLPurifier_Filter_MyIframe()));
}

Now I can switch to the source view in CKEditor, paste in my IFRAME code from Youtube, and view the results. Mostly. I still need to track down why I sometimes need to refresh the page in order to see it, but this is promising.

2011-08-05 Fri 16:34

You can comment with Disqus or you can e-mail me at sacha@sachachua.com.