EmacsConf backstage: making lots of intro videos with subed-record

| emacsconf, subed, emacs

Summary (735 words): Emacs is a handy audio/video editor. subed-record can combine multiple audio files and images to create multiple output videos.

Watch on YouTube

It's nice to feel like you're saying someone's name correctly. We ask EmacsConf speakers to introduce themselves in the first few seconds of their video, but people often forget to do that, so that's okay. We started recording introductions for EmacsConf 2022 so that stream hosts don't have to worry about figuring out pronunciation while they're live. Here's how I used subed-record to turn my recordings into lots of little videos.

First, I generated the title images by using Emacs Lisp to replace text in a template SVG and then using Inkscape to convert the SVG into a PNG. Each image showed information for the previous talk as well as the upcoming talk. (emacsconf-stream-generate-in-between-pages)

emacsconf.svg.png
Figure 1: Sample title image

Then I generated the text for each talk based on the title, the speaker names, pronunciation notes, pronouns, and type of Q&A. Each introduction generally followed the pattern, "Next we have title by speakers. Details about Q&A." (emacsconf-pad-expand-intro and emacsconf-subed-intro-subtitles below)

00:00:00.000 --> 00:00:00.999
#+OUTPUT: sat-open.webm
[[file:/home/sacha/proj/emacsconf/2023/assets/in-between/sat-open.svg.png]]
Next, we have "Saturday opening remarks".

00:00:05.000 --> 00:00:04.999
#+OUTPUT: adventure.webm
[[file:/home/sacha/proj/emacsconf/2023/assets/in-between/adventure.svg.png]]
Next, we have "An Org-Mode based text adventure game for learning the basics of Emacs, inside Emacs, written in Emacs Lisp", by Chung-hong Chan. He will answer questions via Etherpad.

I copied the text into an Org note in my inbox, which Syncthing copied over to the Orgzly Revived app on my Android phone. I used Google Recorder to record the audio. I exported the m4a audio file and a rough transcript, copied them back via Syncthing, and used subed-record to edit the audio into a clean audio file without oopses.

Each intro had a set of captions that started with a NOTE comment. The NOTE comment specified the following:

  • #+AUDIO:: the audio source to use for the timestamped captions that follow
  • [[file:...]]: the title image I generated for each talk. When subed-record-compile-video sees a comment with a link to an image, video, or animated GIF, it takes that visual and uses it for the span of time until the next visual.
  • #+OUTPUT: the file to create.
NOTE #+OUTPUT: hyperdrive.webm
[[file:/home/sacha/proj/emacsconf/2023/assets/in-between/hyperdrive.svg.png]]
#+AUDIO: intros-2023-11-21-cleaned.opus

00:00:15.680 --> 00:00:17.599
Next, we have "hyperdrive.el:

00:00:17.600 --> 00:00:21.879
Peer-to-peer filesystem in Emacs", by Joseph Turner

00:00:21.880 --> 00:00:25.279
and Protesilaos Stavrou (also known as Prot).

00:00:25.280 --> 00:00:27.979
Joseph will answer questions via BigBlueButton,

00:00:27.980 --> 00:00:31.080
and Prot might be able to join depending on the weather.

00:00:31.081 --> 00:00:33.439
You can join using the URL from the talk page

00:00:33.440 --> 00:00:36.320
or ask questions through Etherpad or IRC.

NOTE
#+OUTPUT: steno.webm
[[file:/home/sacha/proj/emacsconf/2023/assets/in-between/steno.svg.png]]
#+AUDIO: intros-2023-11-19-cleaned.opus

00:03:23.260 --> 00:03:25.480
Next, we have "Programming with steno",

00:03:25.481 --> 00:03:27.700
by Daniel Alejandro Tapia.

NOTE
#+AUDIO: intro-2023-11-29-cleaned.opus

00:00:13.620 --> 00:00:16.580
You can ask your questions via Etherpad and IRC.

00:00:16.581 --> 00:00:18.079
We'll send them to the speaker

00:00:18.080 --> 00:00:19.919
and post the answers in the talk page

00:00:19.920 --> 00:00:21.320
after the conference.

I could then call subed-record-compile-video to create the videos for all the intros, or mark a region with C-SPC and then subed-record-compile-video only the intros inside that region.

Sample intro

Using Emacs to edit the audio and compile videos worked out really well because it made it easy to change things.

  • Changing pronunciation or titles: For EmacsConf 2023, I got the recordings sorted out in time for the speakers to correct my pronunciation if they wanted to. Some speakers also changed their talk titles midway. If I wanted to redo an intro, I just had to rerecord that part, run it through my subed-record audio cleaning process, add an #+AUDIO: comment specifying which file I want to take the audio from, paste it into my main intros.vtt, and recompile the video.
  • Cancelling talks: One of the talks got cancelled, so I needed to update the images for the talk before it and the talk after it. I regenerated the title images and recompiled the videos. I didn't even need to figure out which talk needed to be updated - it was easy enough to just recompile all of them.
  • Changing type of Q&A: For example, some speakers needed to switch from answering questions live to answering them after the conference. I could just delete the old instructions, paste in the instructions from elsewhere in my intros.vtt (making sure to set #+AUDIO to the file if it came from a different take), and recompile the video.

And of course, all the videos were captioned. Bonus!

So that's how using Emacs to edit and compile simple videos saved me a lot of time. I don't know how I'd handle this otherwise. 47 video projects that might all need to be updated if, say, I changed the template? Yikes. Much better to work with text. Here are the technical details.

Generating the title images

I used Inkscape to add IDs to our template SVG so that I could edit them with Emacs Lisp. From emacsconf-stream.el:

emacsconf-stream-generate-in-between-pages: Generate the title images.
(defun emacsconf-stream-generate-in-between-pages (&optional info)
  "Generate the title images."
  (interactive)
  (setq info (or emacsconf-schedule-draft (emacsconf-publish-prepare-for-display (emacsconf-filter-talks (or info (emacsconf-get-talk-info))))))
  (let* ((by-track (seq-group-by (lambda (o) (plist-get o :track)) info))
         (dir (expand-file-name "in-between" emacsconf-stream-asset-dir))
         (template (expand-file-name "template.svg" dir)))
    (unless (file-directory-p dir)
      (make-directory dir t))
    (mapc (lambda (track)
            (let (prev)
              (mapc (lambda (talk)
                      (let ((dom (xml-parse-file template)))
                        (mapc (lambda (entry)
                                (let ((prefix (car entry)))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "title")
                                                 (plist-get (cdr entry) :title))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "speakers")
                                                 (plist-get (cdr entry) :speakers))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "url")
                                                 (and (cdr entry) (concat emacsconf-base-url (plist-get (cdr entry) :url))))
                                  (emacsconf-stream-svg-set-text
                                   dom
                                   (concat prefix "qa")
                                   (pcase (plist-get (cdr entry) :q-and-a)
                                     ((rx "live") "Live Q&A after talk")
                                     ((rx "pad") "Etherpad")
                                     ((rx "IRC") "IRC Q&A after talk")
                                     (_ "")))))
                              (list (cons "previous-" prev)
                                    (cons "current-" talk)))
                        (with-temp-file (expand-file-name (concat (plist-get talk :slug) ".svg") dir)
                          (dom-print dom))
                        (shell-command
                         (concat "inkscape --export-type=png -w 1280 -h 720 --export-background-opacity=0 "
                                 (shell-quote-argument (expand-file-name (concat (plist-get talk :slug) ".svg")
                                                                         dir)))))
                      (setq prev talk))
                    (emacsconf-filter-talks (cdr track)))))
          by-track)))

emacsconf-stream-svg-set-text: Update DOM to set the tspan in the element with ID to TEXT.
(defun emacsconf-stream-svg-set-text (dom id text)
  "Update DOM to set the tspan in the element with ID to TEXT.
If the element doesn't have a tspan child, use the element itself."
  (if (or (null text) (string= text ""))
      (let ((node (dom-by-id dom id)))
        (when node
          (dom-set-attribute node 'style "visibility: hidden")
          (dom-set-attribute (dom-child-by-tag node 'tspan) 'style "fill: none; stroke: none")))
    (setq text (svg--encode-text text))
    (let ((node (or (dom-child-by-tag
                     (car (dom-by-id dom id))
                     'tspan)
                    (dom-by-id dom id))))
      (cond
       ((null node)
        (error "Could not find node %s" id))                      ; skip
       ((= (length node) 2)
        (nconc node (list text)))
       (t (setf (elt node 2) text))))))

Generating the script

From emacsconf-pad.el:

emacsconf-pad-expand-intro: Make an intro for TALK.
(defun emacsconf-pad-expand-intro (talk)
  "Make an intro for TALK."
  (cond
   ((null (plist-get talk :speakers))
    (format "Next, we have \"%s\"." (plist-get talk :title)))
   ((plist-get talk :intro-note)
    (plist-get talk :intro-note))
   (t
    (let ((pronoun (pcase (plist-get talk :pronouns)
                     ((rx "she") "She")
                     ((rx "\"ou\"" "Ou"))
                     ((or 'nil "nil" (rx string-start "he") (rx "him")) "He")
                     ((rx "they") "They")
                     (_ (or (plist-get talk :pronouns) "")))))
      (format "Next, we have \"%s\", by %s%s.%s"
              (plist-get talk :title)
              (replace-regexp-in-string ", \\([^,]+\\)$"
                                        ", and \\1"
                                        (plist-get talk :speakers))
              (emacsconf-surround " (" (plist-get talk :pronunciation) ")" "")
              (pcase (plist-get talk :q-and-a)
                ((or 'nil "") "")
                ((rx "after") " You can ask questions via Etherpad and IRC. We'll send them to the speaker, and we'll post the answers on the talk page afterwards.")
                ((rx "live")
                 (format " %s will answer questions via BigBlueButton. You can join using the URL from the talk page or ask questions through Etherpad or IRC."
                         pronoun
                         ))
                ((rx "pad")
                 (format " %s will answer questions via Etherpad."
                         pronoun
                         ))
                ((rx "IRC")
                 (format " %s will answer questions via IRC in the #%s channel."
                         pronoun
                         (plist-get talk :channel)))))))))

And from emacsconf-subed.el:

emacsconf-subed-intro-subtitles: Create the introduction as subtitles.
(defun emacsconf-subed-intro-subtitles ()
  "Create the introduction as subtitles."
  (interactive)
  (subed-auto-insert)
  (let ((emacsconf-publishing-phase 'conference))
    (mapc
     (lambda (sub) (apply #'subed-append-subtitle nil (cdr sub)))
     (seq-map-indexed
      (lambda (talk i)
        (list
         nil
         (* i 5000)
         (1- (* i 5000))
         (format "#+OUTPUT: %s.webm\n[[file:%s]]\n%s"
                 (plist-get talk :slug)
                 (expand-file-name
                  (concat (plist-get talk :slug) ".svg.png")
                  (expand-file-name "in-between" emacsconf-stream-asset-dir))
                 (emacsconf-pad-expand-intro talk))))
      (emacsconf-publish-prepare-for-display (emacsconf-get-talk-info))))))

View org source for this post
You can comment with Disqus or you can e-mail me at sacha@sachachua.com.