EmacsConf backstage: making lots of intro videos with subed-record
| emacsconf, subed, emacsSummary (735 words): Emacs is a handy audio/video editor. subed-record can combine multiple audio files and images to create multiple output videos.
It's nice to feel like you're saying someone's name correctly. We ask EmacsConf speakers to introduce themselves in the first few seconds of their video, but people often forget to do that, so that's okay. We started recording introductions for EmacsConf 2022 so that stream hosts don't have to worry about figuring out pronunciation while they're live. Here's how I used subed-record to turn my recordings into lots of little videos.
First, I generated the title images by using Emacs Lisp to replace
text in a template SVG and then using Inkscape to convert the SVG into
a PNG. Each image showed information for the previous talk as well as
the upcoming talk. (emacsconf-stream-generate-in-between-pages
)
Then I generated the text for each talk based on the title, the
speaker names, pronunciation notes, pronouns, and type of Q&A. Each
introduction generally followed the pattern, "Next we have title by
speakers. Details about Q&A." (emacsconf-pad-expand-intro
and
emacsconf-subed-intro-subtitles
below)
00:00:00.000 --> 00:00:00.999 #+OUTPUT: sat-open.webm [[file:/home/sacha/proj/emacsconf/2023/assets/in-between/sat-open.svg.png]] Next, we have "Saturday opening remarks". 00:00:05.000 --> 00:00:04.999 #+OUTPUT: adventure.webm [[file:/home/sacha/proj/emacsconf/2023/assets/in-between/adventure.svg.png]] Next, we have "An Org-Mode based text adventure game for learning the basics of Emacs, inside Emacs, written in Emacs Lisp", by Chung-hong Chan. He will answer questions via Etherpad.
I copied the text into an Org note in my inbox, which Syncthing copied over to the Orgzly Revived app on my Android phone. I used Google Recorder to record the audio. I exported the m4a audio file and a rough transcript, copied them back via Syncthing, and used subed-record to edit the audio into a clean audio file without oopses.
Each intro had a set of captions that started with a NOTE
comment.
The NOTE
comment specified the following:
#+AUDIO:
: the audio source to use for the timestamped captions that follow[[file:...]]
: the title image I generated for each talk. Whensubed-record-compile-video
sees a comment with a link to an image, video, or animated GIF, it takes that visual and uses it for the span of time until the next visual.#+OUTPUT:
the file to create.
NOTE #+OUTPUT: hyperdrive.webm [[file:/home/sacha/proj/emacsconf/2023/assets/in-between/hyperdrive.svg.png]] #+AUDIO: intros-2023-11-21-cleaned.opus 00:00:15.680 --> 00:00:17.599 Next, we have "hyperdrive.el: 00:00:17.600 --> 00:00:21.879 Peer-to-peer filesystem in Emacs", by Joseph Turner 00:00:21.880 --> 00:00:25.279 and Protesilaos Stavrou (also known as Prot). 00:00:25.280 --> 00:00:27.979 Joseph will answer questions via BigBlueButton, 00:00:27.980 --> 00:00:31.080 and Prot might be able to join depending on the weather. 00:00:31.081 --> 00:00:33.439 You can join using the URL from the talk page 00:00:33.440 --> 00:00:36.320 or ask questions through Etherpad or IRC. NOTE #+OUTPUT: steno.webm [[file:/home/sacha/proj/emacsconf/2023/assets/in-between/steno.svg.png]] #+AUDIO: intros-2023-11-19-cleaned.opus 00:03:23.260 --> 00:03:25.480 Next, we have "Programming with steno", 00:03:25.481 --> 00:03:27.700 by Daniel Alejandro Tapia. NOTE #+AUDIO: intro-2023-11-29-cleaned.opus 00:00:13.620 --> 00:00:16.580 You can ask your questions via Etherpad and IRC. 00:00:16.581 --> 00:00:18.079 We'll send them to the speaker 00:00:18.080 --> 00:00:19.919 and post the answers in the talk page 00:00:19.920 --> 00:00:21.320 after the conference.
I could then call subed-record-compile-video
to create the videos
for all the intros, or mark a region with C-SPC
and then
subed-record-compile-video
only the intros inside that region.
Using Emacs to edit the audio and compile videos worked out really well because it made it easy to change things.
- Changing pronunciation or titles: For EmacsConf 2023, I got the
recordings sorted out in time for the speakers to correct my
pronunciation if they wanted to. Some speakers also changed their
talk titles midway. If I wanted to redo an intro, I just had to
rerecord that part, run it through my subed-record audio cleaning
process, add an
#+AUDIO:
comment specifying which file I want to take the audio from, paste it into my mainintros.vtt
, and recompile the video. - Cancelling talks: One of the talks got cancelled, so I needed to update the images for the talk before it and the talk after it. I regenerated the title images and recompiled the videos. I didn't even need to figure out which talk needed to be updated - it was easy enough to just recompile all of them.
- Changing type of Q&A: For example, some speakers needed to switch
from answering questions live to answering them after the
conference. I could just delete the old instructions, paste in the
instructions from elsewhere in my
intros.vtt
(making sure to set#+AUDIO
to the file if it came from a different take), and recompile the video.
And of course, all the videos were captioned. Bonus!
So that's how using Emacs to edit and compile simple videos saved me a lot of time. I don't know how I'd handle this otherwise. 47 video projects that might all need to be updated if, say, I changed the template? Yikes. Much better to work with text. Here are the technical details.
Generating the title images
I used Inkscape to add IDs to our template SVG so that I could edit them with Emacs Lisp. From emacsconf-stream.el:
emacsconf-stream-generate-in-between-pages: Generate the title images.
(defun emacsconf-stream-generate-in-between-pages (&optional info) "Generate the title images." (interactive) (setq info (or emacsconf-schedule-draft (emacsconf-publish-prepare-for-display (emacsconf-filter-talks (or info (emacsconf-get-talk-info)))))) (let* ((by-track (seq-group-by (lambda (o) (plist-get o :track)) info)) (dir (expand-file-name "in-between" emacsconf-stream-asset-dir)) (template (expand-file-name "template.svg" dir))) (unless (file-directory-p dir) (make-directory dir t)) (mapc (lambda (track) (let (prev) (mapc (lambda (talk) (let ((dom (xml-parse-file template))) (mapc (lambda (entry) (let ((prefix (car entry))) (emacsconf-stream-svg-set-text dom (concat prefix "title") (plist-get (cdr entry) :title)) (emacsconf-stream-svg-set-text dom (concat prefix "speakers") (plist-get (cdr entry) :speakers)) (emacsconf-stream-svg-set-text dom (concat prefix "url") (and (cdr entry) (concat emacsconf-base-url (plist-get (cdr entry) :url)))) (emacsconf-stream-svg-set-text dom (concat prefix "qa") (pcase (plist-get (cdr entry) :q-and-a) ((rx "live") "Live Q&A after talk") ((rx "pad") "Etherpad") ((rx "IRC") "IRC Q&A after talk") (_ ""))))) (list (cons "previous-" prev) (cons "current-" talk))) (with-temp-file (expand-file-name (concat (plist-get talk :slug) ".svg") dir) (dom-print dom)) (shell-command (concat "inkscape --export-type=png -w 1280 -h 720 --export-background-opacity=0 " (shell-quote-argument (expand-file-name (concat (plist-get talk :slug) ".svg") dir))))) (setq prev talk)) (emacsconf-filter-talks (cdr track))))) by-track)))
emacsconf-stream-svg-set-text: Update DOM to set the tspan in the element with ID to TEXT.
(defun emacsconf-stream-svg-set-text (dom id text) "Update DOM to set the tspan in the element with ID to TEXT. If the element doesn't have a tspan child, use the element itself." (if (or (null text) (string= text "")) (let ((node (dom-by-id dom id))) (when node (dom-set-attribute node 'style "visibility: hidden") (dom-set-attribute (dom-child-by-tag node 'tspan) 'style "fill: none; stroke: none"))) (setq text (svg--encode-text text)) (let ((node (or (dom-child-by-tag (car (dom-by-id dom id)) 'tspan) (dom-by-id dom id)))) (cond ((null node) (error "Could not find node %s" id)) ; skip ((= (length node) 2) (nconc node (list text))) (t (setf (elt node 2) text))))))
Generating the script
From emacsconf-pad.el:
emacsconf-pad-expand-intro: Make an intro for TALK.
(defun emacsconf-pad-expand-intro (talk) "Make an intro for TALK." (cond ((null (plist-get talk :speakers)) (format "Next, we have \"%s\"." (plist-get talk :title))) ((plist-get talk :intro-note) (plist-get talk :intro-note)) (t (let ((pronoun (pcase (plist-get talk :pronouns) ((rx "she") "She") ((rx "\"ou\"" "Ou")) ((or 'nil "nil" (rx string-start "he") (rx "him")) "He") ((rx "they") "They") (_ (or (plist-get talk :pronouns) ""))))) (format "Next, we have \"%s\", by %s%s.%s" (plist-get talk :title) (replace-regexp-in-string ", \\([^,]+\\)$" ", and \\1" (plist-get talk :speakers)) (emacsconf-surround " (" (plist-get talk :pronunciation) ")" "") (pcase (plist-get talk :q-and-a) ((or 'nil "") "") ((rx "after") " You can ask questions via Etherpad and IRC. We'll send them to the speaker, and we'll post the answers on the talk page afterwards.") ((rx "live") (format " %s will answer questions via BigBlueButton. You can join using the URL from the talk page or ask questions through Etherpad or IRC." pronoun )) ((rx "pad") (format " %s will answer questions via Etherpad." pronoun )) ((rx "IRC") (format " %s will answer questions via IRC in the #%s channel." pronoun (plist-get talk :channel)))))))))
And from emacsconf-subed.el:
emacsconf-subed-intro-subtitles: Create the introduction as subtitles.
(defun emacsconf-subed-intro-subtitles () "Create the introduction as subtitles." (interactive) (subed-auto-insert) (let ((emacsconf-publishing-phase 'conference)) (mapc (lambda (sub) (apply #'subed-append-subtitle nil (cdr sub))) (seq-map-indexed (lambda (talk i) (list nil (* i 5000) (1- (* i 5000)) (format "#+OUTPUT: %s.webm\n[[file:%s]]\n%s" (plist-get talk :slug) (expand-file-name (concat (plist-get talk :slug) ".svg.png") (expand-file-name "in-between" emacsconf-stream-asset-dir)) (emacsconf-pad-expand-intro talk)))) (emacsconf-publish-prepare-for-display (emacsconf-get-talk-info))))))