Categories: geek » emacs » subed

RSS - Atom - Subscribe via email

EmacsConf backstage: making lots of intro videos with subed-record

| emacsconf, subed, emacs

Summary (735 words): Emacs is a handy audio/video editor. subed-record can combine multiple audio files and images to create multiple output videos.

Watch on YouTube

It's nice to feel like you're saying someone's name correctly. We ask EmacsConf speakers to introduce themselves in the first few seconds of their video, but people often forget to do that, so that's okay. We started recording introductions for EmacsConf 2022 so that stream hosts don't have to worry about figuring out pronunciation while they're live. Here's how I used subed-record to turn my recordings into lots of little videos.

First, I generated the title images by using Emacs Lisp to replace text in a template SVG and then using Inkscape to convert the SVG into a PNG. Each image showed information for the previous talk as well as the upcoming talk. (emacsconf-stream-generate-in-between-pages)

Figure 1: Sample title image

Then I generated the text for each talk based on the title, the speaker names, pronunciation notes, pronouns, and type of Q&A. Each introduction generally followed the pattern, "Next we have title by speakers. Details about Q&A." (emacsconf-pad-expand-intro and emacsconf-subed-intro-subtitles below)

00:00:00.000 --> 00:00:00.999
#+OUTPUT: sat-open.webm
Next, we have "Saturday opening remarks".

00:00:05.000 --> 00:00:04.999
#+OUTPUT: adventure.webm
Next, we have "An Org-Mode based text adventure game for learning the basics of Emacs, inside Emacs, written in Emacs Lisp", by Chung-hong Chan. He will answer questions via Etherpad.

I copied the text into an Org note in my inbox, which Syncthing copied over to the Orgzly Revived app on my Android phone. I used Google Recorder to record the audio. I exported the m4a audio file and a rough transcript, copied them back via Syncthing, and used subed-record to edit the audio into a clean audio file without oopses.

Each intro had a set of captions that started with a NOTE comment. The NOTE comment specified the following:

  • #+AUDIO:: the audio source to use for the timestamped captions that follow
  • [[file:...]]: the title image I generated for each talk. When subed-record-compile-video sees a comment with a link to an image, video, or animated GIF, it takes that visual and uses it for the span of time until the next visual.
  • #+OUTPUT: the file to create.
NOTE #+OUTPUT: hyperdrive.webm
#+AUDIO: intros-2023-11-21-cleaned.opus

00:00:15.680 --> 00:00:17.599
Next, we have "hyperdrive.el:

00:00:17.600 --> 00:00:21.879
Peer-to-peer filesystem in Emacs", by Joseph Turner

00:00:21.880 --> 00:00:25.279
and Protesilaos Stavrou (also known as Prot).

00:00:25.280 --> 00:00:27.979
Joseph will answer questions via BigBlueButton,

00:00:27.980 --> 00:00:31.080
and Prot might be able to join depending on the weather.

00:00:31.081 --> 00:00:33.439
You can join using the URL from the talk page

00:00:33.440 --> 00:00:36.320
or ask questions through Etherpad or IRC.

#+OUTPUT: steno.webm
#+AUDIO: intros-2023-11-19-cleaned.opus

00:03:23.260 --> 00:03:25.480
Next, we have "Programming with steno",

00:03:25.481 --> 00:03:27.700
by Daniel Alejandro Tapia.

#+AUDIO: intro-2023-11-29-cleaned.opus

00:00:13.620 --> 00:00:16.580
You can ask your questions via Etherpad and IRC.

00:00:16.581 --> 00:00:18.079
We'll send them to the speaker

00:00:18.080 --> 00:00:19.919
and post the answers in the talk page

00:00:19.920 --> 00:00:21.320
after the conference.

I could then call subed-record-compile-video to create the videos for all the intros, or mark a region with C-SPC and then subed-record-compile-video only the intros inside that region.

Sample intro

Using Emacs to edit the audio and compile videos worked out really well because it made it easy to change things.

  • Changing pronunciation or titles: For EmacsConf 2023, I got the recordings sorted out in time for the speakers to correct my pronunciation if they wanted to. Some speakers also changed their talk titles midway. If I wanted to redo an intro, I just had to rerecord that part, run it through my subed-record audio cleaning process, add an #+AUDIO: comment specifying which file I want to take the audio from, paste it into my main intros.vtt, and recompile the video.
  • Cancelling talks: One of the talks got cancelled, so I needed to update the images for the talk before it and the talk after it. I regenerated the title images and recompiled the videos. I didn't even need to figure out which talk needed to be updated - it was easy enough to just recompile all of them.
  • Changing type of Q&A: For example, some speakers needed to switch from answering questions live to answering them after the conference. I could just delete the old instructions, paste in the instructions from elsewhere in my intros.vtt (making sure to set #+AUDIO to the file if it came from a different take), and recompile the video.

And of course, all the videos were captioned. Bonus!

So that's how using Emacs to edit and compile simple videos saved me a lot of time. I don't know how I'd handle this otherwise. 47 video projects that might all need to be updated if, say, I changed the template? Yikes. Much better to work with text. Here are the technical details.

Generating the title images

I used Inkscape to add IDs to our template SVG so that I could edit them with Emacs Lisp. From emacsconf-stream.el:

emacsconf-stream-generate-in-between-pages: Generate the title images.
(defun emacsconf-stream-generate-in-between-pages (&optional info)
  "Generate the title images."
  (setq info (or emacsconf-schedule-draft (emacsconf-publish-prepare-for-display (emacsconf-filter-talks (or info (emacsconf-get-talk-info))))))
  (let* ((by-track (seq-group-by (lambda (o) (plist-get o :track)) info))
         (dir (expand-file-name "in-between" emacsconf-stream-asset-dir))
         (template (expand-file-name "template.svg" dir)))
    (unless (file-directory-p dir)
      (make-directory dir t))
    (mapc (lambda (track)
            (let (prev)
              (mapc (lambda (talk)
                      (let ((dom (xml-parse-file template)))
                        (mapc (lambda (entry)
                                (let ((prefix (car entry)))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "title")
                                                 (plist-get (cdr entry) :title))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "speakers")
                                                 (plist-get (cdr entry) :speakers))
                                  (emacsconf-stream-svg-set-text dom (concat prefix "url")
                                                 (and (cdr entry) (concat emacsconf-base-url (plist-get (cdr entry) :url))))
                                   (concat prefix "qa")
                                   (pcase (plist-get (cdr entry) :q-and-a)
                                     ((rx "live") "Live Q&A after talk")
                                     ((rx "pad") "Etherpad")
                                     ((rx "IRC") "IRC Q&A after talk")
                                     (_ "")))))
                              (list (cons "previous-" prev)
                                    (cons "current-" talk)))
                        (with-temp-file (expand-file-name (concat (plist-get talk :slug) ".svg") dir)
                          (dom-print dom))
                         (concat "inkscape --export-type=png -w 1280 -h 720 --export-background-opacity=0 "
                                 (shell-quote-argument (expand-file-name (concat (plist-get talk :slug) ".svg")
                      (setq prev talk))
                    (emacsconf-filter-talks (cdr track)))))

emacsconf-stream-svg-set-text: Update DOM to set the tspan in the element with ID to TEXT.
(defun emacsconf-stream-svg-set-text (dom id text)
  "Update DOM to set the tspan in the element with ID to TEXT.
If the element doesn't have a tspan child, use the element itself."
  (if (or (null text) (string= text ""))
      (let ((node (dom-by-id dom id)))
        (when node
          (dom-set-attribute node 'style "visibility: hidden")
          (dom-set-attribute (dom-child-by-tag node 'tspan) 'style "fill: none; stroke: none")))
    (setq text (svg--encode-text text))
    (let ((node (or (dom-child-by-tag
                     (car (dom-by-id dom id))
                    (dom-by-id dom id))))
       ((null node)
        (error "Could not find node %s" id))                      ; skip
       ((= (length node) 2)
        (nconc node (list text)))
       (t (setf (elt node 2) text))))))

Generating the script

From emacsconf-pad.el:

emacsconf-pad-expand-intro: Make an intro for TALK.
(defun emacsconf-pad-expand-intro (talk)
  "Make an intro for TALK."
   ((null (plist-get talk :speakers))
    (format "Next, we have \"%s\"." (plist-get talk :title)))
   ((plist-get talk :intro-note)
    (plist-get talk :intro-note))
    (let ((pronoun (pcase (plist-get talk :pronouns)
                     ((rx "she") "She")
                     ((rx "\"ou\"" "Ou"))
                     ((or 'nil "nil" (rx string-start "he") (rx "him")) "He")
                     ((rx "they") "They")
                     (_ (or (plist-get talk :pronouns) "")))))
      (format "Next, we have \"%s\", by %s%s.%s"
              (plist-get talk :title)
              (replace-regexp-in-string ", \\([^,]+\\)$"
                                        ", and \\1"
                                        (plist-get talk :speakers))
              (emacsconf-surround " (" (plist-get talk :pronunciation) ")" "")
              (pcase (plist-get talk :q-and-a)
                ((or 'nil "") "")
                ((rx "after") " You can ask questions via Etherpad and IRC. We'll send them to the speaker, and we'll post the answers on the talk page afterwards.")
                ((rx "live")
                 (format " %s will answer questions via BigBlueButton. You can join using the URL from the talk page or ask questions through Etherpad or IRC."
                ((rx "pad")
                 (format " %s will answer questions via Etherpad."
                ((rx "IRC")
                 (format " %s will answer questions via IRC in the #%s channel."
                         (plist-get talk :channel)))))))))

And from emacsconf-subed.el:

emacsconf-subed-intro-subtitles: Create the introduction as subtitles.
(defun emacsconf-subed-intro-subtitles ()
  "Create the introduction as subtitles."
  (let ((emacsconf-publishing-phase 'conference))
     (lambda (sub) (apply #'subed-append-subtitle nil (cdr sub)))
      (lambda (talk i)
         (* i 5000)
         (1- (* i 5000))
         (format "#+OUTPUT: %s.webm\n[[file:%s]]\n%s"
                 (plist-get talk :slug)
                  (concat (plist-get talk :slug) ".svg.png")
                  (expand-file-name "in-between" emacsconf-stream-asset-dir))
                 (emacsconf-pad-expand-intro talk))))
      (emacsconf-publish-prepare-for-display (emacsconf-get-talk-info))))))

View org source for this post

Using subed-record in Emacs to edit audio and clean up oopses

| emacs, subed

Finding enough quiet focused time to record audio is a challenge. I often have to re-record segments in order to correct brain hiccups or to restart after interruptions. It's also hard for me to sit still and listen to my recordings looking for mistakes to edit out. I'm not familiar enough with Audacity to zip around with keyboard shortcuts, and I don't like listening to myself again and again in order to find my way around an audio file.

Sure, I could take the transcript, align it with subed-align and Aeneas to get the timestamps, and then use subed-convert to get a CSV (actually a TSV since it uses tabs) that I can import into Audacity as labels, but it still feels a little awkward to navigate. I have to zoom in a lot for the text to be readable.

Figure 1: Audacity labels

So here's a workflow I've been experimenting with for cleaning up my recorded audio.

Just like with my audio braindumps, I use Google Recorder on my phone because I can get the audio file and a rough transcript, and because the microphone on it is better than on my laptop. For narration recordings, I hide in the closet because the clothes muffle echoes. I don't feel as self-conscious there as I might be if I recorded in the kitchen, where my computer usually is. I used to record in Emacs using subed-record by pressing left to redo a segment and right to move on to the next one, but using my phone means I don't have to deal with the computer's noises or get the good mic from downstairs.

I start the recorder on my phone and then switch to my Org file in Orgzly Revived, where I've added my script. I read it as far as I can go. If I want to redo a segment, I say "Oops" and then just redo the last phrase or so.

Screenshot of Google Recorder on my phone

I export the transcript and the M4A audio file using Syncthing, which copies them to my computer. I have a function that copies the latest recording and even sets things up for removing oops segments (my-subed-copy-latest-phone-recording, which calls my-split-oops). If I want to process several files, I can copy them over with my-subed-copy-recording.

my-subed-copy-latest-phone-recording: Copy the latest recording transcript and audio to DESTINATION.
(defun my-subed-copy-latest-phone-recording (destination)
  "Copy the latest recording transcript and audio to DESTINATION."
     (read-file-name (format "Move %s to: "
                             (file-name-base (my-latest-file my-phone-recording-dir ".txt")))
                     nil nil nil nil #'file-directory-p))))
  (let ((base (file-name-base (my-latest-file my-phone-recording-dir ".txt"))))
    (rename-file (expand-file-name (concat base ".txt") my-phone-recording-dir)
    (rename-file (expand-file-name (concat base ".m4a") my-phone-recording-dir)
    (find-file (expand-file-name (concat base ".txt") destination))
    (save-excursion (my-split-oops))
    (goto-char (point-min))
    (flush-lines "^$")
    (goto-char (point-min))
     (concat "#+OUTPUT: "
             (file-name-base (buffer-file-name))

(defun my-subed-copy-recording (filename destination)
     (read-file-name (format "Copy %s to: "
                             (file-name-base (buffer-file-name)))
                     nil nil nil nil #'file-directory-p))))
  (dolist (ext '("m4a" "txt" "json" "vtt"))
    (when (file-exists-p (concat (file-name-sans-extension filename) "." ext))
      (copy-file (concat (file-name-sans-extension filename) "." ext)
                 destination t)))
  (when (get-file-buffer filename)
    (kill-buffer (get-file-buffer filename))
    (dired destination)))

I'll use Aeneas to get the timestamps for each line of text, so a little bit of text processing will let me identify the segments that I want to remove. The way my-split-oops works is that it looks for "oops" in the transcript. Whenever it finds "oops", it adds a newline afterwards. Then it takes the next five words and sees if it can search backward for them within 300 characters. If it finds the words, then that's the start of my repeated segment, and we can add a newline before that. If it doesn't find the words, we try again with four words, then three, then two, then one. I can also manually review the file and see if the oopses are well lined up. When they're detected properly, I should see partially duplicated lines.

I used to record using sub-record by using by. Oops,
I used to record. Oops,
I used to record an emacs using subhead record, by pressing left to reduce segment, and write to move on to the next one.
But using my phone means, I don't have to deal with them. Oops.
But using my phone means, I don't have to deal with the computer's noises or get the good mic from downstairs. I started recorder on my phone

my-split-oops: Look for oops and make it easier to split.
(defun my-split-oops ()
  "Look for oops and make it easier to split."
  (let ((scan-window 300))
    (while (re-search-forward "oops[,\.]?[ \n]+" nil t)
      (let ((start (min (line-beginning-position) (- (point) scan-window)))
        (if (bolp)
              (setq start (min (line-beginning-position) (- (point) scan-window))))
          (insert "\n"))
          (setq start-search (point))
          ;; look for 1..3 words back
             for n downfrom 4 downto 1
               (dotimes (_ n) (forward-word))
               (setq search-for (downcase (string-trim (buffer-substring start-search (point)))))
               (goto-char start-search)
               (when (re-search-backward (regexp-quote search-for) start t)
                 (goto-char (match-beginning 0))
                 (cl-return (point)))))
            (and (call-interactively 'isearch-backward) (point))))
          (insert "\n"))))))

Once the lines are split up, I use subed-align and get a VTT file. The oops segments will be in their own subtitles.

Figure 2: Subtitles and waveforms

The timestamps still need a bit of tweaking sometimes, so I use subed-waveform-show-current or subed-waveform-show-all. I can use the following bindings:

  • middle-click to play a sample
  • M-left-click to set the start and copy to the previous subtitle
  • left-click to set the start without changing the previous one
  • M-right-click to set the end and copy to the next subtitle
  • right-click to set the end without changing the next one
  • M-j to jump to the current subtitle and play it again in MPV
  • M-J to jump to close to the end of the current subtitle and play it in MPV

I use my-subed-delete-oops to delete the oops segments. I can also just mark them for skipping by calling C-u M-x my-subed-delete-oops instead.

Then I add a #+OUTPUT: filename-cleaned.opus comment under a NOTE near the beginning of the file. This tells subed-record~compile-audio where to put the output.



00:00:00.000 --> 00:00:10.319
Finding enough. Oops.

#+OUTPUT: 2023-12-subed-record-cleaned.opus

00:00:10.320 --> 00:00:36.319
Finding enough quiet Focused. Time to record. Audio is a challenge. I often have to re-record segments in order to correct brain hiccups, or to restart after interruptions.

I can test short segments by marking the region with C-SPC and using subed-record-compile-try-flow. This lets me check if the transitions between segments make sense.

When I'm happy with everything, I can use subed-record-compile-audio to extract the segments specified by the start and end times of each subtitle and concatenate them one after the other in the audio file specified by the output. The result should be a clean audio file.

If I need to compile an audio file from several takes, I process each take separately. Once I've adjusted the timestamps and deleted or skipped the oops segments, I add #+AUDIO: input-filename.opus to a NOTE at the beginning of the file. subed-record-insert-audio-source-note makes this easier. Then I copy the file's subtitles into my main file. subed-record-compile-audio will take the audio from whichever file was specified by the #+AUDIO: comment, so I can use audio from different files.

Example VTT segment with multiple audio files
#+AUDIO: 2023-11-11-emacsconf.m4a

00:10:55.617 --> 00:10:58.136
Sometimes we send emails one at a time.

#+AUDIO: 2023-11-15-emacsconf.m4a

00:10:55.625 --> 00:11:03.539
Like when you let a speaker know that we've received a proposal That's mostly a matter of plugging the talks properties into the right places in the template.

Now I have a clean audio file that corresponds to my script. I can use subed-align on my script to get the timestamps for each line using the cleaned audio. Once I have a subtitle file, I can use emacsconf-subed-split (in emacsconf-subed.el - which I probably should add to subed-mode sometime) to quickly split the captions up to fit the line lengths. Then I redo the timestamps with subed-align and adjust timestamps with subed-waveform-show-current.

So that's how I go from rough recordings with stutters and oopses to a clean audio file with captions based on my script. People can probably edit faster with Audacity wizardry or the AI audio editors that are in vogue these days, but this little workflow gets around my impatience with audio by turning it into (mostly) text, so that's cool. Let's see if I can make more presentations now that I've gotten the audio side figured out!


#EmacsConf backstage: autopilot with crontab

| emacs, emacsconf, subed

[2023-10-26 Thu]: updated handle-session and added talk

I figured out multi-track streaming so close to EmacsConf 2022 that there wasn't enough time to get other volunteers used to working with the setup, especially since I was still scrambling to figure out more infrastructure as the conference approached. We decided I'd run both streams myself, which meant I needed to make things as automatic as possible so that I wouldn't go crazy. I wanted a lot of things to happen automatically: playing recorded intros and videos, browsing to the right URLs depending on the type of Q&A, publishing updates to the wiki, and so on.

I used timers and TODO state changes to execute commands via TRAMP, which was pretty cool for the most part. But it turned out TRAMP doesn't like being called when it's already running, like when it's being called from two timers going off at the same time. It gives a "Forbidden reentrant call of TRAMP". We found a couple of quick workarounds: I could reschedule the talks to be a minute apart, or I could cancel the conflicting timer and just start them with the shell scripts.

Last year, we had a shell script that played the intro and the main talk, and other scripts to handle the Q&A by opening BigBlueButton, Etherpad, or the IRC channel. Much of the logic was in Emacs Lisp because it was easy to write it that way. For this year, I wanted to write a script that handled the intro, video, and Q&A portions. This is now in roles/obs/templates/handle-session.

# Handle the intro/talk/Q&A for a session
# Usage: handle-session $SLUG


# Kill background music if playing
if screen -list | grep -q background; then
    screen -S background -X quit

# Update the status
sudo -u  talk $SLUG PLAYING &

# Update the overlay
overlay $SLUG

# Play the intro if it exists. If it doesn't exist, switch to the intro slide and stop processing.

if [[ -f $BASE_DIR/assets/intros/$SLUG.webm ]]; then
  killall -s TERM $FIREFOX_NAME
  mpv $BASE_DIR/assets/intros/$SLUG.webm
  firefox --kiosk $BASE_DIR/assets/in-between/$SLUG.png
  exit 0

# Play the video if it exists. If it doesn't exist, switch to the BBB room and stop processing.
if [ "x$TEST_MODE" = "x" ]; then
if [ ! -f "$FILE" ]; then
    # Is there an original file?

if [[ -f $FILE ]]; then
  killall -s TERM $FIREFOX_NAME
  mpv $FILE
  /usr/local/bin/bbb $SLUG
  exit 0

sudo -u  talk $SLUG CLOSED_Q &

# Open the appropriate Q&A URL
QA=$(jq -r '.talks[] | select(.slug=="'$SLUG'")["qa-backstage-url"]' < $BASE_DIR/talks.json)
QA_TYPE=$(jq -r '.talks[] | select(.slug=="'$SLUG'")["qa-type"]' < $BASE_DIR/talks.json)
if [ "$QA_TYPE" = "live" ]; then
  /usr/local/bin/bbb $SLUG
elif [ "$QA" != "null" ]; then
  /usr/local/bin/music &
  /usr/bin/firefox $QA
  # i3-msg 'layout splith'

It builds on roles/obs/templates/bbb, roles/obs/templates/overlay, and roles/obs/templates/music. I also have a roles/prerec/templates/talk script that uses emacsclient to update the status of the talk.

I wrote some Tampermonkey scripts to automate joining the web conference and the IRC channel.

Now that we have a script that handles all the different things related to a session, it's easier to schedule the execution of that script. Instead of using Emacs timers and running into that problem with tramp, I want to try using cron. Cron is a standard UNIX and Linux tool for scheduling things to run at certain times. You make a plain text file in a particular format: minute, hour, day of month, month, day of week, and then the command, and then you tell cron to use that file with something like crontab your-file. Since it's plain text, we can generate it with Emacs Lisp and format-time-string, save with TRAMP, and install with ssh. Each track has its own user account for streaming, so each track can have its own file.

emacsconf-stream-format-crontab: Return crontab entries for TALKS.
(defun emacsconf-stream-format-crontab (track talks &optional test-mode)
  "Return crontab entries for TALKS.
Use the display specified in TRACK.
If TEST-MODE is non-nil, load the videos from the test directory."
" (plist-get track :uid))
    (lambda (talk)
      (format "%s /usr/bin/screen -dmS play-%s bash -c \"DISPLAY=%s TEST_MODE=%s /usr/local/bin/handle-session %s | tee -a ~/track.log\"\n"
              ;; cron times are UTC
              (format-time-string "%-M %-H %-d %m *" (plist-get talk :start-time))
              (plist-get talk :slug)
              (plist-get track :vnc-display)
              (if test-mode "1" "")
              (plist-get talk :slug)))
    (emacsconf-filter-talks talks))))

emacsconf-stream-crontabs: Write the streaming users’ crontab files.
(defun emacsconf-stream-crontabs (&optional test-mode info)
  "Write the streaming users' crontab files.
If TEST-MODE is non-nil, use the videos in the test directory.
If INFO is non-nil, use that as the schedule instead."
  (let ((emacsconf-publishing-phase 'conference))
    (setq info (or info (emacsconf-publish-prepare-for-display (emacsconf-get-talk-info))))
    (dolist (track emacsconf-tracks)
      (let ((talks (seq-filter (lambda (talk)
                                 (string= (plist-get talk :track)
                                          (plist-get track :name)))
            (crontab (expand-file-name (concat (plist-get track :id) ".crontab")
                                       (concat (plist-get track :tramp) "~"))))
        (with-temp-file crontab
          (when (plist-get track :autopilot)
            (insert (emacsconf-stream-format-crontab track talks test-mode))))
        (emacsconf-stream-track-ssh track (concat "crontab ~/" (plist-get track :id) ".crontab"))))))

I want to test the whole setup before the conference, of course. First, I needed test videos. This generates test videos and subtitles following our naming convention.

(defun emacsconf-stream-generate-test-videos (&optional info)
  "Generate 1-minute test videos for INFO."
  (setq info (or info (emacsconf-publish-prepare-for-display (emacsconf-get-talk-info))))
  (let* ((dir (expand-file-name "test" emacsconf-stream-asset-dir))
         (default-directory dir)
         (subed-default-subtitle-length 1000)
         (test-length 60))
    (unless (file-directory-p dir)
      (make-directory dir t))
     (format "ffmpeg -y -f lavfi -i testsrc=duration=%d:size=1280x720:rate=10 -i background-music.opus -shortest %s "
             test-length (expand-file-name "template.webm" dir)))
    (dolist (talk info)
      (with-temp-file (expand-file-name (concat (plist-get talk :file-prefix) "--main.vtt") dir)
        (dotimes (i test-length)
           (* i 1000)
           (1- (* i 1000))
           (format "%s %02d %s"
                   (plist-get talk :slug)
                   (substring "123456789 123456789 123456789 123456789 123456789 123456789 "
                              (1+ (length (format "%s %02d" (plist-get talk :slug) i))))))))
       (expand-file-name "template.webm" dir)
       (expand-file-name (concat (plist-get talk :file-prefix) "--main.webm") dir)

Then I needed to write a crontab based on a different schedule. This code sets up a series of test videos to start about a minute after I run the code, with the dev stream set up to start a minute after the gen stream.

(let* ((offset-seconds 60)
       (start-time (time-add (current-time) offset-seconds))
       (emacsconf-schedule-validation-functions nil)
       (emacsconf-schedule-default-buffer-minutes 1)
       (emacsconf-schedule-default-buffer-minutes-for-live-q-and-a 1)
       (emacsconf-schedule-strategies '(emacsconf-schedule-allocate-buffer-time
       (schedule (emacsconf-schedule-prepare
                      :start ,(format-time-string "%Y-%m-%d %H:%M" start-time)
                      :set-track "General")
                     (sat-open :time 1)
                     (uni :time 1) ; live Q&A
                     (adventure :time 1) ; pad Q&A
                      ,(format-time-string "%Y-%m-%d %H:%M" (time-add start-time 60))
                      :set-track "Development")
                     (repl :time 1) ; IRC
                     (matplotllm :time 1) ; pad
                     (voice :time 1) ; live
  (emacsconf-stream-crontabs t schedule))

That generates gen.crontab and dev.crontab. This is what gen.crontab looks like for testing:

35 11 26 10 * /usr/bin/screen -dmS play-sat-open bash -c "DISPLAY=:5 TEST_MODE=1 /usr/local/bin/handle-session sat-open | tee -a ~/track.log"
36 11 26 10 * /usr/bin/screen -dmS play-uni bash -c "DISPLAY=:5 TEST_MODE=1 /usr/local/bin/handle-session uni | tee -a ~/track.log"
38 11 26 10 * /usr/bin/screen -dmS play-adventure bash -c "DISPLAY=:5 TEST_MODE=1 /usr/local/bin/handle-session adventure | tee -a ~/track.log"

The result: for both tracks, the intro videos play, the test videos play, and web browsers go to the right places for the Q&A.

In case I need to resume manual control:

emacsconf-stream-cancel-crontab: Remove crontab for TRACK.
(defun emacsconf-stream-cancel-crontab (track)
  "Remove crontab for TRACK."
  (interactive (list (emacsconf-complete-track)))
  (plist-put track :autopilot nil)
  (emacsconf-stream-track-ssh track "crontab -r"))

emacsconf-stream-cancel-all-crontabs: Remove crontabs.
(defun emacsconf-stream-cancel-all-crontabs ()
  "Remove crontabs."
  (dolist (track emacsconf-tracks)
    (plist-put track :autopilot nil)
    (emacsconf-stream-track-ssh track "crontab -r")))

Here are some things I learned along the way:

  • I needed to use timedatectl set-timezone America/Toronto to change the server's timezone to America/Toronto so that the crontab would run at the right time.

    In Ansible terms, that's:

    	- name: Set system timezone
    		tags: tz
    			name: ""
    	- name: Restart cron
    		tags: tz
    			name: cron
    			state: restarted
  • I also needed to specify the PATH so that I didn't need to add the absolute paths in all the other shell scripts, XDG_RUNTIME_DIR to get audio working, and DISPLAY so that windows showed up in the right place.

I think this will let me run both tracks for EmacsConf with more ease and less frantic juggling. We'll see!

Using Emacs and Python to record an animation and synchronize it with audio

| emacs, emacsconf, python, subed, video

[2023-01-14 Sat]: Removed my fork since upstream now has the :eval function.

The Q&A session for Things I'd like to see in Emacs (Richard Stallman) from EmacsConf 2022 was done over Mumble. Amin pasted the questions into the Mumble chat buffer and I copied them into a larger buffer as the speaker answered them, but I didn't do it consistently. I figured it might be worth making another video with easier-to-read visuals. At first, I thought about using LaTeX to create Beamer slides with the question text, which I could then turn into a video using ffmpeg. Then I decided to figure out how to animate the text in Emacs, because why not? I figured a straightforward typing animation would probably be less distracting than animate-string, and emacs-director seems to handle that nicely. I forked it to add a few things I wanted, like variables to make the typing speed slower (so that it could more reliably type things on my old laptop, since sometimes the timers seemed to have hiccups) and an :eval step for running things without needing to log them. (2023-01-14: Upstream has the :eval feature now.)

To make it easy to synchronize the resulting animation with the chapter markers I derived from the transcript of the audio file, I decided to beep between scenes. First step: make a beep file.

ffmpeg -y -f lavfi -i 'sine=frequency=1000:duration=0.1' beep.wav

Next, I animated the text, with a beep between scenes. I used subed-parse-file to read the question text directly from the chapter markers, and I used simplescreenrecorder to set up the recording settings (including audio).

(defun my-beep ()
    (shell-command "aplay ~/recordings/beep.wav &" nil nil)))

(require 'director)
(defvar emacsconf-recording-process nil)
(shell-command "xdotool getwindowfocus windowsize 1282 720")
  (switch-to-buffer (get-buffer-create "*Questions*"))
  (face-remap-add-relative 'default :height 300)
  (setq-local mode-line-format "   Q&A for EmacsConf 2022: What I'd like to see in Emacs (Richard M. Stallman) -")
  (sit-for 3)
  (hl-line-mode -1)
  (when (process-live-p emacsconf-recording-process) (kill-process emacsconf-recording-process))
  (setq emacsconf-recording-process (start-process "ssr" (get-buffer-create "*ssr*")
  (sit-for 3)
   :version 1
   :log-target '(file . "/tmp/director.log")
   (lambda ()
     (switch-to-buffer (get-buffer-create "*Questions*"))
   (let ((subtitles (subed-parse-file "~/proj/emacsconf/rms/emacsconf-2022-rms--what-id-like-to-see-in-emacs--answers--chapters.vtt")))
     (apply #'append
             (list :eval '(my-beep))
             (list :type "* Q&A for Richard Stallman's EmacsConf 2022 talk: What I'd like to see in Emacs\n\n\n"))
             (lambda (sub)
                (list :log (elt sub 3))
                (list :eval '(progn (org-end-of-subtree)
                                    (unless (bolp) (insert "\n"))))
                (list :type (concat "** " (elt sub 3) "\n\n"))
                (list :eval '(org-back-to-heading))
                (list :wait 5)
                (list :eval '(my-beep))))
   :typing-style 'human
   :delay-between-steps 0
   :after-end (lambda ()
                (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))
   :on-failure (lambda ()
                 (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))
   :on-error (lambda ()
               (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))))

I used the following code to copy the latest recording to animation.webm and extract the audio to animation.wav. my-latest-file and my-recordings-dir are in my Emacs config.

(let ((name "animation.webm"))
  (copy-file (my-latest-file my-recordings-dir) name t)
   (format "ffmpeg -y -i %s -ar 8000 -ac 1 %s.wav"
           (shell-quote-argument name)
           (shell-quote-argument (file-name-sans-extension name)))))

Then I needed to get the timestamps of the beeps in the recording. I subtracted a little bit (0.82 seconds) based on comparing the waveform with the results.

filename = "animation.wav"
from import wavfile
from scipy import signal
import numpy as np
import re
rate, source =
peaks = signal.find_peaks(source, height=1000, distance=1000)
base_times = (peaks[0] / rate) - 0.82

I noticed that the first question didn't seem to get beeped properly, so I tweaked the times. Then I wrote some code to generate a very long ffmpeg command that used trim and tpad to select the segments and extend them to the right durations. There was some drift when I did it without the audio track, but the timestamps seemed to work right when I included the Q&A audio track as well.

import webvtt
import subprocess
chapters_filename =  "emacsconf-2022-rms--what-id-like-to-see-in-emacs--answers--chapters.vtt"
answers_filename = "answers.wav"
animation_filename = "animation.webm"
def get_length(filename):
    result =["ffprobe", "-v", "error", "-show_entries",
                             "format=duration", "-of",
                             "default=noprint_wrappers=1:nokey=1", filename],
    return float(result.stdout)

def get_frames(filename):
    result =["ffprobe", "-v", "error", "-select_streams", "v:0", "-count_packets",
                             "-show_entries", "stream=nb_read_packets", "-of",
                             "csv=p=0", filename],
    return float(result.stdout)

answers_length = get_length(answers_filename)
# override base_times
times = np.asarray([  1.515875,  13.50, 52.32125 ,  81.368625, 116.66625 , 146.023125,
       161.904875, 182.820875, 209.92125 , 226.51525 , 247.93875 ,
       260.971   , 270.87375 , 278.23325 , 303.166875, 327.44925 ,
       351.616375, 372.39525 , 394.246625, 409.36325 , 420.527875,
       431.854   , 440.608625, 473.86825 , 488.539   , 518.751875,
       544.1515  , 555.006   , 576.89225 , 598.157375, 627.795125,
       647.187125, 661.10875 , 695.87175 , 709.750125, 717.359875])
fps = 30.0
times = np.append(times, get_length(animation_filename))
anim_spans = list(zip(times[:-1], times[1:]))
chapters =
if chapters[0].start_in_seconds == 0:
    vtt_times = [[c.start_in_seconds, c.text] for c in chapters]
    vtt_times = [[0, "Introduction"]] + [[c.start_in_seconds, c.text] for c in chapters] 
vtt_times = vtt_times + [[answers_length, "End"]]
# Add ending timestamps
vtt_times = [[x[0][0], x[1][0], x[0][1]] for x in zip(vtt_times[:-1], vtt_times[1:])]
test_rate = 1.0

i = 0
concat_list = ""
groups = list(zip(anim_spans, vtt_times))
import ffmpeg
animation = ffmpeg.input('animation.webm').video
audio = ffmpeg.input('rms.opus')

for_overlay = ffmpeg.input('color=color=black:size=1280x720:d=%f' % answers_length, f='lavfi')
params = {"b:v": "1k", "vcodec": "libvpx", "r": "30", "crf": "63"}
test_limit = 1
params = {"vcodec": "libvpx", "r": "30", "copyts": None, "b:v": "1M", "crf": 24}
test_limit = 0
anim_rate = 1
import math
cursor = 0
if test_limit > 0:
    groups = groups[0:test_limit]
clips = []

# cursor is the current time
for anim, vtt in groups:
    padding = vtt[1] - cursor - (anim[1] - anim[0]) / anim_rate
    if (padding < 0):
        print("Squeezing", math.floor((anim[1] - anim[0]) / (anim_rate * 1.0)), 'into', vtt[1] - cursor, padding)
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS')) 
    elif padding == 0:
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS'))
        print("%f to %f: Padding %f into %f - pad: %f" % (cursor, vtt[1], (anim[1] - anim[0]) / (anim_rate * 1.0), vtt[1] - cursor, padding))
        cursor = cursor + padding + (anim[1] - anim[0]) / anim_rate
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS').filter('tpad', stop_mode="clone", stop_duration=padding))
    for_overlay = for_overlay.overlay(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS+%f' % vtt[0]))
    clips.append(audio.filter('atrim', start=vtt[0], end=vtt[1]).filter('asetpts', 'PTS-STARTPTS'))
args = ffmpeg.concat(*clips, v=1, a=1).output('output.webm', **params).overwrite_output().compile()
print(' '.join(f'"{item}"' for item in args))

Anyway, it's here for future reference. =)

View org source for this post

subed.el: Word-level timing improvements, TSV support

| emacs, subed

I figured out how to align the subtitles to get word-level timestamps and generate SRV2 files, so now I'm working on improving the support in subed.el so that it can work with those timestamps.

The subed-word-data-load-from-file function in subed-word-data.el should load the word data from the SRV2 file and attempt to match it up with the text, colouring words if they were successfully matched.


Figure 1: After subed-word-data-load-from-file

I also updated and committed code for working with TSV files like the label export from the Audacity audio editor. The concise format might make editing and reviewing easier. The files look like this:


Figure 2: Tab-separated values

To convert an existing file, use subed-convert (from subed-common.el). You can also manually turn on subed-tsv-mode from subed-tsv.el when you're visitng a TSV subtitle/label file. Tab-separated values can be in any sort of text file and tsv is a common file extension, so I don't automatically add it to auto-mode-alist.

The changes should be in 1.0.16 or the latest version from the Git repository at .

Coverage reporting in Emacs with Buttercup, Undercover, Coverage, and a Makefile

| emacs, elisp, subed

One of the things that I always wanted to get back to was the practice of having good test coverage. That way, I can have all these tests catch me in case I break something in my sleep-deprived late-night hacking sessions, and I can see where I may have missed a spot.

Fortunately, subed-mode included lots of tests using the Buttercup testing framework. They look like this:

(describe "SRT"
  (describe "Getting"
    (describe "the subtitle ID"
      (it "returns the subtitle ID if it can be found."
         (insert mock-srt-data)
         (subed-jump-to-subtitle-text 2)
         (expect (subed-subtitle-id) :to-equal 2)))
      (it "returns nil if no subtitle ID can be found."
         (expect (subed-subtitle-id) :to-equal nil))))

and I can run them with make test, which the Makefile defines as emacs -batch -f package-initialize -L . -f buttercup-run-discover.

I don't have Cask set up for subed. I should probably learn how to use Cask. In the meantime, I needed to figure out how to get my Makefile to get the buttercup tests to capture the coverage data and report it in a nice way.

It turns out that the undercover coverage recording library works well with buttercup. It took me a little fiddling (and some reference to undercover.el-buttercup-integration-example) to figure out exactly how to invoke it so that undercover instrumented libraries that I was loading, since the subed files were in one subdirectory and the tests were in another. This is what I eventually came up with for tests/undercover-init.el:

(add-to-list 'load-path "./subed")
(when (require 'undercover nil t)
  (undercover "./subed/*.el" (:report-format 'simplecov) (:send-report nil)))

Then the tests files could start with:

(load-file "./tests/undercover-init.el")
(require 'subed-srt)

and my Makefile target for running tests with coverage reporting could be:

	mkdir -p coverage
	UNDERCOVER_FORCE=true emacs -batch -L . -f package-initialize -f buttercup-run-discover

Displaying the coverage information in code buffers was easy with the coverage package. It looks in the git root directory for the coverage results, so I didn't need to tell it where the results were. This is what it looks like:


There are a few other options for displaying coverage info. cov uses the fringe and coverlay focuses on highlighting missed lines.

So now I can actually see how things are going, and I can start writing tests for some of those gaps. At some point I may even do the badge thing mentioned in my blog post from 2015 on continuous integration and code coverage for Emacs packages. There are a lot of things I'm slowly remembering how to do… =)

Defining generic and mode-specific Emacs Lisp functions with cl-defmethod

| emacs, elisp, subed

2022-01-27: Added example function description.
2022-01-02: Changed quote to function in the defalias.

I recently took over the maintenance of subed, an Emacs mode for editing subtitles. One of the things on my TODO list was to figure out how to handle generic and format-specific functions instead of relying on defalias. For example, there are SubRip files (.srt), WebVTT files (.vtt), and Advanced SubStation Alpha (.ass). I also want to add support for Audacity labels and other formats.

There are some functions that will work across all of them once you have the appropriate format-specific functions in place, and there are some functions that have to be very different depending on the format that you're working with. Now, how do you do those things in Emacs Lisp? There are several ways of making general functions and specific functions.

For example, the forward-paragraph and backward-paragraph commands use variables to figure out the paragraph separators, so buffer-local variables can change the behaviour.

However, I needed a bit more than regular expressions. An approach taken in some packages like smartparens is to have buffer-local variables have the actual functions to be called, like sp-forward-bound-fn and sp-backward-bound-fn.

(defvar-local sp-forward-bound-fn nil
  "Function to restrict the forward search")

(defun sp--get-forward-bound ()
  "Get the bound to limit the forward search for looking for pairs.
If it returns nil, the original bound passed to the search
function will be considered."
  (and sp-forward-bound-fn (funcall sp-forward-bound-fn)))

Since there were so many functions, I figured that might be a little bit unwieldy. In Org mode, custom export backends are structs that have an alist that maps the different types of things to the functions that will be called, overriding the functions that are defined in the parent export backend.

(cl-defstruct (org-export-backend (:constructor org-export-create-backend)
          (:copier nil))
  name parent transcoders options filters blocks menu)

(defun org-export-get-all-transcoders (backend)
  "Return full translation table for BACKEND.

BACKEND is an export back-end, as return by, e.g,,
`org-export-create-backend'.  Return value is an alist where
keys are element or object types, as symbols, and values are

Unlike to `org-export-backend-transcoders', this function
also returns transcoders inherited from parent back-ends,
if any."
  (when (symbolp backend) (setq backend (org-export-get-backend backend)))
  (when backend
    (let ((transcoders (org-export-backend-transcoders backend))
      (while (setq parent (org-export-backend-parent backend))
        (setq backend (org-export-get-backend parent))
        (setq transcoders
              (append transcoders (org-export-backend-transcoders backend))))

The export code looked a little bit complicated, though. I wanted to see if there was a different way of doing things, and I came across cl-defmethod. Actually, the first time I tried to implement this, I was focused on the fact that cl-defmethod could call different things depending on the class that you give it. So initially I had created a couple of classes: subed-backend class, and then subclasses such as subed-vtt-backend. This allowed me to store the backend as a buffer-local variable and differentiate based on that.

(require 'eieio)

(defclass subed-backend ()
  ((regexp-timestamp :initarg :regexp-timestamp
                     :initform ""
                     :type string
                     :custom string
                     :documentation "Regexp matching a timestamp.")
   (regexp-separator :initarg :regexp-separator
                     :initform ""
                     :type string
                     :custom string
                     :documentation "Regexp matching the separator between subtitles."))
  "A class for data and functions specific to a subtitle format.")

(defclass subed-vtt-backend (subed-backend) nil
  "A class for WebVTT subtitle files.")

(cl-defmethod subed--timestamp-to-msecs ((backend subed-vtt-backend) time-string)
  "Find HH:MM:SS,MS pattern in TIME-STRING and convert it to milliseconds.
Return nil if TIME-STRING doesn't match the pattern.
Use the format-specific function for BACKEND."
    (when (string-match (oref backend regexp-timestamp) time-string)
      (let ((hours (string-to-number (match-string 1 time-string)))
            (mins  (string-to-number (match-string 2 time-string)))
            (secs  (string-to-number (match-string 3 time-string)))
            (msecs (string-to-number (subed--right-pad (match-string 4 time-string) 3 ?0))))
        (+ (* (truncate hours) 3600000)
           (* (truncate mins) 60000)
           (* (truncate secs) 1000)
           (truncate msecs))))))

Then I found out that you can use major-mode as a context specifier for cl-defmethod, so you can call different specific functions depending on the major mode that your buffer is in. It doesn't seem to be mentioned in the elisp manual, so at some point I should figure out how to suggest mentioning it. Anyway, now I have some functions that get called if the buffer is in subed-vtt-mode and some functions that get called if the buffer is in subed-srt-mode.

The catch is that cl-defmethod can't define interactive functions. So if I'm defining a command, an interactive function that can be called with M-x, then I will need to have a regular function that calls the function defined with cl-defmethod. This resulted in a bit of duplicated code, so I have a macro that defines the method and then defines the possibly interactive command that calls that method. I didn't want to think about whether something was interactive or not, so my macro just always creates those two functions. One is a cl-defmethod that I can override for a specific major mode, and one is the function that actually calls it, which may may not be interactive. It doesn't handle &rest args, but I don't have any in subed.el at this time.

(defmacro subed-define-generic-function (name args &rest body)
  "Declare an object method and provide the old way of calling it."
  (declare (indent 2))
  (let (is-interactive
    (when (stringp (car body))
      (setq doc (pop body)))
    (setq is-interactive (eq (caar body) 'interactive))
       (cl-defgeneric ,(intern (concat "subed--" (symbol-name name)))
         ,@(if is-interactive
               (cdr body)
       ,(if is-interactive
            `(defun ,(intern (concat "subed-" (symbol-name name))) ,args
               ,(concat doc "\n\nThis function calls the generic function `"
                        (concat "subed--" (symbol-name name)) "' for the actual implementation.")
               ,(car body)
               (,(intern (concat "subed--" (symbol-name name)))
                ,@(delq nil (mapcar (lambda (a)
                                      (unless (string-match "^&" (symbol-name a))
          `(defalias (quote ,(intern (concat "subed-" (symbol-name name))))
             (function ,(intern (concat "subed--" (symbol-name name))))

For example, the function:

(subed-define-generic-function timestamp-to-msecs (time-string)
  "Find timestamp pattern in TIME-STRING and convert it to milliseconds.
Return nil if TIME-STRING doesn't match the pattern.")

expands to:

  (cl-defgeneric subed--timestamp-to-msecs
    "Find timestamp pattern in TIME-STRING and convert it to milliseconds.
Return nil if TIME-STRING doesn't match the pattern.")
  (defalias 'subed-timestamp-to-msecs 'subed--timestamp-to-msecs "Find timestamp pattern in TIME-STRING and convert it to milliseconds.
Return nil if TIME-STRING doesn't match the pattern."))

and the interactive command defined with:

(subed-define-generic-function forward-subtitle-end ()
  "Move point to end of next subtitle.
Return point or nil if there is no next subtitle."
  (when (subed-forward-subtitle-id)

expands to:

  (cl-defgeneric subed--forward-subtitle-end nil "Move point to end of next subtitle.
Return point or nil if there is no next subtitle."
  (defun subed-forward-subtitle-end nil "Move point to end of next subtitle.
Return point or nil if there is no next subtitle.

This function calls the generic function `subed--forward-subtitle-end' for the actual implementation."

Then I can define a specific one with:

(cl-defmethod subed--timestamp-to-msecs (time-string &context (major-mode subed-srt-mode))
  "Find HH:MM:SS,MS pattern in TIME-STRING and convert it to milliseconds.
Return nil if TIME-STRING doesn't match the pattern.
Use the format-specific function for MAJOR-MODE."
    (when (string-match subed--regexp-timestamp time-string)
      (let ((hours (string-to-number (match-string 1 time-string)))
            (mins  (string-to-number (match-string 2 time-string)))
            (secs  (string-to-number (match-string 3 time-string)))
            (msecs (string-to-number (subed--right-pad (match-string 4 time-string) 3 ?0))))
        (+ (* (truncate hours) 3600000)
           (* (truncate mins) 60000)
           (* (truncate secs) 1000)
           (truncate msecs))))))

The upside is that it's easy to either override or extend a function's behavior. For example, after I sort subtitles, I want to renumber them if I'm in an SRT buffer because SRT subtitles have numeric IDs. This doesn't happen in any of the other modes. So I can just define that this bit of code runs after the regular code that runs.

(cl-defmethod subed--sort :after (&context (major-mode subed-srt-mode))
  "Renumber after sorting. Format-specific for MAJOR-MODE."

The downside is that going to the function's definition and stepping through it is a little more complicated because it's hidden behind this macro and the cl-defmethod infrastructure. I think that if you describe-function the right function, the internal version with the --, then it will list the different implementations of it. I added a note to the regular function's docstring to make it a little easier.

Here's what M-x describe-function subed-forward-subtitle-end looks like:


Figure 1: Describing a generic function

I'm going to give this derived-mode branch a try for a little while by subtitling some more EmacsConf talks before I merge it into the main branch. This is my first time working with cl-defmethod, and it looks pretty interesting.