Categories: emacsconf

RSS - Atom - Subscribe via email

EmacsConf backstage: Using TRAMP and timers to run two tracks semi-automatically

| emacs, emacsconf, org

In previous years, organizers streamed the video feeds for EmacsConf from their own computers to the Icecast server, which was a little challenging because of CPU load. A server shared by a volunteer had a 6-core Intel Xeon E5-2420 with 48 GB of RAM, which turned out to be enough horsepower to run OBS for both the general and development track for EmacsConf 2022. One of the advantages of this setup was that I could write some Emacs Lisp to automatically play recorded intros and talk videos at scheduled times, right from the large Org file that had all the conference details. I used SCHEDULED: properties to indicate when talks should play, and that was picked up by another function that took the Org entry properties and put them into a plist.

This function scheduled the timers:

(defun emacsconf-stream-schedule-timers (&optional info)
  "Schedule PLAYING for the rest of talks and CLOSED_Q for recorded talks."
  (setq info (emacsconf-prepare-for-display (emacsconf-filter-talks (or info (emacsconf-get-talk-info)))))
  (let ((now (current-time)))
    (mapc (lambda (talk)
            (when (and (time-less-p now (plist-get talk :start-time)))
              (emacsconf-stream-schedule-talk-status-change talk (plist-get talk :start-time) "PLAYING"
                                                            `(:title (concat "Starting " (plist-get talk :slug)))))
            (when (and
                   (plist-get talk :video-file)
                   (plist-get talk :qa-time)
                   (not (string-match "none" (or (plist-get talk :q-and-a) "none")))
                   (null (plist-get talk :stream-files)) ;; can't tell when this is
                   (time-less-p now (plist-get talk :qa-time)))
              (emacsconf-stream-schedule-talk-status-change talk (plist-get talk :qa-time) "CLOSED_Q"
                                                            `(:title (concat "Q&A for " (plist-get talk :slug) " (" (plist-get talk :q-and-a) ")"))))

It turns out that TRAMP doesn't like being called from timers if there's a chance that two TRAMP processes might run at the same time. I got "Forbidden reentrant call of Tramp" errors when that happened. There was an easy fix, though. I adjusted the schedules of the talks so that they started at least a minute apart.

Sometimes I wanted to cancel just one timer:

(defun emacsconf-stream-cancel-timer (id)
  "Cancel a timer by ID."
  (interactive (list
                 "ID: "
                 (lambda (string pred action)
                    (if (eq action 'metadata)
                        `(metadata (display-sort-function . ,#'identity))
                      (complete-with-action action
                                             (seq-filter (lambda (o)
                                                           (and (timerp (cdr o))
                                                                (not (timer--triggered (cdr o)))))
                                             (lambda (a b) (string< (car a) (car b))))
                                            string pred))))))
  (when (timerp (assoc-default id emacsconf-stream-timers))
    (cancel-timer (assoc-default id emacsconf-stream-timers))
    (setq emacsconf-stream-timers
          (delq (assoc id emacsconf-stream-timers)
                (seq-filter (lambda (o)
                              (and (timerp (cdr o))
                                   (not (timer--triggered (cdr o)))))

and schedule just one timer manually:

(defun emacsconf-stream-schedule-talk-status-change (talk time new-status &optional notification)
  "Schedule a one-off timer for TALK at TIME to set it to NEW-STATUS."
  (interactive (list (emacsconf-complete-talk-info)
                     (read-string "Time: ")
                     (completing-read "Status: " (mapcar 'car emacsconf-status-types))))
  (require 'diary-lib)
  (setq talk (emacsconf-resolve-talk talk))
  (let* ((converted
           ((listp time) time)
           ((timer-duration time) (timer-relative-time nil (timer-duration time)))
           (t                           ; HH:MM
            (date-to-time (concat (format-time-string "%Y-%m-%d" nil emacsconf-timezone)
                                  (string-pad time 5 ?0 t) 
         (timer-id (concat (format-time-string "%m-%dT%H:%M" converted)
                           (plist-get talk :slug)
    (emacsconf-stream-cancel-timer timer-id) 
    (add-to-list 'emacsconf-stream-timers
                   (run-at-time time converted #'emacsconf-stream-update-talk-status-from-timer
                                talk new-status

The actual playing of talks happened using functions that were called from org-after-todo-state-change-hook. I wrote a function that extracted the talk information and then called my own list of functions.

(defun emacsconf-org-after-todo-state-change ()
  "Run all the hooks in `emacsconf-todo-hooks'.
If an `emacsconf-todo-hooks' entry is a list, run it only for the
tracks with the ID in the cdr of that list."
  (let* ((talk (emacsconf-get-talk-info-for-subtree))
         (track (emacsconf-get-track (plist-get talk :track))))
     (lambda (hook-entry)
        ((symbolp hook-entry) (funcall hook-entry talk))
        ((member (plist-get track :id) (cdr hook-entry))
         (funcall (car hook-entry) talk))))

For example, this function played the recorded intro and the talk:

(defun emacsconf-stream-play-talk-on-change (talk)
  "Play the talk."
  (interactive (list (emacsconf-complete-talk-info)))
  (setq talk (emacsconf-resolve-talk talk))
  (when (or (not (boundp 'org-state)) (string= org-state "PLAYING"))
    (if (plist-get talk :stream-files)
           (plist-get talk :slug))
            (split-string-and-unquote (plist-get talk :stream-files))
            (list "&"))))
           (plist-get talk :recorded-intro)
           (plist-get talk :video-file)) ;; recorded intro and recorded talk
          (message "should automatically play intro and recording")
          (list "play-with-intro" (plist-get talk :slug))) ;; todo deal with stream files
           (plist-get talk :recorded-intro)
           (null (plist-get talk :video-file))) ;; recorded intro and live talk; play the intro and join BBB
          (message "should automatically play intro; join %s" (plist-get talk :bbb-backstage))
          (list "intro" (plist-get talk :slug)))
           (null (plist-get talk :recorded-intro))
           (plist-get talk :video-file)) ;; live intro and recorded talk, show slide and use Mumble; manually play talk
          (message "should show intro slide; play %s afterwards" (plist-get talk :slug))
          (list "intro" (plist-get talk :slug)))
           (null (plist-get talk :recorded-intro))
           (null (plist-get talk :video-file))) ;; live intro and live talk, join the BBB
          (message "join %s for live intro and talk" (plist-get talk :bbb-backstage))
          (list "bbb" (plist-get talk :slug)))))))))

and this function handled IRC announcements when the talk state changed:

(defun emacsconf-erc-announce-on-change (talk)
  "Announce talk."
  (let ((func
         (pcase org-state
           ("PLAYING" #'erc-cmd-NOWPLAYING)
           ("CLOSED_Q" #'erc-cmd-NOWCLOSEDQ)
           ("OPEN_Q" #'erc-cmd-NOWOPENQ)
           ("UNSTREAMED_Q" #'erc-cmd-NOWUNSTREAMEDQ)
           ("TO_ARCHIVE" #'erc-cmd-NOWDONE))))
    (when func
      (funcall func talk))))

The actual announcements were handled by something like this:

(defun erc-cmd-NOWCLOSEDQ (talk)
  "Announce TALK has started Q&A, but the host has not yet opened it up."
  (interactive (list (emacsconf-complete-talk-info)))
  (when (stringp talk) (setq talk (or (emacsconf-find-talk-info talk) (error "Could not find talk %s" talk))))
  (if (emacsconf-erc-recently-announced (format "-- Q&A beginning for \"%s\"" (plist-get talk :slug)))
      (message "Recently announced, skipping")
    (emacsconf-erc-with-channels (list (concat "#" (plist-get talk :channel)))
      (erc-send-message (format "-- Q&A beginning for \"%s\" (%s) Watch: %s Add notes/questions: %s"
                                (plist-get talk :title)
                                (plist-get talk :qa-info)
                                (plist-get talk :watch-url)
                                (plist-get talk :pad-url))))  
    (emacsconf-erc-with-channels (list emacsconf-erc-hallway emacsconf-erc-org)
      (erc-send-message (format "-- Q&A beginning for \"%s\" in the %s track (%s) Watch: %s Add notes/questions: %s . Chat: #%s"
                                (plist-get talk :title)
                                (plist-get talk :track)
                                (plist-get talk :qa-info)
                                (plist-get talk :watch-url)
                                (plist-get talk :pad-url)
                                (plist-get talk :channel))))))

All that code meant that during the actual conference, my role was mostly just worrying, and occasionally starting up the Q&A (if I wasn't sure if the code would do it right). The shell scripts I wrote made it easy for the other organizers to take over the second part as they saw how it worked.

Yay timers, Emacs, and TRAMP!

You can find the latest versions of these functions in the emacsconf-el repository.

Using Emacs and Python to record an animation and synchronize it with audio

| emacs, emacsconf, python

[2023-01-14 Sat]: Removed my fork since upstream now has the :eval function.

The Q&A session for Things I'd like to see in Emacs (Richard Stallman) from EmacsConf 2022 was done over Mumble. Amin pasted the questions into the Mumble chat buffer and I copied them into a larger buffer as the speaker answered them, but I didn't do it consistently. I figured it might be worth making another video with easier-to-read visuals. At first, I thought about using LaTeX to create Beamer slides with the question text, which I could then turn into a video using ffmpeg. Then I decided to figure out how to animate the text in Emacs, because why not? I figured a straightforward typing animation would probably be less distracting than animate-string, and emacs-director seems to handle that nicely. I forked it to add a few things I wanted, like variables to make the typing speed slower (so that it could more reliably type things on my old laptop, since sometimes the timers seemed to have hiccups) and an :eval step for running things without needing to log them. (2023-01-14: Upstream has the :eval feature now.)

To make it easy to synchronize the resulting animation with the chapter markers I derived from the transcript of the audio file, I decided to beep between scenes. First step: make a beep file.

ffmpeg -y -f lavfi -i 'sine=frequency=1000:duration=0.1' beep.wav

Next, I animated the text, with a beep between scenes. I used subed-parse-file to read the question text directly from the chapter markers, and I used simplescreenrecorder to set up the recording settings (including audio).

(defun my-beep ()
    (shell-command "aplay ~/recordings/beep.wav &" nil nil)))

(require 'director)
(defvar emacsconf-recording-process nil)
(shell-command "xdotool getwindowfocus windowsize 1282 720")
  (switch-to-buffer (get-buffer-create "*Questions*"))
  (face-remap-add-relative 'default :height 300)
  (setq-local mode-line-format "   Q&A for EmacsConf 2022: What I'd like to see in Emacs (Richard M. Stallman) -")
  (sit-for 3)
  (hl-line-mode -1)
  (when (process-live-p emacsconf-recording-process) (kill-process emacsconf-recording-process))
  (setq emacsconf-recording-process (start-process "ssr" (get-buffer-create "*ssr*")
  (sit-for 3)
   :version 1
   :log-target '(file . "/tmp/director.log")
   (lambda ()
     (switch-to-buffer (get-buffer-create "*Questions*"))
   (let ((subtitles (subed-parse-file "~/proj/emacsconf/rms/emacsconf-2022-rms--what-id-like-to-see-in-emacs--answers--chapters.vtt")))
     (apply #'append
             (list :eval '(my-beep))
             (list :type "* Q&A for Richard Stallman's EmacsConf 2022 talk: What I'd like to see in Emacs\n\n\n"))
             (lambda (sub)
                (list :log (elt sub 3))
                (list :eval '(progn (org-end-of-subtree)
                                    (unless (bolp) (insert "\n"))))
                (list :type (concat "** " (elt sub 3) "\n\n"))
                (list :eval '(org-back-to-heading))
                (list :wait 5)
                (list :eval '(my-beep))))
   :typing-style 'human
   :delay-between-steps 0
   :after-end (lambda ()
                (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))
   :on-failure (lambda ()
                 (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))
   :on-error (lambda ()
               (process-send-string emacsconf-recording-process "record-save\nwindow-show\nquit\n"))))

I used the following code to copy the latest recording to animation.webm and extract the audio to animation.wav. my-latest-file and my-recordings-dir are in my Emacs config.

(let ((name "animation.webm"))
  (copy-file (my-latest-file my-recordings-dir) name t)
   (format "ffmpeg -y -i %s -ar 8000 -ac 1 %s.wav"
           (shell-quote-argument name)
           (shell-quote-argument (file-name-sans-extension name)))))

Then I needed to get the timestamps of the beeps in the recording. I subtracted a little bit (0.82 seconds) based on comparing the waveform with the results.

filename = "animation.wav"
from import wavfile
from scipy import signal
import numpy as np
import re
rate, source =
peaks = signal.find_peaks(source, height=1000, distance=1000)
base_times = (peaks[0] / rate) - 0.82

I noticed that the first question didn't seem to get beeped properly, so I tweaked the times. Then I wrote some code to generate a very long ffmpeg command that used trim and tpad to select the segments and extend them to the right durations. There was some drift when I did it without the audio track, but the timestamps seemed to work right when I included the Q&A audio track as well.

import webvtt
import subprocess
chapters_filename =  "emacsconf-2022-rms--what-id-like-to-see-in-emacs--answers--chapters.vtt"
answers_filename = "answers.wav"
animation_filename = "animation.webm"
def get_length(filename):
    result =["ffprobe", "-v", "error", "-show_entries",
                             "format=duration", "-of",
                             "default=noprint_wrappers=1:nokey=1", filename],
    return float(result.stdout)

def get_frames(filename):
    result =["ffprobe", "-v", "error", "-select_streams", "v:0", "-count_packets",
                             "-show_entries", "stream=nb_read_packets", "-of",
                             "csv=p=0", filename],
    return float(result.stdout)

answers_length = get_length(answers_filename)
# override base_times
times = np.asarray([  1.515875,  13.50, 52.32125 ,  81.368625, 116.66625 , 146.023125,
       161.904875, 182.820875, 209.92125 , 226.51525 , 247.93875 ,
       260.971   , 270.87375 , 278.23325 , 303.166875, 327.44925 ,
       351.616375, 372.39525 , 394.246625, 409.36325 , 420.527875,
       431.854   , 440.608625, 473.86825 , 488.539   , 518.751875,
       544.1515  , 555.006   , 576.89225 , 598.157375, 627.795125,
       647.187125, 661.10875 , 695.87175 , 709.750125, 717.359875])
fps = 30.0
times = np.append(times, get_length(animation_filename))
anim_spans = list(zip(times[:-1], times[1:]))
chapters =
if chapters[0].start_in_seconds == 0:
    vtt_times = [[c.start_in_seconds, c.text] for c in chapters]
    vtt_times = [[0, "Introduction"]] + [[c.start_in_seconds, c.text] for c in chapters] 
vtt_times = vtt_times + [[answers_length, "End"]]
# Add ending timestamps
vtt_times = [[x[0][0], x[1][0], x[0][1]] for x in zip(vtt_times[:-1], vtt_times[1:])]
test_rate = 1.0

i = 0
concat_list = ""
groups = list(zip(anim_spans, vtt_times))
import ffmpeg
animation = ffmpeg.input('animation.webm').video
audio = ffmpeg.input('rms.opus')

for_overlay = ffmpeg.input('color=color=black:size=1280x720:d=%f' % answers_length, f='lavfi')
params = {"b:v": "1k", "vcodec": "libvpx", "r": "30", "crf": "63"}
test_limit = 1
params = {"vcodec": "libvpx", "r": "30", "copyts": None, "b:v": "1M", "crf": 24}
test_limit = 0
anim_rate = 1
import math
cursor = 0
if test_limit > 0:
    groups = groups[0:test_limit]
clips = []

# cursor is the current time
for anim, vtt in groups:
    padding = vtt[1] - cursor - (anim[1] - anim[0]) / anim_rate
    if (padding < 0):
        print("Squeezing", math.floor((anim[1] - anim[0]) / (anim_rate * 1.0)), 'into', vtt[1] - cursor, padding)
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS')) 
    elif padding == 0:
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS'))
        print("%f to %f: Padding %f into %f - pad: %f" % (cursor, vtt[1], (anim[1] - anim[0]) / (anim_rate * 1.0), vtt[1] - cursor, padding))
        cursor = cursor + padding + (anim[1] - anim[0]) / anim_rate
        clips.append(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS').filter('tpad', stop_mode="clone", stop_duration=padding))
    for_overlay = for_overlay.overlay(animation.trim(start=anim[0], end=anim[1]).setpts('PTS-STARTPTS+%f' % vtt[0]))
    clips.append(audio.filter('atrim', start=vtt[0], end=vtt[1]).filter('asetpts', 'PTS-STARTPTS'))
args = ffmpeg.concat(*clips, v=1, a=1).output('output.webm', **params).overwrite_output().compile()
print(' '.join(f'"{item}"' for item in args))

Anyway, it's here for future reference. =)

View org source for this post

Converting our VTT files to TTML

| emacsconf, geek, ffmpeg

I wanted to convert our VTT files to TTML files so that we might be able to use them for training lachesis for transcript segmentation. I downloaded the VTT files from EmacsConf 2021 to a directory and copied the edited captions from the EmacsConf 2022 backstage area (using head -1 ${FILE} | grep -q "captioned" to distinguish them from the automatic ones). I installed the ttconv python package. Then I used the following command to convert the TTML files:

for FILE in *.vtt; do
    BASE=$(basename -s .vtt "$FILE");
    ffmpeg -y -i $FILE $; tt convert -i $ -o $BASE.ttml

I haven't gotten around to installing whanever I need in order to get lachesis to work under Python 2.7, since it hasn't been updated for Python 3. It'll probably be a low-priority project anyway, as EmacsConf is fast approaching. Anyway, I thought I'd stash this in my blog somewhere in case I need to make TTML files again!

Re-encoding the EmacsConf videos with FFmpeg and GNU Parallel

| geek, linux, emacsconf, ffmpeg

It turns out that using -crf 56 compressed the EmacsConf a little too aggressively, losing too much information in the video. We wanted to reencode everything, maybe going back to the default value of -crf 32. My laptop would have taken a long time to do all of those videos. Fortunately, one of the other volunteers shared a VM on a machine with 12 cores, and I had access to a few other systems. It was a good opportunity to learn how to use GNU Parallel to send jobs to different machines and retrieve the results.

First, I updated the compression script,

ffmpeg -y -i "$FILE"  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -aq-mode 2 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -cpu-used 8 -auto-alt-ref 1 -lag-in-frames 25 -g 240 -pass 1 -f webm -an -threads 8 /dev/null &&
if [[ $FILE =~ "webm" ]]; then
    ffmpeg -y -i "$FILE" $*  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -tile-columns 2 -tile-rows 2 -frame-parallel 0 -cpu-used -5 -auto-alt-ref 1 -lag-in-frames 25 -pass 2 -g 240 -ac 2 -threads 8 -c:a copy "${FILE%.*}--compressed$SUFFIX.webm"
    ffmpeg -y -i "$FILE" $*  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -tile-columns 2 -tile-rows 2 -frame-parallel 0 -cpu-used -5 -auto-alt-ref 1 -lag-in-frames 25 -pass 2 -g 240 -ac 2 -threads 8 -c:a libvorbis "${FILE%.*}--compressed$SUFFIX.webm"

I made an originals.txt file with all the original filenames. It looked like this:


I set up a ~/.parallel/emacsconf profile with something like this so that I could use three computers and my laptop, sending one job each and displaying progress:

--sshlogin computer1 --sshlogin computer2 --sshlogin computer3 --sshlogin : -j 1 --progress --verbose --joblog parallel.log

I already had SSH key-based authentication set up so that I could connect to the three remote computers.

Then I spread the jobs over four computers with the following command:

cat originals.txt | parallel -J emacsconf \
                             --transferfile {} \
                             --return '{=$_ =~ s/\..*?$/--compressed32.webm/=}' \
                             --cleanup \
                             --basefile \
                             bash 32 {}

It copied each file over to the computer it was assigned to, processed the file, and then copied the file back.

It was also helpful to occasionally do echo 'killall -9 ffmpeg' | parallel -J emacsconf -j 1 --onall if I cancelled a run.

It still took a long time, but less than it would have if any one computer had to crunch through everything on its own.

This was much better than my previous way of doing things, which involved copying the files over, running ffmpeg commands, copying the files back, and getting somewhat confused about which directory I was in and which file I assigned where and what to do about incompletely-encoded files.

I sometimes ran into problems with incompletely-encoded files because I'd cancelled the FFmpeg process. Even though ffprobe said the files were long, they were missing a large chunk of video at the end. I added a compile-media-verify-video-frames function to compile-media.el so that I could get the last few seconds of frames, compare them against the duration, and report an error if there was a big gap.

Then I changed emacsconf-publish.el to use the new filenames, and I regenerated all the pages. For EmacsConf 2020, I used some Emacs Lisp to update the files. I'm not particularly fond of wrangling video files (lots of waiting, high chance of error), but I'm glad I got the computers to work together.

Adding little nudges to help on the EmacsConf wiki

| emacs, emacsconf

A number of people helped capture the talks for EmacsConf 2021, which was fantastic because we were able to stream all of the first day's talks with open captions and most of the second day's talks too. Right now, in fact, there are only two talks left that haven't been captioned. After the conference, a couple of other people volunteered to help out as well. Whee!

I want to figure out a good way to help people work on the things that they're interested in without necessarily burdening them with too much work, too little work, too much coordination, not enough coordination. Before the conference, one of the perks we had offered was that captioners got early access to the videos. I had a password-protected directory on a web server and an index that I made using Emacs Lisp to display the the talks that still need to be captioned. People e-mailed me to call dibs on the talk they wanted to caption, and that was how we avoided duplicating work. Now that all the videos are public, of course, people can just go to the regular wiki.

The other thing to think about is that in addition to captioning the two remaining talks (not essential, but it would be nice), there are also different levels of things that we can do. It would be nice to have chapter markers for some of the longer Q&A sessions. It would be fantastic to cross reference those with the questions and answers so that so that people can jump to the section they're interested in. It'd be incredible if somebody actually wrote down the answers. And it'd be even more awesome if people actually captioned the Q&A sessions as well, which were in many cases much longer than the actual sessions. So this is a fair bit of work, but people can probably pick a level that matches their interest and time available.

I'm not entirely sure how to coordinate this especially since I've got limited computer time. So my goal is to have something where volunteers can basically just wander around looking for talks that they're interested in and see ways to help out, or see a list of things that could use some work. So for example, while they're browsing the maintainers talk, they might say, "Oh, this one needs some chapter markers. I want to help with that. How do I do that? How do I get started?" And then they go down that path. On the other hand, you might have somebody sitting down saying, "I've got an hour and I want to go help out. What can I do?"

I don't want to keep data in many different places. I wonder if I can use the wiki for a lot of this coordination. Now that the videos are public, I've started tagging the pages that need extra help, like long Q&A session that need chapter markers.

With a little bit more work, I think people will be able to follow the instructions from there, especially if they've done this kind of captioning before, or email us to ask for help and then we can get them started.

I also thought about using Etherpad to do that kind of coordination where people would put their name next to a thing to reserve it, but then that's one more step. I don't know. At the moment, editing the wiki is a bit of an involved process. Worst-case scenario (best-case, actually, if we have lots of people wanting to help? =) ), people can call dibs by emailing us at and one of us organizers will add a little note there in the volunteer attribute. It's probably a good start, so we'll see where we can take it.

EmacsConf backstage: picking timestamps from a waveform

| emacs, emacsconf

We wanted to trim the Q&A session recordings so that people don't have to listen to the transition from the main presentation or the long silence until we got around to stopping the recording.

The MPV video player didn't have a waveform view, so I couldn't just jump to the parts with sound. Audacity could show waveforms, but it didn't have an easy way to copy the timestamp. I didn't want to bother with heavyweight video-editing applications on my Lenovo X220. So the obvious answer is, of course, to make a text editor do the job. Yay Emacs!


Figure 1: Select timestamps using a waveform

It's very experimental and I don't know if it'll work for anyone else. If you want to use it, you will also need mpv.el, the MPV media player, and the ffmpeg command-line tool. Here's my workflow:

  • M-x waveform-show to select the file.
  • left-click on the waveform to copy the timestamp and start playing from there
  • right-click to sample from that spot
  • left and right to adjust the position, shift-left and shift-right to take smaller steps
  • SPC to copy the current MPV position
  • j to jump to a timestamp (hh:mm:ss or seconds)
  • > to speed up, < to slow down

I finally figured out how to use SVG to embed the waveform generated by FFMPEG and animate the current MPV playback position. Whee! There's lots of room for improvement, but it's a pretty fun start.

If you're curious, you can find the code at . Let me know if it actually works for you!