EmacsConf backstage: Makefile targets

Posted: - Modified: | emacsconf

[2024-11-16 Sat]: Removed highlight_words from whisperx call.

We like to use pre-recorded videos at EmacsConf to minimize technical risks. This also means we can caption them beforehand, stream them with open captions, and publish them as soon as the talk goes live.

Here's the process:

  1. Speakers upload their videos in whatever format they like. We use PsiTransfer to accept the uploaded files.
  2. We rename the files to have the talk title and speaker name in the filename, like emacsconf-2024-emacs30--emacs-30-highlights--philip-kaludercic--original.mov.
  3. We use FFmpeg to reencode them to WEBM so that everything is available in a free format, and we replace the --original.* part with --reencoded.webm. We copy this to --main.webm as a starting point.
  4. We extract the audio and save it to --reencoded.opus.
  5. We use ffmpeg-normalize to normalize the audio and save it to --normalized.opus.
  6. We use WhisperX to get a reasonable starting point for captions, which we save to --reencoded.vtt. I remove the underlines and the tsv and srt files.
  7. Someone edits the captions. We save edited captions as --main.vtt.
  8. --normalized.opus and --main.vtt get combined into --main.webm.

I've been slowly learning how to set up Makefile rules to automate more and more of this. Let's go through parts of the roles/prerec/templates/Makefile.

Make the reencoded webm from the original MP4, MOV, MKV, or WEBM

Here's the rule that makes a --reencoded.webm based on the original mp4, mov, mkv, or webm.

VIDEO_EXTS = mp4 mkv webm mov
source_patterns = $(foreach ext,$(VIDEO_EXTS),$(1)--original.$(ext))
emacsconf-%--reencoded.webm: SOURCES = $(call source_patterns, emacsconf-$*)
emacsconf-%--reencoded.webm:
  $(eval SOURCE := $(lastword $(sort $(wildcard $(SOURCES)))))
  @if [ -z "$(SOURCE)" ]; then \
    echo "No source file found for $@"; \
    echo "Tried: $(SOURCES)"; \
    exit 1; \
  fi
  @echo "Using source: $(SOURCE)"
  ./reencode-in-screen.sh "$(SOURCE)"

Reencoding can take a while and it's prone to me accidentally breaking it, so we stick it in a GNU screen so that I don't accidentally quit it. This is reencode-in-screen.sh:

#!/bin/bash
ORIGINAL=$1
BASE="${ORIGINAL%--original.*}"
REENCODED="${BASE}--reencoded.webm"
SLUG=$(echo "$ORIGINAL" | perl -ne '/^emacsconf-[0-9]*-(.*?)--/ && print $1')
LOCK=".lock-$SLUG"

if [ ! -f "$REENCODED" ]; then
    if [  -f "$LOCK" ]; then
        echo "$LOCK already exists, waiting for it"
    else
        touch "$LOCK"
        screen -dmS reencode-$SLUG /bin/bash -c "reencode.sh \"$ORIGINAL\" \"$REENCODED\" && thumbnail.sh \"$MAIN\" && rm \"$LOCK\""
        echo "Processing $REENCODED in reencode-$SLUG"
    fi
fi

which calls roles/prerec/templates/reencode.sh. Here's the templatized version from Ansible:

#!/usr/bin/env bash

set -euo pipefail

# Defaults
q={{ reencode_quality }}
cpu={{ reencode_cpu }}
time_limit=""
print_only=false
limit_resolution={{ res_y }}
limit_fps={{ fps }}

while getopts :q:c:t:s OPT; do
    case $OPT in
        q|+q)
            q="$OPTARG"
            ;;
        c|+c)
            cpu="$OPTARG"
            ;;
        t|+t)
            time_limit="-to $OPTARG"
            ;;
        s)
            print_only=true
            ;;
        *)
            echo "usage: `basename $0` [+-q ARG] [+-c ARG} [--] ARGS..."
            exit 2
    esac
done
shift `expr $OPTIND - 1`
OPTIND=1

input="$1"
output="${2:-$(echo $input | sed 's/--original.*/--reencoded.webm/')}"

command="$(cat<<EOF
ffmpeg -y -i "$input" $time_limit \
       -vf "scale='-1':'min($limit_resolution,ih)',
            fps='$limit_fps'" \
       -c:v libvpx-vp9 -b:v 0 -crf $q -an \
       -row-mt 1 -tile-columns 2 -tile-rows 2 -cpu-used $cpu -g 240 \
       -pass 1 -f webm -threads $cpu /dev/null &&
    ffmpeg -y -i "$input" $time_limit \
           -vf "scale='-1':'min($limit_resolution,ih)',
                fps='$limit_fps'" \
               -c:v libvpx-vp9 -b:v 0 -crf $q -c:a libopus \
               -row-mt 1 -tile-columns 2 -tile-rows 2 -cpu-used $cpu \
               -pass 2 -threads $cpu -- "$output"
EOF
)"

if [ $print_only == true ]; then
    echo "$command"
else
    eval "$command"
fi

Process the audio and captions

Processing the audio is relatively straightforward.

emacsconf-%--reencoded.opus: emacsconf-%--reencoded.webm
  ffmpeg -i "$<" -c:a copy "$@"

emacsconf-%--normalized.opus: emacsconf-%--reencoded.opus
  ffmpeg-normalize "$<" -ofmt opus -c:a libopus -o "$@"

emacsconf-%--reencoded.vtt: emacsconf-%--reencoded.opus
  whisperx --model large-v2 --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --compute_type int8 --print_progress True --max_line_width 50 --segment_resolution chunk --max_line_count 1 --language en "$<"

After this, we need to manually process the --reencoded.vtt and then eventually save the edited version as --main.vtt.

Combine the video, audio, and subtitles

The next part of the Makefile creates the --main.webm from the reencoded, normalized, and edited files, or from just the --reencoded.webm if that's all that's available.

emacsconf-%--main.webm: emacsconf-%--reencoded.webm emacsconf-%--normalized.opus emacsconf-%--main.vtt
  ffmpeg -i emacsconf-$*--reencoded.webm -i emacsconf-$*--normalized.opus -i emacsconf-$*--main.vtt \
    -map 0:v -map 1:a -c:v copy -c:a copy \
    -map 2 -c:s webvtt -y \
    $@

emacsconf-%--main.webm: emacsconf-%--reencoded.webm
  cp "$<" "$@"

This works because the Makefile picks the most specific set of dependencies.

Making all the files based on the original ones that are available

Finally, we need some rules to make various things. We do this with a wildcard match for all the original files, and then we make a list without the --original.*. After that, we can just use addsuffix to add the different file endings.

PRERECS_ORIGINAL := $(wildcard emacsconf-*--original.*)
PREFIXES := $(shell for f in $(PRERECS_ORIGINAL); do echo "$${f%--original.*}"; done)
PRERECS_REENCODED := $(addsuffix --reencoded.webm, $(PREFIXES))
PRERECS_OPUS := $(addsuffix --reencoded.opus, $(PREFIXES))
PRERECS_NORMAL := $(addsuffix --normalized.opus, $(PREFIXES))
PRERECS_MAIN := $(addsuffix --main.webm, $(PREFIXES))
PRERECS_CAPTIONS := $(addsuffix --reencoded.vtt, $(PREFIXES))

all: reencoded opus normal main
reencoded: $(PRERECS_REENCODED)
opus: $(PRERECS_OPUS)
normal: $(PRERECS_NORMAL)
captions: $(PRERECS_CAPTIONS)
main: $(PRERECS_MAIN)

I sometimes do the captions on my computer, so I've left them out of the all target.

Seems to be doing all right so far. It's nice having the Makefile figure out what's changed and what needs to be updated.

View org source for this post
You can comment with Disqus or you can e-mail me at sacha@sachachua.com.