I came across Ruby Video on Hacker News and thought it was a good idea, particularly the topic view. I mentioned it in a toot and that seemed to strike a chord in the #emacs community there, so I exported some of the metadata for EmacsConf videos into an Org Mode file. @xenodium whipped up a quick web prototype at emacs.tv. I added a bunch of videos from Emacs News and wrote some code for playing the videos from Emacs, and then grabbed more videos from YouTube playlists and Vimeo search results. (Gotta find a good way to monitor PeerTube…) As of this writing, there are 2785 videos with a combined playtime of more than 1000 hours.
I am, in fact, listening to emacstv-background-mode as I write this. I was listening to it earlier while I played Minecraft with the kiddo. I'll probably shift some of my doomscrolling to shuffling through the emacs.tv web interface on my phone. I love hearing people's enthusiasm, and I occasionally pick up interesting tips along the way. (Gotta steal prot/window-single-toggle…)
It's easy to use little crumbs of time to add more tags to the videos.org file. Sometimes I use org-agenda with buffer restriction (<) and search (s) to mark/unmark (m, u) so that I can bulk-tag (B +). To make this even more convenient, I've added emacstv-agenda-search, emacstv-org-ql-search, and emacstv-org-ql-search-untagged so that I can do that bulk tagging from anywhere.
It would be nice to have mpv reuse the window. I wonder if I can queue up a number of videos instead of doing it one at a time, and if that would do the trick…
When subtitle times are too far off from the video
or audio, people start worrying if their video has
frozen or jumped ahead. It's good to keep
subtitles roughly in time with the audio.
For EmacsConf, we can get timing information from
two places. WhisperX produces a JSON file with
word data in the process of doing the speech
recognition, and the aeneas forced alignment tool
can use synthesized text-to-speech to figure out
the timestamps for each line of text compared to a
media file.
Aeneas timestamps are more helpful once we start
editing, but it can be confused by long silences,
extraneous noises, multiple speakers, and
inaccurate transcripts (words added or removed).
Loading word data requires a pretty close match at
the moment, but since we change only about 4% of
the subtitle text when editing, those cues are
still helpful. (I measured this by the Levenshtein
distance between the combined cue texts of edited
subtitles versus the original WhisperX
transcripts, using string-distance to
approximate the editing percentage.)
To make it easier to correct subtitle timing, I
added a few ways to tweak subtitle timing for a
region of subtitles.
WhisperX:subed-word-data-fix-subtitle-timing in
subed-word-data.el tries to match the word
data from WhisperX against the text of the current
subtitle, using string-distance for approximate
matches. I start at about two words shorter than
what's in the subtitle, and then increase the
number of words taken from the data while the
string distance decreases. I skip the data for
words before the beginning of the first
subtitle in the region.
Aeneas:subed-align-region uses Aeneas to realign the
subtitles from the region using the section of the
media file between the start of the first subtitle
and the end of the last subtitle in the region.
When I notice that the times are off, I skim the
subtitles (or just skim them visually) to find the
last well-timed subtitle. Then I pick a subtitle
that's in the incorrectly-timed section. I use
subed-mpv-jump-to-current-subtitle (M-j) to
jump to that position, and I play back that
subtitle. It usually belongs to some text further
down, so I reset to that position with M-j, set
my mark before the previous correctly-timed
subtitle with C-SPC, go to the subtitle that
matches that time, and use
subed-copy-player-pos-to-start-time (C-c [) to
set the proper timestamp. Then I can go to the
previous incorrectly-timed subtitle and use M-x
subed-align-region. This runs the Aeneas forced
alignment tool using just the subtitle text in the
region, the starting timestamp of the first
subtitle, and the ending timestamp of the last
subtitle, making it easy to adjust that section.
subed-align-region is in subed-align.el
Retiming by pressing SPC after each subtitle: As
an experiment, I've also added a
subed-retime-subtitles command that plays
through the subtitles so that I can press SPC
when the next subtitle starts. It begins with the
current subtitle and stops when you press a key
that's not in its keymap.
Manual adjustments: For fine-tuning timestamps,
I usually turn on subed-waveform-show-all and
shift-left-click
(subed-waveform-set-start-and-copy-to-previous)
or shift-right-click
(subed-waveform-set-stop-and-copy-to-next) on
the waveforms because it's easy to see where the
words and pauses are. When I'm not sure, I can use
middle-click (subed-waveform-play-sample) to
play part of the file without changing the
subtitle start/stop or the MPV playback position.
I'm experimenting with adding repeating
keybindings. There was a
subed-mpv-frame-step-map that was bound to C-c
C-f, so I've renamed it to subed-mpv-control,
added a whole bunch of keybindings to the
subed-mpv-control-map based on MPV and Aegisub
shortcuts, and made it a repeating transient map.
Ideas for next steps:
Gotta get the hang of all these new capabilities
through practice! =)
To make my subed-align-region workflow even more
convenient, I could use completing-read to let
me select a future subtitle with completion, and
then Emacs could automatically fix the subtitle
start time, go to the previous subtitle, and
realign the region.
Also, I think switching the waveforms from overlays to
text properties could be a good idea. When I cut
text, the overlays get left behind, but I want the
waveforms to go away too.
While writing this post and fiddling with subed, I
ended up adding a bunch of keybindings and a menu.
I figured this was as good a time as any to stop
tweaking it and finally publish. (But it's fun!
Just one more idea…)
This post covers the week ending Dec 4 and the week ending Dec 11, since it was a bit of a rush leading up to EmacsConf.
EmacsConf 2024 went well, hooray! Here are some of my journal entries over the past two weeks:
I worked on the BigBlueButton server some more. I used Spookfox to automate Firefox from Emacs Lisp so that I could add moderator codes to all the BBB rooms. That way, speakers can let themselves in if needed, since we might be understaffed. (Might need to ask the mailing list if anyone wants to volunteer to host, which is mostly reading out questions and making conversation.) I also updated the Tampermonkey script so that the user in the VNC session will be able to join the web conference automatically.
I added shell scripts to copy the BBB redirect files so that I can easily do that by hand in case I don't get the automation sorted out again over the next week.
Livestreaming to Toobnix seems to be iffy at the moment, so I'll just focus on 480p and YouTube. I'll probably end up manually copying and pasting the stream keys for each event, so I've added them to the shift checklists to make that easier for myself.
I confirmed crontab and publishing still worked, and I processed some last-minute submissions. I also sent the check-in emails and fixed my email delivery verification.
#emacsconf day 1 wasn't
100% smooth, but it was 100% fun, and people rerouted around all of the
tech hiccups. I think we've figured out the color issue (needed to
update mpv from 0.35 to 0.38), I updated my scripts to take the video
files from the cache directory instead of other directories that I
forgot to update, updated the checklist to have the right URLs, enabled
case-fold-search on the other Emacs, and added random package mentions
to the countdown screen. I forgot to let zaeph know I edited one of the
videos, so next time I should flag that somehow. I'm not 100% sure about
our BBB setup; a couple of people's computers crashed. On the plus side,
this year, sooo many people helped out with captions and quality checks.
Improving little by little! :D The important stuff got done: people got
to see things and chat with other people!
The second day of EmacsConf went pretty well! We managed to handle a couple of last-minute uploads.
I processed the EmacsConf Q&As to add chapter indices and correct a number of misrecognized words. I also copied comments from of IRC and YouTube.
I had a lot of fun watching Leo Vivier, Corwin Brust, and FlowyCoder
fluidly swap roles as needed during
#emacsconf . It was
like professional jugglers dancing, one tossing a ball up in the air,
the other shifting into place to catch it, the third getting the next
ones lined up so things keep moving smoothly.
I dropped by Lispy Gopher Show again to chat about Emacs, Emacs Lisp, and EmacsConf with screwtape.
@screwtape I imagine it could
be useful to have a smart radio object that could tell someone how many
minutes until your next show and where to listen to it (saves us from
UTC conversions); do the same for other anonradio shows; search for a
keyword in your archives (even just the descriptions); and maybe even
allow other people to contribute a note that can be reviewed and
included in the archive description for an episode
Yay, I've copied the rest of the comments from IRC and YouTube to the
#EmacsConf talk pages,
so speakers will be able to review them in one go. I've also copied some
sections out of the transcripts for quick answers. I might send the
speakers the thanks email with the discussion and main talk video links,
but without links to the Q&A videos yet.
BigBlueButton audio mixing was as usual a bit of a challenge, with some
participants quiet and some participants louder. BBB saves only mixed
audio. It would be nice to see if I can get separate audio recordings
next year by configuring
https://github.com/bigbluebutton/bigbluebutton/issues/12302 , but
that sounds a little complicated. Instead of taking over the task of
messing with the audio in the current recordings (which I tend to flub
because I don't have the patience for it :) ), I can leave space for
other people to do things. Instead, I can focus on the other tasks I've
been procrastinating. :)
Life:
A- felt that the Outschool club was worth keeping because she likes the people.
We all practised shinny at High Park. Nice! A- and I worked on our stops once it was time to move over to the leisure skate area. We've also skated even though there was a light drizzle.
W- enjoyed helping out with the Bike Brigade.
A+'s CCAT scores qualified her for the next step in the TDSB gifted identification process. I've been trying to figure out what this could look like for us. There's probably no gifted program for virtual school, so it might look much like what we've already got. We've been talking about how to adapt to systems that are designed for other people. At the moment, it seems to work better for her if I sit with her during boring parts of class and help her explore things like coding with Python (or help her get her homework out of the way), so I don't have much focused time myself. It's important to us that she feels good about learning and that she learns how to work with/around systems, so spending that time is worthwhile. It just means that I have to be strategic about what I do.