I wanted to experiment with for colouring the mode line of the active window ever so slightly different to make it easier to see where the active window is. I usually have global-hl-line-mode turned on, so that highlight is another indicator, but let's see how this tweak feels. I modified the code so that it uses the theme colours from the currently-selected Modus themes, since I trust Prot's colour choices more than I trust mine. Thanks to Irreal for sharing Ignacio's comment!
Finding enough quiet focused time to record audio is a challenge. I
often have to re-record segments in order to correct brain hiccups or
to restart after interruptions. It's also hard for me to sit still and
listen to my recordings looking for mistakes to edit out. I'm not
familiar enough with Audacity to zip around with keyboard shortcuts,
and I don't like listening to myself again and again in order to find
my way around an audio file.
Sure, I could take the transcript, align it with subed-align and
Aeneas to get the timestamps, and then use subed-convert to get a CSV (actually a TSV since it uses tabs)
that I can import into Audacity as labels, but it still feels a little
awkward to navigate. I have to zoom in a lot for the text to be
readable.
So here's a workflow I've been experimenting with for cleaning up my
recorded audio.
Just like with my audio braindumps, I use Google Recorder on my phone
because I can get the audio file and a rough transcript, and because
the microphone on it is better than on my laptop. For narration
recordings, I hide in the closet because the clothes muffle echoes. I
don't feel as self-conscious there as I might be if I recorded in the
kitchen, where my computer usually is. I used to record in Emacs using
subed-record by pressing left to redo a segment and right to
move on to the next one, but using my phone means I don't have to deal
with the computer's noises or get the good mic from downstairs.
I start the recorder on my phone and then switch to my Org file in
Orgzly Revived, where I've added my script. I read it
as far as I can go. If I want to redo a segment, I say "Oops" and then
just redo the last phrase or so.
Screenshot of Google Recorder on my phone
I export the transcript and the M4A audio file using Syncthing, which
copies them to my computer. I have a function that copies the latest
recording and even sets things up for removing oops segments
(my-subed-copy-latest-phone-recording, which calls my-split-oops).
If I want to process several files, I can copy them over with
my-subed-copy-recording.
my-subed-copy-latest-phone-recording: Copy the latest recording transcript and audio to DESTINATION.
I'll use Aeneas to get the timestamps for each line of text, so a
little bit of text processing will let me identify the segments that I
want to remove. The way my-split-oops works is that it looks for
"oops" in the transcript. Whenever it finds "oops", it adds a newline
afterwards. Then it takes the next five words and sees if it can
search backward for them within 300 characters. If it finds the words,
then that's the start of my repeated segment, and we can add a newline
before that. If it doesn't find the words, we try again with four
words, then three, then two, then one. I can also manually review the
file and see if the oopses are well lined up. When they're detected
properly, I should see partially duplicated lines.
I used to record using sub-record by using by. Oops,
I used to record. Oops,
I used to record an emacs using subhead record, by pressing left to reduce segment, and write to move on to the next one.
But using my phone means, I don't have to deal with them. Oops.
But using my phone means, I don't have to deal with the computer's noises or get the good mic from downstairs. I started recorder on my phone
my-split-oops: Look for oops and make it easier to split.
(defunmy-split-oops ()
"Look for oops and make it easier to split."
(interactive)
(let ((scan-window 300))
(while (re-search-forward "oops[,\.]?[ \n]+" nil t)
(let ((start (min (line-beginning-position) (- (point) scan-window)))
start-search
found
search-for)
(if (bolp)
(progn
(backward-char)
(setq start (min (line-beginning-position) (- (point) scan-window))))
(insert "\n"))
(save-excursion
(setq start-search (point))
;; look for 1..3 words back
(goto-char
(or
(cl-loop
for n downfrom 4 downto 1
do
(save-excursion
(dotimes (_ n) (forward-word))
(setq search-for (downcase (string-trim (buffer-substring start-search (point)))))
(goto-char start-search)
(when (re-search-backward (regexp-quote search-for) start t)
(goto-char (match-beginning 0))
(cl-return (point)))))
(and (call-interactively 'isearch-backward) (point))))
(insert "\n"))))))
Once the lines are split up, I use subed-align and get a VTT file.
The oops segments will be in their own subtitles.
The timestamps still need a bit of tweaking sometimes, so I use
subed-waveform-show-current or subed-waveform-show-all. I can use
the following bindings:
middle-click to play a sample
M-left-click to set the start and copy to the previous subtitle
left-click to set the start without changing the previous one
M-right-click to set the end and copy to the next subtitle
right-click to set the end without changing the next one
M-j to jump to the current subtitle and play it again in MPV
M-J to jump to close to the end of the current subtitle and play it in MPV
I use my-subed-delete-oops to delete the oops segments. I can also
just mark them for skipping by calling C-u M-x my-subed-delete-oops
instead.
Then I add a #+OUTPUT: filename-cleaned.opus comment under a NOTE
near the beginning of the file. This tells
subed-record~compile-audio where to put the output.
WEBVTT
NOTE #+SKIP00:00:00.000-->00:00:10.319
Finding enough. Oops.
NOTE#+OUTPUT: 2023-12-subed-record-cleaned.opus00:00:10.320-->00:00:36.319
Finding enough quiet Focused. Time to record. Audio is a challenge. I often have to re-record segments in order to correct brain hiccups, or to restart after interruptions.
I can test short segments by marking the region with C-SPC and using
subed-record-compile-try-flow. This lets me check if the transitions
between segments make sense.
When I'm happy with everything, I can use subed-record-compile-audio
to extract the segments specified by the start and end times of each
subtitle and concatenate them one after the other in the audio file
specified by the output. The result should be a clean audio file.
If I need to compile an audio file from several takes, I process each
take separately. Once I've adjusted the timestamps and deleted or
skipped the oops segments, I add #+AUDIO: input-filename.opus to a
NOTE at the beginning of the file.
subed-record-insert-audio-source-note makes this easier. Then I copy
the file's subtitles into my main file. subed-record-compile-audio will take
the audio from whichever file was specified by the #+AUDIO: comment,
so I can use audio from different files.
Example VTT segment with multiple audio files
NOTE#+AUDIO: 2023-11-11-emacsconf.m4a00:10:55.617-->00:10:58.136
Sometimes we send emails one at a time.
NOTE#+AUDIO: 2023-11-15-emacsconf.m4a00:10:55.625-->00:11:03.539
Like when you let a speaker know that we've received a proposal That's mostly a matter of plugging the talks properties into the right places in the template.
Now I have a clean audio file that corresponds to my script. I can use
subed-align on my script to get the timestamps for each line using
the cleaned audio. Once I have a subtitle file, I can use
emacsconf-subed-split (in emacsconf-subed.el - which I
probably should add to subed-mode sometime) to quickly split the
captions up to fit the line lengths. Then I redo the timestamps with
subed-align and adjust timestamps with
subed-waveform-show-current.
So that's how I go from rough recordings with stutters and oopses to a
clean audio file with captions based on my script. People can probably
edit faster with Audacity wizardry or the AI audio editors that are in
vogue these days, but this little workflow gets around my impatience
with audio by turning it into (mostly) text, so that's cool. Let's see
if I can make more presentations now that I've gotten the audio side
figured out!
I have lots of different things in my Org Mode inbox. Following the PARA method, I want to file them under projects, areas, resources, or archive so that I can find related things later. Actually, no, I don't want to refile them. I do want to be able to:
find all the pieces related to something when I'm ready to start working on a task
find useful links again, especially if I can use my own words
Refiling is annoying on my phone, so I tend to wait until I'm back at
my computer. But even with org-refile-use-outline-path set to file
and the ability to specify substrings, there's still a bit of
friction.
Tagging is a little easier to do on my phone. I can add a few tags
when I share a webpage or create a task.
I thought it would be nice to have something that automatically
refiles my inbox headings tagged with various tags to other subtrees
where I've set a :TAG_TARGET: property or something like that. For
example, I can set the TAG_TARGET property to emacsconf to mean
that anything tagged with :emacsconf: should get filed under there.
So when I'm ready, I can call my-org-refile-to-tag-targets and have
lots of things disappear from my inbox.
Next step might be to write a function that will refile just the
current subtree (either going straight to the tag target or prompting
me for a destination if there isn't a matching one), so I can look at
stuff, decide if it needs to be scheduled first or something like
that, and then send it somewhere. There must be something I can pass a
property match to and it'll tell me if it matches the current
subtree - probably something along the lines of
org-make-tags-matcher…
I wanted to get the Q&A sessions up quickly after the conference, so I
uploaded them to YouTube and added them to the EmacsConf 2023
playlist. I used YouTube's video editor to roughly guess where to
trim them based on the waveforms. I needed to actually trim the source
videos, though, so that our copies would be up to date and I could use
those for the Toobnix uploads.
My first task was to figure out which videos needed to be trimmed to
match the YouTube edits. First, I retrieved the video details using
the API and the code that I added to emacsconf-extract.el.
After quickly checking the results, I copied them over to the original videos, updated the video data in my conf.org, and republished the info pages in the wiki.
The time I spent on figuring out how to talk to the YouTube API feels like it's paying off.
I want to see if we can caption EmacsConf live presentations and Q&A
sessions, even if the automated captions need help with misrecognized
words. Now that I can get live speech into Emacs using the Deepgram
streaming API, I can process that information and send it to other
places. Here's a quick demonstration of appending live speech captions
to Etherpad:
I added an emacsconf-pad-append-text function to
emacsconf-pad.el that uses the appendText function.