2024-01-28 Yay Emacs: closed captions and synchronized highlights for audio clips; exploring Emacs Lisp functions

Posted: - Modified: | yay-emacs, emacs

<2024-01-28 Sun 07:30-08:15>

  • [2024-02-02 Fri]: Now with our own copy of the video! Synchronized timestamps too.

You can also watch it on YouTube.

00:00:08.519: All right, hang on a second while I get everything back on. Okay. All right. It is 7:30 and let's see. I think we are live. Thank you again for joining me for another Yay Emacs! session. It is January 28th, 2024. This week has been a weird week, so let me just talk about whatever we've got to talk about. I didn't actually get to run to the things that I said I'd work on last week, like soundex, approximate sound matching, or all that other stuff. But I did figure out how to get audio clips from these Emacs livestreams and get them all synchronized with closed captions and sketches and other things in my blog. The other thing that I wanted to cover today was how how to explore Emacs Lisp functions using things like eldoc and helpful and elisp-demos.

Closed captions and synchronized highlights for audio clips

Transcript for this section
  • 1:13: Synchronizing audio clips with closed captions: So, last session, I talked about my evil plans for this thing. I wanted to extract the clip of that section and turn it into a narrated blog post. I didn't want to always post videos because sometimes, as in the case of that one, it's just a sketch and the audio. It's a lot more efficient to just post the SVG as well as the audio recording rather than a video where the visual doesn't change as much. The other nice thing about trying to figure out how to do this with audio is that I can then make it more interactive, in the sense that I can highlight the transcript, or I can let people click on the SVG and get it to jump to the right places instead of just watching. I could use that to get a video to do the right thing too. At some point in my life, I will figure that out, but for now I figured it was good to start simple with some audio. So this is what I ended up doing. I've got this blog post somewhere here. Hang on, I have it in one of my windows. Was it this way? Ah, here it is.
  • 2:36: The output: So I'll show you the result first and then I'll talk a little bit about how I made this happen behind the scenes. This is an audio clip about 5 minutes from the previous thing. If we play this… "If I do this out loud, other people can help out with questions and comments like the way that you're all doing now, which is great, fantastic. And of course, those are all very selfish reasons. So I'm hoping other people are getting…" So while that's playing, I'm displaying closed captions over here so that there's something to look at. Sometimes it's easier for people to understand when there's text. I tend to watch all my movies with subtitles these days. I like that part. This also gets highlighted depending on what the current section is. The way that it works is it listens to the cue change. There's a track that has the VTT file, the captions file, which I slightly edited from the automatic transcripts that you can get out of OpenAI Whisper. So that's a VTT file, which I'll show you in a bit. It listens to the time based on those cue changes, and it highlights the relevant section. For example, if I jump to the middle here, it says, "you learned something from that process." This is the section that it's in, and this is the highlight in the SVG. I think I've made it so that you can click on the times and then it'll jump to the appropriate section. I think you can actually… I'd previously gotten it to work so that you could actually also click on the SVG. Oh, actually, that worked! See, look at that. I can click on the SVG, and it will jump to the right spot as well.
  • 4:29: The Org source code: So how does that all work? Well, with Emacs of course. I have my Org Mode post for this here. Yes. So this is my… This is my blog post for my evil plan for Yay Emacs. And I have a bunch of divs with classes, just to start trying to target different things. In Org, you can basically just say #+begin_ and then anything, and it'll turn that into a class, sorry, into a div with that class wrapping around your begin and end. I have a custom audio link type here. What it does is: it exports the audio player. If you specify a captions file, which, also, I've added to this– this isn't built in–then it will read the captions file and output the track information. You can see that here if you look at audio export. So in audio export… I'm still learning how to work with this variable pitch thing. I would like my code to be fixed pitch, so I will play around with that someday. Okay, so if you've got a captions parameter in the thing here, it will now output this <track kind=subtitles label=captions and so on and so forth. Let me turn my highlight mode back on so you can see that. Then, if you have captions, it also includes this little bit of JavaScript. You can't really get around using JavaScript for this one, but it should degrade well, so that if you're viewing it on Planet Emacs Life or in elfeed or some other viewer that doesn't have JavaScript, then it should just not show anything. You can play the audio, you can read the text transcript that's included below it, but you won't get any of this confusing behavior. But if you do have JavaScript enabled, then it will display the active cue or the active caption in the captions div for that audio. That's how the closed captions appear. And now I can just add that very easily by including that. Actually, come to think of it, I should make sure that I automatically determine the vtt from the file name, because I usually put them in the same directory and name them the same way. Okay, so that's how the track element gets added.
  • 7:17: Adding a transcript at the bottom: But what about the transcript at the bottom? I have a little bit of code here that takes the subtitle list from the subtitles. So I've got here subed-parse-file, which just reads the VTT using subed and returns a list of a list of subtitles with a start and end times. It's pretty long, but you know, it's all there. This is the ID. This is the starting timestamp. This is the ending timestamp. This is the text of it. And if there are any comments, then the fourth element, well, element number 4 will have it, actually number 5, but people are strange and count from 0 in this world. So it gets me that, and then I have a little function here that groups them according to the section they're in. Let me show you what sections those are. I'm using find file at point. This is that one. In order to mark out the chapters, I just add NOTE comments here. That way, I can group the different subtitles by those sections. That makes it easier to read the transcript, when things are divided. Then printing it out is a matter of going over that with map-concat and including this audio time with the data start and a data stop. This gives me this kind of list. Now, I've put that as an Org Babel result, so that if I wanted to change how this is set up, maybe organize it differently, maybe indent some things or promote things to headings instead of bold list items, then it would be easy for me to remove it from the results drawer and edit it. So then this gives me this span class, audio time, data start and data stop, which my JavaScript code can take a look at and highlight whenever the time reaches that. That's handled by this bit of HTML export. Notice that I'm using… So there's that div that I mentioned earlier that has a special class on it. That allows me to target that specific section with some CSS. I can say, all right, for the SVGs in this one, if it's highlighted, then use this color as fill. Position the sticky top thing so that as you scroll down, the audio is always visible for this blog post, which allows you to have the narration read to you while you're scrolling through. I don't want to auto scroll because that feels a little bit too much. I want to give people more control over that. But it's a nice compromise, because then at least, you can always pause or keep track of the captions while you are looking at the sketch. It's starting to turn into a tiny little presentation. The way that it works is very much like the closed captions JavaScript. For every audio time span there is, we're going to turn that into a button so that we can jump to that specified time. Whenever the closed captions change, look for the span of audio time that we want to highlight.
  • 11:00: Highlighting the SVG: We also use pretty much the same code to highlight the SVG. What I've done with the SVG is I went through the process that I outlined previously in, when I talked about animating things. Yeah, so I have a blog post here, "Animating SVG topic maps with Inkscape, Emacs, FFmpeg, and reveal.js," where I talk about how to take an SVG file and start to simplify it so that it makes it easier to animate. I've refactored the code a little bit so that it's easier for me to animate a sketch with just the highlights changing. That's what I used here to identify the different paths and then start setting them up. The image started off as a PDF. I now have a little bit of code that allows me to give a color scheme. I like to highlight most of my sketches with yellow highlighter–pretend yellow highlighter. The actual device only outputs gray, so black, white, and two shades of gray, but it can be fun to distinguish the sketches about technology versus the sketches about parenting versus the sketches about everything else. So that when I'm looking in my sketchbook, sketches.sachachua.com – when I'm looking in my sketchbook, then I can see this is the balance of things and so forth. I can now select that color scheme. I usually select it based on the file name, but I wanted to be able to select the color scheme from Emacs Lisp as well when I'm recoloring sketches. So that's all cool. Now I can also just match on a specific sort of thing. So here I'm matching only on the gray things, and I want to turn those into, instead of one big path, I want to break them up into smaller paths. Afterwards, I identified the paths to assign IDs to them. I got the IDs back, and then I made a table with a start and end times. I currently just manually copy and paste these times from the VTT. So what I did was I looked at the VTT file and I copied the start time into the section over here. For example, this one is we bounce ideas around. The ID that I assigned it is h-bounce. So we copied it over. Then I use this little bit of code to edit the SVG file itself so that I can add the audio time classes to the SVG elements and the data-start and data-stop attributes directly. Alejandro has a question: if this is configured specifically for this org file, or is it for all org files? I'm still experimenting with this code, so I don't know what I want it to be yet. All of these things are just for this particular file. If I find that I'm using this technique a lot, then it'll be something that I can pull out into a function that I can easily call, or maybe even into my standard JavaScript file. But because you can export little bits of HTML inside your Org subtree, it's easy to make something that just targets this specific post by adding a special class around it. Then I can run it through my whole Org 11ty export process. I'm going to drink some water first. So the end result is I have this SVG. Actually, let me see if I can open it up here. You can take a look at the markup itself. This SVG has path elements that represent the highlights. They start off with a regular sort of blue. My JavaScript looks for the audio time classes and looks at the start and stop. If the current time of the audio being played back is within those times, then it will add a class to it that highlights it in a different way. The nice thing about all of this is that because I can work with Org tables in order to specify this, I don't have to think about the markup or think about the times as I'm trying to edit the SVG element, hoping that I don't mess up the syntax, because if I have a typo, then the whole image won't display. So hooray for Emacs Lisp! That's how the whole thing came together. As I mentioned, it's an interesting technique to have narration be synchronized with closed captions as well as highlights in both this text transcript and a sketch. So I think I might use that for future posts, in which case I might make something that allows me to just give it the sound file, which I've extracted from the video, using the transcript as the basis to get that sound file, then it will do all of this stuff for me. If I don't need to do anything fancy with the transcript highlighting and I just want that kind of list, then it could be super straightforward to make something like this for the cases where I'm talking about a sketch. You can also extend the idea to talk about a screenshot. One of the things that I want to make it easier to do is just look at something in more detail while having the narration come to you, so that you're not trying to remember the image as well as scroll down. I'm trying to figure out how this works in terms of the screen layout since I want the images to be nice and large, but I also want the text to be easy to get to while you're looking at the image or whatever. Audio narration might be an interesting compromise where you can be just just be looking at the screenshot or the video–well, if it's a video, it will have that automatically, but it could be an animated GIF or a series of images–and have that audio narration talk you through it so that you don't have to scroll. And so that you can jump around as you like. So that's that. Then of course, since I wanted it to break down to– not so much break down, but I wanted the RSS to still make sense. I wanted to make sure that it looked okay and behaved okay on things like Planet Emacs Life or Elfeed. So as an aside, I ended up digging into how to get that working.
  • 18:27: Planet Emacs Life and mix-blend-mode: It turns out that Planet Emacs Life… So Planet Emacs Life uses this very old RSS aggregator called Planet Venus, which hasn't been updated in more than a decade and didn't actually handle SVG mix-blend-mode which I use so that the highlights… The highlights here are sometimes specified on top of the text. If I just leave it as a regular SVG element, they will cover it. And I want to just use mix-blend-mode: darken so that the highlight and the text can both be visible at the same time, and I don't have to worry about the order of elements in my SVG. But they didn't display well in Planet Emacs Life. And since I run planet.emacslife.com, I can fiddle with it until I get it to work. This is how it originally looked in Planet Emacs Live. And then I had to dig around in it, which is more of a Python thing than an Emacs thing, but I did start to learn how to use PDB to interactively debug Python code, which was kind of fun. Eventually, I figured out how to update the sanitizer so that it would allow me to use the SVG attributes that I wanted to use. I mean, the SVG CSS that I wanted to use. Then it worked there.
  • 19:52: Elfeed and SVGs: Then I also wanted to get it to work for elfeed because I know a lot of people use elfeed to read RSS feeds in Emacs. I didn't need to have all the fancy audio closed captioning work in elfeed because that's too much, but I did want the SVGs to display. In elfeed, the SVGs sort of work in that mix-blend-mode works, but it was too big. So I said, okay, how do we fiddle with this? It turns out that the shr library that elfeed uses to render HTML ignores the cases. It lowercases all the HTML tags because it uses libxml-parse-html-region, which doesn't pay attention to the case sensitivity of tags. Normally, you don't want case sensitivity for HTML tags, but SVGs have case-sensitive tags. I had to rummage around in there in order to find out what to fix. It turns out that most other parsers just special case the SVG tags to get them back to the correct cases for it. So I have this defconst which defines all those attributes, and this recursive function, which goes through this DOM tree and corrects the case of those by using the setcar function. Then shr-tag-svg can use that to fix SVG tags so that then they will properly listen to width and height and viewBox. Well, width and height always work, but it would ignore the viewBox, which is kind of annoying. So now the view box will work, which I like. I've submitted that patch to emacs-devel, so they will go and talk about it and figure out whether it makes sense to them or not. That's the rabbit hole that I end up going through just because of this "Oh, let's use SVGs more." So, Emacs. At least we can tinker around with these things. libxml-parse-html-region just calls a C library that does the actual parsing, but there is a layer on top where you can fiddle with the Emacs Lisp functions that call that in order to get the right output. That's a detour that I went on in order to get this stuff to work. But I'm very tickled that I now have closed captions for audio and synchronized sketches, and things sort of look okay on Planet Emacs Life, even if you don't have JavaScript enabled. So, where, oh yes, yes, yes, okay. Where are we now?

Exploring Emacs Lisp functions


Figure 1: M-x apropos

00:24:41.380: So how do you find functions in Emacs? Well, there's an apropos command, which is spelled A-P-R-O-P-O-S because the S is silent. Apropos lets you search for something. For example, if you're looking for, let's say, anything related to finding a function, describing a function, function help, maybe. Then it will show you a list of those functions where the documentation matches those keywords or the name of the function matches those keywords. This is one of the things that I often do when I'm trying to find something that… There's probably a function out there to do what I want to do, so I start looking for it.


(use-package orderless
  :ensure t
  (completion-styles '(orderless basic))
  (completion-category-overrides '((file (styles basic partial-completion)))))
Figure 2: Orderless: match completions in any order

00:26:39.280: Orderless is a package that works very well with different kinds of completion if you use this little snippet over here, which sets your completion styles to use orderless. So I'm using eval here to just evaluate that bit. Then I can say M-x and function here will now find things where the match isn't at the beginning. It can be in the middle. It can be somewhere else. You can even specify matches out of order. So, for example, if I know something is related to functions, I might start typing func, function or whatever, and then I might start saying, oh yeah, it's a describe thing, right? Even if you're specifying things out of order, you can get the completion for it. Often when I'm looking around for something, I will use this describe function with orderless and vertical to just start typing the words that I think it might be called. Let me show you that now. For example, the describe-function here… What describe-function does is it gets a name of a function, and it will display the documentation for it. Let's say, for example… Let's look for describe package. So it will describe this information. It will show you where it's found. It will show you what its arguments are and what the documentation is. So that's orderless. It helps you find things.


(use-package vertico
  :ensure t
Figure 3: Vertico completion

00:25:29.380: Another way to do it is just to type in the names and hope that you can find the function that way. Now, when you're starting off with a plain Emacs, it's a little bit sparse. You actually have to remember the things. If you start typing describe, you can use tab for completion. You have to hit it twice to get more of a completion list. So this is one of the things that I tend to encourage people to change right away, because completions make it much easier for you to see what you actually want to do. One of the things that I like to use for this is a package called Vertico. And if I evaluate that, if I use the package and turn on vertico mode, then you'll see that it displays the functions right there. Even if I don't remember how to spell apropos, I can start typing apr and it will display it right away. If I'm looking for something like function, you'll notice that it can't find anything, but that's because we're still missing the next thing that I want to share, which is orderless.


(use-package marginalia
  :ensure t
  :bind (:map minibuffer-local-map
              ("M-A" . marginalia-cycle))
Figure 4: Marginalia: margin notes

00:28:10.920: So that's orderless. It helps you find things. The other nice thing that you can add on top of this completion is a package called marginalia. Marginalia, anyway, marginalia, sure. Now, when you're describing functions, sometimes the first line is all you need in order to figure out whether this is something you want or not. Marginalia will let you, let's say for example if you're looking for a function, it will let you see the arguments as well as a brief description of it in the right side here. It adds these margin notes to a lot of the searches, a lot of the completions that you have available. Let's say for example, if I'm looking for… I want a function that works with sequences and it makes things flatter. No, it makes things, okay, let's make, let's pick something that involves splitting things into different groups. And seq-partition here, you can see here that it takes a sequence and it returns a list of elements of sequence grouped into subsequences of length n. Having that function description right there can be really helpful when you're trying to find something. All right. So that's Marginalia.


M-x shortdoc:

Figure 5: shortdoc sequence

00:29:56.020: Another way to find something if you aren't quite sure what you're looking for is shortdoc. With shortdoc, you can give it a group of functions, and the wonderful Emacs developers have already created a whole bunch of cheat sheets for these different functions. For example, I do a lot of work with sequences, which is basically any kind of list or vector or… I think, sets, maybe? I don't know, it's… Anyway, sequences. So there is a shortdoc for sequence, and what it does is it shows you very short documentation (hence short doc) showing you the different important functions in this group, with examples. I really like the fact that I can just flip through this and then in the different sections, try to find the function that I'm looking for. This is especially helpful if you don't know what something is called or even what keywords you might start to look for, because one of the big challenges with Emacs Lisp is just figuring out what things are called. So, shortdoc, you can just call it with M-x shortdoc, and then you can browse through it. As you can see, there are a whole lot of different things that are included there. You can probably define your own too if you want to. For example, string-related functions show me all the things that… Like one time I was trying to figure out whether it was called repeat or whatever. It's actually called make-string. You can see all of the examples there.


Install from MELPA:

(require 'package)
(add-to-list 'package-archives '("melpa" . "https://melpa.org/packages/") t)
(use-package helpful
  :ensure t
  (("C-h f" . helpful-callable)
   ("C-h v" . helpful-variable)
   ("C-h k" . helpful-key)
   ("C-h x" . helpful-command)))

00:31:44.120: Now, this is great for finding functions. but sometimes you want to have even more information about something when you're looking at it. So if I C-h f, which is the describe-function keyboard shortcut, the, it'll default to this, the symbol at point, which is helpful. It'll show me the documentation for it. But you can actually get even more documentation than this. That's where the helpful package comes in, which is another thing that I've been finding really useful. Now, Helpful is not built in, but you can install it from Melpa, which is the Milkypostman Elpa, Emacs Lisp package repository. Oh, sorry, archive, A, not R, archive. You can find the instructions for doing that at melpa.org. So let's get this into our configuration. I paste it in, I evaluate the lines, and then it has a bunch of commands that replace the functions that we have. Well, not replace, but you know, they encourage you to bind it over the keyboard shortcuts for find-function– sorry, describe-function or describe-variable or describe-key and so forth. For example, if I press C-h f, it will now call helpful-callable. If I then use something like seq-find, maybe, then it will show me not only the signature, the arguments for the function, and the documentation for it, which is fairly normal, but it'll give me as well references. It'll let me enable edebugging right from there, and even include the source code. I find that the source… Looking at the source code for functions is very helpful. Before, I used to use describe-function or find-function. Well, describe-function will give you a link to its implementation. And find~function will try to find the… Is it actually called find-function? Yeah. So, there is a find-function that finds the definition of a function, but with helpful, I can get the source code right there. So then I can say, show me all sorts of things, and I don't have to do that extra jump. A whole bunch of other things, in case you're curious: It'll show you the symbol properties or whether something is advised or not. You get some of that as well with a regular describe-function, but this is a little bit extra. So that's the helpful package, which I do find helpful, as it says. As Alejandro points out, even source code in C, if it's available. So I think I was talking about libxml-parse-html-region. libxml-parse-html-region is actually the thing that I mentioned. You'll see here that it's defined in xml.c, but the source code is here as well. All it does is it calls the parse region from libxml2, which is arcane wizardry, and I don't know how to fiddle with the C stuff yet, which is why we ended up hacking it at the Emacs Lisp level instead. So, helpful is actually very helpful. One thing that makes it even more helpful


(use-package elisp-demos
  :ensure t
  (advice-add 'describe-function-1 :after #'elisp-demos-advice-describe-function-1)
  (advice-add 'helpful-update :after #'elisp-demos-advice-helpful-update))

Example: set-transient-map

Figure 7: Demos for set-transient-map

00:35:13.260: One thing that makes it even more helpful than that is another package called elisp-demos. elisp-demos adds some extra examples to this. Let's try that out here. Now that I've included that, it has added some advice around helpful-update to load the demos. If I look up something like seq-partition maybe, which is that function we mentioned earlier, you'll notice that there's a new section here called demos, and an example here of how someone uses it to… You pass a list, you pass a number, and it gives you some results. I put in this patch recently that the maintainer has accepted, hooray, and it allows you to add your own notes [with this button]. So you can add a note there. The kiddo is awake, so I'd better talk even faster. Okay, so the last thing I wanted to show is you can use this not only to include– hello kiddo, you're awake, you're awake– not only to include your own examples, but also, you can use it to take other notes. Here, for example, is set-transient-map. There are examples there. If I look at tabulated-list-mode, I think there aren't any demos for this one yet. But, very quickly, I will show you before I feed the kiddo some breakfast– (although if you want, you can start getting your own breakfast)– is you can also use that to add your own notes here. This is a blog post that had a whole bunch of information on how to set up tabulated list mode, which will open eventually in… Ah! Some browser. Okay, I will just open it here because I just remembered my default browser is Firefox, though… We will show that some other time. Okay, so this is the blog post that explains a lot about how to use tabulated-list-mode, or at least it gives a very short example. And then… Actually, let's just copy this over here. Sure, we can add… Ha, it's not added. Let's make that lisp… I don't have my usual structs defined here. OK, so now if I C-h f tabulate-list-mode you can see here the… Oh, that's not an end_export, is it? That should be an end_src. So you can see here, it has the example that we've just added and you can also include whatever notes you have all about the thing. That is probably all I can do today because I gotta go feed a kiddo. All right folks, thank you for joining me. Next stream is probably Sunday again, probably around this time, but maybe with faster talking. Thanks everyone and have lots of fun. Thanks for dropping by!

View org source for this post
You can comment with Disqus or you can e-mail me at sacha@sachachua.com.