Here are some differences between the current
implementation and the previous ones:
The website shows the last two weeks of posts,
since I can filter by date. It should also
ignore future-dated posts.
The list of feeds on the right side is now
sorted by last post date, so it's easier to see
active blogs.
I can now filter a general feed by a regular
expression.
I've removed a number of unreachable blogs.
The feed list is loaded from a JSON instead of
an INI.
The Atom feed and the OPML file should validate,
but let me know at sacha@sachachua.com if there
are any hiccups (or if you have an Atom/RSS feed
we can add to the aggregator =) ).
[2025-01-12 Sun]: u/dr-timeous posted a treemap_org.py · GitHub that makes a coloured treemap that displays the body on hover. (Reddit) Also, I think librsvg doesn't support wrapped text, so that might mean manually wrapping if I want to figure out the kind of text density that webtreemap has.
One of the challenges with digital notes is that
it's hard to get a sense of volume, of mass, of
accumulation. Especially with Org Mode, everything
gets folded away so neatly and I can jump around
so readily with C-c j (org-goto) or C-u C-c
C-w (org-refile) that I often don't stumble
across the sorts of things I might encounter in a
physical notebook.
Treemaps are a quick way to visualize hierarchical
data using nested rectangles or squares, giving a
sense of relative sizes. I was curious about what
my main organizer.org file would look like as a
treemap, so I wrote some code to transform it into
the kind of data that
https://github.com/danvk/webtreemap wants as
input. webtreemap creates an HTML file that uses
Javascript to let me click on nodes to navigate
within them.
For this treemap prototype, I used
org-map-entries to go over all the headings and
make a report with the outline path and the size
of the heading. To keep the tree visualization
manageable, I excluded done/cancelled tasks and
archived headings. I also wanted to exclude some
headings from the visualization, like the way my
Parenting subheading has lots of personal
information underneath it. I added a :notree:
tag to indicate that a tree should not be
included.
Reflections
The video and the screenshot above show the
treemap for my main Org Mode file,
organizer.org. I feel like the treemap makes it
easier to see projects and clusters where I'd
accumulated notes, both in terms of length and
quantity. (I've omitted some trees like
"Parenting" which take up a fairly large chunk of
space.)
To no one's surprise, Emacs takes up a large part
of my notes and ideas. =)
When I look at this treemap, I notice a bunch of
nodes I need to mark as DONE or CANCELLED
because I forgot to update my organizer.org. That
usually happens when I come up with an idea, don't
remember that I'd come up with it before, put it
in my inbox.org file, and do it from there or from
the organizer.org location I've refiled it to
without bumping into the first idea. Once in a
blue moon, I go through my whole organizer.org
file and clean out the cruft. Maybe a treemap like
this will make it easier to quickly scan things.
Interestingly, "Explore AI" takes up a
disproportionately large chunk of my "Inactive
Projects" visualization, even though I spend more
time and attention on other things. Large language
models make it easy to generate a lot of text, but
I haven't really done the work to process those.
I've also collected a lot of links that I haven't
done much with.
It might be neat to filter the headings by
timestamp so that I can see things I've touched in
the last 6 months.
Hmm, looking at this treemap reminds me that I've
got "organizer.org/Areas/Ideas for things to do
with focused time/Writing/", which probably should
get moved to the posts.org file that I tend to
use for drafts. Let's take look at the treemap for
that file. (Updated: cleared it out!)
Unlike my organizer.org file, my posts.org
file tends to be fairly flat in terms of
hierarchy. It's just a staging ground for ideas
before I put them on my blog. I usually try to
keep posts short, but a few of my posts have
sub-headings. Since the treemap makes it easy to
see nodes that are larger or more complex, that
could be a good nudge to focus on getting those
out the door. Looking at this treemap reminds me
that I've got a bunch of EmacsConf posts that I
want to finish so that I can document more of our
processes and tools.
My inbox.org is pretty flat too, since it's
really just captured top-level notes that I'll
either mark as done or move somewhere else
(usually organizer.org). Because the treemap
visualization tool uses / as a path separator,
the treemap groups headings that are plain URLs
together, grouped by domain and path.
My Emacs configuration is organized as a
hierarchy. I usually embed the explanatory blog
posts in it, which explains the larger nodes. I
like how the treemap makes it easy to see the
major components of my configuration and where I
might have a lot of notes/custom code. For
example, my config has a surprising amount to do
with multimedia considering Emacs is a text
editor, and that's mostly because I like to tinker
with my workflow for sketchnotes and subtitles.
This treemap would be interesting to colour based
on whether something has been described in a blog
post, and it would be great to link the nodes in a
published SVG to the blog post URLs. That way, I
can more easily spot things that might be fun to
write about.
There's another treemap visualization tool that
can produce squarified treemaps as coloured SVGs,
so that style might be interesting to explore too.
Next steps
I think there's some value in being able to look
at and think about my outline headings with a
sense of scale. I can imagine a command that shows
the treemap for the current subtree and allows
people to click on a node to jump to it (or maybe
shift-click to mark something for bulk action), or
one that shows subtrees summing up :EFFORT:
estimates or maybe clock times from the logbook,
or one limited by a timestamp range, or one that
highlights matching entries as you type in a
query, or one that visualizes s-exps or JSON or
project files or test coverage.
It would probably be more helpful if the treemap
were in Emacs itself, so I could quickly jump to
the Org nodes and read more or mark something as
done when I notice it. boxy-headings uses text to
show the spatial relationships of nested headings,
which is neat but probably not up to handling this
kind of information density. Emacs can also
display SVG images in a buffer, animate them, and
handle mouse-clicks, so it could be interesting to
implement a general treemap visualization which
could then be used for all sorts of things like
disk space usage, files in project modules, etc.
SVGs would probably be a better fit for this
because that allows increased text density and
more layout flexibility.
It would be useful to browse the treemap within
Emacs, export it as an SVG so that I can include
it in a webpage or blog post, and add some
Javascript for web-based navigation.
The Emacs community being what it is (which is
awesome!), I wouldn't be surprised if someone's
already figured it out. Since a quick search
for treemap in the package archives and various
places doesn't seem to turn anything up, I thought
I'd share these quick experiments in case they
resonate with other people. I guess I (or someone)
could figure out the squarified treemapping
algorithm or the ordered treemap algorithm in
Emacs Lisp, and then we can see what we can do
with it.
I've also thought about other visualizations that
can help me see my Org files a different way.
Network graphs are pretty popular among the
org-roam crew because org-roam-ui makes them.
Aside from a few process checklists that link to
headings that go into step-by-step detail and
things that are meant to graph connections between
concepts, most of my Org Mode notes don't
intentionally link to other Org Mode notes. (There
are also a bunch of random org-capture context
annotations I haven't bothered removing.) I tend
to link to my public blog posts, sketches, and
source code rather than to other headings, so
that's a layer of indirection that I'd have to
custom-code. Treemaps might be a good start,
though, as they take advantage of the built-in
hierarchy. Hmm…
I usually write my scripts with phrases that could be turned into the subtitles. I figured I might as well combine that information with the WhisperX transcripts which I use to cut out my false starts and oopses. To do that, I use the string-distance function, which calculates how similar strings are, based on the Levenshtein [distance] algorithm. If I take each line of the script and compare it with the list of words in the transcription, I can add one transcribed word at a time, until I find the number with the minimum distance from my current script phrase. This lets me approximately match strings despite misrecognized words. I use oopses to signal mistakes. When I detect those, I look for the previous script line that is closest to the words I restart with. I can then skip the previous lines automatically. When the script and the transcript are close, I can automatically correct the words. If not, I can use comments to easily compare them at that point. Even though I haven't optimized anything, it runs well enough for my short videos. With these subtitles as a base, I can get timestamps with subed-align and then there's just the matter of tweaking the times and adding the visuals.
Text from sketch
Matching a script with a transcript 2025-01-09-01
script
record on my phone
WhisperX transcript (with false starts and recognition errors)
My current implementation is totally unoptimized (n²) but it's fine for short videos.
Process:
While there are transcript words to process
Find the script line that has the minimum distance to the words left in the transcript. restart after oopses
Script
Transcript: min. distance between script phrase & transcript
Restarting after oops: find script phrase with minimum distance
Ex. script phrase: The Emacs text editor
Transcript: The Emax text editor is a…
Bar graph of distance decreasing, and then increasing again
Minimum distance
Oops?
N: Use transcript words, or diff > threshold?
Y: Add script words as comment
N: Correct minor errors
Y: Mark caption for skipping and look for the previous script line with minimum distance.
Result:
Untimed captions with comments
Aeneas
Timed captions for editing
This means I can edit a nicely-split, mostly-corrected file.
I've included the links to various files below so you can get a sense of how it works. Let's focus on an excerpt from the middle of my script file.
it runs well enough for my short videos.
With these subtitles as a base,
I can get timestamps with subed-align
When I call WhisperX with large-v2 as the model and --max_line_width 50 --segment_resolution chunk --max_line_count 1 as the options, it produces these captions corresponding to that part of the script.
01:25.087-->01:29.069
runs well enough for my short videos. With these subtitles
01:29.649-->01:32.431
as a base, I can get... Oops. With these subtitles as a base, I
01:33.939-->01:41.205
can get timestamps with subedeline, and then there's just
Running subed-word-data-use-script-file results in a VTT file containing this excerpt:
00:00:00.000-->00:00:00.000
it runs well enough for my short videos.
NOTE #+SKIP00:00:00.000-->00:00:00.000
With these subtitles as a base,
NOTE #+SKIP00:00:00.000-->00:00:00.000
I can get... Oops.
00:00:00.000-->00:00:00.000
With these subtitles as a base,
NOTE#+TRANSCRIPT: I can get timestamps with subedeline,#+DISTANCE: 0.1400:00:00.000-->00:00:00.000
I can get timestamps with subed-align
There are no timestamps yet, but subed-align can add them. Because subed-align uses the Aeneas forced alignment tool to figure out timestamps by lining up waveforms for speech-synthesized text with the recorded audio, it's important to keep the false starts in the subtitle file. Once subed-align has filled in the timestamps and I've tweaked the timestamps by using the waveforms, I can use subed-record to create an audio file that omits the subtitles that have #+SKIP comments.
The code is available as subed-word-data-use-script-file in subed-word-data.el. I haven't released a new version of subed.el yet, but you can get it from the repository.
In addition to making my editing workflow a little more convenient, I think it might also come in handy for applying the segmentation from tools like sub-seg or lachesis to captions that might already have been edited by volunteers. (I got sub-seg working on my system, but I haven't figured out lachesis.) If I call subed-word-data-use-script-file with the universal prefix arg C-u, it should set keep-transcript-words to true and keep any corrections we've already made to the caption text while still approximately matching and using the other file's segments. Neatly-segmented captions might be more pleasant to read and may require less cognitive load.
There's probably some kind of fancy Python project
that already does this kind of false start
identification and script reconciliation. I just
did it in Emacs Lisp because that was handy and
because that way, I can make it part of subed. If
you know of a more robust or full-featured
approach, please let me know!
I put together a pull request to modify
ob-mermaid-cli-path so that it doesn't get quoted
and can therefore have the aa-exec command needed
to work around that. With that modified
org-babel-execute:mermaid, I can then configure
ob-mermaid like this:
(use-package ob-mermaid
:load-path"~/vendor/ob-mermaid")
;; I need to override this so that the executable isn't quoted
(setq ob-mermaid-cli-path "aa-exec --profile chrome mmdc -c ~/.config/mermaid/config.json")
I also ran into a problem where the library that
Emacs uses to display SVGs could not handle the
foreignObject elements used for the labels.
mermaid missing text in svg · Issue #112 ·
mermaid-js/mermaid-cli . Using the following
~/.config/mermaid/config.json fixed it, and I
put the option in the ob-mermaid-cli-path above
so that it always gets loaded.
New in this video: subed-record-sum-time, #+PAD_LEFT and #+PAD_RIGHT
I like the constraints of a one-minute video, so I added a subed-record-sum-time command. That way, when I edit the video using Emacs, I can check how long the result will be. First, I split the subtitles, align it with the audio to fix the timestamps, and double check the times. Then I can skip my oopses. Sometimes WhisperX doesn't catch them, so I also look at waveforms and characters per second. I already talk quickly, so I'm not going to speed that up but I can trim the pauses in between phrases which is easy to do with waveforms. Sometimes, after reviewing a draft, I realize I need a little more time. If the original audio has some silence, I can just copy and paste it. If not, I can pad left or pad right to add some silence. I can try the flow of some sections and compile the video when I'm ready. Emacs can do almost anything. Yay Emacs!
I like the constraints of a one-minute video, so I added a subed-record-sum-time command. That way, when I edit the video using Emacs, I can check how long the result will be.
subed-record uses subtitles and directives in
comments in a VTT subtitle file to edit audio
and video. subed-record-sum-time calculates
the resulting duration and displays it in the
minibuffer.
First, I split the subtitles, align it with the audio to fix the timestamps, and double check the times.
I'm experimenting with an algorithmic way to
combine the breaks from my script with the
text from the transcript. subed-align calls
the aeneas forced alignment tool to match up
the text with the timestamps. I use
subed-waveform-show-all to show all the
waveforms.
Then I can skip my oopses.
Adding a NOTE #+SKIP comment before a
subtitle makes subed-record-compile-video
and subed-record-compile-flow skip that part
of the audio.
Sometimes WhisperX doesn't catch them,
WhisperX sometimes doesn't transcribe my false starts if I repeat things quickly.
so I also look at waveforms
subed-waveform-show-all adds waveforms for
all the subtitles. If I notice there's a pause
or a repeated shape in the waveform, or if I
listen and notice the repetition, I can
confirm by middle-clicking on the waveform to
sample part of it.
and characters per second.
Low characters per second is sometimes a sign
that the timestamps are incorrect or there's a
repetition that wasn't transcribed.
I already talk quickly, so I'm not going to speed that up
Also, I already sound like a chipmunk;
mechanically speeding up my recording to fit
in a certain time will make that worse =)
but I can trim the pauses in between phrases which is easy to do with waveforms.
left-click to set the start, right-click to
set the stop. If I want to adjust the
previous/next one at the same time, I would
use shift-left-click or shift-right-click, but
here I want to skip the gaps between phrases,
so I adjust the current subtitle without
making the previous/next one longer.
Sometimes, after reviewing a draft, I realize I need a little more time.
I can specify visuals like a video, animated
GIF, or an image by adding a [[file:...]]
link in the comment for a subtitle. That
visual will be used until the next visual is
specified in a comment on a different
subtitle. subed-record-compile-video can
automatically speed up video clips to fit in
the time for the current audio segment, which
is the set of subtitles before the next visual
is defined. After I compile and review the
video, sometimes I notice that something goes by too quickly.
If the original audio has some silence, I can just copy and paste it.
This can sometimes feel more natural than adding in complete silence.
If not, I can pad left or pad right to add some silence.
I added a new feature so that I could specify
something like #+PAD_RIGHT: 1.5 in a comment
to add 1.5 seconds of silence after the audio
specified by that subtitle.
I can try the flow of some sections
I can select a region and then use M-x
subed-record-compile-try-flow to play the
audio or C-u M-x
subed-record-compile-try-flow to play the
audio+video for that region.
and compile the video when I'm ready.
subed-record-compile-video compiles the
video to the file specified in #+OUTPUT:
filename. ffmpeg is very arcane, so I'm glad
I can simplify my use of it with Emacs Lisp.
Emacs can do almost anything. Yay Emacs!
Non-linear audio and video editing is actually
pretty fun in a text editor, especially when I
can just use M-x vundo to navigate my undo
history.