Comparing pronunciation recordings across time
| french, emacs, org, subed: I added a column for Feb 20, the first session with the sentences. I also added keyboard shortcuts (1..n) for playing the audio of the row that the mouse is on.
My French tutor gave me a list of sentences to help me practise pronunciation.
Sentences
- Maman peint un grand lapin blanc.
- Un enfant intelligent mange lentement.
- Le roi croit voir trois noix.
- Le témoin voit le chemin loin.
- Moins de foin au loin ce matin.
- La laine beige sèche près du collège.
- La croquette sèche dans l'assiette.
- Elle mène son frère à l'hôtel.
- Le verre vert est très clair.
- Elle aimait manger et rêver.
- Le jeu bleu me plaît peu.
- Ce neveu veut un jeu.
- Le feu bleu est dangereux.
- Le beurre fond dans le cœur chaud.
- Les fleurs de ma sœur sentent bon.
- Le hibou sait où il va.
- L'homme fort mord la pomme.
- Le sombre col tombe.
- L'auto saute au trottoir chaud.
- Le château d'en haut est beau.
- Le cœur seul pleure doucement.
- Tu es sûr du futur ?
- Trois très grands trains traversent trois trop grandes rues.
- Je veux deux feux bleus, mais la reine préfère la laine beige.
- Vincent prend un bain en chantant lentement.
- La mule sûre court plus vite que le loup fou.
- Luc a bu du jus sous le pont où coule la boue.
- Le frère de Robert prépare un rare rôti rouge.
- La mule court autour du mur où hurle le loup.
I can fuzzy-match these with the word timing JSON from WhisperX, like this.
Extract all approximately matching phrases
(subed-record-extract-all-approximately-matching-phrases
sentences
"/home/sacha/sync/recordings/2026-02-20-raphael.json"
"/home/sacha/proj/french/analysis/virelangues/2026-02-20-raphael-script.vtt")
Then I can use subed-record to manually tweak them, add notes, and so on. I end up with VTT files like 2026-03-06-raphael-script.vtt. I can assemble the snippets for a session into a single audio file, like this:
I wanted to compare my attempts over time, so I wrote some code to use Org Mode and subed-record to build a table with little audio players that I can use both within Emacs and in the exported HTML.
This collects just the last attempts for each sentence during a number of my sessions (both with the tutor and on my own). The score is from the Microsoft Azure pronunciation assessment service. I'm not entirely sure about its validity yet, but I thought I'd add it for fun. * indicates where I've added some notes from my tutor, which should be available as a title attribute on hover. (Someday I'll figure out a mobile-friendly way to do that.)
Calling it with my sentences and files
(my-lang-summarize-segments
sentences
'(("/home/sacha/proj/french/analysis/virelangues/2026-02-20-raphael-script.vtt" . "Feb 20")
;("~/sync/recordings/processed/2026-02-20-raphael-tongue-twisters.vtt" . "Feb 20")
("~/sync/recordings/processed/2026-02-22-virelangues-single.vtt" . "Feb 22")
("~/proj/french/recordings/2026-02-26-virelangues-script.vtt" . "Feb 26")
("~/proj/french/recordings/2026-02-27-virelangues-script.vtt" . "Feb 27")
("~/proj/french/recordings/2026-03-03-virelangues.vtt" . "Mar 3")
("/home/sacha/sync/recordings/processed/2026-03-03-raphael-reference-script.vtt" . "Mar 3")
("~/proj/french/analysis/virelangues/2026-03-06-raphael-script.vtt" . "Mar 6")
("~/proj/french/analysis/virelangues/2026-03-12-virelangues-script.vtt" . "Mar 12"))
"clip"
#'my-lang-subed-record-get-last-attempt
#'my-lang-subed-record-cell-info
t
)
| Feb 20 | Feb 22 | Feb 26 | Feb 27 | Mar 3 | Mar 3 | Mar 6 | Mar 12 | Text |
| ▶️ 63* | ▶️ 96 | ▶️ 95 | ▶️ 94 | ▶️ 83 | ▶️ 83* | ▶️ 81* | ▶️ 88 | Maman peint un grand lapin blanc. |
| ▶️ 88* | ▶️ 95 | ▶️ 99 | ▶️ 99 | ▶️ 96 | ▶️ 89* | ▶️ 92* | ▶️ 83 | Un enfant intelligent mange lentement. |
| ▶️ 84* | ▶️ 97 | ▶️ 97 | ▶️ 96 | ▶️ 94 | ▶️ 95* | ▶️ 98* | ▶️ 99 | Le roi croit voir trois noix. |
| ▶️ 80* | ▶️ 85 | ▶️ 77 | ▶️ 94 | ▶️ 97 | ▶️ 92* | ▶️ 88 | Le témoin voit le chemin loin. | |
| ▶️ 72* | ▶️ 97 | ▶️ 95 | ▶️ 77 | ▶️ 92 | ▶️ 89* | ▶️ 86 | Moins de foin au loin ce matin. | |
| ▶️ 79* | ▶️ 95 | ▶️ 76 | ▶️ 95 | ▶️ 76 | ▶️ 90* | ▶️ 90* | ▶️ 79 | La laine beige sèche près du collège. |
| ▶️ 67* | ▶️ 99 | ▶️ 85 | ▶️ 81 | ▶️ 85 | ▶️ 99* | ▶️ 97* | ▶️ 97 | La croquette sèche dans l'assiette. |
| ▶️ 88* | ▶️ 99 | ▶️ 100 | ▶️ 100 | ▶️ 98 | ▶️ 100* | ▶️ 99* | ▶️ 100 | Elle mène son frère à l'hôtel. |
| ▶️ 77* | ▶️ 87 | ▶️ 99 | ▶️ 93 | ▶️ 87 | ▶️ 87* | ▶️ 99 | Le verre vert est très clair. | |
| ▶️ 100* | ▶️ 94 | ▶️ 100 | ▶️ 99 | ▶️ 99 | ▶️ 99* | ▶️ 100* | ▶️ 100 | Elle aimait manger et rêver. |
| ▶️ 78* | ▶️ 98 | ▶️ 99 | ▶️ 98 | ▶️ 98 | ▶️ 92* | ▶️ 88 | Le jeu bleu me plaît peu. | |
| ▶️ 78* | ▶️ 97 | ▶️ 85 | ▶️ 95 | ▶️ 85 | ▶️ 85 | Ce neveu veut un jeu. | ||
| ▶️ 73* | ▶️ 95 | ▶️ 95 | ▶️ 96 | ▶️ 97 | ▶️ 100 | Le feu bleu est dangereux. | ||
| ▶️ 87* | ▶️ 76 | ▶️ 65 | ▶️ 97 | ▶️ 85 | ▶️ 74* | ▶️ 85* | ▶️ 96 | Le beurre fond dans le cœur chaud. |
| ▶️ 84* | ▶️ 43 | ▶️ 85 | ▶️ 79 | ▶️ 75 | ▶️ 98 | Les fleurs de ma sœur sentent bon. | ||
| ▶️ 70* | ▶️ 86 | ▶️ 79 | ▶️ 76 | ▶️ 87 | ▶️ 84 | ▶️ 98 | Le hibou sait où il va. | |
| ▶️ 92* | ▶️ 95 | ▶️ 86 | ▶️ 92 | ▶️ 98 | ▶️ 99* | ▶️ 94 | L'homme fort mord la pomme. | |
| ▶️ 83* | ▶️ 73 | ▶️ 69 | ▶️ 81 | ▶️ 60 | ▶️ 96* | ▶️ 81 | Le sombre col tombe. | |
| ▶️ 39* | ▶️ 49 | ▶️ 69 | ▶️ 56 | ▶️ 69 | ▶️ 96* | ▶️ 94 | L'auto saute au trottoir chaud. | |
| ▶️ 82 | ▶️ 84 | ▶️ 85 | ▶️ 98 | ▶️ 94 | ▶️ 96* | ▶️ 99 | Le château d'en haut est beau. | |
| ▶️ 89 | ▶️ 85 | ▶️ 75 | ▶️ 91 | ▶️ 52 | ▶️ 75* | ▶️ 70* | ▶️ 98 | Le cœur seul pleure doucement. |
| ▶️ 98* | ▶️ 99 | ▶️ 99 | ▶️ 95 | ▶️ 93* | ▶️ 97* | ▶️ 99 | Tu es sûr du futur ? | |
| ▶️ 97 | ▶️ 93 | ▶️ 92 | ▶️ 85* | ▶️ 90 | Trois très grands trains traversent trois trop grandes rues. | |||
| ▶️ 94 | ▶️ 85 | ▶️ 97 | ▶️ 82* | ▶️ 92 | Je veux deux feux bleus, mais la reine préfère la laine beige. | |||
| ▶️ 91 | ▶️ 79 | ▶️ 87 | ▶️ 82* | ▶️ 94 | Vincent prend un bain en chantant lentement. | |||
| ▶️ 89 | ▶️ 91 | ▶️ 91 | ▶️ 84* | ▶️ 92 | La mule sûre court plus vite que le loup fou. | |||
| ▶️ 91 | ▶️ 93 | ▶️ 93 | ▶️ 92* | ▶️ 96 | Luc a bu du jus sous le pont où coule la boue. | |||
| ▶️ 88 | ▶️ 71 | ▶️ 94 | ▶️ 86* | ▶️ 92 | Le frère de Robert prépare un rare rôti rouge. | |||
| ▶️ 81 | ▶️ 84 | ▶️ 88 | ▶️ 67* | ▶️ 94 | La mule court autour du mur où hurle le loup. |
Pronunciation still feels a bit hit or miss. Sometimes I say a sentence and my tutor says "Oui," and then I say it again and he says "Non, non…" The /ʁ/ and /y/ sounds are hard.
I like seeing these compact links in an Org Mode table and being able to play them, thanks to my custom audio link type. It should be pretty easy to write a function that lets me use a keyboard shortcut to play the audio (maybe using the keys 1-9?) so that I can bounce between them for comparison.
If I screen-share from Google Chrome, I can share the tab with audio, so my tutor can listen to things at the same time. Could be fun to compare attempts so that I can try to hear the differences better. Hmm, actually, let's try adding keyboard shortcuts that let me use 1-8 to play the current table row. Mwahahaha! It works!
Code for summarizing the segments
(defun my-lang-subed-record-cell-info (item file-index file sub)
(let* ((sound-file (expand-file-name (format "%s-%s-%d.opus"
prefix
(my-transform-html-slugify item)
(1+ file-index))))
(score (car (split-string
(or
(subed-record-get-directive "#+SCORE" (elt sub 4)) "")
";")))
(note (replace-regexp-in-string
(concat "^" (regexp-quote (cdr file))
"\\(: \\)?")
""
(or (subed-record-get-directive "#+NOTE" (elt sub 4)) ""))))
(when (or always-create (not (file-exists-p sound-file)))
(subed-record-extract-audio-for-current-subtitle-to-file sound-file sub))
(org-link-make-string
(concat "audio:" sound-file "?icon=t"
(format "&source=%s&source-start=%s" (car file) (elt sub 1))
(format "&title=%s"
(url-hexify-string
(if (string= note "")
(cdr file)
(concat (cdr file) ": " note)))))
(concat
"▶️"
(if score (format " %s" score) "")
(if (string= note "") "" "*")))))
(defun my-lang-subed-record-get-last-attempt (item file)
"Return the last subtitle matching ITEM in FILE."
(car
(last
(seq-remove
(lambda (o) (string-match "#\\+SKIP" (or (elt o 4) "")))
(learn-lang-subed-record-collect-matching-subtitles
item
(list file)
nil
nil
'my-subed-simplify)))))
(defun my-lang-summarize-segments (items files prefix attempt-fn cell-fn &optional always-create)
(cons
(append
(seq-map 'cdr files)
(list "Text"))
(seq-map
(lambda (item)
(append
(seq-map-indexed
(lambda (file file-index)
(let* ((sub (funcall attempt-fn item file)))
(if sub
(funcall cell-fn item file-index file sub)
"")))
files)
(list item)))
items)))
Some code for doing this stuff is in sachac/learn-lang on Codeberg.
