Remove filler words at the start and upcase the next word
| audio, speechtotext, emacsLike many people, I tend to use "So", "And", "You know", and "Uh" to bridge between sentences when thinking. WhisperX does a reasonable job of detecting sentences and splitting them up anyway, but it leaves those filler words in at the start of the sentence. I usually like to remove these from transcripts so that they read more smoothly.
Here's a short Emacs Lisp function that removes those filler words when they start a sentence, capitalizing the next word. When called interactively, it prompts while displaying an overlay. When called from Emacs Lisp, it changes without asking for confirmation.
(defvar my-filler-words-regexp "\\. \\(?:so,?\\|and\\|you know,\\|uh,?\\) \\(.\\)") (defun my-remove-filler-words-at-start () (interactive) (save-excursion (while (re-search-forward my-filler-words-regexp nil t) (if (and (called-interactively-p) (not current-prefix-arg)) (let ((overlay (make-overlay (match-beginning 0) (match-end 0)))) (overlay-put overlay 'common-edit t) (overlay-put overlay 'display (propertize (concat (match-string 0) " -> . " (upcase (match-string 1))) 'face 'modus-themes-mark-sel)) (unwind-protect (pcase (read-char-choice "Replace (y/n/!/q)? " "yn!q") (?! (replace-match (concat ". " (upcase (match-string 1))) t) (while (re-search-forward "\\. \\(?:So\\|And\\) \\(.\\)" nil t) (replace-match (concat ". " (upcase (match-string 1))) t))) (?y (replace-match (concat ". " (upcase (match-string 1))) t)) (?n nil) (?q (goto-char (point-max)))) (delete-overlay overlay))) (replace-match (concat ". " (upcase (match-string 1))) t)))))