Weekly review: Week ending March 26, 2021

| review, weekly
  • Emacs:
    • I added the ability to expand all, collapse all, and toggle visibility of headings in my exported HTML.
    • I learned how to use org-special-blocks’ defblock for Org special blocks.
    • I added a preview to my consult function for reading a sketch filename.
  • Other tech:
    • I removed the search form and sidebar from my blog. I added night mode.
    • It turns out my laptop has 8 GB of RAM. That might have something to do with the CPU load with OBS – maybe it’s swapping? I looked at the prices for 16GB RAM kits, but I might still want to upgrade my laptop in order to have more screen choices and CPU power.
    • I started looking into the GEDCOM export from Geni, and I noticed that it was incomplete. I may have to manually re-enter the ones that other people put in.
  • Gardening:
    • W- and I moved the garden cage to the deck. I planted peas, calendula, and radishes in the planter outside. I started a small pot of pumpkin seeds.
    • I made sliding doors based on the LEGO Technic idea book we borrowed from the library.
  • Drawing:
    • I tried sketching in both Concepts and Procreate. They’re both nice.
    • I practised sketching plants and insects following “Illustration School: Let’s Draw Plants and Small Creatures.”
    • I modified my sketch viewer to handle SVGs. I figured out how to sketch with a dark background and change the colours for posting.
  • I sewed two long dresses and a bonnet for A-. I also sewed a pair of pajama pants for myself.

Blog posts

Sketches

Time

Category The other week % Last week % Diff % h/wk Diff h/wk
A- 41.7 48.3 6.6 81.6 11.1
Discretionary – Play 0.0 1.2 1.2 2.0 2.0
Unpaid work 2.4 3.2 0.8 5.4 1.3
Discretionary – Family 0.0 0.2 0.2 0.3 0.3
Discretionary – Productive 11.6 10.9 -0.7 18.4 -1.2
Personal 5.6 4.2 -1.3 7.1 -2.2
Sleep 38.8 32.0 -6.7 54.1 -11.3
View or add comments

Add a note to the bottom of blog posts exported from my config file

Posted: - Modified: | emacs, org

Update: 2021-04-18: Tweaked the code so that I could add it to the main org-export-filter-body-functions list now that I'm using Eleventy and ox-11ty.el instead of Wordpress and org2blog.

I occasionally post snippets from my Emacs configuration file, drafting the notes directly in my literate config and posting them via org2blog. I figured it might be a good idea to include a link to my config at the end of the posts, but I didn't want to scatter redundant links in my config file itself. Wouldn't it be cool if the link could be automatically added whenever I use org2blog to post a subtree from my config file? I think the code below accomplishes that.

(defun my/org-export-filter-body-add-emacs-configuration-link (string backend info)
  (when (string-match "\\.emacs\\.d/Sacha\\.org" (plist-get info :input-file))
    (concat string
            (let ((id (org-entry-get-with-inheritance "CUSTOM_ID")))
              (format
               "\n<div class=\"note\">This is part of my <a href=\"https://sachachua.com/dotemacs%s\">Emacs configuration.</a></div>"
               (if id (concat "#" id) ""))))))

(use-package org
  :config
  (add-to-list 'org-export-filter-body-functions #'my/org-export-filter-body-add-emacs-configuration-link))
This is part of my Emacs configuration.
View or add comments

2021-03-22 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Hacker News, planet.emacslife.com, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments

Weekly review: Week ending March 19, 2021

| review, weekly
  • I pulled the other meetups’ iCal feeds in automatically.
  • I added announcement timers for upcoming Emacs events.
  • I listened to the EmacsSF meetup on retro Emacs.
  • I experimented with fast ffmpeg cuts in Emacs, compensating for the distance between keyframes.
  • I submitted some tests for subed.el.
  • I tried streaming again, but I think OBS was taking up too much CPU and it made my computer a little too unresponsive. I’ll try streaming from OBS to Twitch instead of using ffmpeg to multicast next time.
  • I wrote about my word-level timing code.
  • I edited a few more subtitles.

Blog posts

Sketches

Time

Category The other week % Last week % Diff % h/wk Diff h/wk
Sleep 33.7 38.8 5.1 64.7 8.5
A- 41.2 41.7 0.4 69.6 0.7
Personal 5.3 5.6 0.2 9.3 0.4
Business 1.4 0.0 -1.4 0.0 -2.3
Discretionary – Productive 13.7 11.6 -2.1 19.4 -3.5
Unpaid work 4.7 2.4 -2.3 4.0 -3.9
View or add comments

Using word-level timing information when editing subtitles or captions in Emacs

Posted: - Modified: | emacs

I like to split captions at logical points, such as at the end of a phrase or sentence. At first, I used subed.el to play the video for the caption, pausing it at the appropriate point and then calling subed-split-subtitle to split at the playback position. Then I modified subed-split-subtitle to split at the video position that’s proportional to the text position, so that it’s roughly in the right spot even if I’m not currently listening. That got me most of the way to being able to quickly edit subtitles.

It turns out that word-level timing is actually available from YouTube if I download the autogenerated SRV2 file using youtube-dl, which I can do with the following function:

(defun my/caption-download-srv2 (id)
  (interactive "MID: ")
  (when (string-match "v=\\([^&]+\\)" id) (setq id (match-string 1 id)))
  (call-process "youtube-dl" nil nil nil "--write-auto-sub" "--sub-lang" "en" "--skip-download" "--sub-format" "srv2"
                (concat "https://youtu.be/" id))
  (my/caption-load-word-data (my/latest-file "." "\\.srv2\\'")))

I started parsing JSON files, but SRV2 seemed to be more reliably avaliable, so here are the parsing functions for both. I also change common recognition errors along the way, using the my/subed-common-edits variable defined in my config for subtitles. To change those ones in the VTT file I’m editing, I use my/subed-fix-common-errors, also defined elsewhere.

(defvar-local my/caption-cache nil "Word-level timing in the form ((start . ms) (end . ms) (text . ms))")
(defun my/caption-json-time-to-ms (json)
  (+ (* 1000 (string-to-number (alist-get 'seconds json)))
     (/ (alist-get 'nanos json) 1000000)))

(defun my/caption-extract-words-from-json3 ()
  (let* ((data (progn (goto-char (point-min)) (json-read)))
         (json3-p (alist-get 'events data))
         (reversed (reverse
                    (or (alist-get 'events data)
                        (cl-loop for seg in (car (alist-get 'results data))
                                 nconc (alist-get 'words (car (alist-get 'alternatives seg)))))))
         (last-event (seq-first reversed))
         (last-ms (if json3-p
                      (+ (alist-get 'tStartMs last-event)
                         (alist-get 'dDurationMs last-event)))))
    (reverse
     (cl-loop for e across reversed append
              (if json3-p
                  (mapcar
                   (lambda (seg)
                     (let ((rec
                            `((start ,(+ (alist-get 'tStartMs e)
                                         (or (alist-get 'tOffsetMs seg) 0)))
                              (end ,(min last-ms
                                         (+ (alist-get 'tStartMs e)
                                            (or (alist-get 'dDurationMs e) 0))))
                              (text ,(alist-get 'utf8 seg)))))
                       (setq last-ms (alist-get 'start rec))
                       rec))
                   (reverse (alist-get 'segs e)))
                `((start ,(my/caption-json-time-to-ms (alist-get 'startTime seg)))
                  (end ,(my/caption-json-time-to-ms (alist-get 'endTime seg)))
                  (text ,(alist-get 'word seg))))))))

(defun my/caption-extract-words-from-srv2 ()
  (let* ((data (xml-parse-region))
         (text-elements (reverse (dom-by-tag data 'text)))
         (last-start (+ (string-to-number
                         (alist-get 't (xml-node-attributes (car text-elements))))
                        (string-to-number (alist-get 'd (xml-node-attributes (car text-elements)))))))
    (reverse
     (mapcar #'(lambda (element)
                 (let ((rec (list (cons 'start (string-to-number (alist-get 't (xml-node-attributes element))))
                                  (cons 'end last-start)
                                  (cons 'text (car (xml-node-children element))))))
                   (setq last-start (alist-get 'start rec))
                   rec))
             text-elements))))

(defun my/caption-fix-common-errors (data)
  (mapc (lambda (o)
          (mapc (lambda (e)
                  (when (string-match (concat "\\<" (car e) "\\>") (alist-get 'text o))
                    (map-put! o 'text (replace-match (cadr e) t t (alist-get 'text o)))))
                my/subed-common-edits))
        data))

(defun my/caption-load-word-data (file)
  "Load word-level timing from FILE."
  (interactive "fFile: ")
  (let (data)
    (with-current-buffer (find-file-noselect file)
      (cond
       ((string-match "\\.json" file)
        (setq data (my/caption-extract-words-from-json3)))
       ((string-match "\\.srv2\\'" file)
        (setq data (my/caption-extract-words-from-srv2)))
       (t (error "Unknown format."))))
    (setq-local my/caption-cache
                (mapcar (lambda (entry)
                          (setf (alist-get 'text entry)
                                (replace-regexp-in-string "&#39;" "'" (alist-get 'text entry)))
                          entry)
                        (my/caption-fix-common-errors data)))))

Assuming I start editing from the beginning of the file, then the part of the captions file after point is mostly unedited. That means I can match the remainder of the current caption with the word-level timing to try to figure out the time to use when splitting the subtitle, falling back to the proportional method if the data is not available.

(defun my/caption-look-up-word ()
  (save-excursion
    (let* ((end (subed-subtitle-msecs-stop))
           (start (subed-subtitle-msecs-start))
           (remaining-words (split-string (buffer-substring (point) (subed-jump-to-subtitle-end))))
           (words (reverse (seq-filter (lambda (o)
                                         (and (<= (alist-get 'end o) end)
                                              (>= (alist-get 'start o) start)
                                              (not (string-match "^\n*$" (alist-get 'text o)))))
                                       my/caption-cache)))
           (offset 0)
           candidate done)
      (while (not done)
        (setq candidate (elt words (+ (1- (length remaining-words)) offset)))
        (cond
         ((and candidate (string-match (concat "\\<" (car remaining-words) "\\>") (alist-get 'text candidate)))
          (setq done t))
         ((> offset (length words)) (setq done t))
         ((> offset 0) (setq offset (- offset)))
         (t (setq offset (1+ (- offset))))))
      candidate)))

(defun my/caption-unwrap ()
  (interactive)
  (subed-jump-to-subtitle-text)
  (let ((limit (save-excursion (or (subed-jump-to-subtitle-end) (point)))))
         (while (re-search-forward "\n" limit t)
           (replace-match " "))))
(defun my/caption-split ()
  "Split the current subtitle based on word-level timing if available."
  (interactive)
  (save-excursion
    (let ((data (my/caption-look-up-word)))
      (prin1 data)
      (subed-split-subtitle (and data (- (alist-get 'start data) (subed-subtitle-msecs-start)))))))
(defun my/caption-split-and-merge-with-next ()
  (interactive)
  (my/caption-split)
  (my/caption-unwrap)
  (subed-forward-subtitle-id)
  (subed-merge-with-next)
  (my/caption-unwrap))
(defun my/caption-split-and-merge-with-previous ()
  (interactive)
  (my/caption-split)
  (subed-merge-with-previous)
  (my/caption-unwrap))
(use-package subed
  :if my/laptop-p
  :load-path "~/vendor/subed/subed"
  :bind
  (:map subed-mode-map
        ("M-'" . my/caption-split)
        ("M-," . my/caption-split-and-merge-with-previous)
        ("M-q" . my/caption-unwrap)
        ("M-." . my/caption-split-and-merge-with-next)))

That way, I can use the word-level timing information for most of the reformatting, but I can easily replay segments of the video if I’m unsure about a word that needs to be changed.

If I want to generate a VTT based on the caption data, breaking it at certain words, these functions help:

(defvar my/caption-breaks
  '("the" "this" "we" "we're" "I" "finally" "but" "and" "when")
  "List of words to try to break at.")
(defun my/caption-make-groups (list)
  (let (result
        current-item
        done
        (current-length 0)
        (limit 70)
        (lower-limit 30)
        (break-regexp (concat "\\<" (regexp-opt my/caption-breaks) "\\>")))
    (while list
      (cond
       ((null (car list)))
       ((string-match "^\n*$" (alist-get 'text (car list)))
        (push (cons '(text . " ") (car list)) current-item)
        (setq current-length (1+ current-length)))
       ((< (+ current-length (length (alist-get 'text (car list)))) limit)
        (setq current-item (cons (car list) current-item)
              current-length (+ current-length (length (alist-get 'text (car list))) 1)))
       (t (setq done nil)
          (while (not done)
          (cond
           ((< current-length lower-limit)
            (setq done t))
           ((and (string-match break-regexp (alist-get 'text (car current-item)))
                 (not (string-match break-regexp (alist-get 'text (cadr current-item)))))
            (setq current-length (- current-length (length (alist-get 'text (car current-item)))))
            (push (pop current-item) list)
            (setq done t))
           (t
            (setq current-length (- current-length (length (alist-get 'text (car current-item)))))
            (push (pop current-item) list))))
          (push nil list)
          (setq result (cons (reverse current-item) result) current-item nil current-length 0)))
      (setq list (cdr list)))
    (reverse result)))

(defun my/caption-format-as-subtitle (list &optional word-timing)
  "Turn a LIST of the form (((start . ms) (end . ms) (text . s)) ...) into VTT.
If WORD-TIMING is non-nil, include word-level timestamps."
  (format "%s --> %s\n%s\n\n"
          (subed-vtt--msecs-to-timestamp (alist-get 'start (car list)))
          (subed-vtt--msecs-to-timestamp (alist-get 'end (car (last list))))
          (s-trim (mapconcat (lambda (entry)
                               (if word-timing
                                   (format " <%s>%s"
                                           (subed-vtt--msecs-to-timestamp (alist-get 'start entry))
                                           (string-trim (alist-get 'text entry)))
                                 (alist-get 'text entry)))
                             list ""))))

(defun my/caption-to-vtt (&optional data)
  (interactive)
  (with-temp-file "captions.vtt"
    (insert "WEBVTT\n\n"
            (mapconcat
             (lambda (entry) (my/caption-format-as-subtitle entry))
             (my/caption-make-groups
              (or data (my/caption-fix-common-errors my/caption-cache)))
             ""))))
View or add comments

2021-03-15 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Hacker News, planet.emacslife.com, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments

Weekly review: Week ending March 12, 2021

| review, weekly
  • Emacs streaming:
    • I experimented with redirecting my automatic caption output into Emacs, using it to dictate a few sentences. It works well when I clearly enunciate.
    • I had problems streaming. OBS kept using >100% CPU. It might be the browser soure.
  • Video editing and subtitles:
    • I wrote some code to use word-level timing from Google’s video transcripts when splitting subtitles.
    • I wrote some code to mute the video based on subtitles and to include a title image.
    • I figured out what was wrong with my subtitle splitting code. format-seconds has a bug. It rounds up when given a decimal.
    • I split the videos based on the XML provided by BigBlueButton.
    • I edited the subtitles for Mike’s demo of transient.
    • I worked on generating title clips. At first, I tried editing Inkscape SVGs, but that didn’t work so well with long titles. I switched to using LaTeX and learned how to use TikZ.
  • Other:
    • I sewed a simple watermelon nightgown/dress for A-. She liked it.
    • I ordered the Georgi chording keyboard. I want to see if I can get the hang of stenography for captioning, writing, and coding.
    • The arborists removed one of the trees in the backyard.

Blog posts

Time

Category The other week % Last week % Diff % h/wk Diff h/wk
Discretionary – Productive 6.8 13.7 6.9 23.1 11.7
Unpaid work 3.7 4.7 1.0 7.9 1.7
Personal 4.7 5.3 0.6 8.9 1.0
Sleep 33.2 33.7 0.5 56.6 0.8
Business 1.7 1.4 -0.4 2.3 -0.6
Discretionary – Play 2.5 0.0 -2.5 0.0 -4.3
A- 47.3 41.2 -6.1 69.3 -10.2

I’ve been staying up to work on video processing.

View or add comments