Categories: geek

RSS - Atom - Subscribe via email

2021-04-05 Emacs news

| emacs, emacs-news

Links from, r/orgmode, r/spacemacs, r/planetemacs, Hacker News,, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments

Org Mode: Insert YouTube video with separate captions

| emacs

I’m playing around with some ideas for making it easier to post a video with its captions on a webpage or in an Org file so that it’s easier to skim or search.

This requires the youtube-dl command. I’m also learning how to use dash.el‘s threading macro, so you’ll need to install that as well if you want to run it.

(require 'dash)

(defun my/msecs-to-timestamp (msecs)
  "Convert MSECS to string in the format HH:MM:SS.MS."
  (concat (format-seconds "%02h:%02m:%02s" (/ msecs 1000))
          "." (format "%03d" (mod msecs 1000))))

(defun my/org-insert-youtube-video-with-transcript (url)
  (interactive "MURL: ")
  (let* ((id (if (string-match "v=\\([^&]+\\)" url) (match-string 1 url) url))
         (temp-file (make-temp-name "org-youtube-"))
         (temp-file-name (concat temp-file ".en.srv1"))
    (when (and (call-process "youtube-dl" nil nil nil
                             "--write-sub" "--write-auto-sub"  "--no-warnings" "--sub-lang" "en" "--skip-download" "--sub-format" "srv1"
                             "-o" temp-file
                             (format "" id))
               (file-exists-p temp-file-name))
       (format "#+begin_export html
<iframe width=\"560\" height=\"315\" src=\"\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture\" allowfullscreen></iframe>\n#+end_export\n" id)
       (mapconcat (lambda (o)
                    (format "| [[][%s]] | %s |\n"
                            (dom-attr o 'start)
                            (my/msecs-to-timestamp (* 1000 (string-to-number (dom-attr o 'start))))
                            (->> (dom-text o)
                                 (replace-regexp-in-string "[ \n]+" " ")
                                 (replace-regexp-in-string "&#39;" "'")
                                 (replace-regexp-in-string "&quot;" "\""))))
                  (dom-by-tag (xml-parse-file temp-file-name) 'text)
      (delete-file temp-file-name))))

It makes an embedded Youtube video and a table with captions below it. The Org file doesn’t look too bad, either.


I decided to stick to standard Org syntax so that I can read it in Emacs too. With the current implementation, clicking on the timestamps jumps to that position in the video, but on the Youtube website. I haven’t coded anything fancy like keeping the embedded video at a fixed position, controlling it from the clicks, or highlighting the current position. It’s a start, though!

Here’s the output of running it with my talk from the last EmacsConf.

00:00:00.000 I’m Sacha Chua, and welcome to EmacsConf 2020.
00:00:04.000 To kick things off, here are ten cool things
00:00:07.000 that people have been working on
00:00:08.000 since the conference last year.
00:00:10.000 If you want to follow the links
00:00:11.000 or if you’d like to add something I’ve missed,
00:00:14.000 add them to the collaborative pad
00:00:16.000 if you’re watching this live
00:00:17.000 or check out the EmacsConf wiki page for this talk.

… (omitted for brevity)

This is part of my Emacs configuration.
View or add comments

2021-03-29 Emacs news

| emacs, emacs-news

Links from, r/orgmode, r/spacemacs, r/planetemacs, Hacker News,, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments

Add a note to the bottom of blog posts exported from my config file

Posted: - Modified: | emacs, org

Update: 2021-04-18: Tweaked the code so that I could add it to the main org-export-filter-body-functions list now that I'm using Eleventy and ox-11ty.el instead of Wordpress and org2blog.

I occasionally post snippets from my Emacs configuration file, drafting the notes directly in my literate config and posting them via org2blog. I figured it might be a good idea to include a link to my config at the end of the posts, but I didn't want to scatter redundant links in my config file itself. Wouldn't it be cool if the link could be automatically added whenever I use org2blog to post a subtree from my config file? I think the code below accomplishes that.

(defun my/org-export-filter-body-add-emacs-configuration-link (string backend info)
  (when (and (plist-get info :input-file) (string-match "\\.emacs\\.d/Sacha\\.org" (plist-get info :input-file)))
    (concat string
            (let ((id (org-entry-get-with-inheritance "CUSTOM_ID")))
               "\n<div class=\"note\">This is part of my <a href=\"\">Emacs configuration.</a></div>"
               (if id (concat "#" id) ""))))))

(use-package org
  (add-to-list 'org-export-filter-body-functions #'my/org-export-filter-body-add-emacs-configuration-link))
This is part of my Emacs configuration.
View or add comments

2021-03-22 Emacs news

| emacs, emacs-news

Links from, r/orgmode, r/spacemacs, r/planetemacs, Hacker News,, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments

Using word-level timing information when editing subtitles or captions in Emacs

Posted: - Modified: | emacs

I like to split captions at logical points, such as at the end of a phrase or sentence. At first, I used subed.el to play the video for the caption, pausing it at the appropriate point and then calling subed-split-subtitle to split at the playback position. Then I modified subed-split-subtitle to split at the video position that’s proportional to the text position, so that it’s roughly in the right spot even if I’m not currently listening. That got me most of the way to being able to quickly edit subtitles.

It turns out that word-level timing is actually available from YouTube if I download the autogenerated SRV2 file using youtube-dl, which I can do with the following function:

(defun my/caption-download-srv2 (id)
  (interactive "MID: ")
  (when (string-match "v=\\([^&]+\\)" id) (setq id (match-string 1 id)))
  (call-process "youtube-dl" nil nil nil "--write-auto-sub" "--sub-lang" "en" "--skip-download" "--sub-format" "srv2"
                (concat "" id))
  (my/caption-load-word-data (my/latest-file "." "\\.srv2\\'")))

I started parsing JSON files, but SRV2 seemed to be more reliably avaliable, so here are the parsing functions for both. I also change common recognition errors along the way, using the my/subed-common-edits variable defined in my config for subtitles. To change those ones in the VTT file I’m editing, I use my/subed-fix-common-errors, also defined elsewhere.

(defvar-local my/caption-cache nil "Word-level timing in the form ((start . ms) (end . ms) (text . ms))")
(defun my/caption-json-time-to-ms (json)
  (+ (* 1000 (string-to-number (alist-get 'seconds json)))
     (/ (alist-get 'nanos json) 1000000)))

(defun my/caption-extract-words-from-json3 ()
  (let* ((data (progn (goto-char (point-min)) (json-read)))
         (json3-p (alist-get 'events data))
         (reversed (reverse
                    (or (alist-get 'events data)
                        (cl-loop for seg in (car (alist-get 'results data))
                                 nconc (alist-get 'words (car (alist-get 'alternatives seg)))))))
         (last-event (seq-first reversed))
         (last-ms (if json3-p
                      (+ (alist-get 'tStartMs last-event)
                         (alist-get 'dDurationMs last-event)))))
     (cl-loop for e across reversed append
              (if json3-p
                   (lambda (seg)
                     (let ((rec
                            `((start ,(+ (alist-get 'tStartMs e)
                                         (or (alist-get 'tOffsetMs seg) 0)))
                              (end ,(min last-ms
                                         (+ (alist-get 'tStartMs e)
                                            (or (alist-get 'dDurationMs e) 0))))
                              (text ,(alist-get 'utf8 seg)))))
                       (setq last-ms (alist-get 'start rec))
                   (reverse (alist-get 'segs e)))
                `((start ,(my/caption-json-time-to-ms (alist-get 'startTime seg)))
                  (end ,(my/caption-json-time-to-ms (alist-get 'endTime seg)))
                  (text ,(alist-get 'word seg))))))))

(defun my/caption-extract-words-from-srv2 ()
  (let* ((data (xml-parse-region))
         (text-elements (reverse (dom-by-tag data 'text)))
         (last-start (+ (string-to-number
                         (alist-get 't (xml-node-attributes (car text-elements))))
                        (string-to-number (alist-get 'd (xml-node-attributes (car text-elements)))))))
     (mapcar #'(lambda (element)
                 (let ((rec (list (cons 'start (string-to-number (alist-get 't (xml-node-attributes element))))
                                  (cons 'end last-start)
                                  (cons 'text (car (xml-node-children element))))))
                   (setq last-start (alist-get 'start rec))

(defun my/caption-fix-common-errors (data)
  (mapc (lambda (o)
          (mapc (lambda (e)
                  (when (string-match (concat "\\<" (car e) "\\>") (alist-get 'text o))
                    (map-put! o 'text (replace-match (cadr e) t t (alist-get 'text o)))))

(defun my/caption-load-word-data (file)
  "Load word-level timing from FILE."
  (interactive "fFile: ")
  (let (data)
    (with-current-buffer (find-file-noselect file)
       ((string-match "\\.json" file)
        (setq data (my/caption-extract-words-from-json3)))
       ((string-match "\\.srv2\\'" file)
        (setq data (my/caption-extract-words-from-srv2)))
       (t (error "Unknown format."))))
    (setq-local my/caption-cache
                (mapcar (lambda (entry)
                          (setf (alist-get 'text entry)
                                (replace-regexp-in-string "&#39;" "'" (alist-get 'text entry)))
                        (my/caption-fix-common-errors data)))))

Assuming I start editing from the beginning of the file, then the part of the captions file after point is mostly unedited. That means I can match the remainder of the current caption with the word-level timing to try to figure out the time to use when splitting the subtitle, falling back to the proportional method if the data is not available.

(defun my/caption-look-up-word ()
    (let* ((end (subed-subtitle-msecs-stop))
           (start (subed-subtitle-msecs-start))
           (remaining-words (split-string (buffer-substring (point) (subed-jump-to-subtitle-end))))
           (words (reverse (seq-filter (lambda (o)
                                         (and (<= (alist-get 'end o) end)
                                              (>= (alist-get 'start o) start)
                                              (not (string-match "^\n*$" (alist-get 'text o)))))
           (offset 0)
           candidate done)
      (while (not done)
        (setq candidate (elt words (+ (1- (length remaining-words)) offset)))
         ((and candidate (string-match (concat "\\<" (car remaining-words) "\\>") (alist-get 'text candidate)))
          (setq done t))
         ((> offset (length words)) (setq done t))
         ((> offset 0) (setq offset (- offset)))
         (t (setq offset (1+ (- offset))))))

(defun my/caption-unwrap ()
  (let ((limit (save-excursion (or (subed-jump-to-subtitle-end) (point)))))
         (while (re-search-forward "\n" limit t)
           (replace-match " "))))
(defun my/caption-split ()
  "Split the current subtitle based on word-level timing if available."
    (let ((data (my/caption-look-up-word)))
      (prin1 data)
      (subed-split-subtitle (and data (- (alist-get 'start data) (subed-subtitle-msecs-start)))))))
(defun my/caption-split-and-merge-with-next ()
(defun my/caption-split-and-merge-with-previous ()
(use-package subed
  :if my/laptop-p
  :load-path "~/vendor/subed/subed"
  (:map subed-mode-map
        ("M-'" . my/caption-split)
        ("M-," . my/caption-split-and-merge-with-previous)
        ("M-q" . my/caption-unwrap)
        ("M-." . my/caption-split-and-merge-with-next)))

That way, I can use the word-level timing information for most of the reformatting, but I can easily replay segments of the video if I’m unsure about a word that needs to be changed.

If I want to generate a VTT based on the caption data, breaking it at certain words, these functions help:

(defvar my/caption-breaks
  '("the" "this" "we" "we're" "I" "finally" "but" "and" "when")
  "List of words to try to break at.")
(defun my/caption-make-groups (list)
  (let (result
        (current-length 0)
        (limit 70)
        (lower-limit 30)
        (break-regexp (concat "\\<" (regexp-opt my/caption-breaks) "\\>")))
    (while list
       ((null (car list)))
       ((string-match "^\n*$" (alist-get 'text (car list)))
        (push (cons '(text . " ") (car list)) current-item)
        (setq current-length (1+ current-length)))
       ((< (+ current-length (length (alist-get 'text (car list)))) limit)
        (setq current-item (cons (car list) current-item)
              current-length (+ current-length (length (alist-get 'text (car list))) 1)))
       (t (setq done nil)
          (while (not done)
           ((< current-length lower-limit)
            (setq done t))
           ((and (string-match break-regexp (alist-get 'text (car current-item)))
                 (not (string-match break-regexp (alist-get 'text (cadr current-item)))))
            (setq current-length (- current-length (length (alist-get 'text (car current-item)))))
            (push (pop current-item) list)
            (setq done t))
            (setq current-length (- current-length (length (alist-get 'text (car current-item)))))
            (push (pop current-item) list))))
          (push nil list)
          (setq result (cons (reverse current-item) result) current-item nil current-length 0)))
      (setq list (cdr list)))
    (reverse result)))

(defun my/caption-format-as-subtitle (list &optional word-timing)
  "Turn a LIST of the form (((start . ms) (end . ms) (text . s)) ...) into VTT.
If WORD-TIMING is non-nil, include word-level timestamps."
  (format "%s --> %s\n%s\n\n"
          (subed-vtt--msecs-to-timestamp (alist-get 'start (car list)))
          (subed-vtt--msecs-to-timestamp (alist-get 'end (car (last list))))
          (s-trim (mapconcat (lambda (entry)
                               (if word-timing
                                   (format " <%s>%s"
                                           (subed-vtt--msecs-to-timestamp (alist-get 'start entry))
                                           (string-trim (alist-get 'text entry)))
                                 (alist-get 'text entry)))
                             list ""))))

(defun my/caption-to-vtt (&optional data)
  (with-temp-file "captions.vtt"
    (insert "WEBVTT\n\n"
             (lambda (entry) (my/caption-format-as-subtitle entry))
              (or data (my/caption-fix-common-errors my/caption-cache)))
View or add comments

2021-03-15 Emacs news

| emacs, emacs-news

Links from, r/orgmode, r/spacemacs, r/planetemacs, Hacker News,, YouTube, the Emacs NEWS file and emacs-devel.

View or add comments