Categories: geek » emacs » org

View topic page - RSS - Atom - Subscribe via email

Create a Google Calendar event from an Org Mode timestamp

| org, emacs

Time zones are hard, so I let calendaring systems take care of the conversion and confirmation. I've been using Google Calendar because it synchronizes with my phone and people know what to do with the event invite. Org Mode has iCalendar export, but I sometimes have a hard time getting .ics files into Google Calendar on my laptop, so I might as well just create the calendar entry in Google Calendar directly. Well. Emacs is a lot more fun than Google Calendar, so I'd rather create the calendar entry from Emacs and put it into Google Calendar.

This function lets me start from a timestamp like [2026-04-24 Fri 10:30] (inserted with C-u C-c C-!, or org-timestamp-inactive) and create an event based on a template.

(defvar sacha-time-zone "America/Toronto" "Full name of time zone.")

;;;###autoload
(defun sacha-emacs-chat-schedule (&optional time)
  "Create a Google Calendar invite based on TIME or the Org timestamp at point."
  (interactive (list (sacha-org-time-at-point)))
  (browse-url
   (format
    "https://calendar.google.com/calendar/render?action=TEMPLATE&text=%s&details=%s&dates=%s&ctz=%s"
    (url-hexify-string sacha-emacs-chat-title)
    (url-hexify-string sacha-emacs-chat-description)
    (format-time-string
     "%Y%m%dT%H%M%S" time)
    sacha-time-zone)))

(defvar sacha-emacs-chat-title "Emacs Chat" "Title of calendar entry.")
(defvar sacha-emacs-chat-description
  "All right, let's try this! =) See the calendar invite for the Google Meet link.

Objective: Share cool stuff about Emacs workflows that's not obvious from reading configs, and have fun chatting about Emacs

Some ideas for things to talk about:
- Which keyboard shortcuts or combinations of functions work really well for you?
- What's something you love about your setup?
- What are you looking forward to tweaking next?

Let me know if you want to do it on stream (more people can ask questions) or off stream (we can clean up the video in case there are hiccups). Also, please feel free to send me links to things you'd like me to read ahead of time, like your config!"
  "Description.")

It uses this function to convert the timestamp at point:

sacha-org-time-at-point: Return Emacs time object for timestamp at point.
(defun sacha-org-time-at-point ()
  "Return Emacs time object for timestamp at point."
  (org-timestamp-to-time (org-timestamp-from-string (org-element-property :raw-value (org-element-context)))))

This is part of my Emacs configuration.
View Org source for this post

Make chapter markers and video time hyperlinks easier to note while I livestream

| org, emacs

I want to make it easier to add chapter markers to my YouTube video descriptions and hyperlinks to specific times in videos in my blog posts.

This is part of my Emacs configuration.
View Org source for this post

Org Mode: JS for translating times to people's local timezones

| org, emacs, js

I want to get back into the swing of doing Emacs Chats again, which means scheduling, which means timezones. Let's see first if anyone happens to match up with the Thursday timeslots (10:30 or 12:45) that I'd like to use for Emacs-y video things, but I might be able to shuffle things around if needed.

I want something that can translate times into people's local timezones. I use Org Mode timestamps a lot because they're so easy to insert with C-u C-c ! (org-timestamp-inactive), which inserts a timestamp like this:

By default, the Org HTML export for it does not include the timezone offset. That's easily fixed by adding %z to the time specifier, like this:

(setq org-html-datetime-formats '("%F" . "%FT%T%z"))

Now a little bit of Javascript code makes it clickable and lets us toggle a translated time. I put the time afterwards so that people can verify it visually. I never quite trust myself when it comes to timezone translations.

function translateTime(event) {
  if (event.target.getAttribute('datetime')?.match(/[0-9][0-9][0-9][0-9]$/)) {
    if (event.target.querySelector('.translated')) {
      event.target.querySelectorAll('.translated').forEach((o) => o.remove());
    } else {
      const span = document.createElement('span');
      span.classList.add('translated');
      span.textContent = ' → ' + (new Date(event.target.getAttribute('datetime'))).toLocaleString(undefined, {
        month: 'short',  
        day: 'numeric',  
        hour: 'numeric', 
        minute: '2-digit',
        timeZoneName: 'short'
      });
      event.target.appendChild(span);
    }
  }
}
function clickForLocalTime() {
  document.querySelectorAll('time').forEach((o) => {
    if (o.getAttribute('datetime')?.match(/[0-9][0-9][0-9][0-9]$/)) {
      o.addEventListener('click', translateTime);
      o.classList.add('clickable');
    }
  });
}

And some CSS to make it more obvious that it's now clickable:

.clickable {
    cursor: pointer;
    text-decoration: underline dotted;
}

Let's see if this is useful.

Someday, it would probably be handy to have a button that translates all the timestamps in a table, but this is a good starting point.

View Org source for this post

Org Mode: Tangle Emacs config snippets to different files and add boilerplate

| emacs, org

I want to organize the functions in my Emacs configuration so that they are easier for me to test and so that other people can load them from my repository. Instead of copying multiple code blogs from my blog posts or my exported Emacs configuration, it would be great if people could just include a file from the repository. I don't think people copy that much from my config, but it might still be worth making it easier for people to borrow interesting functions. It would be great to have libraries of functions that people can evaluate without worrying about side effects, and then they can copy or write a shorter piece of code to use those functions.

In Prot's configuration (The custom libraries of my configuration), he includes each library as in full, in a single code block, with the boilerplate description, keywords, and (provide '...) that make them more like other libraries in Emacs.

I'm not quite sure my little functions are at that point yet. For now, I like the way that the functions are embedded in the blog posts and notes that explain them, and the org-babel :comments argument can insert links back to the sections of my configuration that I can open with org-open-at-point-global or org-babel-tangle-jump-to-org.

Thinking through the options...

Org tangles blocks in order, so if I want boilerplate or if I want to add require statements, I need to have a section near the beginning of my config that sets those up for each file. Noweb references might help me with common text like the license. Likewise, if I want a (provide ...) line at the end of each file, I need a section near the end of the file.

If I want to specify things out of sequence, I could use Noweb. By setting :noweb-ref some-id :tangle no on the blocks I want to collect later, I can then tangle them in the middle of the boilerplate. Here's a brief demo:

#+begin_src emacs-lisp :noweb yes :tangle lisp/sacha-eshell.el :comments no
;; -*- lexical-binding: t; -*-
<<sacha-eshell>>
(provide 'sacha-eshell)
#+end_src

However, I'll lose the comment links that let me jump back to the part of the Org file with the original source block. This means that if I use find-function to jump to the definition of a function and then I want to find the outline section related to it, I have to use a function that checks if this might be my custom code and then looks in my config for "defun …". It's a little less generic.

I wonder if I can combine multiple targets with some code that knows what it's being tangled to, so it can write slightly different text. org-babel-tangle-single-block currently calculates the result once and then adds it to the list for each filename, so that doesn't seem likely.

Alternatively, maybe I can use noweb or my own tangling function and add the link comments from org-babel-tangle-comments.

Aha, I can fiddle with org-babel-post-tangle-hook to insert the boilerplate after the blocks have been written. Then I can add the lexical-binding: t cookie and the structure that makes it look more like the other libraries people define and use. It's always nice when I can get away with a small change that uses an existing hook. For good measure, let's even include a list of links to the sections of my config that affect that file.

(defvar sacha-dotemacs-url "https://sachachua.com/dotemacs/")

;;;###autoload
(defun sacha-dotemacs-link-for-section-at-point (&optional combined)
  "Return the link for the current section."
  (let* ((custom-id (org-entry-get-with-inheritance "CUSTOM_ID"))
         (title (org-entry-get (point) "ITEM"))
         (url (if custom-id
                  (concat "dotemacs:" custom-id)
                (concat sacha-dotemacs-url ":-:text=" (url-hexify-string title)))))
    (if combined
        (org-link-make-string
         url
         title)
      (cons url title))))

(eval-and-compile
  (require 'org-core nil t)
  (require 'org-macs nil t)
  (require 'org-src nil t))
(declare-function 'org-babel-tangle--compute-targets "ob-tangle")
(defun sacha-org-collect-links-for-tangled-files ()
  "Return a list of ((filename (link link link link)) ...)."
  (let* ((file (buffer-file-name))
         results)
    (org-babel-map-src-blocks (buffer-file-name)
      (let* ((info (org-babel-get-src-block-info))
             (link (sacha-dotemacs-link-for-section-at-point)))
        (mapc
         (lambda (target)
           (let ((list (assoc target results #'string=)))
             (if list
                 (cl-pushnew link (cdr list) :test 'equal)
               (push (list target link) results))))
         (org-babel-tangle--compute-targets file info))))
    ;; Put it back in source order
    (nreverse
     (mapcar (lambda (o)
               (cons (car o)
                     (nreverse (cdr o))))
             results))))
(defvar sacha-emacs-config-module-links nil "Cache for links from tangled files.")

;;;###autoload
(defun sacha-emacs-config-update-module-info ()
  "Update the list of links."
  (interactive)
  (setq sacha-emacs-config-module-links
        (seq-filter
         (lambda (o)
           (string-match "sacha-" (car o)))
         (sacha-org-collect-links-for-tangled-files)))
  (setq sacha-emacs-config-modules-info
        (mapcar (lambda (group)
                  `(,(file-name-base (car group))
                    (commentary
                     .
                     ,(replace-regexp-in-string
                       "^"
                       ";; "
                       (concat
                        "Related Emacs config sections:\n\n"
                        (org-export-string-as
                         (mapconcat
                          (lambda (link)
                            (concat "- " (cdr link) "\\\\\n  " (org-link-make-string (car link)) "\n"))
                          (cdr group)
                          "\n")
                         'ascii
                         t))))))
                sacha-emacs-config-module-links)))

;;;###autoload
(defun sacha-emacs-config-prepare-to-tangle ()
  "Update module info if tangling my config."
  (when (string-match "Sacha.org" (buffer-file-name))
    (sacha-emacs-config-update-module-info)))

Let's set up the functions for tangling the boilerplate.

(defvar sacha-emacs-config-modules-dir "~/sync/emacs/lisp/")
(defvar sacha-emacs-config-modules-info nil "Alist of module info.")
(defvar sacha-emacs-config-url "https://sachachua.com/dotemacs")

;;;###autoload
(defun sacha-org-babel-post-tangle-insert-boilerplate-for-sacha-lisp ()
  (when (file-in-directory-p (buffer-file-name) sacha-emacs-config-modules-dir)
    (goto-char (point-min))
    (let ((base (file-name-base (buffer-file-name))))
      (insert (format ";;; %s.el --- %s -*- lexical-binding: t -*-

;; Author: %s <%s>
;; URL: %s

;;; License:
;;
;; This file is not part of GNU Emacs.
;;
;; This is free software; you can redistribute it and/or modify
;; it under the terms of the GNU General Public License as published by
;; the Free Software Foundation; either version 3, or (at your option)
;; any later version.
;;
;; This is distributed in the hope that it will be useful,
;; but WITHOUT ANY WARRANTY; without even the implied warranty of
;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
;; GNU General Public License for more details.
;;
;; You should have received a copy of the GNU General Public License
;; along with GNU Emacs; see the file COPYING.  If not, write to the
;; Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
;; Boston, MA 02110-1301, USA.

;;; Commentary:
;;
%s
;;; Code:

\n\n"
                      base
                      (or
                       (assoc-default 'description
                                      (assoc-default base sacha-emacs-config-modules-info #'string=))
                       "")
                      user-full-name
                      user-mail-address
                      sacha-emacs-config-url
                      (or
                       (assoc-default 'commentary
                                      (assoc-default base sacha-emacs-config-modules-info #'string=))
                       "")))
      (goto-char (point-max))
      (insert (format "\n(provide '%s)\n;;; %s.el ends here\n"
                      base
                      base))
      (save-buffer))))
(setq sacha-emacs-config-url "https://sachachua.com/dotemacs")
(with-eval-after-load 'org
  (add-hook 'org-babel-pre-tangle-hook #'sacha-emacs-config-prepare-to-tangle)
  (add-hook 'org-babel-post-tangle-hook #'sacha-org-babel-post-tangle-insert-boilerplate-for-sacha-lisp))

You can see the results at .emacs.d/lisp. For example, the function definitions in this post are at lisp/sacha-emacs.el.

This is part of my Emacs configuration.
View Org source for this post

YE11: Fix find-function for Emacs Lisp from org-babel or scratch

| org, emacs, elisp, stream, yay-emacs

Watch on Internet Archive, watch/comment on YouTube, download captions, or email me

Where can you define an Emacs Lisp function so that you can use find-function to jump to it again later?

  • A: In an indirect buffer from Org Mode source block with your favorite eval function like eval-defun
    • C-c ' (org-edit-special) inside the block; execute the defun with C-M-x (eval-defun), C-x C-e (eval-last-sexp), or eval-buffer.

          (defun my-test-1 () (message "Hello"))
      
  • B: In an Org Mode file by executing the block with C-c C-c

      (defun my-test-2 () (message "Hello"))
    
  • C: In a .el file

    file:///tmp/test-search-function.el : execute the defun with C-M-x (eval-defun), C-x C-e (eval-last-sexp), or eval-buffer

  • D: In a scratch buffer, other temporary buffer, or really any buffer thanks to eval-last-sexp

    (defun my-test-4 () (message "Hello"))

Only option C works - it's gotta be in an .el file for find-function to find it. But I love jumping to function definitions using find-function or lispy-goto-symbol (which is bound to M-. if you use lispy and set up lispy-mode) so that I can look at or change how something works. It can be a little frustrating when I try to jump to a definition and it says, "Don't know where blahblahblah is defined." I just defined it five minutes ago! It's there in one of my other buffers, don't make me look for it myself. Probably this will get fixed in Emacs core someday, but no worries, we can work around it today with a little bit of advice.

I did some digging around in the source code. Turns out that symbol-file can't find the function definition in the load-history variable if you're not in a .el file, so find-function-search-for-symbol gets called with nil for the library, which causes the error. (emacs:subr.el)

I wrote some advice that searches in any open emacs-lisp-mode buffers or in a list of other files, like my Emacs configuration. This is how I activate it:

(setq sacha-elisp-find-function-search-extra '("~/sync/emacs/Sacha.org"))
(advice-add 'find-function-search-for-symbol :around #'sacha-elisp-find-function-search-for-symbol)

Now I should be able to jump to all those functions wherever they're defined.

(my-test-1)
(my-test-2)
(my-test-3)
(my-test-4)

Note that by default, M-. in emacs-lisp-mode uses xref-find-definitions, which seems to really want files. I haven't figured out a good workaround for that yet, but lispy-mode makes M-. work and gives me a bunch of other great shortcuts, so I'd recommend checking that out.

Here's the source code for the find function thing:

(defvar sacha-elisp-find-function-search-extra
  nil
  "List of filenames to search for functions.")

;;;###autoload
(defun sacha-elisp-find-function-search-for-symbol (fn symbol type library &rest _)
  "Find SYMBOL with TYPE in Emacs Lisp buffers or `sacha-find-function-search-extra'.
Prioritize buffers that do not have associated files, such as Org Src
buffers or *scratch*. Note that the fallback search uses \"^([^ )]+\" so that
it isn't confused by preceding forms.

If LIBRARY is specified, fall back to FN.

Activate this with:

(advice-add 'find-function-search-for-symbol
 :around #'sacha-org-babel-find-function-search-for-symbol-in-dotemacs)"
  (if (null library)
      ;; Could not find library; search my-dotemacs-file just in case
      (progn
        (while (and (symbolp symbol) (get symbol 'definition-name))
          (setq symbol (get symbol 'definition-name)))
        (catch 'found
          (mapc
           (lambda (buffer-or-file)
             (with-current-buffer (if (bufferp buffer-or-file)
                                      buffer-or-file
                                    (find-file-noselect buffer-or-file))
               (let* ((regexp-symbol
                       (or (and (symbolp symbol)
                                (alist-get type (get symbol 'find-function-type-alist)))
                           (alist-get type find-function-regexp-alist)))
                      (form-matcher-factory
                       (and (functionp (cdr-safe regexp-symbol))
                            (cdr regexp-symbol)))
                      (regexp-symbol (if form-matcher-factory
                                         (car regexp-symbol)
                                       regexp-symbol))

                      (case-fold-search)
                      (regexp (if (functionp regexp-symbol) regexp-symbol
                                (format (symbol-value regexp-symbol)
                                        ;; Entry for ` (backquote) macro in loaddefs.el,
                                        ;; (defalias (quote \`)..., has a \ but
                                        ;; (symbol-name symbol) doesn't.  Add an
                                        ;; optional \ to catch this.
                                        (concat "\\\\?"
                                                (regexp-quote (symbol-name symbol)))))))
                 (save-restriction
                   (widen)
                   (with-syntax-table emacs-lisp-mode-syntax-table
                     (goto-char (point-min))
                     (if (if (functionp regexp)
                             (funcall regexp symbol)
                           (or (re-search-forward regexp nil t)
                               ;; `regexp' matches definitions using known forms like
                               ;; `defun', or `defvar'.  But some functions/variables
                               ;; are defined using special macros (or functions), so
                               ;; if `regexp' can't find the definition, we look for
                               ;; something of the form "(SOMETHING <symbol> ...)".
                               ;; This fails to distinguish function definitions from
                               ;; variable declarations (or even uses thereof), but is
                               ;; a good pragmatic fallback.
                               (re-search-forward
                                (concat "^([^ )]+" find-function-space-re "['(]?"
                                        (regexp-quote (symbol-name symbol))
                                        "\\_>")
                                nil t)))
                         (progn
                           (beginning-of-line)
                           (throw 'found
                                   (cons (current-buffer) (point))))
                       (when-let* ((find-expanded
                                    (when (trusted-content-p)
                                      (find-function--search-by-expanding-macros
                                       (current-buffer) symbol type
                                       form-matcher-factory))))
                         (throw 'found
                                 (cons (current-buffer)
                                       find-expanded)))))))))
           (delq nil
                 (append
                  (sort
                   (match-buffers '(derived-mode . emacs-lisp-mode))
                   :key (lambda (o) (or (buffer-file-name o) "")))
                  sacha-elisp-find-function-search-extra)))))
    (funcall fn symbol type library)))

I even figured out how to write tests for it:

(ert-deftest sacha-elisp--find-function-search-for-symbol--in-buffer ()
  (let ((sym (make-temp-name "--test-fn"))
        buffer)
    (unwind-protect
        (with-temp-buffer
          (emacs-lisp-mode)
          (insert (format ";; Comment\n(defun %s () (message \"Hello\"))" sym))
          (eval-last-sexp nil)
          (setq buffer (current-buffer))
          (with-temp-buffer
            (let ((pos (sacha-elisp-find-function-search-for-symbol nil (intern sym) nil nil)))
              (should (equal (car pos) buffer))
              (should (equal (cdr pos) 12)))))
      (fmakunbound (intern sym)))))

(ert-deftest sacha-elisp--find-function-search-for-symbol--in-file ()
  (let* ((sym (make-temp-name "--test-fn"))
         (temp-file (make-temp-file
                     "test-" nil ".org"
                     (format
                      "#+begin_src emacs-lisp\n;; Comment\n(defun %s () (message \"Hello\"))\n#+end_src"
                      sym)))
         (sacha-elisp-find-function-search-extra (list temp-file))
         buffer)
    (unwind-protect
        (with-temp-buffer
          (let ((pos (sacha-elisp-find-function-search-for-symbol nil (intern sym) nil nil)))
            (should (equal (buffer-file-name (car pos)) temp-file))
            (should (equal (cdr pos) 35))))
      (delete-file temp-file))))
This is part of my Emacs configuration.
View Org source for this post

Extract PDF highlights into an Org file with Python

Posted: - Modified: | org

: Updated screenshot. I finished reading all the pages and ended up with 202 highlights, so I'm going to have fun updating my config with those notes!

I've been trying to find a good workflow for highlighting interesting parts of PDFs, and then getting that into my notes as images and text in Emacs. I think I've finally figured out something that works well for me that feels natural (marking things.

I wanted to read through Prot's Emacs configuration while the kiddo played with her friends at the playground. I saved the web page as a PDF and exported it to Noteful. The PDF has 481 pages. Lots to explore! It was a bit chilly, so I had my gloves on. I used a capacitative stylus in my left hand to scroll the document and an Apple Pencil in my right hand to highlight the parts I wanted to add to my config or explore further.

Back at my computer, I used pip install pymupdf to install the PyMuPDF library. I poked around the PDF in the Python shell to see what it had, and I noticed that the highlights were drawings with fill 0.5. So I wrote this Python script to extract the images and text near that rectangle:

import fitz
import pathlib
import sys
import os

BUFFER = 5

def extract_highlights(filename, output_dir):
    doc = fitz.open(filename)
    s = "* Excerpts\n"
    for page_num, page in enumerate(doc):
        page_width = page.rect.width
        page_text = ""
        for draw_num, d in enumerate(page.get_drawings()):
            if d['fill_opacity'] == 0.5:
               rect = d['rect']
               clip_rect = fitz.Rect(0, rect.y0 - BUFFER, page_width, rect.y1 + BUFFER)
               img = page.get_pixmap(clip=clip_rect)
               img_filename = "page-%03d-%d.png" % (page_num + 1, draw_num + 1)
               img.save(os.path.join(output_dir, img_filename))
               text = page.get_text(clip=clip_rect)
               page_text = (page_text
                            + "[[file:%s]]\n#+begin_quote\n[[pdf:%s::%d][p%d]]: %s\n#+end_quote\n\n"
                            % (img_filename,
                               os.path.join("..", filename),
                               page_num + 1,
                               page_num + 1, text))
        if page_text != "":
            s += "** Page %d\n%s" % (page_num + 1, page_text)
    pathlib.Path(os.path.join(output_dir, "index.org")).write_bytes(s.encode())

if __name__ == '__main__':
    if len(sys.argv) < 3:
        print("Usage: list-highlights.py pdf-filename output-dir")
    else:
        extract_highlights(sys.argv[1], sys.argv[2])

After I opened the resulting index.org file, I used C-u C-u C-c C-x C-v (org-link-preview) to make the images appear inline throughout the whole buffer. There's a little extra text from the PDF extraction, but it's a great starting point for cleaning up or copying. The org-pdftools package lets me link to specific pages in PDFs, neat!

2026-04-06-22-55-35.png
Figure 1: Screenshot of Org Mode file with link previews

To set up org-pdftools, I used:

(use-package org-pdftools
  :hook (org-mode . org-pdftools-setup-link))

Here's my quick livestream about the script with a slightly older version that had an off-by-one bug in the page numbers and didn't have the fancy PDF links. =)

View Org source for this post

Categorizing Emacs News items by voice in Org Mode

| speech, speech-recognition, emacs, org

I'm having fun exploring which things might actually be easier to do by voice than by typing. For example, after I wrote some code to expand yasnippets by voice, I realized that it was easier to:

  1. press my shortcut,
  2. say "okay, define interactive function",
  3. and then press my shortcut again,

than to:

  1. mentally say it,
  2. get the first initials,
  3. type in "dfi",
  4. and press Tab to expand.

Another area where I do this kind of mental translation for keyboard shortcuts is when I categorize dozens of Emacs-related links each week for Emacs News. I used to do this by hand. Then I wrote a function to try to guess the category based on regular expressions (my-emacs-news-guess-category in emacs-news/index.org, which is large). Then I set up a menu that lets me press numbers corresponding to the most frequent categories and use tab completion for the rest. 1 is Emacs Lisp, 2 is Emacs development, 3 is Emacs configuration, 4 is appearance, 5 is navigation, and so on. It's not very efficient, but some of it has at least gotten into muscle memory, which is also part of why it's hard to change the mapping. I don't come across that many links for Emacs development or Spacemacs, and I could probably change them to something else, but… Anyway.

2026-03-23_20-38-33.png
Figure 1: Screenshot of my menu for categorizing links

I wanted to see if I could categorize links by voice instead. I might not always be able to count on being able to type a lot, and it's always fun to experiment with other modes of input. Here's a demonstration showing how Emacs can automatically open the URLs, wait for voice input, and categorize the links using a reasonably close match. The *Messages* buffer displays the recognized output to help with debugging.

Screencast with audio: categorizing links by voice

This is how it works:

  1. It starts an ffmpeg recording process.
  2. It starts Silero voice activity detection.
  3. When it detects that speech has ended, it use curl to send the WAV to an OpenAI-compatible server (in my case, Speaches with the Systran/faster-whisper-base.en model) for transcription, along with a prompt to try to influence the recognition.
  4. It compares the result with the candidates using string-distance for an approximate match. It calls the code to move the current item to the right category, creating the category if needed.

Since this doesn't always result in the right match, I added an Undo command. I also have a Delete command for removing the current item, Scroll Up and Scroll Down, and a way to quit.

Initial thoughts

I used it to categorize lots of links in this week's Emacs News, and I think it's promising. I loved the way my hands didn't have to hover over the number keys or move between those and the characters. Using voice activity detection meant that I could just keep dictating categories instead of pressing keyboard shortcuts or using the foot pedal I recently dusted off. There's a slight delay, of course, but I think it's worth it. If this settles down and becomes a solid part of my workflow, I might even be able to knit or hand-sew while doing this step, or simply do some stretching exercises.

What about using streaming speech recognition? I've written some code to use streaming speech recognition, but the performance wasn't good enough when I tried it on my laptop (Lenovo P52 released in 2018, no configured GPU under Linux). The streaming server dropped audio segments in order to try to catch up. I'd rather have everything transcribed at the level of the model I want, even if I have to wait a little while. I also tried using the Web Speech API in Google Chrome for real-time speech transcription, but it's a little finicky. I'm happy with the performance I get from either manually queueing speech segments or using VAD and then using batch speech recognition with a model that's kept in memory (which is why I use a local server instead of a command-line tool). Come to think of it, I should try this with a higher-quality model like medium or large, just in case the latency turns out to be not that much more for this use case.

What about external voice control systems like Talon Voice or Cursorless? They seem like neat ideas and lots of people use them. I think hacking something into Emacs with full access to its internals could be lots of fun too.

A lot of people have experimented with voice input for Emacs over the years. It could be fun to pick up ideas for commands and grammars. Some examples:

What about automating myself out of this loop? I've considered training a classifier or sending the list to a large language model to categorize links in order to set more reasonable defaults, but I think I'd still want manual control, since the fun is in getting a sense of all the cool things that people are tinkering around with in the Emacs community. I found that with voice control, it was easier for me to say the category than to look for the category it suggested and then say "Okay" to accept the default. If I display the suggested category in a buffer with very large text (and possibly category-specific background colours), then I can quickly glance at it or use my peripheral vision. But yeah, it's probably easier to look at a page and say "Org Mode" than to look at the page, look at the default text, see if it matches Org Mode, and then say okay if it is.

Ideas for next steps

I wonder how to line up several categories. I could probably rattle off a few without waiting for the next one to load, and just pause when I'm not sure. Maybe while there's a reasonably good match within the first 1-3 words, I'll take candidates from the front of the queue. Or I could delimit it with another easily-recognized word, like "next".

I want to make a more synchronous version of this idea so that I can have a speech-enabled drop-in replacement that I can use as my y-or-n-p while still being able to type y or n. This probably involves using sit-for and polling to see if it's done. And then I can use that to play Twenty Questions, but also to do more serious stuff. It would also be nice to have replacements for read-string and completing-read, since those block Emacs until the user enters something.

I might take a side-trip into a conversational interface for M-x doctor and M-x dunnet, because why not. Naturally, it also makes sense to voice-enable agent-shell and gptel interactions.

I'd like to figure out a number- or word-based completion mechanism so that I can control Reddit link replacement as well, since I want to select from a list of links from the page. Maybe something similar to the way voicemacs adds numbers to helm and company or how flexi-choose.el works.

I'm also thinking about how I can shift seamlessly between typing and speaking, like when I want to edit a link title. Maybe I can check if I'm in the minibuffer and what kind of minibuffer I'm in, perhaps like the way Embark does.

It would be really cool to define speech commands by reusing the keymap structure that menus also use. This is how to define a menu in Emacs Lisp:

(easy-menu-define words-menu global-map
  "Menu for word navigation commands."
  '("Words"
     ["Forward word" forward-word]
     ["Backward word" backward-word]))

and this is how to set just one binding:

(keymap-set-after my-menu "<drink>"
  '("Drink" . drink-command) 'eat)

That makes sense to reuse for speech commands. I'd also like to be able to specify aliases while hiding them or collapsing them for a "What can I say" help view… Also, if keymaps work, then maybe minor modes or transient maps could work? This sort of feels like it should be the voice equivalent of a transient map.

The code so far

(defun my-emacs-news-categorize-with-voice (&optional skip-browse)
  (interactive (list current-prefix-arg))
  (unless skip-browse
    (my-spookfox-browse))
  (speech-input-cancel-recording)
  (let ((default (if (fboundp 'my-emacs-news-guess-category) (my-emacs-news-guess-category))))
    (speech-input-from-list
     (if default
         (format "Category (%s): " default)
       "Category: ")
     '(("Org Mode" "Org" "Org Mode")
       "Other"
       "Emacs Lisp"
       "Coding"
       ("Emacs configuration" "Config" "Configuration")
       ("Appearance" "Appearance")
       ("Default" "Okay" "Default")
       "Community"
       "AI"
       "Writing"
       ("Reddit" "Read it" "Reddit")
       "Shells"
       "Navigation"
       "Fun"
       ("Dired" "Directory" "Dir ed")
       ("Mail, news, and chat" "News" "Mail" "Chat")
       "Multimedia"
       "Scroll down"
       "Scroll up"
       "Web"
       "Delete"
       "Skip"
       "Undo"
       ("Quit" "Quit" "Cancel" "All done"))
     (lambda (result text)
       (message "Recognized %s original %s" result text)
       (pcase result
         ("Undo"
          (undo)
          (my-emacs-news-categorize-with-voice t))
         ("Skip"
          (forward-line)
          (my-emacs-news-categorize-with-voice))
         ("Quit"
          (message "All done.")
          (speech-input-cancel-recording))
         ("Reddit"
          (my-emacs-news-replace-reddit-link)
          (my-emacs-news-categorize-with-voice t))
         ("Scroll down"
          (my-spookfox-scroll-down)
          (my-emacs-news-categorize-with-voice t))
         ("Scroll up"
          (my-spookfox-scroll-up)
          (my-emacs-news-categorize-with-voice t))
         ("Delete"
          (delete-line)
          (undo-boundary)
          (my-emacs-news-categorize-with-voice))
         ("Default"
          (my-org-move-current-item-to-category
           (concat default ":"))
          (undo-boundary)
          (my-emacs-news-categorize-with-voice))
         (_
          (my-org-move-current-item-to-category
           (concat result ":"))
          (undo-boundary)
          (my-emacs-news-categorize-with-voice))))
     t)))

It uses Spookfox to control Firefox from Emacs:

(defun my-spookfox-scroll-down ()
  (interactive)
  (spookfox-js-injection-eval-in-active-tab "window.scrollBy(0, document.documentElement.clientHeight);" t))

(defun my-spookfox-scroll-up ()
  (interactive)
  (spookfox-js-injection-eval-in-active-tab "window.scrollBy(0, -document.documentElement.clientHeight);"))

(defun my-spookfox-background-tab (url &rest args)
  "Open URL as a background tab."
  (if spookfox--connected-clients
      (spookfox-tabs--request (cl-first spookfox--connected-clients) "OPEN_TAB" `(:url ,url))
    (browse-url url)))

It also uses these functions for categorizing Org Mode items:

(defun my-org-move-current-item-to-category (category)
    "Move current list item under CATEGORY earlier in the list.
  CATEGORY can be a string or a list of the form (text indent regexp).
  Point should be on the next line to process, even if a new category
  has been inserted."
    (interactive (list (completing-read "Category: " (my-org-get-list-categories))))
    (when category
      (let* ((col (current-column))
             (item (point-at-bol))
             (struct (org-list-struct))
             (category-text (if (stringp category) category (elt category 0)))
             (category-indent (if (stringp category) 2 (+ 2 (elt category 1))))
             (category-regexp (if (stringp category) category (elt category 2)))
             (end (elt (car (last struct)) 6))
             (pos (point))
             s)
        (setq s (org-remove-indentation (buffer-substring-no-properties item (org-list-get-item-end item struct))))
        (save-excursion
          (if (string= category-text "x")
              (org-list-send-item item 'delete struct)
            (goto-char (caar struct))
            (if (re-search-forward (concat "^ *- +" category-regexp) end t)
                (progn
                  ;; needs a patch to ol.el to check if stringp
                  (org-list-send-item item (point-at-bol) struct)
                  (org-move-item-down)
                  (org-indent-item))
              (goto-char end)
              (org-list-insert-item
               (point-at-bol)
               struct (org-list-prevs-alist struct))
              (let ((old-struct (copy-tree struct)))
                (org-list-set-ind (point-at-bol) struct 0)
                (org-list-struct-fix-bul struct (org-list-prevs-alist struct))
                (org-list-struct-apply-struct struct old-struct))
              (goto-char (point-at-eol))
              (insert category-text)
              (org-list-send-item item 'end struct)
              (org-indent-item)
              (org-indent-item))
            (recenter))))))

(defun my-org-guess-list-category (&optional categories)
  (interactive)
  (require 'cl-lib)
  (unless categories
    (setq categories
          (my-helm-org-list-categories-init-candidates)))
  (let* ((beg (line-beginning-position))
         (end (line-end-position))
         (string (buffer-substring-no-properties beg end))
         (found
          (cl-member string
                     categories
                     :test
                     (lambda (string cat-entry)
                       (unless (string= (car cat-entry) "x")
                         (string-match (regexp-quote (downcase (car cat-entry)))
                                       string))))))
    (when (car found)
      (my-org-move-current-item-to-category
       (cdr (car found)))
      t)))

For the speech-input functions, experimental code is at https://codeberg.org/sachac/speech-input .

View Org source for this post