Categories: geek » emacs

View topic page - RSS - Atom - Subscribe via email
Recommended links:

consult + org-db-v3: Approximate search of my sketches using text, and a multi-source consult command for approximately searching sketches and blog posts

| emacs

I like to draw sketchnotes when I want to untangle a thought or build up a thought over several posts.1 Following up on Playing around with org-db-v3 and consult: vector search of my blog post Org files, with previews, I want to be able to search my sketches using approximate text matches. I also want to have an approximate search interface that includes both sketches and blog posts.

Here's what I've gotten working so far:

Screencast of my-sketch-similar and my-consult-similar

  • 0:00: Text preview: All right, so here's a preview of how I can flip through the images that are similar to my current text. Let's say, for example, I've got my-sketch-similar. I'm passing in my whole blog post. This one is a text preview. It just very quickly goes through the OCR text.
  • 0:27: Image preview: I can use the actual images since Emacs has image support. It's a bit slower and sometimes it doesn't work. So far it's working, which is nice, but you can tell there's a bit of a delay.
  • 0:51: External viewer like Geeqie: The third option is to use an external program like Geeqie. Geeqie? Anyway, that one seems a lot faster.
  • 1:12: my-consult-similar: Then I can use that with this new multiple source consult command that I've just defined. So, for example, if I say my-consult-similar, I can then flip through the related blog posts as well as the related sketches in one go. I was calling it with a C-u universal prefix argument, but I can also call it and type in, for example, parenting and anxiety. Then I can see my recent blog posts and sketches that are related to that topic. I think I should be able to just… Yes, if I press enter, it will insert the link. So that's what I hacked up today.

The data

As part of my sketchnote process,2 I convert my sketches to text files.3 I usually use Google Cloud Vision to automatically convert the images to text. By keeping .txt files beside image files, I can easily search for images and include them in blog posts.4 I usually edit the file afterwards to clean up the layound and fix misrecognized words, but even the raw text output can make files more searchable.

I indexed the sketches' text files with:

(defun my-org-db-v3-index-recent-sketches (after)
  (interactive (list
                (when current-prefix-arg
                  (org-read-date nil nil nil "After: " nil "-2w"))))
  (setq after (or after (org-read-date nil nil "-2w")))
  (mapcar #'org-db-v3-index-file-async
          (seq-remove
           (lambda (o) (string> after (file-name-base o)))
           (directory-files "~/sync/sketches" t "\\.txt$"))))

Completion code

Writing the Consult completion code for the sketches was pretty straightforward because I could base it on my blog posts.

(defun my-org-db-v3-sketch--collection (input)
  "Perform the RAG search and format the results for Consult.
Returns a list of cons cells (DISPLAY-STRING . PLIST)."
  (mapcar
   (lambda (o)
     (cons (file-name-base (alist-get 'source_path o)) o))
   (seq-uniq
    (my-org-db-v3-to-emacs-rag-search input 100 "%sync/sketches%")
    (lambda (a b) (string= (alist-get 'source_path a)
                           (alist-get 'source_path b))))))

(defun my-sketch-similar (&optional query hide-initial)
  "Vector-search blog posts using `emacs-rag-search' and present results via Consult.
If called with \\[universal-argument\], use the current post's text.
If a region is selected, use that as the default QUERY.
HIDE-INITIAL means hide the initial query, which is handy if the query is very long."
  (interactive (my-11ty-interactive-context current-prefix-arg))
  (consult--read
   (if hide-initial
       (my-org-db-v3-sketch--collection query)
     (consult--dynamic-collection
         #'my-org-db-v3-sketch--collection
       :min-input 3 :debounce 1))
   :lookup #'consult--lookup-cdr
   :prompt "Search sketches (approx): "
   :category 'sketch
   :sort nil
   :require-match t
   :state (my-image--state)
   :initial (unless hide-initial query)))

(defun my-sketch-similar-insert (link)
  "Vector-search sketches and insert a link.
If called with \\[universal-argument\], use the current post's text.
If a region is selected, use that as the default QUERY.
HIDE-INITIAL means hide the initial query, which is handy if the query is very long."
  (interactive (list
                (if embark--command
                    (read-string "Sketch: ")
                  (apply #'my-sketch-similar
                         (my-11ty-interactive-context current-prefix-arg)))))
  (my-insert-sketch-and-text link))

(defun my-sketch-similar-link (link)
  "Vector-search sketches and insert a link.
If called with \\[universal-argument\], use the current post's text.
If a region is selected, use that as the default QUERY.
HIDE-INITIAL means hide the initial query, which is handy if the query is very long."
  (interactive (list
                (if embark--command
                    (read-string "Sketch: ")
                  (apply #'my-sketch-similar
                         (my-11ty-interactive-context current-prefix-arg)))))
  (when (and (listp link) (alist-get 'source_path link))
    (setq link (my-image-filename (file-name-base link))))
  (insert (org-link-make-string (concat "sketchLink:" link) (file-name-base link))))

(From Handle sketches too in my config)

Previewing images

Displaying images in Emacs can be a little bit slow, so I wanted to have different options for preview. The fastest way might be to preview just the text to see whether this is a relevant image.

(setq my-sketch-preview 'text)
2025-10-29_14-46-06.png

Another way to preview to load the actual image, if I have a bit more patience.

(setq my-sketch-preview t)
2025-10-29_14-47-45.png
Figure 2: Screenshot of image preview

Sometimes Consult says "No partial preview of a binary file", though, so I can probably look into how to get around that.

Using an external program is another option. Here I have some code to use Geeqie to display the images.

(setq my-sketch-preview 'geeqie)
2025-10-29_14-48-57.png
Figure 3: Using Geeqie to flip through images

Using Geeqie feels faster and more reliable than using Emacs to preview images.

The preview is implemented by the following function in the Completing sketches part of my config.

(declare-function 'my-geeqie-view "Sacha.el")

(defvar my-sketch-preview 'text
  "*Preview sketches.
'text means show the associated text.
'geeqie means open image in Geeqie.
t means open image in Emacs.")

(defun my-image--state ()
  "Manage preview window and cleanup."
  ;; These functions are closures captured when the state is initialized by consult--read
  (let ((preview (consult--buffer-preview))
        (open (consult--temporary-files)))
    ;; The returned lambda is the actual preview function called by Consult
    (lambda (action cand)
      (unless cand
        (funcall open))
      (when my-sketch-preview
        (let ((filename (cond
                         ((and (eq my-sketch-preview 'text)
                               (listp cand)
                               (alist-get 'source_path cand))
                          (alist-get 'source_path cand))
                         ((and (listp cand)
                               (alist-get 'source_path cand))
                          (my-image-filename (file-name-base (alist-get 'source_path cand))))
                         (t cand))))
          (when filename
            (pcase my-sketch-preview
              ('geeqie (my-geeqie-view (list filename)))
              (_ (funcall preview action
                          (and cand
                               (eq action 'preview)
                               (funcall open filename)))))))))))

The following function calls geeqie. It's in the Manage photos with geeqie part of my config.

(defun my-geeqie-view (filenames)
  (interactive "f")
  (start-process-shell-command
   "geeqie" nil
   (concat
    "geeqie --remote "
    (mapconcat
     (lambda (f)
       (concat "file:" (shell-quote-argument f)))
     (cond
      ((listp filenames) filenames)
      ((file-directory-p filenames)
       (list (car (seq-filter #'file-regular-p (directory-files filenames t)))))
      (t (list filenames)))
     " "))))

Multiple sources

Now I can make a Consult source that combines both blog posts and sketches using semantic search. I wanted to have the same behaviour as my other functions. If I call it interactively, I want to type in text. If I call it with a region, I want to search for that region. If I call it with the universal prefix argument C-u, I want to use the current post text as a starting point. Since this behaviour shows up in several functions, I finally got around to writing a function that encapsulates it.

Then I can use that for the interactive arguments of my new my-consult-similar function.

(defvar my-consult-source-similar-blog-posts
  (list :name "Blog posts"
        :narrow ?b
        :category 'my-blog
        :state #'my-blog-post--state
        :async (consult--dynamic-collection
                   (lambda (input)
                     (seq-take
                      (my-org-db-v3-blog-post--collection input)
                      5)))
        :action #'my-embark-blog-insert-link))

(defvar my-consult-source-similar-sketches
  (list :name "Sketches"
        :narrow ?s
        :category 'sketch
        :async (consult--dynamic-collection
                   (lambda (input)
                     (seq-take (my-org-db-v3-sketch--collection input) 5)))
        :state #'my-image--state
        :action #'my-insert-sketch-and-text))

(defun my-consult-similar (query hide-initial)
  (interactive (my-11ty-interactive-context current-prefix-arg))
  (if hide-initial
      (let ((new-sources
             (list
              (append
               (copy-sequence my-consult-source-similar-blog-posts)
               (list :items (seq-take (my-org-db-v3-blog-post--collection query) 5)))
              (append
               (copy-sequence my-consult-source-similar-sketches)
               (list :items (seq-take (my-org-db-v3-sketch--collection query) 5))))))
        (dolist (source new-sources)
          (cl-remf source :async))
        (consult--multi new-sources))
    (consult--multi '(my-consult-source-similar-blog-posts
                      my-consult-source-similar-sketches)
                    :initial query)))

(defun my-org-db-v3-index-recent-public (after)
  (interactive (list
                (when current-prefix-arg
                  (org-read-date nil nil nil "After: " nil "-2w"))))
  (setq after (or after (org-read-date nil nil "-2w")))
  (mapc #'org-db-v3-index-file-async
        (my-blog-org-files-except-reviews after))
  (my-org-db-v3-index-recent-sketches after))

This is what it looks like given this whole post:

2025-10-29_14-36-24.png
Figure 4: Screenshot of semantic search for both blog posts and sketches

Thoughts and next steps

The vector search results from my sketches don't feel as relevant as the blog posts, possibly because there's a lot less text in my sketches. Handwriting is tiring, and I can only fit so much on a page. Still, now that I'm sorting results by similarity score, maybe we'll see what we get and how we can tweak things..

It might be nifty to use embark-become to switch between exact title match, full-text search, and vector search.

plz supports asynchronous requests and org-db-v3.el has examples of doing this, so maybe I can replicate some of Remembrance Agent's functionality by having an idle timer asynchronously update a dedicated buffer with resources that are similar to the current paragraph, or maybe the last X words near point.

I wonder if it makes sense to mix results from different sources in the same list instead of splitting it up into different categories.

View org source for this post

Playing around with org-db-v3 and consult: vector search of my blog post Org files, with previews

Posted: - Modified: | emacs, org

: Sort my-org-db-v3-to-emacs-rag-search by similarity score.

I tend to use different words even when I'm writing about the same ideas. When I use traditional search tools like grep, it can be hard to look up old blog posts or sketches if I can't think of the exact words I used. When I write a blog post, I want to automatically remind myself of possibly relevant notes without requiring words to exactly match what I'm looking for.

Demo

Here's a super quick demo of what I've been hacking together so far, doing vector search on some of my blog posts using the .org files I indexed with org-db-v3:

Screencast of my-blog-similar-link

Play by play:

  • 0:00:00 Use M-x my-blog-similar-link to look for "forgetting things", flip through results, and use RET to select one.
  • 0:00:25 Select "convert the text into a link" and use M-x my-blog-similar-link to change it into a link.
  • 0:00:44 I can call it with C-u M-x my-blog-similar-link and it will do the vector search using all of the post's text. This is pretty long, so I don't show it in the prompt.
  • 0:00:56 I can use Embark to select and insert multiple links. C-SPC selects them from the completion buffer, and C-. A acts on all of them.
  • 0:01:17 I can also use Embark's C-. S (embark-collect) to keep a snapshot that I can act on, and I can use RET in that buffer to insert the links.

Background

A few weeks ago, John Kitchin demonstrated a vector search server in his video Emacs RAG with LibSQL - Enabling semantic search of org-mode headings with Claude Code - YouTube. I checked out jkitchin/emacs-rag-libsql and got the server running. My system's a little slow (no GPU), so (setq emacs-rag-http-timeout nil) was helpful. It feels like a lighter-weight version of Khoj (which also supports Org Mode files) and maybe more focused on Org than jwiegley/rag-client. At the moment, I'm more interested in embeddings and vector/hybrid search than generating summaries or using a conversational interface, so something simple is fine. I just want a list of possibly-related items that I can re-read myself.

Of course, while these notes were languishing in my draft file, John Kitchin had already moved on to something else. He posted Fulltext, semantic text and image search in Emacs - YouTube, linking to a new vibe-coded project called org-db-v3 that promises to offer semantic, full-text, image, and headline search. The interface is ever so slightly different: POST instead of GET, a different data structure for results. Fortunately, it was easy enough to adapt my code. I just needed a small adapter function to make the output of org-db-v3 look like the output from emacs-rag-search.

(use-package org-db-v3
  :load-path "~/vendor/org-db-v3/elisp"
  :init
  (setq org-db-v3-auto-enable nil))

(defun my-org-db-v3-to-emacs-rag-search (query &optional limit filename-pattern)
  "Search org-db-v3 and transform the data to look like emacs-rag-search's output."
  (org-db-v3-ensure-server)
  (setq limit (or limit 100))
  (mapcar (lambda (o)
            `((source_path . ,(assoc-default 'filename o))
              (line_number . ,(assoc-default 'begin_line o))
              ,@o))
          (sort
           (assoc-default 'results
                          (plz 'post (concat (org-db-v3-server-url) "/api/search/semantic")
                            :headers '(("Content-Type" . "application/json"))
                            :body (json-encode `((query . ,query)
                                                 (limit . ,limit)
                                                 (filename_pattern . ,filename-pattern)))
                            :as #'json-read))
           :key (lambda (o) (alist-get 'similarity_score o))
           :reverse t)))

I'm assuming that org-db-v3 is what John's going to focus on instead of emacs-rag-search (for now, at least). I'll focus on that for the rest of this post, although I'll include some of the emacs-rag-search stuff just in case.

Indexing my Org files

Both emacs-rag and org-db-v3 index Org files by submitting them to a local web server. Here are the key files I want to index:

  • organizer.org: my personal projects and reference notes
  • reading.org: snippets from books and webpages
  • resources.org: bookmarks and frequently-linked sites
  • posts.org: draft posts
(dolist (file '("~/sync/orgzly/organizer.org"
                "~/sync/orgzly/posts.org"
                "~/sync/orgzly/reading.org"
                "~/sync/orgzly/resources.org"))
  (org-db-v3-index-file-async file))

(emacs-rag uses emacs-rag-index-file instead.)

Indexing blog posts via exported Org files

Then I figured I'd index my recent blog posts, except for the ones that are mostly lists of links, like Emacs News or my weekly/monthly/yearly reviews. I write my posts in Org Mode before exporting them with ox-11ty and converting them with the 11ty static site generator. I'd previously written some code to automatically export a copy of my Org draft in case people wanted to look at the source of a blog post, or in case I wanted to tweak the post in the future. (Handy for things like Org Babel.) This was generally exported as an index.org file in the post's directory. I can think of a few uses for a list of these files, so I'll make a function for it.

(defun my-blog-org-files-except-reviews (after-date)
  "Return a list of recent .org files except for Emacs News and weekly/monthly/yearly reviews.
AFTER-DATE is in the form yyyy, yyyy-mm, or yyyy-mm-dd."
  (setq after-date (or after-date "2020"))
  (let ((after-month (substring after-date 0 7))
        (posts (my-blog-posts)))
    (seq-keep
     (lambda (filename)
       (when (not (string-match "[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]-emacs-news" filename))
         (when (string-match "/blog/\\([0-9]+\\)/\\([0-9]+\\)/" filename)
           (let ((month (match-string 2 filename))
                 (year (match-string 1 filename)))
             (unless (string> after-month
                              (concat year "-" month))
               (let ((info (my-blog-post-info-for-url (replace-regexp-in-string "~/proj/static-blog\\|index\\.org$\\|\\.org$" "" filename) posts)))
                 (let-alist info

                   (when (and
                          info
                          (string> .date after-date)
                          (not (seq-intersection .categories
                                                 '("emacs-news" "weekly" "monthly" "yearly")
                                                 'string=)))
                     filename))))))))
     (sort
      (directory-files-recursively "~/proj/static-blog/blog" "\\.org$")
      :lessp #'string<
      :reverse t))))

This is in the Listing exported Org posts section of my config. I have a my-blog-post-info-for-url function that helps me look up the categories. I get the data out of the JSON that has all of my blog posts in it.

Then it's easy to index those files:

(mapc #'org-db-v3-index-file-async (my-blog-org-files-except-reviews))

Searching my blog posts

Now that my files are indexed, I want to be able to turn up things that might be related to whatever I'm currently writing about. This might help me build up thoughts better, especially if a long time has passed in between posts.

org-db-v3-semantic-search-ivy didn't quite work for me out of the box, but I'd written an Consult-based interface for emacs-rag-search-vector that was easy to adapt. This is how I put it together.

First I started by looking at emacs-rag-search-vector. That shows the full chunks, which feels a little unwieldy.

2025-10-09_10-05-58.png
Figure 1: Screenshot showing the chunks returned by a search for "semantic search"

Instead, I wanted to see the years and titles of the blog posts as a quick summary, with the ability to page through them for a quick preview. consult.el lets me define a custom completion command with that behavior. Here's the code:

(defun my-blog-similar-link (link)
  "Vector-search blog posts using `emacs-rag-search' and insert a link.
If called with \\[universal-argument\], use the current post's text.
If a region is selected, use that as the default QUERY.
HIDE-INITIAL means hide the initial query, which is handy if the query is very long."
  (interactive (list
                (if embark--command
                    (read-string "Link: ")
                  (my-blog-similar
                   (cond
                    (current-prefix-arg (my-11ty-post-text))
                    ((region-active-p)
                     (buffer-substring (region-beginning)
                                       (region-end))))
                   current-prefix-arg))))
  (my-embark-blog-insert-link link))

(defun my-embark-blog--inject-target-url (&rest args)
  "Replace the completion text with the URL."
  (delete-minibuffer-contents)
  (insert (my-blog-url (get-text-property 0 'consult--candidate (plist-get args :target)))))

(with-eval-after-load 'embark
  (add-to-list 'embark-target-injection-hooks '(my-blog-similar-link my-embark-blog--inject-target-url)))

(defun my-11ty-interactive-context (use-post)
  "Returns (query hide-initial) for use in interactive arguments.
If USE-POST is non-nil, query is the current post text and hide-initial is t.
If the region is active, returns that as the query."
  (list (cond
         (embark--command (read-string "Input: "))
         (use-post (my-11ty-post-text))
         ((region-active-p)
          (buffer-substring (region-beginning)
                            (region-end))))
        use-post))

(defun my-blog-similar (&optional query hide-initial)
  "Vector-search blog posts using org-db-v3 and present results via Consult.
If called with \\[universal-argument\], use the current post's text.
If a region is selected, use that as the default QUERY.
HIDE-INITIAL means hide the initial query, which is handy if the query is very long."
  (interactive (my-11ty-interactive-context current-prefix-arg))
  (consult--read
   (if hide-initial
       (my-org-db-v3-blog-post--collection query)
     (consult--dynamic-collection
         #'my-org-db-v3-blog-post--collection
       :min-input 3 :debounce 1))
   :lookup #'consult--lookup-cdr
   :prompt "Search blog posts (approx): "
   :category 'my-blog
   :sort nil
   :require-match t
   :state (my-blog-post--state)
   :initial (unless hide-initial query)))

(defvar my-blog-semantic-search-source 'org-db-v3)
(defun my-org-db-v3-blog-post--collection (input)
  "Perform the RAG search and format the results for Consult.
Returns a list of cons cells (DISPLAY-STRING . PLIST)."
  (let ((posts (my-blog-posts)))
    (mapcar (lambda (o)
              (my-blog-format-for-completion
               (append o
                       (my-blog-post-info-for-url (alist-get 'source_path o)
                                                  posts))))
            (seq-uniq
               (my-org-db-v3-to-emacs-rag-search input 100 "%static-blog%")
               (lambda (a b) (string= (alist-get 'source_path a)
                                      (alist-get 'source_path b)))))))

It uses some functions I defined in other parts of my config:

When I explored emacs-rag-search, I also tried hybrid search (vector + full text). At first, I got "database disk image is malformed". I fixed this by dumping the SQLite3 database. Using hybrid search, I tended to get less-relevant results based on the repetition of common words, though, so that might be something for future exploration. Anyway, my-emacs-rag-search and my-emacs-rag-search-hybrid are in the emacs-rag-search part of my config just in case.

Along the way, I contributed some notes to consult.el's README.org so that it'll be easier to figure this stuff out in the future. In particular, it took me a while to figure out how to use :lookup #'consult--lookup-cdr to get richer information after selecting a completion candidate, and also how to use consult--dynamic-collection to work with slower dynamic sources.

Quick thoughts and next steps

It is kinda nice being able to look up posts without using the exact words.

Now I can display a list of blog posts that are somewhat similar to what I'm currently working on. It should be pretty straightforward to filter the list to show only posts I haven't linked to yet.

I can probably get this to index the text versions of my sketches, too.

It might also be interesting to have a multi-source Consult command that starts off with fast sources (exact title or headline match) and then adds the slower sources (Google web search, semantic blog post search via org-db-v3) as the results become available.

I'll save that for another post, though!

View org source for this post

2025-10-27 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post

2025-10-20 Emacs news

| emacs, emacs-news

: Fixed org-linkin link, thanks gnomon-!

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post

2025-10-13 Emacs news

Posted: - Modified: | emacs, emacs-news

: Fixed org-social link, thanks gnomon-!

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post

Added multiple timezone support to casual-timezone-planner

| emacs

My eldest sister got a Nintendo Switch. Now she can join my middle sister, the kids, and me in a Minecraft Realm. We're all in different timezones, so we needed to figure out a good time to meet. I briefly contemplated firing up timeanddate.com's Meeting Planner, but I wanted an Emacs way to do things.

I remembered coming across casual-timezone-planner in one of the Emacs News posts in June. It only handled one remote timezone, but it was easy to extend casual-timezone-utils.el to support multiple timezones. I changed completing-read to completing-read-multiple, added the columns to the vtable, and updated a few more functions. kickingvegas tweaked it a little more, and now multiple timezone support is in the version of casual that's on MELPA. Yay!

2025-10-08_09-55-25.png
Figure 1: Screenshot of times in America/Toronto, Europe/Amsterdam, and America/Los_Angeles

We settled on 7 AM Los Angeles, 10 AM Toronto, 4 PM Amsterdam, and we played on Saturday and Sunday. Had lots of fun!

View org source for this post

2025-10-06 Emacs news

| emacs, emacs-news

Why I Keep Blogging With Emacs is a short blog post that got a lot of comments on HN this week. Also, xenodium is trying out making videos, starting with batch-applying command-line utilities.

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, Mastodon #emacs, Bluesky #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post