Categories: speechtotext

RSS - Atom - Subscribe via email

Late-night braindumps by talking to myself

| geek, speechtotext

At my most recent eye exam, the optometrist recommended that I use a warm compress and massage my eyes afterwards to help with dry eyes, so I ordered a USB-powered eye mask. I like how I don't have to fuss around with figuring out the temperature and duration, especially now that things are colder. It's just there by my bed. It turns on. It has a timer, so I can set it to automatically turn off at 20 minutes. The medium heat setting is warm enough to be warm but not uncomfortable, even in winter.

Since I can't see anything while I have the compress on, I've taken to recording a quick braindump of the day. It's a nice way to wind down. I end up clearing my brain, and I usually just sleep afterwards instead of staying up. W- keeps later hours than I do, so I can talk without disturbing anyone.

Google Recorder automatically transcribes my recording on the fly with varying levels of accuracy. It's enough to mostly recognize my thoughts next day, and I can tap on a word on my phone to jump back to that part of the audio. If I really want to, I can use aeneas to align the text to the recording so that I can do that sort of verification on my computer. For the most part, I've been able to understand things from context.

I've been experimenting with using special keywords to set off parts of the text that I might want to pay special attention to. At first, I tried phrases like "begin elephant" and "end elephant", but "end" and "and" are easily confused by the speech-to-text recognizer, and "elephant" gets misrecognized sometimes too. "Hello" and "goodbye" seemed to work better. "Hello computer" / "Goodbye computer" and "Hello notebook" and "Goodbye notebook" seem to work okay. Might go with "Hello notebook", since it has fewer syllables.

My goal for that time is to do a braindump of the different things I need to think about and capture, and then make it easier to follow up. It might also be a good time to review the day and plan tomorrow. Sometimes I might talk my way through the idea for a blog post or a decision I'm considering. I still tend to stutter and wander all over the place, but it's useful to have notes as a starting point.

How could I improve this workflow? I would like to record my thoughts and ideally, have the transcript automatically show up in a folder in a directory and my computer. To reduce the friction in that, I might need to do the transcription on my computer, because Google Recorder requires me to tap a few buttons in order to share the transcript unless I root my phone, which is a bit of an involved process that might need to wait until after EmacsConf. If I use another recording app that saves the audio to a directory that is synchronized to sync thing to my server, then I can transcribe it with OpenAI Whisper and align it with aeneas to get the timestamps. My laptop is pretty slow, but I'll be asleep anyway, so maybe the laptop can work on it overnight.

I would ideally like it to create Org Mode entries, maybe putting those entries in a separate inbox if I'm worried about messing up my main inbox. It can pull out the text between "Hello, notebook" and "Goodbye, notebook" sections along with links to the files where they came from so that I can review the context. Org Mode links can jump to text or lines in a text file, or I can add support for easily capturing and linking to subtitles in subed.el (maybe a subed-org.el). Who knows? I might even be able to start extracting audio snippets in order to create a quick presentation using something like compile-media. (I'm recording in a relatively quiet room and no one's excitedly trying to interrupt me, so I might as well take advantage of that!) Oh, maybe I should also add a command to copy the subtitles in the region as plain text. If I figure out a grammar, I might even be able to automate or partially automate common commands.

I don't need to record everything 24x7 (although someone's experimented with that), but these short snippets might be a good start. It could lead to getting more stuff out of my brain, and that can be handy too.

View or add comments