Avoiding automatic data type conversion in Microsoft Excel and Pandas

| coding, python

Automatic conversion of data types is often handy, but sometimes it can mess things up. For example, when you import a CSV into Microsoft Excel, it will helpfully convert and display dates/times in your preferred format–and it will use your configured format when exporting back to CSV, which is not cool when your original file had YYYY-MM-DD HH:MM:SS and someone's computer decided to turn it into MM/DD/YY HH:MM. To avoid this conversion and import the columns as strings, you can change the file extension to .txt instead of .csv and then change each column type that you care about, which can be a lot of clicking. I had to change things back with a regular expression along the lines of:

import re
s = "12/9/21 11:23"
match = re.match('([0-9]+)/([0-9]+)/([0-9]+)( [0-9]+:[0-9]+)', s)
date = '20%s-%s-%s%s:00' % (match.group(3).zfill(2), match.group(1).zfill(2), match.group(2).zfill(2), match.group(4))
print(date)

The pandas library for Python also likes to do this kind of data type conversion for data types and for NaN values. In this particular situation, I wanted it to leave columns alone and leave the nan string in my input alone. Otherwise, to_csv would replace nan with the blank string, which could mess up a different script that used this data as input. This is the code to do it:

import pandas as pd
df = pd.read_csv('filename.csv', encoding='utf-8', dtype=str, na_filter=False)

I'm probably going to run into this again sometime, so I wanted to make sure I put my notes somewhere I can find them later.

View or add comments

Re-encoding the EmacsConf videos with FFmpeg and GNU Parallel

| geek, linux, emacsconf

It turns out that using -crf 56 compressed the EmacsConf a little too aggressively, losing too much information in the video. We wanted to reencode everything, maybe going back to the default value of -crf 32. My laptop would have taken a long time to do all of those videos. Fortunately, one of the other volunteers shared a VM on a machine with 12 cores, and I had access to a few other systems. It was a good opportunity to learn how to use GNU Parallel to send jobs to different machines and retrieve the results.

First, I updated the compression script, compress-video-low.sh:

Q=$1
WIDTH=1280
HEIGHT=720
AUDIO_RATE=48000
VIDEO_FILTER="scale=w=${WIDTH}:h=${HEIGHT}:force_original_aspect_ratio=1,pad=${WIDTH}:${HEIGHT}:(ow-iw)/2:(oh-ih)/2,fps=25,colorspace=all=bt709:iall=bt601-6-625:fast=1"
FILE=$2
SUFFIX=$Q
shift
shift
ffmpeg -y -i "$FILE"  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -aq-mode 2 -tile-columns 0 -tile-rows 0 -frame-parallel 0 -cpu-used 8 -auto-alt-ref 1 -lag-in-frames 25 -g 240 -pass 1 -f webm -an -threads 8 /dev/null &&
if [[ $FILE =~ "webm" ]]; then
    ffmpeg -y -i "$FILE" $*  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -tile-columns 2 -tile-rows 2 -frame-parallel 0 -cpu-used -5 -auto-alt-ref 1 -lag-in-frames 25 -pass 2 -g 240 -ac 2 -threads 8 -c:a copy "${FILE%.*}--compressed$SUFFIX.webm"
else
    ffmpeg -y -i "$FILE" $*  -pixel_format yuv420p -vf $VIDEO_FILTER -colorspace 1 -color_primaries 1 -color_trc 1 -c:v libvpx-vp9 -b:v 0 -crf $Q -tile-columns 2 -tile-rows 2 -frame-parallel 0 -cpu-used -5 -auto-alt-ref 1 -lag-in-frames 25 -pass 2 -g 240 -ac 2 -threads 8 -c:a libvorbis "${FILE%.*}--compressed$SUFFIX.webm"
fi

I made an originals.txt file with all the original filenames. It looked like this:

emacsconf-2020-frownies--the-true-frownies-are-the-friends-we-made-along-the-way-an-anecdote-of-emacs-s-malleability--case-duckworth.mkv
emacsconf-2021-montessori--emacs-and-montessori-philosophy--grant-shangreaux.webm
emacsconf-2021-pattern--emacs-as-design-pattern-learning--greta-goetz.mp4
...

I set up a ~/.parallel/emacsconf profile with something like this so that I could use three computers and my laptop, sending one job each and displaying progress:

--sshlogin computer1 --sshlogin computer2 --sshlogin computer3 --sshlogin : -j 1 --progress --verbose --joblog parallel.log

I already had SSH key-based authentication set up so that I could connect to the three remote computers.

Then I spread the jobs over four computers with the following command:

cat originals.txt | parallel -J emacsconf \
                             --transferfile {} \
                             --return '{=$_ =~ s/\..*?$/--compressed32.webm/=}' \
                             --cleanup \
                             --basefile compress-video-low.sh \
                             bash compress-video-low.sh 32 {}

It copied each file over to the computer it was assigned to, processed the file, and then copied the file back.

It was also helpful to occasionally do echo 'killall -9 ffmpeg' | parallel -J emacsconf -j 1 --onall if I cancelled a run.

It still took a long time, but less than it would have if any one computer had to crunch through everything on its own.

This was much better than my previous way of doing things, which involved copying the files over, running ffmpeg commands, copying the files back, and getting somewhat confused about which directory I was in and which file I assigned where and what to do about incompletely-encoded files.

I sometimes ran into problems with incompletely-encoded files because I'd cancelled the FFmpeg process. Even though ffprobe said the files were long, they were missing a large chunk of video at the end. I added a compile-media-verify-video-frames function to compile-media.el so that I could get the last few seconds of frames, compare them against the duration, and report an error if there was a big gap.

Then I changed emacsconf-publish.el to use the new filenames, and I regenerated all the pages. For EmacsConf 2020, I used some Emacs Lisp to update the files. I'm not particularly fond of wrangling video files (lots of waiting, high chance of error), but I'm glad I got the computers to work together.

View or add comments

Toggle screen recording with AutoKey and vokoscreenNG

| geek

I want to be able to toggle recording with a keypress, but vokoscreen-ng didn't seem to have a shortcut for that. Autokey to the rescue! At first I tried to send <ctrl><shift><f10> directly, but that didn't work because I had to fake the keypresses. Here's the working script:

from subprocess import call
keyboard.press_key("<ctrl>")
keyboard.press_key("<shift>")
if not store.has_key('voko-running'):
    store.set_value('voko-running', 1)
    keyboard.fake_keypress('<f10>')
else:
    store.remove_value('voko-running')
    keyboard.fake_keypress('<f11>')
keyboard.release_key("<ctrl>")
keyboard.release_key("<shift>")

I've bound it to my Menu key. Let's see if that makes things easier.

View or add comments

Making the most of the moment

| planning

This post isn't super-special, but I wanted to experiment with a workflow for making videos based on my sketches and dictation, so I made this.

2021-12-20a Making the most of the moment #planning #kaizen

Building on yesterday's reflection on waste, I thought I'd think about how to make the most of the moment. There are some things that are easier now and harder later, so I should take advantage of the situation to prepare for what's next. There are some things that are harder now and easier later, so that's a good opportunity to stress-test systems and improve things when things are a little bit easier. There are things that are about the same. They've got to be done anyway, so I might as well figure out how to keep learning and growing through them.

What sorts of things are easier now and harder later? Well, the big one is spending time with the kiddo. Right now, she's really interested in spending time with us. We have lots of time together, and she actually wants to be with me. I know that this is not always going to be the case. So as a result, I should take advantage of this opportunity to be present and make memories and all those other good things, but also to personally enjoy it, to store up all those things that I'm going to fondly remember when she's having a teenage angsty meltdown. One of the ways that I can help myself remember these things is by keeping a journal, maybe taking pictures and videos if it doesn't get in the way of enjoying the time with her, and also investing the time to build the skills and patterns that will help us later on in life.

Something that's easier now and harder later: schooling. Right now, she's in senior kindergarten. That's the second year of kindergarten. Next year, she's going to be in grade one. That means that we've been able to get away with a very relaxed, play-based sort of learning. It's just essentially her learning whatever she wants to do and me writing it up nicely in an observation spreadsheet. I've been slowly learning how to guide her interests by leaving interesting things lying around and supporting her interests wherever they take us. Right now, it's puzzles, for example. Eventually, I'll probably need to learn how to give her a little bit more structure so that she practises things like writing, drawing, spelling, and so on. That's someday.

Another thing that's easy now is that partly because of where she is developmentally and partly because of COVID, we're focused on our own little world. She's quite happy playing with us. She's not that interested in online classes or hanging out with other people yet. So she's focused on us at the moment, and again, that's an opportunity to be present and make memories, and to build those patterns.

Definitely easier now, harder later: time with my husband. Parenting is a lot easier when you can take a break and know that someone else is going to be there, especially when the kiddo has decided that only Daddy will do or that Mama's the meanest person in the world. He's also really great at helping me keep perspective. For example, when the kiddo hands me something that I'm not entirely sure I should take at the moment because my hands are full or she wants to give me something random, he always reminds me that their world is so small and they want to give us whatever they can. So keeping those things in mind is helpful. Of course, the relationship is great, and I'm not going to have that forever. I want to make the most of it while it's there. Also, he has a lot of skills that I want to learn. So I can take advantage of this time to learn those skills, bring those perspectives into my head, and get through the harder parts of raising a kid. It's not always going to be like this, so it's great that he's around to help, and then eventually (probably) it'll get easier. It'll get harder first, probably, and then easier.

My mom is another example of something that's easier now, but I've got to start preparing for when it's going to be hard. She's the only one in the Philippines. None of my sisters are there either, so estate tax paperwork is going to be a big headache. If we can get some of the preparation sorted out, then that makes later much easier. There's also enjoying the time with her while we have her. So, maybe that involves recording some calls or finding other ways to talk about stories or remember things or make that connection.

And lastly, there's this big question mark around climate change and society. I think things are not super easy now, but they will probably get a lot harder later on, so if I can build skills, help us develop more resilience, and build resources, that might put us in a better position for when things get crazier.

On the flip side, there are some things that are harder now and easier later. As I mentioned, it's a good opportunity to stress-test the systems and processes. So for example, the kiddo really loves getting our attention. She wants to spend time with us, which is a little hard when it comes to focusing on my own things. But eventually she'll move towards independence. At the moment, I can just relish the time I have with her and put off whatever I can so that I can just not worry, not feel like I'm being pulled into different directions. Then I can try to just use those little moments.

That's a second challenge: fragmented time. Eventually I'll be able to sit down and focus on things. Right now, it's a little hard. But fortunately, that's kind of like a preview of later, much much later, when it will be hard to focus on things, so any note-taking habits and processes that I build now might be helpful later. So, build systems and tools.

My tech setup is not quite as awesome as it could be. Sometimes it takes too much setup time to go downstairs and plug into the external monitor, set up all the things that I want. Context-switching is friction. So I can use what I have and gradually build on that toolkit, learning different ways of using the things that I've got and then add more as I can.

COVID-19 pandemic: hard now, someday easier, maybe? It mostly means that we aren't relying on external resources. I can't take her to library story time or other things like that. There also supply chain issues to watch out for. Less socialization, can't really take her out to see friends. We could do some outer playdates, but even then, it's harder to arrange. And then, of course, there's a lot of risk and uncertainty. Again: put off what I can and look for opportunities to make the most of things. For example, virtual kindergarten has been working out really well for us.

Of course, screen time is an issue, especially with young kids. I just have to find other things to do, like draw these reflections and solve Rubik's cubes and things like that.

There are some things that are about the same now as well as later. Cooking, for example. It's always got to be done, but I can keep growing by trying new recipes and techniques.

Tidying. I've got to keep working on ways to see clutter and get rid of it, maybe figure out where things are supposed to go. I lose a lot of time like trying to find things if I have put them down in a moment of inattention, so I have to figure out how to smoothen that.

Gardening stuff happens every year. It's always a new opportunity to try different plants or learn more skills. This year, we learned how to transplant periwinkle and start them from cuttings.

Health, got to keep working on that. Finding things that I enjoy doing as a form of movement will make it easier later on, too.

There are lots of different things that I can do now to prepare for harder things later, and lots of different ways I can take advantage of what's tough now so that can have ideas for things to improve later on, when things get better. There are things that are about the same. It's all about making the most of the moment.

View or add comments

2021-12-20 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Hacker News, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, emacs-devel, and lemmy/c/emacs.

View or add comments

Reflecting on wasted effort

| kaizen

One way to look for ways to improve is to think about where the waste might be. I wanted to reflect on how I'm currently doing things and where I might be wasting effort.

  • Not noticing an opportunity: There's not noticing that there's an opportunity to improve or not seeing that something that I can do that takes advantages of something I'm already doing.
  • Working on the wrong thing: If I pick something less effective to work on, I waste a little opportunity. Something might be a bad fit if it bumps into my weaknesses or doesn't take advantage of my strengths. Maybe I'm picking the wrong problem to work on, or I'm taking the wrong approach, or I haven't prepared, or I'm working on something that may be high effort and low reward. It's usually not a big deal, but it helps to think a little bit about which tasks can lead to compounding benefits and which are one-offs that don't help as much.
  • Working at the wrong time: I feel a little slower working on something when I'm not in the right mindset or I'm not as interested in it as in other things I could be doing. It's also tough when I don't have enough energy to work on things. It's important to notice when I'm getting into the negative productivity zone, especially when coding. If I pick the wrong time to work on something, I might have to deal with lots of interruptions or distractions.
  • Context-switching: Context-switching is a particularly big challenge for me because I'm basically working with one to two hour chunks possibly separated by days or months. For example, if I start something on Tuesday and then I pick it up again on Friday, I need to do a fair bit of rethinking and remembering. Switching from one thing to another is hard. I'm always looking up how to do something in the specific language that I need to work with. It's related to the problem of…
  • Duplicate research: Sometimes I have to reread the resources that would help me prepare for that task.
  • Tunnel vision: On the flip side, focusing too much on one project means not thinking about other things. Everything else tends to be in the back burner because of context switching costs, and that sometimes leads to…
  • Letting an opportunity lapse: Sometimes it's too late to get the most out of something because a person who wanted it has moved on (including me) or because I'd completely forgotten the context of my notes. This also applies to real life, too. A- is not going to want to hang out with me forever, so I should make the most of it. =)
  • Forgetting the context: Quick notes are sometimes too quick.
  • Fragmented time: Since I need to work in short bursts, I have to get to a good stopping point. That can be tough.
  • Frittering away time on distractions: It can also be tough working on something that doesn't fit into five minutes here, five minutes there. There's a big temptation to fritter time away on distractions like scrolling through Reddit, or just working on small, easy stuff instead of thinking about the harder problems.
  • Repetitive steps that could be automated: Waste could also be working on things that the computer could be working on instead.
  • Not making the most of it: If I'm not paying attention, I might not get as much out of an experience or task as I could have.
  • Not harvesting notes/code: It's very tempting sometimes to try to work quickly and just solve the problem for today. But if I take a little bit of extra time to harvest my notes from it, then I might be able to solve that problem when I run into the same problem, six months later or something like that.
  • Doing more than needed: The principle of You Ain't Gonna Need It often comes into play here, especially if I need to squeeze things down to fit into the chunks of time I have.
  • Missing pieces, incomplete notes: If I write something incomplete, I might have to redo more of it when I want to reuse it or build on it.
  • Forgetting where to find something: If I can't even remember the keywords needed to find something, that's even more of a waste of good notes.
  • Mistakes: Mistakes happen, and that's another source of wasted effort. If I'm in a rush or if I'm being impatient, I am not very good at paying attention to details. Then, when I need to go back and fix things, I have to deal with the context-switching all over again.
  • Doing things that might be a better fit for other people: So if there's something that can be done by somebody who's more detail-oriented or who has more time to look at all the small things or who has those particular skills or interests, it's better for them to do it. Then I can focus on the stuff that fits me.
  • Limits of tools: If I'm coding, doing it on my laptop with maybe two side-by-side windows is not quite as effective as plugging into the external monitor and getting all the things set up so I can see things instead of switching between overlapping things.
  • Having things in a form that's hard to search or skim: Videos and sketches can be hard to search or skim, so sometimes it makes sense to go back and actually write the text for it.
  • Negative feelings: For example, if the kiddo really wants my attention and I'm trying to complete a thought, it's tough not to get frustrated by the interruptions. It helps to be able to pull myself back and actually focus on her because there's no getting around that anyway, and then to do my thing later. It's also good to not let that frustration linger, because then that gets in the way of both enjoying her company and being able to focus on my own thing afterwards.

Now to think a little more closely about my main challenges…

Dealing with the fragmentation of my time is a big challenge. The way that I might do that is by grouping tasks together, so I don't have to switch context so much. Tunnel vision hasn't been too much of a problem for me so far, although it does mean that some things don't get worked on for a long time.

Taking a little bit of extra time to write up my notes makes sense, although it means my chunks of coding time have to be even shorter. Extracting excerpts for literate programming posts is a bit tough if I need to think about how to provide enough context. Maybe I should let myself fill things in later. I'm also looking into ways to do that faster, like maybe auto-generated captions running in the background so I can think out loud, grab the transcript, and then edit it a little bit (like this post). We'll see how that goes.

It makes sense to invest some time into expanding my toolset, like learning more about my tools, automating things, or taking advantage of hardware or software.

Of course, I'm still probably going to run into mistakes along the way, but if I can figure out which things are not as good for me, then I can see if other people want to go pick them up.

Might be a reasonable plan for reducing waste. Let's see how it works out.

View or add comments

Why I Love Emacs - from Bob Oliver

| emacs, org

Sometimes I post updates from people who don't have their own blog. Here's one from Bob Oliver. - Sacha

This short article sets out why I, as an Emacs newbie, really, really love this software. But before I get into that I would like to explain my voyage (Note: absence of the 'journey' word) to Emacs.

Many moons ago, back in the late seventies / early eighties I was a Cobol programmer, a job I loved. As it is with life, circumstances change and I moved away from Data Processing, as we called it in olden days. This meant I had to get my programming fix using my Sinclair Spectrum, which I programmed using their version of BASIC. I learned how to build my own, very simple games, and spent many hours playing my games and programming more. Then the children came along, the Sinclair went into the loft (attic for non-UK readers) and I had little or no time for hobbies.

Years later, with family grown and flown the nest, the Raspberry Pi was released and revised my love of programming. I took to learning C and Python - though remain very much at the beginner stage. All very enjoyable. This sparked a notion that I might be able to build an app and enhance my future pension prospects. To this end I installed xCode on my MacBook and also tried VS-Code. Needless to say I have not achieved proficiency and have since removed those products from my MacBook.

I still wanted to enhance my knowledge of C, Python and Bash, and so was really pleased when the Raspberry Pi foundation released Raspberry O/S Desktop for Mac (apologies if this name is not technically correct). This enabled me to re-purpose an old MacBook (circa 2009 and no longer supported) as a Linux machine, which got me interesting in learning all things Linux. This led to me installing Emacs as my code editor. Through reading all things Emacs I discovered org-mode and now Emacs is my text editor of choice.

As probably most new users to Emacs, I found it a bit confusing at first, but did as recommended stuck with it, and I am really glad I did.

What do I use Emacs for?

A very good question. Short answer is code and text editor.

  1. Writing, compiling, testing and running C programs.
  2. Writing, testing and running Bash scripts.
  3. Writing, testing and running Python programs.
  4. Compiling my, not so, daily journal.
  5. Using org-mode as my word processor of choice.

The key reason for using org-mode for my journal, was portability and long term accessibility. I had used various electronic journals before, each with their own proprietary file standards, making me concerned that my journal would not be available to my children long after I have gone. Also as Linux, and hence org-mode, use plain text files I can edit with any text editor on any platform, so can be assured that I can move the files as and when I change computers. Also as plain text files, they are readily searchable, so I can recall memories easily.

Finding Emacs and org-mode is probably one of the best things I have done since I retired from full-time employment.

What next:

  1. Maintain my journal writing.
  2. Write up my poems in org-mode - I have several going back to my teenage years.
  3. Develop my writing skills and maybe write a novel.
  4. Learn how to send and recieve mail through Emacs - I have yet to find a guide that is not too technical / complicated for me.

SO MY MESSAGE IS JOIN THE EMACS AND ORG SOCIETY - YOU WON'T REGRET IT.

Bob Oliver Essex, England.

View or add comments