Categories: coding

RSS - Atom - Subscribe via email

Avoiding automatic data type conversion in Microsoft Excel and Pandas

| coding, python

Automatic conversion of data types is often handy, but sometimes it can mess things up. For example, when you import a CSV into Microsoft Excel, it will helpfully convert and display dates/times in your preferred format–and it will use your configured format when exporting back to CSV, which is not cool when your original file had YYYY-MM-DD HH:MM:SS and someone's computer decided to turn it into MM/DD/YY HH:MM. To avoid this conversion and import the columns as strings, you can change the file extension to .txt instead of .csv and then change each column type that you care about, which can be a lot of clicking. I had to change things back with a regular expression along the lines of:

import re
s = "12/9/21 11:23"
match = re.match('([0-9]+)/([0-9]+)/([0-9]+)( [0-9]+:[0-9]+)', s)
date = '20%s-%s-%s%s:00' % (match.group(3).zfill(2), match.group(1).zfill(2), match.group(2).zfill(2), match.group(4))
print(date)

The pandas library for Python also likes to do this kind of data type conversion for data types and for NaN values. In this particular situation, I wanted it to leave columns alone and leave the nan string in my input alone. Otherwise, to_csv would replace nan with the blank string, which could mess up a different script that used this data as input. This is the code to do it:

import pandas as pd
df = pd.read_csv('filename.csv', encoding='utf-8', dtype=str, na_filter=False)

I'm probably going to run into this again sometime, so I wanted to make sure I put my notes somewhere I can find them later.

Started learning how to interactively debug Javascript in Emacs with Indium

| 11ty, emacs, coding

I noticed something strange in my static blog: my blogging category page didn't list my post on statically generating my blog with Eleventy. Now it does, of course, since I fixed it. But it didn't, and that was weird. I tried using console.log to debug it, but it was annoying to try to figure out the right thing to print out in a long list of nested objects. Besides, console.log debugging is so… last century.

Since these tips for debugging in 11ty mentioned interactively debugging things in VS Code, I decided it was a good time to learn how to use Indium, a Javascript development environment for Emacs.

(use-package indium :hook ((js2-mode . indium-interaction-mode)))

After some trial and error, this was the .indium.json file that allowed me to use M-x indium-launch to start the Eleventy process.

{
  "configurations": [
    {
      "name": "11ty",
      "type": "node",
      "program": "node",
      "args": "./node_modules/.bin/eleventy"
    }
  ]
}

I originally had "inspect-brk": true in it as well, following the suggested configuration, but I found it easier to just set breakpoints in my files using indium-add-breakpoint (C-c b b, a keybinding set up by indium-interaction-mode in my js2-mode-hook).

Conditional breakpoints didn't seem to work, so I just put my logic in an if and set my breakpoint in there.

  categories.forEach((item) => {
    if (item.slug == 'blogging') {
      let post = data.collections._posts.find(o => o.inputPath.match(/statically-generating-my-blog-with-eleventy/));
      console.log(post);
    }
    ...
  }

When I set my breakpoint on the let post... line and ran M-x indium-launch, I got an interactive debugger at that breakpoint. I could also switch to the REPL console and type stuff. Yay!

As it turned out, the post I wanted wasn't showing up in the list of posts. It was because I had used eleventyConfig.setTemplateFormats and forgotten to include md for Markdown files. Once I figured out what was going on, it was easy to fix. This is what the debugger looks like. It adds values to the ends of lines, and you can evaluate things.

Screenshot_20210816_002331.png

I'm looking forward to learning more about using Indium to debug scripts running in Node or Chrome. Slowly, slowly having some focused time to sharpen the saw!

If you use Emacs for Javascript development and you're curious about Indium, you can check out the documentation.