Today I ran part of the way to work. It was a cold, beautiful winter morning in Stockholm.

Me running with water and Stockholm City Hall in the background.

Sometimes, I solve programming problems by coding on paper. A few days ago, it looked like this:

A piece of paper with source code written on it with annotations.

I’ve started working on a code editor that is a mix of a text editor and a structured editor. It is all text, but parsers and pretty printers allow you to work with a tree structure and not think too much about syntax. It is a work in progress. Code is here.

Screenshot of rledit editing a JSON document with a selection.

We got some more snow. I like running in the winter. Especially when there is snow and the sun is shining.

Me running in a snow-covered landscape with the sun setting in the background.

I needed to submit some heic photos to a service that only accepted jpg. I didn’t know about the heic format, but a little searching gave me a solution:

$ heif-convert
bash: heif-convert: command not found...
Install package 'libheif' to provide command 'heif-convert'? [N/y] y
...
$ find . -iname '*.heic' -exec heif-convert -q 100 {} {}.jpg \;

Today was the first day of snow this season. Not much. I’m looking forward to many more runs on a white trail.

Me running on a trail with a little snow.

I was researching how to run Black (and possibly other formatters) from Vim and found Ergonomic mappings for code formatting in Vim. It was very helpful.

How would you improve this code?

def update_r_users(service)
    r_users = []
    for user in service.get_all_users():
        if "r" in user:
            r_users.append(user)
    service.set_users_in_group("users_with_r_in_name", r_users)

Find out what I did it in my latest newsletter.

Today I learned about the Rison data serialization format. I wrote a function to convert a Python value to Rison format. It was an elegant recursive function with partial support for the format.

I’ve used testing without mocks quite extensively now. I’ve also used it in a work project for more than a year. My experience is that it’s the best testing strategy that I’ve ever used. I’ve never felt more confident that my code works. I refactor code without fear of it breaking. It’s so good.

It’s getting dark. It gives variation to the running.

Me running in the dark.

Various things have kept me from running for a while. Today I had enough. I just had to go for a short run. It was the first run with warmer clothes. The weather was nice. I reclaimed some energy.

Me running.

Pull requests discourage experiments because changes can only propagate after approval. The idea behind PRs is to only approve “good” changes.

First, the learning opportunities of mistakes are gone. Second, you might loose interest in experimenting because you are afraid of making mistakes.

Today I just needed to run. I had not run since I hurt my achilles tendon almost a month ago. I wanted to see if it still hurt. I felt something, but not too much. I think I still need to take it easy with running, but man it felt good moving again.

Me running.

If you want to know how to implement a Bash-like shell, with support for redirects, in only 31 lines of Python, you should check out my latest blog post Bash Redirects Explained.

Do you know the difference between the following Bash commands?

program 2>&1 >/tmp/log.txt
program >/tmp/log.txt 2>&1

If not, you might be interested in my latest blog post Bash Redirects Explained.

Bash Redirects Explained

I thought I knew how Bash redirects worked.

If I wanted to redirect the output of a command to a file, I’d type this:

program > /tmp/log.txt

If I wanted to pipe both stdout and stderr to a text editor for further processing, I’d type this:

program 2>&1 | vim -

I knew that 2>&1 meant redirect stderr to stdout making it appear on stdout as well.

I knew certain patterns for certain situations. But when I encountered situations where I had not learned a pattern, I was lost. For example, I could not explain the difference between

program 2>&1 >/tmp/log.txt

and

program >/tmp/log.txt 2>&1

And I got scared when I saw something like this:

program < input.txt > output.txt 2>&1

Have you also been there? What did you do?

I would search the Internet for a pattern that matched the use case, or just try different alternatives and notice how they behaved.

I did this until one day when I learned a mental model for how Bash redirects work. Now I no longer need to rely on patterns. I can easily parse any situation and use any combination of redirects for my purposes.

The rest of this article explains this mental model.

The Standard Streams

A process has three standard streams attached to it:

  • stdin (0)
  • stdout (1)
  • stderr (2)

Diagram of the three streams of a process.

When we start a program from the terminal, Bash sets up the standard streams as follows:

  • stdin: terminal/keyboard
  • stdout: terminal
  • stderr: terminal

What redirects do is to modify what the standard streams point to before the program starts executing.

  • < means modify stdin.
  • > means modify stdout.
  • 2> means modify stderr.

That is the mental model: redirects modify standard streams before program execution.

Let’s evaluate a few examples using this mental model to see how it works.

Logcat Utility

To be able to show what happens in different examples, we have a utility program, logcat.py, that makes use of all three streams. It reads text from stdin, logs the arguments and the length of the text to stderr, and writes the text to stdout. It looks like this:

#!/usr/bin/env python

import sys

text = sys.stdin.read()

sys.stderr.write(f"Args: {(sys.argv[1:])}\n")
sys.stderr.write(f"Read {len(text)} characters.\n")

sys.stdout.write(text)

Example: No Redirect

Let’s start with an example without redirects to see the operation of logcat.py:

$ ./logcat.py ignored arguments

Before logcat.py starts executing, Bash sets up the standard streams as follows:

  • stdin: terminal/keyboard
  • stdout: terminal
  • stderr: terminal

When execution starts, logcat.py waits for input. If we type hello in the terminal (followed by a return and ctrl+d), the following is printed to the terminal:

Args: ['ignored', 'arguments']
Read 6 characters.
hello

We can see that it read our input from the terminal/keyboard and wrote the log messages along with our input to the terminal as well.

Example: Redirect Stdin

Now let’s modify stdin to instead of the terminal/keyboard be the logcat.py source code:

$ ./logcat.py ignored arguments <logcat.py

This instructs Bash to modify stdin to point to the file logcat.py.

Before logcat.py starts executing, Bash sets up the standard streams as follows:

  • stdin: logcat.py (opened in read mode)
  • stdout: terminal
  • stderr: terminal

When execution starts, the following is printed to the terminal:

Args: ['ignored', 'arguments']
Read 182 characters.
#!/usr/bin/env python

import sys

text = sys.stdin.read()

sys.stderr.write(f"Args: {(sys.argv[1:])}\n")
sys.stderr.write(f"Read {len(text)} characters.\n")

sys.stdout.write(text)

We can see that the redirect operation is stripped from the arguments. Only Bash sees it and does not pass it along to the program. Furthermore we can see that the logcat.py source code is printed to the terminal.

Example: Redirect Stdin and Stdout

Let’s say we’re only interested in the log messages, and want to throw away stdout:

$ ./logcat.py ignored arguments <logcat.py >/dev/null

This instructs Bash to modify stdin to point to the file logcat.py and to modify stdout to point to the file /dev/null.

Before logcat.py starts executing, Bash sets up the standard streams as follows:

  • stdin: logcat.py (opened in read mode)
  • stdout: /dev/null (opened in write mode)
  • stderr: terminal

When execution starts, the following is printed to the terminal:

Args: ['ignored', 'arguments']
Read 182 characters.

We can see that the redirect operations are all stripped from the arguments and the source code has been written to /dev/null and is thus not shown in the terminal.

Extended Mental Model

Let’s extended the mental model to clarify how Bash operates.

When Bash parses a command, it divides it into two parts: the arguments and the redirects. Before it starts executing the program with the arguments, it goes through the redirects, in order, and configures the standard streams before execution.

Example: Redirect All Streams

Let’s see how we can interpret a more complex command using the extended mental model:

$ ./logcat.py <logcat.py is the >out.txt best 2>&1 thing

If we split this into arguments and redirects, we get this:

  • Arguments: ./logcat.py, is, the, best, thing
  • Redirects: <logcat.py, >out.txt, 2>&1

Now, let’s evaluate the redirects in order. The state of the standard streams at start is this:

  • stdin: terminal/keyboard
  • stdout: terminal
  • stderr: terminal

Then we evaluate <logcat.py and get this:

  • stdin: logcat.py (opened in read mode)
  • stdout: terminal
  • stderr: terminal

Then we evaluate >out.txt and get this:

  • stdin: logcat.py (opened in read mode)
  • stdout: out.txt (opened in write mode)
  • stderr: terminal

Then we evaluate 2>&1, which means modify stderr (2>) to be whatever stdout points to (&1), and get this:

  • stdin: logcat.py (opened in read mode)
  • stdout: out.txt (opened in write mode)
  • stderr: out.txt (opened in write mode)

After the standard streams have been set up, execution of ./logcat.py is the best thing starts. Nothing appears on the terminal since all output has been redirected to out.txt:

$ cat out.txt
Args: ['is', 'the', 'best', 'thing']
Read 182 characters.
#!/usr/bin/env python

import sys

text = sys.stdin.read()

sys.stderr.write(f"Args: {(sys.argv[1:])}\n")
sys.stderr.write(f"Read {len(text)} characters.\n")

sys.stdout.write(text)

Mini Shell

I created a mini version of a shell to demonstrate how straight forward it is to implement redirects with POSIX system calls. It works exactly as the extended mental model, and because it is running software, it fills in some more details of the model. I would guess that Bash does something similar even though I haven’t read its source code.

First off, here is a demo that shows how the mini shell can replicate the complex example from above:

$ ./minishell.py
~~?~~> ./logcat.py <logcat.py is the >out.txt best 2>&1 thing
~~0~~> cat out.txt
Args: ['is', 'the', 'best', 'thing']
Read 182 characters.
#!/usr/bin/env python

import sys

text = sys.stdin.read()

sys.stderr.write(f"Args: {(sys.argv[1:])}\n")
sys.stderr.write(f"Read {len(text)} characters.\n")

sys.stdout.write(text)

And here is the implementation in only 31 lines of Python:

#!/usr/bin/env python

import os
import sys

STDIN  = 0
STDOUT = 1
STDERR = 2

statuscode = "?"
while True:
    sys.stdout.write(f"~~{statuscode}~~> ")
    sys.stdout.flush()
    command = input()
    pid = os.fork()
    if pid == 0:
        args = []
        for part in command.split(" "):
            if part.startswith("<"):
                os.dup2(os.open(part[1:], os.O_RDONLY), STDIN)
            elif part.startswith(">"):
                os.dup2(os.open(part[1:], os.O_WRONLY|os.O_CREAT, 0o644), STDOUT)
            elif part == "2>&1":
                os.dup2(STDOUT, STDERR)
            elif part.startswith("2>"):
                os.dup2(os.open(part[2:], os.O_WRONLY|os.O_CREAT, 0o644), STDERR)
            else:
                args.append(part)
        os.execvp(args[0], args)
    else:
        _, statuscode = os.waitpid(pid, 0)

To understand how this works, you need some knowledge of the POSIX system calls fork, waitpid, open, dup2, and execvp. But even if you don’t understand the specifics, I think this codified model can help in understanding how Bash operates. Let’s look at an example.

Example: Duplicated Files

Let’s see if we can explain the difference between the following commands using the mini shell for the model:

$ ./logcat.py <logcat.py >out.txt 2>out.txt
$ ./logcat.py <logcat.py >out.txt 2>&1

At a first glance, it looks like both commands redirect both stdout and stderr to the out.txt file. But if we evaluate it like mini shell does, we see that the first example will open the file twice (two calls to os.open creating two file handles), whereas the second example will open the file only once and then duplicate the file handle for stderr.

When two file handles are created, writes to the two streams will attempt to write to the same location in the file and they will overwrite each other. Furthermore, buffering might alter in which order writes happen, so it is not clear what will actually end up in the file. So to make sure all output is captured in the file, the second example should be used where the file is only opened once.

Conclusion

There is still more to Bash redirects than what I have explained here. But this mental model (along with its extended versions) have helped me reason about Bash redirects. I hope it will do the same for you.