Shell alias evolution

I work with Linux. That shouldn’t come as a surprise considering where I work. My private system does not differ that much either. Since I also work with automation my private systems are naturally configured automatically as well. I honestly cannot help it.

Bash aliases are a minor part of my configuration deployment and I tend to follow multiple approaches for handling my aliases in my environment.

This post is about my approach on handling the main part of bash aliases, even though it’s integrated into a bigger Red Hat Ansible configuration approach for my hosts. This obviously is not the most complex code, but it illustrates how simple code can evolve over time and be integrated in more complex automation solutions.

It’s also a great example on how to spend way too much time on simple problems!


Aliases for the shell are a part of the general host configuration in my environment and are configured by Red Hat Ansible.

Some people might not use alias at all. I use them mostly for navigating the file system and to simplify commands:

  1. Navigation

    There are directories in the file system which i frequent quite often: Remotely synced directories from Dropbox, Nextcloud, etc, Git repositories, dumping grounds for downloads or documents. The aliases I use are prefixed with the abbreviation gtd  (go to directory - smart, I know.). This makes it easy to remember and to use the shells auto-complete feature to get them listed. Typing gtd+<TAB> and I get a list of all available alias.

  2. Simplification

    Some commands are far to complex for me to remember, are tedious to write down every time or I am using replacements which I’m not used to by name.

    My contacts and calendar entries are synced with vdirsyncer sync. The alias vds shortens that down and makes it more convenient.

    I started using bat as a replacement for cat. But I didn’t want to remember every time that I prefer bat. So cat is being aliased to bat instead.

The code providing this functionality evolved from simple one-liners into a more complex structure over time. Though pushing the limits of aliases usually is not a high priority, it bothered me how inelegant the code was, whenever I touched this section of my configuration. It felt dirty; not as good as it could be. Until today nobody beside me would have seen that code.

This is an example of five typical alias from my configuration. These are placed in the file ~/.bashrc and executed when I start a terminal. The live code is much, much longer, so this is the abbreviated version:

# file: ~/.bashrc
alias cat='bat'                             # Alias to replace cat with the cooler bat.
alias vds='vdirsyncer sync'                 # Alias to trigger a command with parameter.
alias gtdupload='/data/sync/upload'         # Alias for jumping to a specific path.
alias gtdgit='/data/repositories/git'       # Alias for jumping to a specific path.
alias gtdbat='/data/repositories/git/bat'   # Alias for jumping into a single(!) GIT repository.

Evolution step 1: Conditional alias

This first version provided the basic alias functionality, but it had some drawbacks:

  1. Replacing cat with bat required that bat was actually installed on the host and available. Otherwise the command would fail.
  2. Adding a jump alias (gtd*) requires the destination folder to exist in order to work. On remote file systems, the state depends on if they have been mounted or not.

Bash provides easy techniques for these problems. First: the test for the existing commands to make sure we are creating useful aliases and don’t have alias which will fail.

# file: ~/.bashrc
test $(command -v bat) && alias cat='bat'
test $(command -v vdirsyncer) && alias vds='vdirsyncer sync'

For the jump aliases, similar can be achieved:

# file: ~/.bashrc
test -d /data/sync/upload && alias gtdupload='cd /data/sync/upload'
test -d /data/repositories/git && alias gtdgit='cd /data/repositories/git'
test -d /data/repositories/git/bat && alias gtdbat='cd /data/repositories/git/bat'

This will only create the aliases if the tests pass and the commands can be found on the host.


Evolution step 2: Reduce redundant code

The obvious problems have been fixed now, but new issues were showing, which were more pressing when the number of alias grew over time. The live code with about a hundred alias showed the flaws clearly:

  1. The amount of redundant code increased. Similar tests were defined and executed multiple times; long directory paths were written at least twice in each line. This increased the (very real) risk of typos.
  2. I like to have an alias for each Git repository on my host. This means that I add an alias each time I clone a new repository. This is easy to forget and annoying if I don’t.

As long as the total number of my alias was low, these issues were not pressing. Over time the number of Git repositories increased and the number of expected aliases as well.

For the Git repositories I wrote a bash function which takes an input directory and creates an alias for each sub-directory found. As a little feature a regex pattern can be used to limit the aliases being created to directories matching the pattern. This made the function also usable for other directory structures beside the Git repositories, as long as their naming follows a certain pattern. This is a feature I am using at a different place on my host and is not related to this alias article directly. Ping me if you want to know more.

# file: .functions

# Function to create aliases
# $1 = working directory
# $2 = regex pattern
function dir_alias {
  directory="$1"
  if [ "${2}" != '' ]; then
    pattern=${2}
  else
    pattern=''
  fi
  if [ -e "${directory}" ]; then
    subdirectories=$(find "${directory}/" -maxdepth 1 -type d )
    for subdir_line in ${subdirectories//\\n/ } # While loop does not work, runs in a subshell
    do
      # Get the directory name
      directory_name=$(basename "$subdir_line")
      alias_name="gtd${directory_name}"

      if [ "${pattern}" != '' ]; then # No match set
        if [[ "${directory_name}" =~ ${pattern} ]]; then

          # Remove leading dates if available
          number_pattern="^([0-9]{4,8})-.*"
          if [[ "${directory_name}" =~ $number_pattern ]]; then
            arr_name=(${directory_name/-/ })
            directory_name=${arr_name[1]}
            alias_name="gtd${directory_name}"
          fi

          # Create alias
          # shellcheck disable=2139
          alias "$alias_name=cd $subdir_line"
        fi
      else
        # Create alias
        # shellcheck disable=2139
        alias "$alias_name=cd $subdir_line"
      fi

    done
  fi
}

This function replaced all existing and future alias for jumping to Git repositories. My code became even cleaner when I moved the function into a separate file .functions and imported the file instead.

  #file: ~/.bashrc
  . .functions

- test -d /data/repositories/git/bat && alias gtdbat='cd /data/repositories/git/bat'
+ dir_alias /data/repositories/git

Any repository in /data/repositories/git  now gets an alias as gtd${repo_name} automatically. No code change necessary.


Evolution step 3: Checking and validation of aliases

The aliases for commands were growing in number as well. The issue of redundant code for tests was still not resolved and I was staring at a wall of code. Though every alias is defined in the ~/.bashrc file, it got messy in terms of where in the file to add new ones and how to manage the existing ones.

I wrote a new function which initially just required a path as parameter and created the alias for it. This would reduce the double code to basically the name of the function plus the path. Checking for any other requirement was handled in the function itself and would not clutter up the .bashrc.

But not all alias are for jumping paths. I needed a more granular function for this.

# file: .functions
#
# Function to create a single aliases
#   $1 is an existing path, create alias gtd$1
#   $2 is an existing path, create alias with $1
#   $2[1] and existing command, create alias with $1
function create_alias
{
  alias_name="$1"
  alias_value="$2"

  # shellcheck disable=2086,2140,2139
  if [ -d "${alias_name}" ] && [ -z "${alias_value}" ]; then
    # if $1 is an existing path and $2 is empty: create alias with gtd$1
    alias "gtd$(basename ${alias_name,,})"="cd ${alias_name}"
    return 0
  elif [ -n "${alias_name}" ] && [ -d "${alias_value}" ]; then
    # if $1 is set and $2 and existing path, create alias
    alias "${alias_name}"="cd ${alias_value}"
    return 0
  elif [ "$(command -v ${alias_value%% *})" ]; then
    # if $1 is set and $2[1] and existing command, create alias
    alias "${alias_name}"="${alias_value}"
    return 0
  fi
}

Using this function aliases could be created even simpler. The function takes care of the testing for existing paths and commands:

# file: ~/.bashrc

. .functions
create_alias cat bat
create_alias vds 'vdirsyncer sync'
create_alias /data/sync/upload
create_alias /data/repositories/git
dir_alias /data/repositories/git

A this point I could have said that I had done enough to improve the alias handling. But since I am on it, why stop now?

However, I am used to split configuration data and its deployment from each other. This is a habit I developed with Puppet and carried it over to Ansible. In Ansible you have the playbooks and roles and you keep them separate from the variables defined with the actual configuration. Breaking this pattern here would cause sleepless nights for me and I did not want this to happen.

But bash does not give you out of the box the encapsulation functionality that Ansible comes with.

As a workaround I did this: The function call gets now wrapped into a loop. The actual configuration values are being defined outside of the loop:

#!/usr/bin/env bash
# file: ~/.bashrc

. .functions
declare -A tools_aliases # Declare Hash

# Add the alias and their commands here
tools_aliases=(
  ['gtdupload']='cd /data/sync/upload'
  ['gtdgit']='cd /data/workspace/git'
  ['cat']='bat'
)

# Manual alias
for cmd in "${!tools_aliases[@]}"; do
  create_alias "${cmd}"="${tools_aliases[$cmd]}"
done

# Auto alias for Git repositories
dir_alias /data/workspace/git

This code still does everything the version one step prior does, but is distilled down to the key elements of what should be configured:

  1. Unique alias, depending on the availability of commands on individual hosts
  2. Common aliases are automatically created when directories matching a pattern have been found.

To integrate it even more into the Ansible host configuration I ~should~ could create another source file the the configuration values from YAML values in Ansible and replace the bash configuration file. This would allow to configure all alias simply by setting a YAML value for Ansible.

I will keep this for a future follow up if I see that this solution does not scale as well as I hope.


Amendment

  • As mentioned this solution is only a tiny part of a bigger Ansible configuration process on my hosts. Git repositories are being but in place and updated automatically from their remotes. There are also other roles and playbooks adding their own application specific alias to the environment. This I kept out of this article in order to keep the article understandable. If the configuration of alias would be the only task, this approach is probably a bit over the top. But when rolled out on multiple hosts, multiple environments and providing a huge number of alias it becomes more meaningful.

  • One lesson I learned while writing this article were the intricacies hidden even in a seemingly simple feature as aliases in bash. Initially I had a much more complex function to handle the automated creation of aliases. This was to a lack of understanding on how alias in combination with bash functions work.

    Picking the functionality apart and testing aliases and their behavior in detail forced me to develop a better understanding of aliases and how they work. For example: The following is mentioned in the man page of bash, but I was not aware of it until now:

    Aliases are expanded when a command is read, not when it is executed. Therefore, an alias definition appearing on the same line as another command does not take effect until the next line of input is read. The commands following the alias definition on that line are not affected by the new alias.

  • Even the man page for bash states in the section BUGS:

    Aliases are confusing in some uses.

  • Using aliases carries the risk of overdosing. A number of existing outdated alias were deleted from the configuration during the creation of this article. They were either not used any more, were completely forgotten to exist or have been replaced and superseded by other commands. This to me shows the importance of continuing maintenance for code also for such a simple thing like alias. There were no dire consequences other than the unnecessary clogging of the configuration.

Daniel Buøy-Vehn

Senior Systems Consultant at Redpill Linpro

Daniel works with automation in the realm of Ansible, AWX, Tower, Terraform and Puppet. He rolls out mainly to our customer in Norway to assist them with the integration and automation projects.

Why automate Ansible

Ansible can be used for many things. There are only a few things I have on my bucket list of things I would like to do, where Ansible cannot help me.

One of my most urgent things to handle was the increasing complexity of Ansible, its configuration and in particular the role development. As I got deeper into Ansible, more and more factors needed to be taken into consideration when setting up a role: the role structure, linting issues, molecule ... [continue reading]

Comparison of different compression tools

Published on December 18, 2024

Why TCP keepalive may be important

Published on December 17, 2024