Oh, Molly!

This post appeared originally in our sysadvent series and has been moved here following the discontinuation of the sysadvent microsite

I’m sure we all have had “that feeling once”. You patch your desktop or laptop, then type in reboot in a shell in order to boot your computer. And that crucial server you were working on starts shutting down.

But fear not - a solution exists for this and similar problems.

History

Molly-guard was (according to Internet) originally a improvised plexiglass cover shielding the kill switch on an IBM 4341. It was named after a programmers daughter - Molly - who tripped this kill switch repeatedly. The name has obviously stuck around.

Your own virtual Molly guard

molly-guard is a small program that tries to prevent you from shutting down or rebooting servers. On Debian and derivatives, it can be usually be installed by running

apt install molly-guard

while repackaged RPM packages can be found for RedHat and derivatives.

How it works

In a nutshell, this program works by forcing one or more checks before the commands halt, shutdown, poweroff or reboot are run. In order to achieve this, these commands are replaced with scripts invoking the molly-guard functionality (ie. the check scripts).

The checks resides in the /etc/molly-guard/run.d directory - all scripts in this directory are run and all needs to exit successfully, that is returns a exit value of 0. After all scripts have exited successfully, the original command is executed.

Typically, these checks includes you having to type in the name of the host you want to halt or boot when you have logged in via SSH. Entering the wrong name will abort your command.

If you are in a re-attached screen session, molly-guard will not find your SSH process and think you are on a local console where the chance of a screw up is smaller, so it won’t ask.

If you set the ALWAYS_QUERY_HOSTNAME variable in the /etc/molly-guard/rc configuration file, this script will also force a check when in screen/imux, logged in via console etc.

Other hypothetical checks can force the user to give a reason or work-order for rebooting a server in order to comply with more strict operations regimes.

On RedHat

Since molly-guard isn’t packaged for EL systems, we use a simpler approach, aliasing the commands reboot, shutdown etc.

It is as simple as this:

clumsy_protect() {
    local cmd="$1"
    shift
    echo "Running $cmd on $HOSTNAME in 2 seconds!"
    sleep 2 || return
    command "$cmd" "$@"
}
alias reboot="clumsy_protect reboot"
alias shutdown="clumsy_protect shutdown"
alias poweroff="clumsy_protect poweroff"

Put the code in /etc/profile.d/clumsy-protect.sh. This gives you 2 seconds to realise that you typed reboot on the wrong machine, and if you hit ^C in time, you can heave a great sigh of relief.

(Aside: the fact that an interrupted sleep returns failure can be used in idioms like while sleep 1; do something; done instead of while true; do something; done where you may have to mash ^C like a madman to make it stop.)

The downside to the alias approach is that sudo does not look for aliases, so only sysadmins who like to do everything in a root shell get this protection.

More on sudo

In bash, you can work around that alias problem. If a trailing space is added to the expanded alias value, bash will perform alias expansion on the rest of the command as well!

alias sudo="sudo "

Now, the problem becomes that the function clumsy_protect is not available in root’s environment. The easy fix is to put the function in a script file in $PATH instead.

Going further

So you rebooted the correct server - but that quick reboot didn’t turn out to be so quick! That terabyte file-system needs fsck, or worse, your initrd was corrupt!

Introducing the all-singing, all-dancing clumsy_protect. It includes a utility which checks your /etc/fstab for typos like:

  • does that LABEL exist?
  • did you forget to remove mounting of that logical volume you deleted?
  • did you update file-system type when you upgraded from ext3 to ext4?

clumsy_protect also check that your initrd has the correct format. On RedHat, it even reruns prelink (if needed) before the reboot so that you don’t get that annoying alert that your server is running outdated libraries.

Pulling it all together

So where can you get this wonder? Look no further - its on github!

Kjetil Homme

Senior Systems Consultant at Redpill Linpro

Kjetil works with infrastructure at Redpill Linpro. He's been with Redpill Linpro for 8 years, and is currently working on automating our IT operations through Puppet, our storage solutions and backup rig. Kjetil been working with Linux since the early 90's and has made several contributions to the kernel and other associated projects.

Why automate Ansible

Ansible can be used for many things. There are only a few things I have on my bucket list of things I would like to do, where Ansible cannot help me.

One of my most urgent things to handle was the increasing complexity of Ansible, its configuration and in particular the role development. As I got deeper into Ansible, more and more factors needed to be taken into consideration when setting up a role: the role structure, linting issues, molecule ... [continue reading]

Comparison of different compression tools

Published on December 18, 2024

Why TCP keepalive may be important

Published on December 17, 2024