CLI App Design

Table of Contents

Foundation

  • Common tasks should be uncomplicated
  • Restrict privileges
  • Secure configuration by default
  • Default behavior should be nondestructive
  • Handle files as streams when possible
  • Format output appropriately
  • End execution gracefully

Input

Security

CLIs, like any other piece of software, must validate, sanitize, and securely handle, all input.

Sensitive input

To prevent echoing data to the terminal in Ruby, we can

require "io/console"
print "valued(€): (no echo)"
sensitive = $stdin.noecho(&:gets).chomp

To get a passphrase securely we can simply:

require "io/console"
phrase = $stdin.getpass "Passphrase: (no echo)"

Validation

We can use Ruby’s built-in Option Parser to coerce arguments into built-in objects.

require 'optparse'
require 'optparse/time'

OptionParser.new do |parser|
  time_desc = "Begin execution at given time"
  parser.on("-t", "--time [TIME]", Time, time_desc) do |time|
    p time
  end
end.parse!

which would fail when used like

$ ruby optparse-test.rb  -t nonsense
... invalid argument: -t nonsense (OptionParser::InvalidArgument)

We can also build our custom validations, for instance

require "optparse"

# assuming
User = Struct.new(:id, :name)
def find_user id
  not_found = ->{ raise "No User Found for id #{id}" }
  [ User.new(1, "Sam"),
    User.new(2, "Gandalf") ].find(not_found) do |u|
    u.id == id
  end
end

# we could do
op = OptionParser.new
op.accept(User) do |user_id|
  find_user user_id.to_i
end

op.on("--user ID", User) do |user|
  puts user
end

op.parse!

Which could result in something like

$ ruby optparse-test.rb --user 3
... `block in find_user': No User Found for id 3 (RuntimeError)

System calls

We should avoid passing user controlled input to system calls. If we do so, though, we must be aware of the nuances in the different ways of doing so in Ruby. For instance, rather than passing a system call as a single string, we pass it as a series of strings. That way Ruby takes care of escaping special characters.

s = "it's special; indeed"
system "echo", s #=> it's special; indeed
system "echo #{s}" #>> unexpected EOF while looking for matching `''

Yet, it won’t escape a null byte, leaving it up to us how to handle it, eg fail gracefully, or sanitize it during input validation.

n = ['a', 'b'].pack 'HxH' #=> "\xA0\x00\xB0" # packing a null byte
system "echo", n #>> ArgumentError (string contains null byte)

require "shellwords"
e = Shellwords.escape n #=> "\\\xA0\\\x00\\\xB0"
system "echo", n #>> ArgumentError (string contains null byte)

If we want access to stdin, stdout, and stderr separately, we can use Open3

For full details on how to make system calls in Ruby we need to check the documentation for #system, #exec, and #spawn

Path traversal prevention

Whenever we need to rely on user input to find a file, or directory, we can restrict file system access through Dir.chdir, or Dir.chroot

File.exist? "../../etc/passwd" #=> true

# As part of a privileged process
Dir.chroot "Dir.pwd" #=> 0
File.exist? "../../etc/passwd" #=> false

Conventions

CLIs usually come as either commands, or command suites. Here we describe the parts of a command as:

$ command --switch --flag=argument [--brackets-means-optional]
$ command --[no-]switch --flag [argument]

As shown above, switches don’t take arguments, whilst flags do. A flag is connected to its argument through an equals sign or a single space.

For reference, the parts of command suite can be described as:

$ executable --global-option command --command-option

Command suites won’t be considered for the rest of the note.

Options

Options refer to both, switches and flags. There are short form (-s), and long form (--long-form) options.

Short-form options should remind us the behavior they control. As they are scarce, we should use them for common nondestructive options.

Long-forms should be as clear as possible, without skimping on letters. They are great for self-documenting scripts.

We should only make dangerous options available as long-form options.

Common options

-h, --help. Used to display the usage reference.

--version. Displays the CLI’s current version. Since -v is not consistently used as version’s short form, it can be used for something else.

--[no-]action flag with optional infix to negate an action. The negation is usually only used in scripts to show the intent of the default setting. Consider logging --action’s execution for sensitive data.

--[no-]force enforce/disable destructive behavior such as deleting, or overwriting, files, only through switches.

File streaming

Whenever a CLI handles a file, we should consider support for handling more than one file at the time. Ruby includes ARGF for file streaming. Beware, we need to get rid of any other options in ARGV before streaming the files.

To prevent loading files at once, we should handle them as a lazy enumerable eg.

ARGF.each_line.lazy { |line| do_something line }

Don’t forget to validate, and sanitize their contents.

Output

Formatting

Format the output depending on whether is meant to be displayed or used as input. For instance,

options[:format] = $stdout.tty? ? :tty : :yml

Some machine-friendly formats commonly shared are YAML, JSON, CSV. As much as possible, stick to the safer versions of such formats. For instance, when opting for YAML, make sure it can be loaded through YAML.safe_load.

Streams

Transmit info through both, $stdout and $stderr. The former is meant for streaming results, and relevant info. The later for errors, and warnings.

Whenever we need to stream info to $stdout, or $stderr, we should consider using IO#flush to make sure it isn’t being display with a delay due to being buffered. For instance, for progress bars.

For continuous streaming, such as logs, we might want to change the stream’s sync mode.

$stdout.sync = true
$stderr.sync = true

We should stick to false whenever we connect to a remote service to reduce the likelihood of errors, though.

Security

Sanitize

As with any delivery mechanism, we may need to sanitize the info streamed through $stdout, and $stderr, so no sensitive data is leaked. This is specially the case when formatting the data for sharing, eg as a YAML.

Consider removing, or replace, information such as usernames, names, addresses, financial info, and so on.

Escaping characters

Most, if not all, forms of output grant special abilities to some characters. Unfortunately, these are frequently abused. Below we focus on how to stream info safely to a terminal (TTY).

ANSI escape sequences allow us to mark certain characters as commands, or metadata, such as formatting, color change, cursor position, reconfigure the keyboard, update the window title, and so on.

Nowadays, most terminal emulators interpret, at least, some of those sequences. Escape all command line special characters, and ANSI escape sequences, so they may be displayed safely. Alternatively, remove them.

require "shellwords"

g = "\e[38;5;33mHi\e[0m" #=> "\e[38;5;33mHi\e[0m"
e = Shellwords.escape g #=> "\\\e\\[38\\;5\\;33mHi\\\e\\[0m"

puts g # (imagine blue text) >> Hi
puts e #>> \[38\;5\;33mHi\[0m

Beware, escaped strings aren’t intended for use in double, nor single, quotes.

a = "a b, c; d *"
"this is what happens #{a.shellescape}"
#=> "this is what happens a\\ b,\\ c\\;\\ d\\ \\*"

Depending on the app, and the amount of output we need to deal with, it may be better to depend on a pager such as less(1). Make sure to read its manual to see what is actually capable of escaping.

Color

Considering the escaping characters section above, if we support coloring we should never rely exclusively on it to convey a message.

Furthermore, consider adding a --[no-]color switch, or checking the NO_COLOR env var for users that prefer no color.

End execution

CLI apps stop processing either because they fail, succeed, or are interrupted.

Exit status

We use exit codes to report success (0; zero) or failure (non-zero). To send the exit code we simply do exit 1, where 1 can be any number. For zero we may call exit without arguments.

In Ruby, we can check the exit status code with

$? #=> 0

# or

require "english"
$CHILD_STATUS.exitstatus #=> 0

Document all exit status codes to help figure out what went wrong. A great starting point is OpenBSD’s sysexits(3)

Signal traps

A signal allows the kernel to communicate asynchronously with a process. Users can send signals to processes they own, for instance

description key binding signal
terminate process C-c SIGINT
suspend execution C-z SIGTSTP
quit & dump core C-\ SIGQUIT
display info C-t SIGINFO

Usually, kill -l will list all available signals locally.

On occasion we may want to trap signals to clean, reload config, etc.

Signal.trap("SIGINT") do
  FileUtils.rm_rf output_file
  exit 1
end

Beware of Signal.trap caveats, though.

Fail gracefully

At the time of design, we should also consider how our app is meant to fail.

Safe file operation

When a process operating on files runs into trouble such as bugs, or running out of disk space, it should ensure no file gets corrupted.

One way of doing a safe write is to put all data in a temporary file before doing any modifications. We replace the original file with the temporary one once we are done changing the latter. For instance,

require "fileutils"
require "tempfile"

def open_safely file
  result = temp_file = nil

  Tempfile.open do |f|
    temp_file = f.path
    result = yield f
  end

  FileUtils.move temp_file, file
  result
end

This prevents partial writes, as nothing gets overwritten unless there’s a complete replacement ready. Also, there is no need for locks since no other process should know of temp_file’s existence. Since moving a file doesn’t interrupt any currently occurring access to the old file, it also implies read safety for those still reading the out-of-date copy.

Process locking

A way to prevent running more than one instance of an app, or service, is by locking it:

  • Pick a location on disk, such as
    • /var/run/rc.d/<app>.pid for daemons
    • /var/log/<app>-<service>.log for logs
    • /tmp/<app>-<service>-<timestamp>.tmp for temporary services
  • Lock the chosen file as soon as possible.

Whenever we can’t lock the chosen file we know the app is already running, and can quit the new instance.

File locking

Whenever code is meant to handle files on disk we need to consider concurrency issues. We must take explicit steps to enforce mutually exclusive behavior. There are two ways of locking:

  • Shared (non-blocking) locks
    • can be hold by many processes at the same time
    • ensures read integrity only
  • Exclusive (blocking) locks
    • can only be hold by one process at a time.
    • ensures read, and write integrity.

Which can be roughly coded in Ruby as:

def access_file path, mode: "r"
  File.open(path, mode, 0644) do |file|
    file.flock(mode == "r" ? File::LOCK_SH : File::LOCK_EX)
    yield file
  ensure
    file.flock(File::LOCK_UN)
  end
end

Beware, the snippet above loads the entire file to memory. Also, depending on the thread model, validate files through size, MIME, extension, and other relevant metadata, before opening them to help protect against malicious files.

An advantage of locking files as shown is that, if we forget to release the lock before quitting the app, or if it crashes, the OS releases the lock.

For more details on Ruby’s take on file permissions, and other details, check Open modes and File constants in the documentation.

Locking Considerations

UNIX-like systems don’t enforce file locking by default. That is, no other application is actually blocked from writing on the locked file unless it explicitly flocks the file of interest.

Config files

Although is fairly common to expose a configuration object in Ruby, we may also load preferences defined in formats such as YAML. Remember to use YAML.safe_load to load configuration files, though.

Single configuration files are usually found as:

  • /etc/<app>/config.yml
  • $HOME/.config/<app>.yml
  • $HOME/.<app>rc
  • ./.<app>

When there are more than one configuration files they are usually split

  • /etc/<app>/<module>.yml
  • $HOME/.config/<app>/<module>.yml
  • $HOME/.<app>.d/<module>.yml
  • ./.config/<module>.yml

Documentation

OptionParser allows us to document command line options for the help reference. To document the API, though, we may use Rdoc. Users can generate, and use it on the terminal

# Generate ri documentation
$ gem rdoc <gem_name> --ri --no-rdoc --overwrite

# List all of GemName's public methods
$ ri <GemName> -l

or as a local web page:

# Generate HTML documentation
$ gem rdoc <gem_name> --rdoc --no-ri --overwrite

# Start local server to access documentation
$ gem server

Check out the Rdoc markup and ri articles for more information.