More Shell Programming Secrets Nobody Talks About

0
854
programming

This is the second part of the article on shell programming secrets. (The first part was carried in the October 2022 issue of Open Source For You.) It covers some other important behaviour of bash, particularly that of text expansions and substitutions.

the famed text-processing abilities of the Bourne shell (sh) have been vastly expanded in the Bourne-Again shell (bash). These abilities are efficient to a fault and the code can be even more cryptic. This makes the bash command interpreter and programming language very powerful but also a minefield of errors.

In the last article published in the October 2022 issue of Open Source For You titled ‘Shell Programming Secrets Nobody Talks About’, you learnt that:

  • sh and bash are not the same
  • if statements check for exit codes, not Boolean values
  • [, true and false are programs, not keywords
  • Presence/absence of space in variable comparisons and assignments makes quite a difference
  • [[ is not the fail-safe version of [
  • Arithmetic operations are not straightforward as in other languages
  • Array operations use cryptic operators
  • By default, bash will not stop for errors

I forgot to mention that if you use the -e option of the set command, as in set -e, then you will not be able to check for error codes of previous statements with the if [ $? -eq 0 ]; then construct. This option makes bash stop running a shell script if any of its statements encounter an error. If you are doing error handling using if statements, then begin your scripts like this:

#!/bin/bash
set -u
# The rest of your shell script

The -u option of the set command, if you remember from the last article, ensures that undefined variables are not replaced with an empty string and instead cause an error.
In this article, we will focus on how bash performs text expansions and substitutions. I will only cover what I think are the most important text-processing features. For comprehensive information, you will have to study the Bash Reference Manual. Bash and several commands such as sed and grep also use regular expressions to perform text processing. ‘Regular expressions’ is a separate topic on its own and I will not cover it either.

History expansion character (!)

This feature is available when typing commands at the shell prompt. It is used to access commands stored in the bash history file.

!n Execute nth command in bash history
!! Execute last command (Equivalent to !-1)
!leword Execute last command beginning with ‘leword’
!?leword? Execute last command containing ‘leword’
^search^replace Execute last command after replacing first occurrence of ‘search’ with ‘replace’

You can modify the history search using certain word designators, preceded by a colon (:).

!?leword?:0 Execute with 0th word (usually the command) executable in last command containing ‘leword’
!?leword?:2 Execute with second word of last command containing ‘leword’
!?leword?:$ Execute with last word in last command containing ‘leword’
!?leword?:2-6 Execute with second word to sixth word in last command containing ‘leword’
!?leword?:-6 Execute with all words up to 6th word in last command containing ‘leword’ (Equivalent to !?leword?:0-6)
!?leword?:* Execute with all words of last command (but not the 0th word) containing ‘leword’ (Equivalent to !?leword?:1-$)
!?leword?:2* Execute with the second word to the last word in last command (but not the 0th word) containing ‘leword’
(Equivalent to !?leword?:2-$)
!?leword?:2- Execute with all words from the 2nd position to last but-one word and not the 0th word in the command containing ‘leword’

Remember that bash will execute whatever you have retrieved from the history with whatever you have already typed at the prompt.

You can also use any number of modifiers, each preceded by a colon (:).

!?leword?:p Display (but not execute) last command containing ‘leword’
!?leword?:t Execute with last command containing ‘leword’ after removing all pathnames of last argument (i.e., leave the tail containing the file name)
!?leword?:r Execute with last command containing ‘leword’ after removing the file extension from the last argument
!?leword?:e Execute with last command containing ‘leword’ after removing pathname and filename from the last argument (leaving just the extension)
!?leword?:s/search/replace Execute last command containing ‘leword’ after replacing the first instance of ‘search’ with ‘replace’
!?leword?:as/search/replace Execute last command containing ‘leword’ after replacing all instances of ‘search’ with ‘replace’

If you omit the search text (‘leword’) and use the history expansion character with the word designators and the modifiers, bash will search the last command. Until you become proficient in using the history expansion character, use the modifier :p to display the command before you actually execute it.

Text expansions and substitutions

These features are available at the shell prompt and in shell scripts.

  • Tilde (~): In your commands, bash will expand instances of ~ with the value of the environmental variable $HOME, that is, your home directory.
  • ? and *: These are metacharacters. In file descriptors, ? matches any one character while * matches any number of any characters. If they do not match any file names, bash will use their literal values.
  • Brace expansion: You can use comma-separated text strings within curly brackets to generate combinations of strings with their suffixes and/or prefixes. When I start a new book, I create its folders like this.
mkdir -p NewBook/{ebook/images,html/images,image-sources,isbn,pub,ref}

This command creates folders like this.

NewBook
NewBook/ref
NewBook/pub
NewBook/isbn
NewBook/image-sources
NewBook/html
NewBook/html/images
NewBook/ebook
NewBook/ebook/images
  • Parameter expansion: When bash executes a script, it creates these special variables for the script.
Shell variable Use
$0 Name of the shell script
$1, $2,… Positional parameters or arguments passed to the script
$# Total count of arguments passed to the script
$? Exit status of last command
$* All arguments (double-quoted)
$@ All arguments (individually double-quoted)
$$ Process ID of current shell/script

At the terminal, $0 will usually expand to the shell program (/bin/bash).

On a terminal, you can use the set command to ipso facto specify parameters to the current shell.

# Displays 0
set $?
# Displays an empty string and causes a new line
echo $*
# Sets hello and world as parameters to current shell
set -- hello world
# Displays 2 (the number of parameters)
echo $#
# Displays world
echo $2
# Remove parameters to current shell
set --
# Displays 0 (as earlier)
set $?

The option — represents the end of options and implies that whatever follows it must be command parameters.

  • Command substitution: Instead of backquotes, you can use the form $(commands) to capture the output of those commands for use in some other commands or variables. It makes quoting and escaping much easier.
  • Variable substitution: You can use these substitutions with command parameters (created by bash for a shell script) or with variables that you have created.
Substitution Effect
${var1:-var2} If var1 is null or does not exist, var2 is used
${var1:=var2} If var1 is null or does not exist, value of var2 is used and set to var1
${var1:?msg} If var1 is null or does not exist, msg is displayed as error
${var1:+var2} If var1 exists, var2 is used but not set to var1
${var:offset} Everything of var after offset number of characters
${var:offset:length} Length number of characters of var after offset number of characters
${!prefix*} ${!prefix@} All variables names beginning with prefix
${!var[@]} ${!var[*]} All indexes of array variable var
${#var} Length of value of var
${var#drop} Value of var without prefix matching Regex pattern drop
${var##drop} Empty string if prefix matches Regex pattern drop
${var%drop} Value of var without suffix matching Regex pattern drop
${var%%drop} Empty string if suffix matches Regex pattern drop
${var^letter} Changes first letter of var to uppercase if it matches letter (any alphabet, * or ?)
————————————————————–
If letter is not specified, all first letter(s) of var will be changed to uppercase
${var^^letter} Changes any letter of var to uppercase if it matches letter (any alphabet, * or ?)
————————————————————–
If letter is not specified, all letter(s) of var will be changed to uppercase
${var,letter} Changes first letter of var to lowercase if it matches letter (any alphabet, * or ?)
————————————————————–
If letter is not specified, all first letter(s) of var will be changed to lowercase
${var,,letter} Changes any letter of var to lowercase if it matches letter (any alphabet, * or ?)
————————————————————–
If letter is not specified, all letter(s) of var will be changed to lowercase
${var/find/replace} Value of var with instances of find replaced with replace.
If find begins with ‘#’, then a match is made at the beginning. A ‘%’ makes it match at the end.

Escaping

You can escape:

  • Special characters using the backslash (\). To escape the backslash character, use double backslashes (\\).
  • Literal text strings by wrapping them in single quotation marks (‘ ‘). Bash will not perform any expansions or substitutions. The single-quoted string should not have any more single-quotation marks. Bash will not perform any backslash-escaping either.
  • Literal text strings by wrapping them in double-quotation marks (“ “) but allowing for
    • $-prefixed variables, expansions and substitutions
    • backslash-escaped characters
    • backquoted (` `) command strings
    • history-expansion characters
# Displays Hello World
a=World; echo “Hello $a”
# Displays Hello $a
a=World; echo ‘Hello $a’
# Displays Hello ‘World’
a=World; echo “Hello ‘$a’”
Printer’s error

The Bash Reference Manual or even this article may use wrong characters for the quotation marks. The apostrophe or u+0027 used in single-quoted strings may be replaced with the right single quotation mark or u+2019. The grave accent or u+0060 used in back quoted strings may be replaced with a left single quotation mark or u+2018. The quotation mark or u+0022 used in double-quoted strings may also be replaced with left and right double quotation marks. They look similar but will result in an error if used in a shell script or in the command line.

I write my books and articles in CommonMark (MarkDown) and output them as HTML, ODT and PDF documents. These documents will not have such errors. When someone edits the document (before it goes to print) in a rich-text editor such as LibreOffice or Microsoft Office, the editor’s autocorrect feature may change ordinary quotation marks and backquotes with inverted quotation marks. Just be aware that this can happen. To avoid mistakes, type the commands by hand. Do not copy-paste them.

Summary

I am sure you will also conclude that bash code can be very cryptic. A lot of production code (industrial-strength shell scripts) is hundreds of lines long. If bash was not so succinct and powerful, it would take forever to write the lines. If you are doing any kind of serious shell scripting, then it is best you know all about bash’s myriad secrets. I think I have covered enough of them to kindle your interest. You are on your own now.
You will find all this and more in my book ‘Linux Command Line Tips & Tricks’. It is free for download from most popular ebook stores.

LEAVE A REPLY

Please enter your comment!
Please enter your name here