Share This
//Expanding the Capabilities of Shell

Expanding the Capabilities of Shell

  • New features of bash, very convenient.
  • This article will introduce some new features added in versions from 4.0 onwards.
  • When considering aspects like shell script compatibility, caution is needed when adding new features. Even though they may not be used, it’s better to be aware of them.

Let’s review and think about what problems these new features solve.

In September 2016, the bash version was updated to 4.4, adding many new features. However, there was a lack of confidence and enthusiasm with every bash version update. For example, when trying to make the shell scripts comply with the POXIS standard, you mostly encounter the vulnerabilities of these new features (*1). On the other hand, even when combining commands to perform shell operations, most of the features used are pipe and redirect.
It seems like the emotional state is something like “New features added to bash!” “Oh…”

Although there are reasons for adding new features, one of them is that some people believe it is necessary to approve the addition.

Moreover, some features can solve the issue of preventing regular users from seeing the management of various data and flags inside bash using its syntax and structure. Even though not using unnecessary features, understanding them helps visualize the structure of the Shell. Therefore, the purpose of this article is to focus on the new features added to bash version 4.0 and later, helping readers understand questions like “Why is this feature added?” and “What’s the benefit?”

Preparing for bash 4.4

Since the bash code is available at https://ftp.gnu.org/gnu/bash/, download version 4.4 and use it. In most UNIX OS environments, it can be manually installed following Instruction 1. For older versions, the installation process is similar, so if comparing old and new versions, try changing the tar.gz file instruction to install a different bash version. The environment verified in this article is Ubuntu 16.04 Server.

Changes between versions are recorded in the CHANGES file found in the directory created by the tar. If interested, feel free to explore.

Additionally, frequently check the BASH_VERSION variable when switching versions and performing work. For example, if you set versions 4.4 and 3.2 at the path as instructed in Instruction 1, you can switch to the next version this way.

$ bash4.4
$ echo $BASH_VERSION
$ 4.4.0(1)-release
Setting version 3.2 as instructed in Instruction 1
$ bash3.2
Though there will be warnings related to the associative array (mentioned later), there’s no need to worry.
$ echo $BASH_VERSION
$ 3.2.57(1)-release

Aggregating Standard Output and Standard Error Output with pipe「|&」

Let’s start with a simple topic. From version 4.0 onwards, |& has become the symbol for pipe. When using bash, sometimes after aggregating the Standard Output and Standard Error Output, you may want to pipe the command to something else, but until version 3.2, this was the situation:

$ ls a aa 2>&1 l nl
1 ls: cannot access 'aa': abbreviated as below
2 a

You would need to rewrite the file descriptor operation part for 2>&1.
The meaning of this one-liner is that after aggregating Standard Output and Standard Error Output of `ls` at 2>&1, the result is piped. First, the Error Output of `ls`, then the Standard Error Output of `ls`, are piped, and the line numbers are appended using `nl`. Regarding file descriptors, it’s a minor aspect in this article, so the explanation is simplified.

Since version 4.0, there is no need to worry about these minor issues. Instead, you can do this:

$ ls a aa l& ln
1 ls: cannot access 'aa': abbreviated as the rest
2 a

It completes the task. However, reducing the number of characters doesn’t bring any new functionality.
Which one is more convenient? Here’s an example of an upcoming case. For example, when adding `find` to a Directory for common users like below, there will be an error at the path of the file or directory in both Standard Output and Standard Error Output.

$ find /proc/
1 /proc/
2 /proc/ fb
(..abbreviated..)
find: `/proc/tty/driver': no authorization
find: `/proc/1/task/1/fd': no authorization
(..abbreviated..)

▼ Instruction 1: install bash
$ wget https://ftp.gnu.org/gnu/bash/bash-4.4.tar.gz
$ tar zxvf bash-4.4.tar.gz
$ cd bash-4.4/
$ ./configure && make -j
$ sudo cp bash /usr/local/bin/bash4.4
$ bash4.4 --version
GNU bash, version 4.4.0(1)-release (x86_64-unknown-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
Licence GPLv3+: GNU GPL version 3 or later
This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

If `find` is not added, when combining two types of Output and viewing them via the terminal:

$ find /proc/ 2> /dev/null
↑ Only viewing Standard Output
$ find /proc/ > /dev/null
↑ Only viewing Error

Or:

$ find /proc/ > a 2> b

Doing this and combining them into one file for viewing:

$ find /proc/ > a 2> b
↑ Combine both outputs and store them in a file

The use of |& will make it more convenient. For example, when combining with other commands or `less`, as shown next:

$ find /proc l& less
↑ View both outputs
$ find /proc l& grep ^find: | less
↑ View only Error
$ find /proc l& grep -v ^find: | less
↑ View only Standard Output

This way, you can choose to view only one type of output without creating unnecessary files like `a` or `b`.
Without |&, there wouldn’t be such complications, but will users use this more frequently? A point to note is that the order of transmitting Standard Output and Standard Error Output to the Pipe is not specified.

Introducing related symbols >>, used to aggregate Standard Output and Standard Error Output and append them to a file.

$ ls a b > result
↑ Combine both outputs and redirect them to a file
$ ls a b >> result
↑ Combine both outputs and append to the file
$ cat result
ls: Cannot access 'a': abbreviated as below
b
ls: Cannot access 'a': abbreviated as below
b

(*1) It can be said that it’s written in sh, but if simulating sh, using bash’s execution environment, the situation becomes more complicated than this.

$ ls a b > result
↑ Combine both outputs and save to a file

$ ls a b >> result
↑ Combine both outputs and append to a file

$ cat result
ls: Cannot access 'a': abbreviated as below
b
ls: Cannot access 'a': abbreviated as below
b

「★★」 —- globstar

There is a built-in `shopt` command for bash, which enables or disables many different features. Like the following:

$ shopt -s globstar

By doing this, the globstar feature becomes available. Regarding usage, let’s consider a case where there’s a directory containing other directories and files as follows (using xargs to save space in the magazine).

$ find | xargs
. ./c ./a ./b ./b/f ./b/e ./b/h ./b/h/i ↩
./b/h/j ./b/g

If you use the command `echo ★★`, this directory will be shown as follows:

$ echo ★★
a b b/e b/f b/g b/h b/h/i b/h/j c

You can see the directory structure similar to using the `find` command, and the result will look like this.
(Note: directories and files starting with “`.`” will not be outputted.)

★★ is the symbol for globstar.

Furthermore, when extracting only directories, if using `find`:

$ find -type d
.
./c
./b
./b/h

You need to set the options as shown above. On the other hand, if you write: ★★/ you can extract the information similarly.

$ echo ★★/
/b b/h/ c/

Then, if combined as follows, you can extract all the files with a `.conf` extension within all subdirectories of `/etc/`.

$ echo /etc/★★/★.conf | tr '''\n' | enter
head -n 3
/etc/adduser.conf
/etc/apache2/apache2.conf
/etc/apache2/conf-available/charset.conf

When using this function commonly on a device, it must be written in `.bashrc` and bash_profile as:

shopt -s globstar

Also, when you only want to use it occasionally, execute the following on the device:

shopt -s globstar .

To turn off this feature, specify the option `-u` instead of `-s`.

A point to note is that unlike the `find` command, as this command already sorts the display order, when used on directories with many files, it might be slower. Regarding the output calculation below, I measured the time of `find` and ★★ in the directory created from the example above and the entire file system, and verified it on the Ubuntu 16.04 Server environment.

I measured several times, using cache.

When the directory has a small number of files:

$ time find ./ > /dev/null

real 0m0.007s
user 0m0.000s
sys 0m0.004s

$ time echo ** > /dev/null

real 0m0.001s
user 0m0.000s
sys 0m0.000s

When the directory has many files:

$ time find ./ > /dev/null

real 0m0.963s
user 0m0.368s
sys 0m0.588s

$ time echo /** > /dev/null

real 0m7.291s
user 0m6.092s
sys 0m1.192s

As mentioned earlier, when the number of files is large, the time taken for the `find` command and ★★ diverges. When the number of files is small, `find` is slower due to being an external command, while `echo` is faster because it’s a built-in command.

Other Features That Can Be Set Using shopt

Here, I also introduce the `failglob` and `autocd` commands related to `shopt`. There are many other features as well; if you want to learn more, please use `man bash` to refer to more specifications related to `shopt`.

`failglob` has been available since version 3.0.

$ touch *.txt

When you want to perform operations with the current timestamp of the files with `.txt` extension, you perform the operation as above. However, if there are no such files in the current directory, it will automatically create a file `*.txt` like this:

$ touch *.txt
$ ls
*.txt

This could be annoying, and if files with special characters are created in the directory, it could cause potential risks.

To prevent this, define it in the `.bashrc` as follows:

shopt -s failglob

Then it will show an error like this:

$ touch *.txt
-bash: file *.txt not found

Personally, I think this option is quite convenient, but unfortunately, it might cause some issues with tab autocomplete and interference.

`autocd` is a feature that eliminates the need for `cd`; just enter the directory name in the terminal, and it will change to that directory. Here’s an example:

$ shopt -s autocd
$ /etc/
cd /etc/
$ ~
cd /home/ueda → corresponds to the current user

You might find it convenient when you’re used to it, but it can be confusing when you switch to an environment that doesn’t use this feature.

Associative Arrays – Reading Data into an Array

We can use associative arrays from version 4.0 onwards. Here’s an example of how to use them, as shown in figure 2.

Figure 2: Using associative arrays
$ declare -A tel
$ tel[police]=113 ← create associative array tel
$ tel[cuu_hoa]=114 ← assign value 113 to key police
$ echo ${tel[police]} ← assign value 114 to key cuu_hoa
113
$ echo ${tel[time]} ← Call the value of police
$ ← no output
$$ echo ${!tel[@]}a ← list keys
police cuu_hoa

Since I usually don’t use this feature, it’s challenging to provide a suitable example for how to use it, but I’ll still give an example for handling user flags as shown in figure 2.

The verification was done on Ubuntu Server 16.04 and 17.04, after typing the command:
$ set -x

When the setting was loaded again and searched for `declare` using `|&`, the output looked like this.

$ source ~/.bashrc |& grep declare
+ grep --color=auto declare
+ source /home/ueda/.bashrc
+++ declare -A _xspecs

In the last line, we see the associative array `_xspecs` was created. This array is populated to autocomplete.

$ echo ${_xspecs[latex]}
!*.@(?(la)tex|texi|dtx|ins|ltx|dbj)

In this example, the file extensions related to the `latex` command are loaded.

If you want to access information immediately for entered data, associative arrays are most suitable. (Anyone interested in the autocomplete function should use the keyword `bash-completion` to investigate further.) Additionally, since the data for autocomplete is managed using bash syntax, bash users can customize it.

However, from a personal perspective, I don’t recommend blindly using associative arrays. When guiding users on flag management, it’s better to handle it like the `_xspecs` example. When handling data, I believe using files is more efficient.

For example, if providing an associative array for beginners, a simple example like police, cuu_hoa in figure 2 is fine.

for key in ${!tel[@]} ; do
#$key and ${tel[$key]} # example handling like here
done

This example is not ideal. If it’s a typical language, it might work, but for beginners, it can confuse them with its multiple parentheses and symbols. I find working with associative arrays confusing, so I’ll conclude this section here.

Another thing to note: don’t overuse it. The feature of reading data from a file into an array has been available since version 4.0. Here’s an example using the `mapfile` command.

$ mapfile -t passwd < /etc/passwd
$ echo ${passwd[0]}
root:x:0:0:root:/root:/bin/bash
$ echo ${passwd[-1]}
ueda:x:1001:1001:ueda,,,:/home/ueda:/bin/bash

This example copies the contents of `/etc/passwd` to the array `[passwd]`, and then `echo` is used to print it. The `-t` option for `mapfile` means line breaks. Additionally, negative indices can be used for arrays like Python lists, allowing you to reference elements from the end. This feature was added in version 4.3.

The `mapfile` command is useful when you need to manage small parameters retrieved from bash. However, unless there’s no external command to use, I suggest not using it to handle data in a `for` loop.

CO-PROCESS

Next, I will introduce Co-process, which has been available since version 4.0. This is the most complicated topic in this entire issue. Let’s start with an example.

$ coproc awk '{print $1*2;fflush()}'
[1] 10872
$ seq 1 3 >&"${COPROC[1]}"
$ read n <&"${COPROC[0]}" ; echo $n
2
$ read n <&"${COPROC[0]}" ; echo $n
4
$ read n <&"${COPROC[0]}" ; echo $n
6

In this example, in the first line, I used the code `awk` in the `coproc` command. This `awk` code reads the text with numbers from the first line, multiplies them by 2, and outputs to standard output. The `fflush` function flushes the output buffer after each line finishes. On line 3, the command `seq 1 3` outputs 1, 2, and 3, and is redirected to the file descriptor `${COPROC[1]}`. At this point, the `COPROC` array contains the numbers as follows:

$ echo ${COPROC[@]}
63 60

Thus, when running `>&”${COPROC[1]}”`, it becomes `>&60`, and the output goes to the file descriptor 60. The command `read n <&63` reads from descriptor 63 and stores it in `n`, printing the value using `echo`. The result will print 2, 4, and 6, as the sequence is multiplied by 2. It’s like running the `awk` program somewhere else. The process ID of `awk` is saved in the `COPROC_PID` variable and can be stopped as follows:

$ kill -KILL $COPROC_PID
[1]+ force quit coproc COPROC awk '{print $1*2;fflush()}'

This is the simplest way to use co-processes, and perhaps your initial impression might be something like [What is this…?]. In summary:

  • Commands specified can be launched from `coproc`.
  • The input/output of commands are managed through the `coproc` variables.
  • Launched commands can read/write at any time.

This can be considered as serverizing command lines.

If this is hard to visualize, I will show you a practical example. In List 1, there is a program for finding the greatest common divisor (gcd) written with both bash and `awk`. Of course, it can be written entirely in bash or `awk`, but in this example, both are used. The key point in this example isn’t the algorithm, but how sub-functions are called repeatedly in the `while` loop of bash. The `awk` command handles the task of multiplying the numbers and outputs the result, which is passed back to `awk` using bash.

List 1: gcd.no_coproc.bash
function sub() {
awk '$1>$2{print $1%$2,$2;fflush()}
$1<=$2{print $2%$1,$1;fflush()}' } while [ "" -ne 0 ] ; do set -- $(echo $1 $2 | sub) echo ">" $1 $2 >$2
done
echo "$(($1+$2))"

In this case, the `awk` program will run each time in a loop, but the process is slow, especially when the loop is long, due to the overhead of calling `awk` each time.

If this program is modified to use a co-process as shown in List 2, the issue of repeatedly calling `awk` will be resolved. You can assign a name to the co-process, and write commands similar to a function.

List 2 gcd.coproc.bash
coproc sub {
awk '$1>$2{print $1%$2,$2;fflush()}
$1<=$2{print $2%$1,$1;fflush()}' } a="$1" b="$2" while [ $((a*b)) -ne 0 ] ; do echo $a $b >&"${sub[1]}"
read a b <&"${sub[0]}" echo ">" $a $b >&2
done
echo $((a+b))

Now let’s compare the number of calculations. Try running the `time` command and execute both scripts. When comparing the `real` time (total runtime), the co-process version runs 4 times faster (0.008s vs 0.036s). Even with just 9 loops, if the loop count increases, the processing time will increase as well.

Image 3 – Calculation comparison
$ time bash ./gcd.no_coproc.bash 10710 102012
> 5622 10710
> 6 12
> 0 6
6
real 0m0.036s

$ time bash ./gcd.coproc.bash 10710 102012
6
real 0m0.008s

Thus, the co-process approach is more efficient when used in repetitive tasks. However, when using `read` on co-process output, the program stops, which is something to keep in mind.

Appendix – pilefail

`pipefail` is a feature available since version 3.0, but it’s a small addition, so I’ll introduce it here. `pipefail` is an option that makes the script stop if any error occurs in the pipe. Although it’s similar to using the `-e` option in sh or bash, if there’s an error, it stops, but it’s a bit confusing.

List 3 set_e.bash
set -e
false | true
echo "NG"

List 4 set_e_pipefail.bash
set -e
set -o pipefail
false | true
echo "NG"

Example List 3 sets the `-e` option on line 1, and `false` on line 2 causes an error. However, when running, it does not stop at line 2 and outputs:

$ bash set_e.bash
NG

To stop immediately on line 2, as shown in List 4:

set -o pipefail

When running, the result will be:

$bash set_e_pipefail.bash
(No output)
$


APPLY NOW






    Benefits

    SALARY & BONUS POLICY

    RiverCrane Vietnam sympathizes staffs' innermost feelings and desires and set up termly salary review policy. Performance evaluation is conducted in June and December and salary change is conducted in January and July every year. Besides, outstanding staffs receive bonus for their achievements periodically (monthly, yearly).

    TRAINING IN JAPAN

    In order to broaden staffs' view about technologies over the world, RiverCrane Vietnam set up policy to send staffs to Japan for study. Moreover, the engineers can develop their career paths in technical or management fields.

    ANNUAL COMPANY TRIP

    Not only bringing chances to the staffs for their challenging, Rivercrane Vietnam also excites them with interesting annual trips. Exciting Gala Dinner with team building games will make the members of Rivercrane connected closer.

    COMPANY'S EVENTS

    Activities such as Team Building, Company Building, Family Building, Summer Holiday, Mid-Autum Festival, etc. will be the moments worthy of remembrance for each individual in the project or the pride when one introduces the company to his or her family, and shares the message "We are One".

    INSURANCE

    Rivercrane Vietnam ensures social insurance, medical insurance and unemployment insurance for staffs. The company commits to support staffs for any procedures regarding these insurances. In addition, other insurance policies are taken into consideration and under review.

    OTHER BENEFITS

    Support budget for activities related to education, entertainment and sports. Support fee for purchasing technical books. Support fee for getting engineering or language certificates. Support fee for joining courses regarding technical management. Other supports following company's policy, etc.

    © 2012 RiverCrane Vietnam. All rights reserved.

    Close