Some POSIX Tricks
---
Unless otherwise noted, all information below pertains to the POSIX.1-2008 standard.
Shell
The POSIX shell built-in utilities are:
- break
- colon (:)
- continue
- dot (.)
- eval
- exec
- exit
- export
- readonly
- return
- set
- shift
- times
- trap
- unset
All other POSIX utilities are external to the shell. Most modern shells include common external utilities as built-ins, such as 'pwd', 'read', 'test' ('[') and 'true'. Control flow statements, such as 'if'/'then'/'fi', 'while'/'do'/'done', and 'case'/'esac' are reserved words in the shell and are not considered utilities.
An exit status of 0 is true/success; a nonzero exit status is false/failure. Contrast with languages that use Boolean algebra (1 is true; 0 is false).
POSIX '.' behaves differently depending on whether the argument given to it contains a slash. If so, '.' uses the file at the given path; otherwise, the file must be located somewhere in $PATH. This is the same behavior the shell uses when executing a file directly.
'set -C' prevents '>' from overwriting existing files. You can override it at run time with '>|'.
To force input from a terminal (as opposed to a pipe or other file descriptor), redirect input from the '/dev/tty' special file.
Use $PWD to get the current working directory without having to call an external utility.
POSIX shell does not have arrays. A crude workaround is to put elements in positional arguments with 'set'.
You can specify a "default' value for a variable by using parameter expansion. The 8 default-value functions are:
- ${VAR:-word}: If $VAR is unset, return 'word' and do not set $VAR. If $VAR is set but empty, return 'word' and do not change $VAR. If $VAR is set and nonempty, return $VAR.
- ${VAR-word}: If $VAR is unset, return 'word' and do not set $VAR. If $VAR is set but empty, return null. If $VAR is set and nonempty, return $VAR.
- ${VAR:=word}: If $VAR is unset, assign 'word' to $VAR and return. If $VAR is set but empty, assign 'word' to $VAR and return. If $VAR is set and nonempty, return $VAR.
- ${VAR=word}: If $VAR is unset, assign 'word' to $VAR and return. If $VAR is set but empty, return null. If $VAR is set and nonempty, return $VAR.
- ${VAR:?word}: If $VAR is unset, return an error. If $VAR is set but empty, return an error. If $VAR is set and nonempty, return $VAR.
- ${VAR?word}: If $VAR is unset, return an error. If $VAR is set but empty, return null. If $VAR is set and nonempty, return $VAR.
- ${VAR:+word}: If $VAR is unset, return null. If $VAR is set but empty, return null. If $VAR is set and nonempty, return 'word'.
- ${VAR+word}: If $VAR is unset, return null. If $VAR is set but empty, return 'word'. If $VAR is set and nonempty, return 'word'.
To test if a string contains a substring, use 'case'. This is the only POSIX method that uses no external utilities at all. It also enables checking for several substrings at once.
Create an empty file, or truncate an existing file, by writing ':' to it. This has the advantage of wiping a file while retaining its inode.
Get the directory of a path without calling 'dirname'. This can be done entirely with string manipulation.
If the path is empty, use '.'. Otherwise, remove all trailing slashes, set to '.' if no slashes remain, remove the last level of the path, remove all trailing slahes again, and set to '/' if nothing remains.
Get the base portion of a path without calling 'basename'. Simply remove all trailing slashes and everything before the final remaining slash from the path.
'until' executes as long as the command given to it returns a nonzero exit status. This is equivalent to 'while !' but is more readable.
Remove all instances of a substring from a string without using 'sed' or 'awk'. Check if the substring is present, and as long as it is, remove the first occurence and check again.
Do not use '${string//"$sub"/}'. Some sources claim this syntax is POSIX-compliant, but it is not.
Iterate through the contents of a directory with a 'for' loop. Beware: if the target path is empty, globs (*) are treated literally! Check that the result exists before operating on it.
Group commands with either braces or parentheses. Braces execute the group in the current shell environment; parentheses spawn a subshell for the group.
When using braces, all commands must be delimited, e.g. with a semicolon or a newline.
This is handy when several commands need to use data from a single pipe.
Parentheses enable the shell to execute commands without modifying the current environment.
':' behaves identically to 'true'. It saves an external call but is less readable.
When writing a here-document, '<<-' strips leading tabs from the input lines, including from the delimiter. This is useful for managing indentation in a script.
Here-documents can also be used to feed a variable into a command. This avoids piping, from 'printf' for example. It can also be used to bypass argument length limits in resource-constrained environments.
Use 'exec' to manipulate file descriptors.
Swap the STDOUT and STDERR of a command. The following command will print the contents of "/tmp/exists.txt" to STDERR and an error message about "/tmp/not-exists.txt" to STDOUT.
The following pseudo-ternary looks like a replacement for 'if'/'then'/'else', but it has a flaw.
If the second command returns a nonzero exit status, the third command will execute, even if the first command returns 0.
You can partially emulate the intended behavior by grouping the second command and returning 0 at the end of the group via 'true'. This causes the return status of the second command to be overwritten. Only use a pseudo-ternary if you are sure you don't need the return value of the second command, or if you don't need to check whether the second command succeeds or fails (for example, if the result will be checked later).
Calculate the day of the week with Zeller's Congruence. This replaces 'date +%w' and avoids a subshell. 0 is Saturday, 1 is Sunday, and so on.
Replacements for Non-POSIX Commands
Replace '!' (historical command execution):
Use 'fc'. By default it edits the selected commands before running them, making it more robust than '!'.
List a range of commands in reverse order.
Execute the last command without editing.
Replace 'dos2unix -O':
Remove the final carriage return from any line that "ends" with one using 'sed'.
It's a good idea to verify that the input file is not binary before running this command.
Replace 'xxd -p':
Create a hex dump of data using 'od', then remove spaces with 'tr'.
The outputted hex is identical, but the formatting is not: 'od' prints 16 bytes per line, while by default 'xxd' prints 30 bytes per line. To emulate 'xxd -p -c0' (no column size limit), remove newlines as well as spaces.
Replace 'xxd -r -p':
'fold' the hex dump into single bytes and loop over them. 'bc' converts each hexadecimal byte to octal, then 'printf' prints the raw byte corresponding to the octal. Each line outputted by 'fold' must contain exactly 2 characters. Hexadecimal digits must be in uppercase. The dump can contain newlines (\n) if they do not split bytes; no other non-digit characters can be present.
Note: this is orders of magnitude slower than 'xxd -r', since it spawns two subshells for each byte of output. It is only useful for very small amounts of data.
awk
The scripting language implemented by 'awk' is Turing-complete, and as such it can replace any 'grep' or 'sed' command. This can reduce the number of external dependencies in a script at the cost of being more verbose. For example, the following is a replacement for 'grep -q':
'awk' is the only POSIX source of randomness of any kind ($RANDOM and /dev/random are undefined). It prints random numbers between 0 and 1 using rand(). Initialize the randon-number generator by calling srand() first.
To mimic the behavior of $RANDOM, set minimum and maximum values of 0 and 32767 respectively, then scale rand() between them.
Beware: this is not cryptographically secure, and it may not be random at all! POSIX specifies that srand() should use seconds since the epoch as a seed by default. This means in a given second, every random number generated by 'awk' will be the same. Some implementations require changing seeds manually. Wait at least 1 second after each call to give the seed time to change.
Because the default seed for srand() is the seconds since the epoch, you can use 'awk' to print the epoch, which POSIX 'date' does not support.
POSIX 'awk' can perform floating-point arithmetic and includes functions like sin(x), exp(x) and log(x). 'awk' can thus replace 'bc' for basic calculations; if you are already using 'awk' in a script, this can save a dependency.
cat
POSIX 'cat' buffers output by default. If it is piped into another command, the command may not receive data until 'cat' outputs a sufficiently large chunk or even the whole input. Use '-u' to unbuffer.
The vast majority of 'cat' implementations today do not buffer output. They accept '-u' for compatibility but internally ignore it.
command
Checking the output of 'command -v' uses a subshell unnecessarily.
Instead, check the exit status of 'command -v' directly. You may want to suppress its output.
date
POSIX 'date' does not support numerical timezones, so it cannot write ISO 8601-compliant local date-times. You must use UTC and hardcode the timezone string.
echo
POSIX 'echo' is ambiguous, and different implementations can use different features by default while still being POSIX-compliant. For consistent results, use 'printf'.
expr
When running a test with 'expr' (such as regular expressions), you can use its exit status in conditional statements directly. You may want to suppress its output.
Some implementations of 'expr' give an error if the string is empty. Use an extra fixed character to avoid this.
find
POSIX 'find' does not have the -mindepth or -maxdepth options. However, if a given name or path is a directory, -prune prevents its traversal. To emulate '-mindepth 1 -maxdepth 1' (list the contents of a directory but not subdirectories), prune all paths that do not match the target path for 'find'. When paired with the -type flag, this is a useful one-liner.
fold
POSIX does not support substring addressing, so traditional methods of iterating through a string cannot be used. 'fold' works around this limitation by splitting a string into substrings of a fixed length, separated by newlines.
The above sample sentence was sourced from this Reddit post.
mkfifo
FIFO (first in, first out) special files, also called named pipes, look like regular files on disk but behave like pipes. Processes can send data to each other ephemerally by reading from and writing to the FIFO. Data is only sent if the FIFO is being read and written simultaneously; writing without a reader will block, and reading without a writer will return EOF. Use '&' to unblock writes to a FIFO within a script.
You can open FIFOs as a file descriptor. If you open it for both reading and writing, the FIFO does not block on writes (up to the size of the pipe buffer). Note that 'cat' blocks in this example because the FIFO, which is open for reading, never sends an EOF.
POSIX.1-2024 introduced the ability for 'read' to split logical lines with a user-specified delimiter rather than always using a newline. You can append a NUL byte to the output of a command, redirect the output to the FIFO file descriptor, then 'read' from the FIFO into a variable using a NUL delimiter. This puts the entire command output into the variable without looping over lines and without spawning a subshell.
Note: if the filesystem on which you create the FIFO is mounted on memory, this will likely be faster than a subshell. Otherwise, accessing the FIFO uses the disk, which is slower than a subshell.
FIFOs are easily subject to race conditions. Do not perform multiple reads or writes on one FIFO simultaneously.
printf
Send an audial alert to the user.
tee
The output of 'tee' is guaranteed to be unbuffered, unlike 'cat'.
test ([)
POSIX specifies one test that does not reference paths, strings or integers: '-t <FD>'. This test checks if the specified file descriptor <FD> is a terminal.
By checking if 0 (STDIN) is a terminal, you can determine if input is coming from an interactive session or being piped from somewhere else.
Use 'test' and string manipulation to check for substrings. Remove the largest part of the string that begins with the substring and compare the result to the original. If they are not equal, the shell found an instance of the substring to remove, so the substring is present in the original string. This is useful for one-liners.
This enables you to check if a string has multiple lines without invoking 'wc', though it is syntactically ugly.
A tweak checks if the substring occurs more than once. Remove the smallest part of the string that begins with the substring, and compare it with removing the largest part of the string that begins with the substring. They will be unequal if the substring appears 2 or more times.
uuencode
'uuencode' can encode data in RFC 4648-compliant Base64. However, its output contains control information, including a mandatory file name to which 'uudecode' will write decoded data. If you only want the Base64-encoded data without the control information, use 'sed' to remove the first line and the last line of the output. The output file name given to 'uuencode' can be anything--it is "x" in the below example.
Miscellany
Send the output of a command to multiple file descriptors simultaneously. This example prints data to both STDOUT and STDERR. Most shells, such as bash, write the PID of the background process to STDERR; use a subshell to discard this data.
To edit a file in-place without using a temporary file, invoke the editing command in a subshell, then print the result to the original file in the current shell. This forces the shell to evaluate the output of the editing command before the original file is truncated.
A one-liner using 'printf' is sufficient for relatively small text files.
This is subject to maximum argument length constraints, which can be an issue in embedded environments. A slightly more robust method is to use a here-document and write back to the file using 'cat'.
Both of these methods will misbehave when editing binary files, because the shell's parameter expansion strips NUL bytes from the output of the editing command. However, depending on the implementation, 'awk' might retain such bytes. The following 'awk' program collects the entire output of the editing command into an internal array before writing it to the original file.
All of these methods have two other downsides. The amount of data that can be edited at once is limited to the size of available memory. Further, when writing data back, the original file will be in an inconsistent state until the write is complete, because part of the file's contents exist only in memory. If a crash occurs, the data is lost. It is always best to use a temporary file.
POSIX specifies only 3 device files:
- /dev/console: A special file pointing to the system console.
- /dev/null: A special file that discards all data written to it. Reading returns EOF.
- /dev/tty: A special file pointing to the terminal that controls the current process group.
Specifically, "random", "urandom" and "zero" are not specified.
---
[Last updated: 2026-04-26]