shell: cleaner cancellable until

A recurring pattern in my workflow has been to reboot a machine and then wait for SSH access to be available. One issue is that the latency until completion falls squarely within "too short to do something else, too long to stare at my screen".

To avoid getting distracted and having another context switch, I wrote a short shell function:

waitfor () {
    until ping -W1 -c1 "$1"; do :; done
    echo "Ping ok."
    until nc -w 1 -z "$1" "${port}" do :; done
    echo "Port ${port} is open."

Which can be used to chain the discovery of the open port with e.g. a notification:

# Usage:
$ waitfor srv0 2223 && notify-send "Host 'srv0' is ready."

So far so good. The function is simple and requires minimal dependencies: POSIX shell, ping and netcat available. Ping could be removed if ICMP is blocked and TCP access is all that matters, but it is not the case usually for me.

The issue with this implementation however is a slightly messy behavior that would grate me anytime I would use it and then cancel. Sending SIGINT to the shell, either ping or netcat will receive it, or the shell itself executing until. If the former, as the loop is written to expect failure and stop only on success, it would continue ; cue spamming C-C to stop.

Not a big issue for sure, but I wanted to know how it could be fixed, preferrably keeping it entirely POSIX. In the end I came up with the following:

# Run $@ until it succeeds, but allow SIGINT to cancel.
_cancellable_until () {(
    until "$@" > /dev/null; do :; done &
    trap 'kill $!; cancel="true"' INT
    wait $!
    trap - INT
    if [ $cancel = "true" ]; then
        return 1

# ping $1 until it is up and port ${2:-22} is open.
waitfor () {
    _cancellable_until ping -W1 -c1 "$1" || return 1
    echo "Ping ok."
    _cancellable_until nc -w 1 -z "$1" "${port}" || return 1
    echo "Port ${port} is open."

First thing to note: the loop body is empty, so this is not iso-functional with the actual until loop of course. Adding a separate body was of no use to me and I preferred to avoid the complexity.

The actual loop is executed as a background job. Immediately after starting it, SIGINT is trapped in the current shell, to send the default SIGTERM to the last job when received. The current shell will then wait for the job to complete: either the command in parameter ("$@") finally succeeds or SIGINT is issued.

If SIGINT is received, the cancellation will make the function return a non-zero exit value, allowing the caller to be aware of it, and stop immediately.

Additionally, the whole _cancellable_until is executed in another subshell (started by the () within the function {}), which will avoid printing the job logs, to reduce visual noise.

The intended behavior is achieved, but at what cost! All of this to have a clean failure loop cancellation. Maybe more advanced shells could have better primitives, but making waitfor portable and clean was more complex than it should have been.