Detect Keyboard Keys and Mouse Clicks in Bash Scripts



, ,

Despite its richness, there is still no easy way for the shell to deal with keys such as function keys that generate multiple characters. For that, this tutorial presents the key-funcs library of functions. The second major section of this tutorial describes how to use the mouse in shell scripts and provides a demonstration program.

Between those sections, we’ll deal with checking user input for validity and the history library. Most people use bash’s history library only at the command line. We’ll use it in scripts, and this tutorial will show how that is done, by using the history command in a rudimentary script for editing a multifield record.

Single-Key Entry

When writing an interactive script, you might want a single key to be pressed without requiring the user to press Enter. The portable way to do that is to use stty and dd:

stty -echo -icanon min 1
_KEY=$(dd count=1 bs=1 2>/dev/null)
stty echo icanon

Using three external commands every time you need a key press is overkill. When you need to use a portable method, you can usually first make a call to stty at the beginning of the script and the other at the end, often in an EXIT trap:

trap 'stty echo icanon' EXIT

Bash, on the other hand, doesn’t need to call any external commands. It may still be a good idea to use stty to turn off echoing at the beginning and back on before exiting. This will prevent characters from showing up on the screen when the script is not waiting for input.

Function Library, key-funcs

The functions in this section comprise the key-funcs library. It begins with two variable definitions, shown here in Listing 15-1.

To get a single keystroke with bash, you can use the function in Listing 15-2.

First, the field separator is set to an empty string so that read doesn’t ignore a leading space (it’s a valid keystroke, so you want it); the -r option disables backslash escaping, -s turns off echoing of keystrokes, and -n1 tells bash to read a single character only.

The -d ” option tells read not to regard a newline (or any other character) as the end of input; this allows a newline to be stored in a variable. The code instructs read to stop after the first key is received (-n1) so it doesn’t read forever.

The last argument uses ${@:-_KEY} to add options or a variable name to the list of arguments. You can see its use in the _keys function in Listing 15-3. (Note that if you use an option without also including a variable name, the input will be stored in $REPLY.)

Note  For this to work on earlier versions of bash or on the Mac OS X, add the variable name to the read command, such as IFS= read –r –s –n1 –d” _KEY “${1:-_KEY}”. If not, then you have to look to $REPLY for the key press read.

The _key function can be used in a simple menu, as shown in Listing 15-3.

Although _key is a useful function by itself, it has its limitations (Listing 15-4). It can store a space, a newline, a control code, or any other single character, but what it doesn’t do is handle keys that return more than one character: function keys, cursor keys, and a few others.

These special keys return ESC (0 × 1B, which is kept in a variable $ESC) followed by one or more characters. The number of characters varies according to the key (and the terminal emulation), so you cannot ask for a specific number of keys. Instead, you have to loop until one of the terminating characters is read. This is where it helps to use bash’s built-in read command rather than the external dd.

The while : loop calls _key with the argument -t1, which tells read to time out after one second, and the name of the variable in which to store the keystroke. The loop continues until a key in $ESC_END is pressed or read times out, leaving $__KX empty.

The timeout is a partially satisfactory method of detecting the escape key by itself. This is a case where dd works better than read, because it can be set to time out in increments of one-tenth of a second.

To test the functions, use _key to get a single character; if that character is ESC, call _keys to read the rest of the sequence, if any. The following snippet assumes that _key and _keys are already defined and pipes each keystroke through hexdump –C to show its contents:

while :
  case $_KEY in
      $ESC) _keys
  printf "%s" "$_KEY" | hexdump -C | {
               read a b
               printf "   %s\n" "$b"
  case "$_KEY" in q) break ;; esac

Unlike the output sequences, which work everywhere, there is no homogeneity among key sequences produced by various terminal emulators. Here is a sample run, in an rxvt terminal window, of pressing F1, F12, up arrow, Home, and q to quit:

   1b 5b 31 31 7e                |.[11~|
   1b 5b 32 34 7e                |.[24~|
   1b 5b 41                      |.[A|
   1b 5b 35 7e                   |.[5~|
   71                            |q|

Here are the same keystrokes in an xterm window:

   1b 4f 50                      |.OP|
   1b 5b 32 34 7e                |.[24~|
   1b 5b 41                      |.[A|
   1b 5b 48                      |.[H|
   71                            |q|

Finally, here they are as produced by a Linux virtual console:

   1b 5b 5b 41                   |.[[A|
   1b 5b 32 34 7e                |.[24~|
   1b 5b 41                      |.[A|
   1b 5b 31 7e                   |.[1~|
   71                            |q|

All the terminals tested fit into one of these three groups, at least for unmodified keys.

The codes stored in $_KEY can be either interpreted directly or in a separate function. It is better to keep the interpretation in a function that can be replaced for use with different terminal types. For example, if you are using a Wyse60 terminal, the source wy60-keys function would set the replacement keys.

Listing 15-5 shows a function, _esc2key, that works for the various terminals on a Linux box, as well as in putty in Windows. It converts the character sequence into a string describing the key, for example, UP, DOWN, F1, and so on:

You can wrap the _key and _esc2key functions into another function, called get_key (Listing 15-6), which returns either the single character pressed or, in the case of multicharacter keys, the name of the key.

In bash-4.x, you can use a simpler function to read keystrokes. The get_key function in Listing 15-7 takes advantage of the capability of read’s -t option to accept fractional times. It reads the first character then waits for one-ten-thousandth of a second for another character. If a multicharacter key was pressed, there will be one to read within that time. If not, it will fall through the remaining read statements before another key can be pressed.

Whenever you want to use cursor or function keys in a script, or for any single-key entry, you can source key-funcs and call get_key to capture key presses. Listing 15-8 is a simple demonstration of using the library.

The script in Listing 15-9 prints a block of text on the screen. It can be moved around the screen with the cursor keys, and the colors can be changed with the function keys. The odd-numbered function keys change the foreground color; the even-numbered keys change the background.

History in Scripts

In the readline functions history -s was used to place a default value into the history list. In those examples, only one value was stored, but it is possible to store more than one value in history or even to use an entire file. Before adding to the history, you should (in most cases) clear it:

history -c

By using more than one history -s command, you can store multiple values:

history -s Genesis
history -s Exodus

With the -r option, you can read an entire file into history. This snippet puts the names of the first five books of the Bible into a file and reads that into the history:

cut -d: -f1 "$kjv" | uniq | head -5 > pentateuch
history -r pentateuch

The readline functions use history if the bash version is less than 4, but read’s -i option with version 4 (or greater). There are times when it might be more appropriate to use history rather than -i even when the latter is available. A case in point is when the new input is likely to be very different from the default but there is a chance that it might not be.

For history to be available, you must use the -e option with read. This also gives you access to other key bindings defined in your .inputrc file.

Sanity Checking

Sanity checking is testing input for the correct type and a reasonable value. If a user inputs Jane for her age, it’s obviously wrong: the data is of the wrong type. If she enters 666, it’s the correct type but almost certainly an incorrect value. The incorrect type can easily be detected with the valint script or function. You can use the rangecheck function from to check for a reasonable value.

Sometimes the error is more problematic, or even malicious. Suppose a script asks for a variable name and then uses eval to assign a value to it:

read -ep "Enter variable name: " var
read -ep "Enter value: " val
eval "$var=\$val"

Now, suppose the entry goes like this:

Enter variable name: rm -rf *;name
Enter value: whatever

The command that eval will execute is as follows:

rm -rf *;name=whatever

Poof! All your files and subdirectories are gone from the current directory. It could have been prevented by checking the value of var with the validname function:

validname "$var" && eval "$var=\$val" || echo Bad variable name >&2

When editing a database, checking that there are no invalid characters is an important step. For example, in editing /etc/passwd (or a table from which it is created), you must make sure that there are no colons in any of the fields. Figure 15-1 adds some humor to this discussion.

Form Entry

The script in Listing 15-10 is a demonstration of handling user input with a menu and history. It uses the key-funcs library to get the user’s selection and to edit password fields. It has a hard-coded record and doesn’t read the /etc/passwd file. It checks for a colon in an entry and prints an error message if one is found.

The record is read into an array from a here document. A single printf statement prints the menu, using a format string with seven blanks and the entire array as its arguments.

Reading the Mouse

On the Linux console_codes man page, there is a section labeled “mouse tracking.” Interesting! It reads: “The mouse tracking facility is intended to return xterm-compatible mouse status reports.” Does that mean the mouse can be used in shell scripts?

According to that man page, mouse tracking is available in two modes: X10 compatibility mode, which sends an escape sequence on button press, and normal tracking mode, which sends an escape sequence on both button press and release. Both modes also send modifier-key information.

To test this, printf “\e[?9h” was first entered at a terminal window. This is the escape sequence that sets the “X10 Mouse Reporting (default off): Set reporting mode to 1 (or reset to 0)”. If you press the mouse button, the computer will beep and print “FB” on the screen. Repeating the mouse click at various points on the screen will net more beeps and “&% -( 5. =2 H7 T= ]C fG rJ }M.”

A mouse click sends six characters: ESC[Mbxy. The first three characters are common to all mouse events, the second three contain the button pressed, and the finals ones are the x and y locations of the mouse. To confirm this, save the input in a variable and pipe it to hexdump:

$ printf "\e[?9h"
$ read x
^[[M!MO            ## press mouse button and enter
$ printf "$x" | hexdump -C
00000000  1b 5b 4d 21 4d 4f                       |.[M!MO|

The first three appear as expected, but what are the final three? According to the man page, the lower two bits of the button character tell which button has been pressed; the upper bits identify the active modifiers. The x and y coordinates are the ASCII values to which 32 has been added to take them out of the range of control characters. The ! is 1,  is 2, and so on.

That gives us a 1 for the mouse button, which means button 2, since 0 to 2 are buttons 1, 2, and 3, respectively, and 4 is release. The x and y coordinates are 45 (O × 4d = 77; 77 – 32 = 45) and 47.

Surprisingly, since running across this information about mouse tracking in a Linux console_codes man page, it was found that these escape codes do not work in all Linux consoles. They work in xtermrxvt, and gnome-terminal on Linux and FreeBSD. They can also be used on FreeBSD and NetBSD, via ssh from a Linux rxvt terminal window. They do not work in a KDE konsole window.

You now know that mouse reporting works (in most xterm windows), and you can get information from a mouse click on the standard input. That leaves two questions: How do you read the information into a variable (without having to press Return), and how can the button and xy information be decoded in a shell script?

With bash, use the read command’s -n option with an argument to specify the number of characters. To read the mouse, six characters are needed:

read -n6 x

Neither of these is adequate for a real script (not all input will be mouse clicks, and you will want to get single keystrokes), but they suffice to demonstrate the concept.

The next step is to decode the input. For the purposes of this demonstration, you can assume that the six characters do indeed represent a mouse click and that the first three characters are ESC[, and M. Here we are only interested in the last three, so we extract them into three separate variables using POSIX parameter expansion:

m1=${x#???}    ## Remove the first 3 characters
m2=${x#????}   ## Remove the first 4 characters
m3=${x#?????}  ## Remove the first 5 characters

Then convert the first character of each variable to its ASCII value. This uses a POSIX printf extension: “If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote.”

printf -v mb "%d" "'$m1"
printf -v mx "%d" "'$m2"
printf -v my "%d" "'$m3"

Finally, interpret the ASCII values. For the mouse button, do a bitwise AND 3. For the x and y coordinates, subtract 32:

## Values > 127 are signed, so fix if less than 0
[ $mx -lt 0 ] && mx=$(( 255 + $mx ))
[ $my -lt 0 ] && my=$(( 255 + $my ))

BUTTON=$(( ($mb & 3) + 1 ))
MOUSEX=$(( $mx - 32 ))
MOUSEY=$(( $my - 32 ))

Putting it all together, the script in Listing 15-11 prints the mouse’s coordinates whenever you press a mouse button.

There are two sensitive areas on the top row. Clicking the left one toggles the mouse reporting mode between reporting only a button press and reporting the release as well. Clicking the right one exits the script.


Bash has a rich set of options for interactive programming. In this tutorial, you learned how to leverage that to read any keystroke, including function keys and others that return more than a single character.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.