I do this with shell scripting and xdotool
. For example, I have bound Meta+V to a shell script which runs another my yt-dlp
wrapper script by finding a terminal window and sending the following keystrokes to it:
- Ctrl+A: move cursor to the beginning of the line
- Ctrl+K: clear everything in front of the cursor
- Space , Y, T, P: type the name of the script,
ytp
, after a space so that the command does not get saved in the shell history. - Return
: to run the script.
This is what the script looks like:
#!/bin/bash
CLIPBOARD="$(xclip -o -sel c)"
[[ $CLIPBOARD = http* ]] || exit 1
TERMINAL_NAME="Konsole"
TERMINAL_EXEC="/usr/bin/konsole"
readarray -t WINDOW_LIST < <(xdotool search --all --class "$TERMINAL_NAME")
for WINDOW in "${WINDOW_LIST[@]}"; do
OUTPUT=$(xdotool "windowactivate" "$WINDOW" 2>&1)
if [ -z $OUTPUT ]; then
if [[ $(xset -q) =~ (Caps Lock: *on) ]]; then
CAPS=1
xdotool key Caps_Lock
fi
xdotool key --clearmodifiers --window "$WINDOW" ctrl+a ctrl+k space y t p Return
if (( CAPS )); then
xdotool key Caps_Lock
fi
# unstick keys if stuck
sleep 0.5 && xdotool keyup Meta_L Meta_R Alt_L Alt_R Super_L Super_R
exit 0
fi
done
#otherwise
$TERMINAL_EXEC -e "$HOME/bin/ytp"
The script also checks if the clipboard contains a URL (actually, a string starting with “http”), because if it doesn’t then running my ytp
script is pointless.
This is how I watch all online videos, my browser does not play multimedia.
I use the exact same approach to program keyboard macros for use in, say, a text editor, or wherever I need a sequence of keys typed exactly.
Here is another example, a video of an xdotool
based shell script automatically playing Snek:
Of course, such UI automation is one of the wonderful things about using X11. If you use Wayland, you’re out of luck (there are some similar utilities, but they all suck), consider using X11.