Shutdown and reboot problems

I’m having issues where during a shutdown or reboot, my system hangs for a long time. My system info is as follows:

System:
  Kernel: 6.4.3-arch1-2 arch: x86_64 bits: 64 compiler: gcc v: 13.1.1
    parameters: initrd=\dd3288db44ed4f3c8eed9562f23181e1\6.4.3-arch1-2\initrd
    archisobasedir=arch archisolabel=EOS_202112 rw
    root=UUID=792fd0e5-0942-45cc-b07b-400213890a85 rootfstype=btrfs
    rootflags=subvol=@ quiet resume=/dev/sdb2 loglevel=3 resume_offset=1840384
    nowatchdog nvidia-drm.modeset=1 nomsi
    systemd.machine_id=dd3288db44ed4f3c8eed9562f23181e1
  Desktop: KDE Plasma v: 5.27.6 tk: Qt v: 5.15.10 info: latte-dock
    wm: kwin_wayland vt: 1 dm: SDDM Distro: EndeavourOS base: Arch Linux
Machine:
  Type: Laptop System: HP product: HP Pavilion Laptop 14-bf1xx
    v: Type1ProductConfigId serial: <superuser required> Chassis: type: 10
    serial: <superuser required>
  Mobo: HP model: 83CE v: 59.38 serial: <superuser required> UEFI: Insyde
    v: F.31 date: 10/30/2017
Battery:
  ID-1: BAT1 charge: 21.9 Wh (60.5%) condition: 36.2/37.1 Wh (97.6%)
    volts: 11.5 min: 11.6 model: Hewlett-Packard PABAS0241231 type: Li-ion
    serial: <filter> status: discharging
CPU:
  Info: model: Intel Core i7-8550U bits: 64 type: MT MCP arch: Coffee Lake
    gen: core 8 level: v3 note: check built: 2017 process: Intel 14nm family: 6
    model-id: 0x8E (142) stepping: 0xA (10) microcode: 0xF2
  Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
    L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
    L3: 8 MiB desc: 1x8 MiB
  Speed (MHz): avg: 1400 high: 2000 min/max: 400/4000 scaling:
    driver: intel_pstate governor: powersave cores: 1: 2000 2: 800 3: 2000
    4: 2000 5: 849 6: 2000 7: 800 8: 752 bogomips: 32012
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3
  Vulnerabilities:
  Type: itlb_multihit status: KVM: VMX unsupported
  Type: l1tf mitigation: PTE Inversion
  Type: mds mitigation: Clear CPU buffers; SMT vulnerable
  Type: meltdown mitigation: PTI
  Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
  Type: retbleed mitigation: IBRS
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: IBRS, IBPB: conditional, STIBP: conditional,
    RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds mitigation: Microcode
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Intel UHD Graphics 620 vendor: Hewlett-Packard driver: i915
    v: kernel arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports:
    active: eDP-1 empty: HDMI-A-1 bus-ID: 00:02.0 chip-ID: 8086:5917
    class-ID: 0300
  Device-2: NVIDIA GM108M [GeForce 940MX] vendor: Hewlett-Packard
    driver: nouveau v: kernel alternate: nvidia_drm,nvidia non-free: 535.xx+
    status: current (as of 2023-07) arch: Maxwell code: GMxxx
    process: TSMC 28nm built: 2014-19 pcie: gen: 1 speed: 2.5 GT/s lanes: 4
    link-max: gen: 3 speed: 8 GT/s bus-ID: 01:00.0 chip-ID: 10de:134d
    class-ID: 0302
  Device-3: Chicony HP Wide Vision HD Camera driver: uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-5:3 chip-ID: 04f2:b5d6
    class-ID: 0e02
  Display: wayland server: X.org v: 1.21.1.8 with: Xwayland v: 23.1.2
    compositor: kwin_wayland driver: X: loaded: modesetting unloaded: nvidia
    dri: iris,nouveau gpu: i915,nouveau display-ID: 0
  Monitor-1: eDP-1 res: 1920x1080 size: N/A modes: N/A
  API: OpenGL v: 4.6 Mesa 23.1.3 renderer: Mesa Intel UHD Graphics 620 (KBL
    GT2) direct-render: Yes
Audio:
  Device-1: Intel Sunrise Point-LP HD Audio vendor: Hewlett-Packard
    driver: snd_hda_intel v: kernel alternate: snd_soc_skl,snd_soc_avs
    bus-ID: 00:1f.3 chip-ID: 8086:9d71 class-ID: 0403
  API: ALSA v: k6.4.3-arch1-2 status: kernel-api
    tools: alsactl,alsamixer,amixer
  Server-1: JACK v: 1.9.22 status: off tools: N/A
  Server-2: PipeWire v: 0.3.74 status: off with: pipewire-media-session
    status: active tools: pw-cli
  Server-3: PulseAudio v: 16.1 status: active with: 1: pulseaudio-alsa
    type: plugin 2: pulseaudio-jack type: module tools: pacat,pactl,pavucontrol
Network:
  Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
    vendor: Hewlett-Packard driver: r8168 v: 8.051.02-NAPI modules: r8169 pcie:
    gen: 1 speed: 2.5 GT/s lanes: 1 port: 3000 bus-ID: 02:00.0
    chip-ID: 10ec:8168 class-ID: 0200
  IF: eno1 state: down mac: <filter>
  Device-2: Intel Wireless 7265 driver: iwlwifi v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 bus-ID: 03:00.0 chip-ID: 8086:095a class-ID: 0280
  IF: wlan0 state: up mac: <filter>
  IF-ID-1: docker0 state: down mac: <filter>
  IF-ID-2: tailscale0 state: unknown speed: -1 duplex: full mac: N/A
Bluetooth:
  Device-1: Intel Bluetooth wireless interface driver: btusb v: 0.8 type: USB
    rev: 2.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-4:2 chip-ID: 8087:0a2a
    class-ID: e001
  Report: rfkill ID: hci0 rfk-id: 1 state: up address: see --recommends
Drives:
  Local Storage: total: 1.36 TiB used: 645.06 GiB (46.2%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/sda maj-min: 8:0 vendor: Seagate model: ST1000LM035-1RK172
    size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
    tech: HDD rpm: 5400 serial: <filter> fw-rev: RSM7 scheme: GPT
  ID-2: /dev/sdb maj-min: 8:16 vendor: Western Digital
    model: WDS500G2B0B-00YS70 size: 465.76 GiB block-size: physical: 512 B
    logical: 512 B speed: 6.0 Gb/s tech: SSD serial: <filter> fw-rev: 20WD
    scheme: GPT
Partition:
  ID-1: / raw-size: 464.76 GiB size: 464.76 GiB (100.00%)
    used: 318.74 GiB (68.6%) fs: btrfs dev: /dev/sdb2 maj-min: 8:18
  ID-2: /home raw-size: 464.76 GiB size: 464.76 GiB (100.00%)
    used: 318.74 GiB (68.6%) fs: btrfs dev: /dev/sdb2 maj-min: 8:18
  ID-3: /var/log raw-size: 464.76 GiB size: 464.76 GiB (100.00%)
    used: 318.74 GiB (68.6%) fs: btrfs dev: /dev/sdb2 maj-min: 8:18
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
  ID-1: swap-1 type: file size: 20 GiB used: 0 KiB (0.0%) priority: -2
    file: /swap/swapfile
Sensors:
  System Temperatures: cpu: 45.0 C pch: 37.5 C mobo: N/A
  Fan Speeds (RPM): N/A
Info:
  Processes: 301 Uptime: 10m wakeups: 1 Memory: total: 16 GiB
  available: 15.54 GiB used: 4.05 GiB (26.1%) Init: systemd v: 253
  default: graphical tool: systemctl Compilers: gcc: 13.1.1 clang: 15.0.7
  Packages: pm: pacman pkgs: 1897 libs: 386 tools: paru Shell: fish v: 3.6.1
  running-in: tmux: inxi: 3.3.28

Here are my logs.

As you’ll see, the shutdown starts at 12:15:23, some issues start occurring at 12:15:25 and the system basically stops doing anything after 12:15:27 after saying Stopped Home Area Manager. Nothing else happens until I shut it down manually using Magic SysRq.

I’m not sure what to do to fix this. It happens at every boot. This log is for wayland, but it happens in X sessions as well. If anyone has any suggestions, please let me know.

I read in another topic that someone else on KDE Plasma and Cinnamon were having trouble with the new Nvidia drivers on those DE’s.
Jul 19 12:15:25 pavilion kwin_wayland[1484]: kwin_wayland_drm: Presentation failed! Permission denied
I see a lot of these errors, I would try downgrading the Nvidia driver and see if the shutdown problems go away then.

3 Likes

If you let it sit on a shutdown, time it. Is it about 90 seconds before it powers off or restarts? If so, there’s an errant process that isn’t shutting down that eventually gets killed.

1 Like

It takes around twice that.

If it is hitting systemd 90 second timeouts once or twice, then they are usually visible on the console as the shutdown proceeds? Since you didn’t provide a comment or picture relating, I guess that you don’t see this?

Maybe you need to shutdown kde/plasma and revert to multi-user mode before starting the shutdown?

sudo systemctl isolate multi-user.target

will kill graphical session and switch to multi-user where you should login to root, or login to your account and then sudo -s, then you can use

shutdown --now

and hopefully you will see if systemd is timing out some units, and thus what they are, since they won’t be in the log because systemd-journald was already shutdown.

1 Like

I think we’re on to something here. I got

[15342.335923] systemd-shutdown[1]: Waiting for process: 102370 (umount.nfs4)

I have an nfs4 mount, mounted via autofs. That’s probably the cause of this. I tried running

sudo umount -a -f -t nfs4

before shutting down and that seems to get rid of this delay.

Then, I added

[Service]
ExecStop=umount -a -f -t nfs4

by using

sudo systemctl edit autofs

but that doesn’t seem to do anything.

Edit: Running it manually isn’t a silver bullet either. (I guess autofs mounts the drive before I’m able to reboot.) Anyway, I get this after waiting almost 3 minutes.


Also, the port 41641 mentioned in the UFW BLOCK messages, is supposed to be related to tailscale, which is part of the autofs mount setup.

i would create a shutdown service to unmount the nfs4 partition since the service should execute before shutdown it may speed the process up.

[Unit]
Description=Run Task as Shutdown
DefaultDependencies=no
Before=shutdown.target

[Service]
Type=oneshot
ExecStart=umount -a -f -t nfs4
TimeoutStartSec=0

[Install]
WantedBy=shutdown.target

maybe this can give you a start

2 Likes

After adding and enabling this service, on the first reboot it seemed to work. But when I rebooted again, it didn’t do anything. Well, it did something, the error are slightly different now.

If it is tailscale dependent, or has other dependencies, then you need to codify the dependencies in the service unit. Also rather that hacking the unit files for standard units you are better making a dropin directory in /etc/systemd/system/name.service.d and adding the snippet there, this causes significantly less hassle on ungrade.

Maybe you need to specify some ports to lock down that udp traffic to a predictable port and then allow it thru the firewall. Or maybe lock to ipv4 not v6 too, depends on what you are doing and how you want to do it.

2 Likes

After some digging around, it looks like this is happening due to network being down before autofs is shut down and hence it’s unable to unmount properly. The only method that works consistently to remedy this is running sudo systemctl stop autofs before shutdown.

Adding this to the systemd unit you provided does not work. Neither does a script inside /etc/NetworkManager/dispatcher.d/pre-down.d/. This is very weird, as I think none of these should be needed in the first place as autofs.service has After=...network.target network-online.target..., which should stop the process before network goes down.

If there were a way to specify a script to be run before anything else during the shutdown sequence, I think that should solve this issue.

i have never tried but have you tried to concatenation the command? something like
systemctl stop autofs && umount -a -f -t nfs4

If the above fails to run in the service you may just want to create a bash script and place the above command in there and have the bash in either /usr/bin or /usr/local/bin.

1 Like

None of these work. Only running sudo systemctl stop autofs manually stops the delay.

It sounds like systemctl doesn’t have the correct dependencies for shutdown. I’ve never had to look for shutdown dependencies before, but I’m sure systemctl can do it, it is just how much effort it takes to understand and adjust the configuration to make it work. Seems like you aren’t the only with with these type of requirements.

1 Like

So I figured out a rather hacky way to solve this. I created a script with just systemctl stop autofs and added it to KDE as a logout script. Then, to avoid entering password everytime at logout, I created the following file.

/etc/polkit-1/rules.d/autofs-service.rules
--------------------------------------------
polkit.addRule(function(action, subject) {
    if (action.id == "org.freedesktop.systemd1.manage-units" &&
        subject.isInGroup("usergroup")) {
        if (action.lookup("unit") == "autofs.service") {
            var verb = action.lookup("verb");
            if (verb == "start" || verb == "stop" || verb == "restart") {
                return polkit.Result.YES;
            }
        }
    }
});

It works. But if possible, I’d like a nicer solution. But that’d probably entail figuring out a way so that the services are stopped in proper order.

3 Likes

Not an alternative I would have even though of trying. But … if it works for you then it works. You probably don’t login to you desktop/laptop remotely and update and reboot it, which I do often enough across multiple hosts that I would be looking to understand systemctl for a dependency solution.

It’s a laptop. I almost never ssh into it, so that’s not really an issue. I wish someone figures out a systemd way of doing it, though.

I’m currently trying to convince systemd to unmount /var/log after shutting down systemd-journald, so that might inform things.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.