[Resolved] Kernel 5.13.x: service failed to start

About a week ago I upgraded my zen kernel from 5.12.15 to 5.13.1 on an Arch system (hope it is alright to post about it here) and noticed that a service won’t start on this system. I posted about it on Arch forum: https://bbs.archlinux.org/viewtopic.php?id=267977

Today I installed linux-5.13.4 in the hope that the issue would have been resolved but unfortunately the issue still seems to be there. I don’t know if this is Arch-specific or has to do with the kernel upstream.

I have a Fedora install on the same machine and I’m waiting for the kernel upgrade to 5.13 to know if the issue is presented there as well.

I wonder if anyone else has stumbled upon this as well.

inxi -Fxxxz

System: Kernel: 5.13.4-arch1-1 x86_64 bits: 64 compiler: gcc v: 11.1.0 Desktop: GNOME 40.3 tk: GTK 3.24.30 wm: gnome-shell
dm: GDM 40.0 Distro: Arch Linux
Machine: Type: Laptop System: Dell product: XPS 13 9380 v: N/A serial: Chassis: type: 10 serial:
Mobo: Dell model: 0KTDY6 v: A00 serial: UEFI: Dell v: 1.13.1 date: 03/25/2021
Battery: ID-1: BAT0 charge: 42.7 Wh (100.0%) condition: 42.7/52.0 Wh (82.2%) volts: 8.6 min: 7.6
model: LGC-LGC6.73 DELL H754V8C type: Li-ion serial: status: Full
CPU: Info: Quad Core model: Intel Core i7-8565U bits: 64 type: MT MCP arch: Kaby Lake note: check rev: B cache:
L2: 8 MiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 32012
Speed: 699 MHz min/max: 400/4600 MHz Core speeds (MHz): 1: 699 2: 1009 3: 721 4: 1904 5: 1453 6: 867 7: 700 8: 700
Graphics: Device-1: Intel WhiskeyLake-U GT2 [UHD Graphics 620] vendor: Dell driver: i915 v: kernel bus-ID: 00:02.0
chip-ID: 8086:3ea0 class-ID: 0300
Device-2: Microdia Integrated_Webcam_HD type: USB driver: uvcvideo bus-ID: 1-5:2 chip-ID: 0c45:6723 class-ID: 0e02
Display: wayland server: X.Org 1.21.1.2 compositor: gnome-shell driver: loaded: i915
note: n/a (using device driver) - try sudo/root resolution: 1920x1080~60Hz s-dpi: 96
OpenGL: renderer: Mesa Intel UHD Graphics 620 (WHL GT2) v: 4.6 Mesa 21.1.5 direct render: Yes
Audio: Device-1: Intel Cannon Point-LP High Definition Audio vendor: Dell driver: snd_hda_intel v: kernel bus-ID: 00:1f.3
chip-ID: 8086:9dc8 class-ID: 0403
Sound Server-1: ALSA v: k5.13.4-arch1-1 running: yes
Sound Server-2: JACK v: 1.9.19 running: no
Sound Server-3: PulseAudio v: 14.99.2-1-g36fcf running: no
Sound Server-4: PipeWire v: 0.3.32 running: yes
Network: Device-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter vendor: Rivet Networks Killer 1435 Wireless-AC
driver: ath10k_pci v: kernel port: efa0 bus-ID: 02:00.0 chip-ID: 168c:003e class-ID: 0280
IF: wlp2s0 state: up mac:
Bluetooth: Device-1: Foxconn / Hon Hai type: USB driver: btusb v: 0.8 bus-ID: 1-7:3 chip-ID: 0489:e0a2 class-ID: e001
Report: rfkill ID: hci0 rfk-id: 1 state: up address: see --recommends
Drives: Local Storage: total: 465.76 GiB used: 58.43 GiB (12.5%)
ID-1: /dev/nvme0n1 vendor: Samsung model: SSD 970 EVO 500GB size: 465.76 GiB speed: 31.6 Gb/s lanes: 4
rotation: SSD serial: rev: 2B2QEXE7 scheme: GPT
Partition: ID-1: / size: 50 GiB used: 21.51 GiB (43.0%) fs: btrfs dev: /dev/nvme0n1p5
ID-2: /boot/efi size: 126 MiB used: 550 KiB (0.4%) fs: vfat dev: /dev/nvme0n1p4
ID-3: /home size: 50 GiB used: 21.51 GiB (43.0%) fs: btrfs dev: /dev/nvme0n1p5
ID-4: /var/log size: 50 GiB used: 21.51 GiB (43.0%) fs: btrfs dev: /dev/nvme0n1p5
Swap: ID-1: swap-1 type: partition size: 8 GiB used: 0 KiB (0.0%) priority: -2 dev: /dev/nvme0n1p7
ID-2: swap-2 type: zram size: 1024 MiB used: 0 KiB (0.0%) priority: 100 dev: /dev/zram0
Sensors: System Temperatures: cpu: 46.0 C mobo: N/A
Fan Speeds (RPM): cpu: 0 fan-2: 0
Info: Processes: 255 Uptime: 16m wakeups: 2323 Memory: 7.42 GiB used: 1.51 GiB (20.3%) Init: systemd v: 249 Compilers:
gcc: 11.1.0 Packages: 950 pacman: 939 flatpak: 11 Shell: Bash v: 5.1.8 running-in: gnome-terminal inxi: 3.3.04

I don’t use anything Thunderbolt, and I’m not sure how to interpret the following output, but it seems different from yours:

$ systemctl --failed
  UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.
$ systemctl status bolt
○ bolt.service - Thunderbolt system service
     Loaded: loaded (/usr/lib/systemd/system/bolt.service; static)
     Active: inactive (dead)
       Docs: man:boltd(8)

Regular kernel, I don’t have linux-zen installed, sorry.

1 Like

I just updated and i checked on regular kernel. I have no failed services.

[ricklinux@eos-kde ~]$ systemctl --failed
  UNIT LOAD ACTIVE SUB DESCRIPTION
0 loaded units listed.
[ricklinux@eos-kde ~]$ 

Same here as i don’t have thunderbolt.

[ricklinux@eos-kde ~]$ systemctl status bolt
○ bolt.service - Thunderbolt system service
     Loaded: loaded (/usr/lib/systemd/system/bolt.service; static)
     Active: inactive (dead)
       Docs: man:boltd(8)
[ricklinux@eos-kde ~]$ 
1 Like

@pebcak

You have a thunderbolt port on the XPS? Then did you try enable and start the service?

1 Like

Thanks @ReemZ and @ricklinux for your replies.

Yes. This is from lspci -vmm:

Slot:	03:00.0
Class:	PCI bridge
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016]
PhySlot:	12
Rev:	02

Slot:	04:00.0
Class:	PCI bridge
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016]
Rev:	02

Slot:	04:01.0
Class:	PCI bridge
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016]
Rev:	02

Slot:	04:02.0
Class:	PCI bridge
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016]
Rev:	02

Slot:	04:04.0
Class:	PCI bridge
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 Bridge (C step) [Alpine Ridge 4C 2016]
Rev:	02

Slot:	05:00.0
Class:	System peripheral
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 NHI (C step) [Alpine Ridge 4C 2016]
SVendor:	Dell
SDevice:	Device 08af
Rev:	02

Slot:	39:00.0
Class:	USB controller
Vendor:	Intel Corporation
Device:	JHL6540 Thunderbolt 3 USB Controller (C step) [Alpine Ridge 4C 2016]
SVendor:	Dell
SDevice:	Device 08af
Rev:	02
ProgIf:	30

I have never needed to enable and start bolt.service since it was on by default. It got started at boot with 5.12.xx kernels and it does so with the LTS kernel as well. I suspect there might be a “bug” in the 5.13 series or something of the sort. Trying to restart the service will fail as well:

sudo systemctl restart bolt.service         
Job for bolt.service failed because a fatal signal was delivered causing the control process to dump core.
See "systemctl status bolt.service" and "journalctl -xeu bolt.service" for details.
 systemctl status bolt.service 
× bolt.service - Thunderbolt system service
     Loaded: loaded (/usr/lib/systemd/system/bolt.service; static)
     Active: failed (Result: core-dump) since Wed 2021-07-21 21:31:27 CEST; 1min 26s ago
       Docs: man:boltd(8)
    Process: 10242 ExecStart=/usr/lib/boltd (code=dumped, signal=SEGV)
   Main PID: 10242 (code=dumped, signal=SEGV)
        CPU: 39ms
journalctl -xeu bolt.service 
Jul 21 21:31:27 arch-gnome boltd[10242]: probing: adding /sys/devices/pci0000:00/0000:00:1d.0/0000:03:00.0 to roots
Jul 21 21:31:27 arch-gnome boltd[10242]: [d2030000-0080-domain0                    ] bootacl: synchronizing journal
Jul 21 21:31:27 arch-gnome boltd[10242]: security level set to 'user'
Jul 21 21:31:27 arch-gnome boltd[10242]: [d2030000-0080-domain0                    ] connected: as domain0 [user] (/sys/devices/pci0000:00/0000:00:1d.0/0000:03:00.0/0000:04:00.0/0000:05:00.0/domain0)
Jul 21 21:31:27 arch-gnome boltd[10242]: [d2030000-0080-XPS 9380                   ] udev: failed to get device info: could not read 'authorized': No such file or directory
Jul 21 21:31:27 arch-gnome systemd[1]: bolt.service: Main process exited, code=dumped, status=11/SEGV
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ An ExecStart= process belonging to unit bolt.service has exited.
░░ 
░░ The process' exit code is 'dumped' and its exit status is 11.
Jul 21 21:31:27 arch-gnome systemd[1]: bolt.service: Failed with result 'core-dump'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ The unit bolt.service has entered the 'failed' state with result 'core-dump'.
Jul 21 21:31:27 arch-gnome systemd[1]: Failed to start Thunderbolt system service.
░░ Subject: A start job for unit bolt.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit bolt.service has finished with a failure.
░░ 
░░ The job identifier is 4874 and the job result is failed.
Jul 21 21:31:27 arch-gnome systemd[1]: bolt.service: Scheduled restart job, restart counter is at 10.
░░ Subject: Automatic restarting of a unit has been scheduled
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ Automatic restarting of the unit bolt.service has been scheduled, as the result for
░░ the configured Restart= setting for the unit.
Jul 21 21:31:27 arch-gnome systemd[1]: Stopped Thunderbolt system service.
░░ Subject: A stop job for unit bolt.service has finished
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A stop job for unit bolt.service has finished.
░░ 
░░ The job identifier is 4995 and the job result is done.
Jul 21 21:31:27 arch-gnome systemd[1]: bolt.service: Start request repeated too quickly.
Jul 21 21:31:27 arch-gnome systemd[1]: bolt.service: Failed with result 'core-dump'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ The unit bolt.service has entered the 'failed' state with result 'core-dump'.
Jul 21 21:31:27 arch-gnome systemd[1]: Failed to start Thunderbolt system service.
░░ Subject: A start job for unit bolt.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit bolt.service has finished with a failure.
░░ 
░░ The job identifier is 4995 and the job result is failed.

sudo dmesg | grep boltd
[35848.815330] boltd[10117]: segfault at 20 ip 00007fca3a921547 sp 00007ffe5c27d8c8 error 4 in libc-2.33.so[7fca3a7e7000+14b000]
[35848.815458] audit: type=1701 audit(1626895867.426:270): auid=4294967295 uid=0 gid=0 ses=4294967295 subj==unconfined pid=10117 comm="boltd" exe="/usr/lib/boltd" sig=11 res=1
[35854.969314] boltd[10149]: segfault at 20 ip 00007f0955746547 sp 00007ffcef982508 error 4 in libc-2.33.so[7f095560c000+14b000]
[35854.969429] audit: type=1701 audit(1626895873.580:296): auid=4294967295 uid=0 gid=0 ses=4294967295 subj==unconfined pid=10149 comm="boltd" exe="/usr/lib/boltd" sig=11 res=1
[35861.155769] boltd[10178]: segfault at 20 ip 00007f220ee45547 sp 00007ffd430726b8 error 4 in libc-2.33.so[7f220ed0b000+14b000]
[35861.155837] audit: type=1701 audit(1626895879.766:320): auid=4294967295 uid=0 gid=0 ses=4294967295 subj==unconfined pid=10178 comm="boltd" exe="/usr/lib/boltd" sig=11 res=1
[35867.769888] boltd[10217]: segfault at 20 ip 00007f2848650547 sp 00007ffc00b13d08 error 4 in libc-2.33.so[7f2848516000+14b000]
[35867.770008] audit: type=1701 audit(1626895886.380:356): auid=4294967295 uid=0 gid=0 ses=4294967295 subj==unconfined pid=10217 comm="boltd" exe="/usr/lib/boltd" sig=11 res=1

I wonder if it’s just related to Gnome?

I have the software on my plasma for thunderbolt but no device.

But it did work as expected before with linux-zen-5.12.xx and it still does with linux-lts-5.10.52.

Well then it must be as you say an issue with the kernel updates.

I suspect that. This is a dual-boot system Arch-Fedora. I haven’t checked on Fedora today to see if the update to 5.13 has arrived. Then we could perhaps know if the issue is in Arch or in 5.13 in general. I’ll report when Fedora gets the 5.13 kernel.

I have the same problem on KDE Plasma. I think it has something to do with the security keys required when connecting a thunderbolt device.

Shouldn’t that just be checked everytime a device is connected, not at boot?

I think it loads the service at boot and if is a problem then reports so.

I got my kernel updated to
Linux fedora 5.13.4-200.fc34.x86_64 #1 SMP Tue Jul 20 20:27:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
om my Fedora 34 on the same machine as above. The bolt.service starts at boot and runs as normally:

systemctl status bolt.service 
● bolt.service - Thunderbolt system service
     Loaded: loaded (/usr/lib/systemd/system/bolt.service; static)
     Active: active (running) since Sat 2021-07-24 14:26:01 CEST; 7min ago
       Docs: man:boltd(8)
   Main PID: 821 (boltd)
     Status: "authmode: enabled, force-power: unset"
      Tasks: 4 (limit: 9058)
     Memory: 2.3M
        CPU: 120ms
     CGroup: /system.slice/bolt.service
             └─821 /usr/libexec/boltd

Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-domain0                    ] bootacl: synchronizing journal
Jul 24 14:26:01 fedora boltd[821]: security level set to 'user'
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-domain0                    ] connected: as domain0 [user] (/sys/devices/pci0000:00/0000:00:1d.0/0>
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-XPS 9380                   ] udev: failed to get device info: could not read 'authorized': No suc>
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-XPS 9380                   ] parent is (null)...
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-XPS 9380                   ] store: updating device
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-XPS 9380                   ] connected: unknown (/sys/devices/pci0000:00/0000:00:1d.0/0000:03:00.>
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-domain0                    ] dbus: exported domain at /org/freedesktop/bolt/domains/d2030000_0080>
Jul 24 14:26:01 fedora boltd[821]: [d2030000-0080-XPS 9380                   ] dbus: exported device at /org/freedesktop/bolt/devices/d2030000_0080>
Jul 24 14:26:01 fedora systemd[1]: Started Thunderbolt system service.

So the problem seems to be on the Arch side (cannot speak for other distros using 5.13 kernel).

2 Likes

You can downgrade kernel or bolt while Arch side is working on it (https://bugs.archlinux.org/task/71569). You could also patch Bolt (https://bbs.archlinux.org/viewtopic.php?pid=1984770#p1984770)

This worked for me (Endeavour/Plasma):

sudo pacman -U https://archive.archlinux.org/packages/b/bolt/bolt-0.8-3-x86_64.pkg.tar.zst

1 Like

Welcome to EnOS forum @mercibe and thanks for your reply!
What puzzles me is the fact that bolt.service starts and runs as normal with bolt 0.9.1-1 on LTS kernel. However there seems to be some recent development on the 5.13 kernel front. See the last comment by @loqs at https://bugs.archlinux.org/task/71569

1 Like

Reverting changes in the 5.13 kernel causing the Thunderbolt issue underway:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git/commit/?h=usb-linus

Working again!

Comment by Steve Balboa (steveb) - Friday, 13 August 2021, 20:19 GMT
Problem solved with 5.13.10-arch1-1

:globe_with_meridians: https://bugs.archlinux.org/task/71569