"journalctl -b -1" not working anymore

I ususally use the command “journalctl -b -1” to see the journal for the previous boot process. But this is not working anymore after I had to do I REISUB reboot to the hanging PC.

Now it says:

9# journalctl -b -1
Data from the specified boot (-1) is not available: No data available

And all other old boot logs are not accessible either, e.g. -b -20

Nevertheless, I see that all old logs are still there. If I do a simple journalctl I can see all the logs starting end of April. even the requeste -b -1 boot log is there. I see the amdgu crash and my subsequent REISUB reboot.


Jun 10 12:03:15 rakete kernel: kernel BUG at mm/slub.c:314!
Jun 10 12:03:15 rakete kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
Jun 10 12:03:15 rakete kernel: CPU: 15 PID: 13154 Comm: kworker/15:0 Tainted: P           OE     5.12.9-zen1-1-zen #1
Jun 10 12:03:15 rakete kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ULTRA/X570 AORUS ULTRA, BIOS F33 05/21/2021
Jun 10 12:03:15 rakete kernel: Workqueue: events drm_sched_job_timedout [gpu_sched]
Jun 10 12:03:15 rakete kernel: RIP: 0010:__slab_free+0x292/0x5b0
Jun 10 12:03:15 rakete kernel: Code: c9 0f 84 95 00 00 00 48 8b 44 24 78 65 48 2b 04 25 28 00 00 00 0f 85 80 02 00 00 48 8d 65 d8 5b 41 5c>
Jun 10 12:03:15 rakete kernel: RSP: 0018:ffffabcb80b1fc80 EFLAGS: 00010246
Jun 10 12:03:15 rakete kernel: RAX: ffff99bac3db6300 RBX: ffff99bac3db6200 RCX: ffff99bac3db6200
Jun 10 12:03:15 rakete kernel: RDX: 000000008020001f RSI: ffffd1a8440f6d00 RDI: ffff99bac0042a00
Jun 10 12:03:15 rakete kernel: RBP: ffffabcb80b1fd30 R08: 0000000000000001 R09: ffffffffc17659e1
Jun 10 12:03:15 rakete kernel: R10: ffff99c9ff242400 R11: ffffabcb80b1fa40 R12: ffffd1a8440f6d00
Jun 10 12:03:15 rakete kernel: R13: ffff99bac3db6200 R14: ffff99bac0042a00 R15: ffff99bac3db6200
Jun 10 12:03:15 rakete kernel: FS:  0000000000000000(0000) GS:ffff99c9bedc0000(0000) knlGS:0000000000000000
Jun 10 12:03:15 rakete kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 10 12:03:15 rakete kernel: CR2: 00005609a5bd1920 CR3: 0000000135a24000 CR4: 0000000000350ee0
Jun 10 12:03:15 rakete kernel: Call Trace:
Jun 10 12:03:15 rakete kernel:  ? __flush_work.isra.0+0x18e/0x210
Jun 10 12:03:15 rakete kernel:  ? kfd_gtt_sa_free+0x56/0x80 [amdgpu]
Jun 10 12:03:15 rakete kernel:  ? kernel_queue_uninit+0x81/0xe0 [amdgpu]
Jun 10 12:03:15 rakete kernel:  kernel_queue_uninit+0x81/0xe0 [amdgpu]
Jun 10 12:03:15 rakete kernel:  stop_cpsch+0xa0/0xc0 [amdgpu]
Jun 10 12:03:15 rakete kernel:  kgd2kfd_pre_reset+0x56/0x80 [amdgpu]
Jun 10 12:03:15 rakete kernel:  amdgpu_device_gpu_recover.cold+0x2a8/0xa58 [amdgpu]
Jun 10 12:03:15 rakete kernel:  amdgpu_job_timedout+0x128/0x150 [amdgpu]
Jun 10 12:03:15 rakete kernel:  drm_sched_job_timedout+0x64/0xe0 [gpu_sched]
Jun 10 12:03:15 rakete kernel:  process_one_work+0x214/0x3e0
Jun 10 12:03:15 rakete kernel:  worker_thread+0x4d/0x470
Jun 10 12:03:15 rakete kernel:  ? process_one_work+0x3e0/0x3e0
Jun 10 12:03:15 rakete kernel:  kthread+0x181/0x1b0
Jun 10 12:03:15 rakete kernel:  ? __kthread_init_worker+0x50/0x50
Jun 10 12:03:15 rakete kernel:  ret_from_fork+0x22/0x30
Jun 10 12:03:15 rakete kernel: Modules linked in: cfg80211 ccm algif_aead cbc des_generic libdes ecb algif_skcipher cmac md4 algif_hash af>
Jun 10 12:03:15 rakete kernel:  vboxnetflt(OE) vboxnetadp(OE) vboxdrv(OE) pkcs8_key_parser sg crypto_user
Jun 10 12:03:15 rakete kernel: ---[ end trace 009f5ee5bcaadc0c ]---
Jun 10 12:03:15 rakete kernel: RIP: 0010:__slab_free+0x292/0x5b0
Jun 10 12:03:15 rakete kernel: Code: c9 0f 84 95 00 00 00 48 8b 44 24 78 65 48 2b 04 25 28 00 00 00 0f 85 80 02 00 00 48 8d 65 d8 5b 41 5c>
Jun 10 12:03:15 rakete kernel: RSP: 0018:ffffabcb80b1fc80 EFLAGS: 00010246
Jun 10 12:03:15 rakete kernel: RAX: ffff99bac3db6300 RBX: ffff99bac3db6200 RCX: ffff99bac3db6200
Jun 10 12:03:15 rakete kernel: RDX: 000000008020001f RSI: ffffd1a8440f6d00 RDI: ffff99bac0042a00
Jun 10 12:03:15 rakete kernel: RBP: ffffabcb80b1fd30 R08: 0000000000000001 R09: ffffffffc17659e1
Jun 10 12:03:15 rakete kernel: R10: ffff99c9ff242400 R11: ffffabcb80b1fa40 R12: ffffd1a8440f6d00
Jun 10 12:03:15 rakete kernel: R13: ffff99bac3db6200 R14: ffff99bac0042a00 R15: ffff99bac3db6200
Jun 10 12:03:15 rakete kernel: FS:  0000000000000000(0000) GS:ffff99c9bedc0000(0000) knlGS:0000000000000000
Jun 10 12:03:15 rakete kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 10 12:03:15 rakete kernel: CR2: 00005609a5bd1920 CR3: 0000000135a24000 CR4: 0000000000350ee0
Jun 10 12:03:43 rakete smartd[4257]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 111 to 110
Jun 10 12:04:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:04:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:04:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:04:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:04:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:04:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:05:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:05:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:05:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:05:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:05:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:05:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:05:00 rakete kernel: amdgpu: Move buffer fallback to memcpy unavailable
Jun 10 12:05:00 rakete kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -19!
Jun 10 12:05:01 rakete crond[13330]: pam_unix(crond:session): session opened for user matthias(uid=1000) by (uid=0)
-- Boot 00000000000000000000000000000000 --
-- Boot 09d9dc88ec4a42cfb4abfb9563fc87a8 --
Jun 10 12:12:01 rakete kernel: Linux version 5.12.9-zen1-1-zen (linux-zen@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #1 Z>
Jun 10 12:12:01 rakete kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-linux-zen root=UUID=0a765f87-6eca-4e05-bd1a-36ac4ba3fb8f rw audit=0 >
Jun 10 12:12:01 rakete kernel: x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
Jun 10 12:12:01 rakete kernel: x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
Jun 10 12:12:01 rakete kernel: x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
Jun 10 12:12:01 rakete kernel: x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
Jun 10 12:12:01 rakete kernel: x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'compacted' format.

What I find odd is the line:

-- Boot 00000000000000000000000000000000 --

Is that eventually causing the issue? How can I fix that?

This kind of sounds like the journal has become corrupted (though this is only an “educated guess”).

You could try moving the journal files, rebooting, and see if you can access the regenerated files?

Works for me on Ryzen.

http://ix.io/3psj

check the db:

journalctl --verify

and review systemd-journald unit logs

journalctl -b -u systemd-journald
1 Like

When I do journalctl --list-boots I see a list of 265 boot logs without any error message.

The I tried this:

20# export SYSTEMD_LOG_LEVEL=debug
21# journalctl -b -6
[...]
Encountered invalid entry while bisecting, cutting algorithm short. (1)
Attempt to move to uninitialized object: 9519184
Encountered invalid entry while bisecting, cutting algorithm short. (1)
Whoopsie! We found a boot ID but can't read its last entry.
Data from the specified boot (-6) is not available: No data available
Root directory /run/log/journal removed.
Directory /var/log/journal/4bd88beaa35549b5922de02c8064cbf1 removed.
Root directory /var/log/journal removed.
mmap cache statistics: 6916 context cache hit, 1392 window list hit, 540 miss

This seems to be the culprit: Whoopsie! We found a boot ID but can't read its last entry.

That must have happened during the REISUB reboot.

Any idea how to fix a bad or missing boot ID?

As far as I can tell there’s no way to repair the file itself, so the question is do you need to retain historical journal entries for any reason?

If not, you can vacuum the journal which will remove the corrupted files, e.g.

sudo journalctl --vacuum-time=1w

will retain the past week of journals and entries.

2 Likes

I agree that it is probably good to cut off the bad piece of the journal. But I want to see if a repair is possible. I have created an issue for systemd to see what the experts are saying.

1 Like