Linux 6.9 seems stuck on one core

The Linux 6.9 Kernel has been very problematic for me. When I boot into it, all my applications seem stuck running on only one core rather than spread over all the cores my CPU offers. This slows down my applications and games greatly.

I’ve got an older motherboard and an pre-Ryzen CPU, an AMD FX-6300. When booting a 6.8 series kernel, my performance is just fine. When booting a 6.9 kernel, nothing I can do seems to discourage my system from running absolutely everything it can on Core 0 until it pegs at 100% usage, and then leaves the other cores almost completely unused. I’ve tried using cpupower to set all my cores to ‘OnDemand’ governors, but that doesn’t seem to accomplish much. I’ve been researching the problem, and don’t see much to illuminate what might be happening. I’ve used both mainline and zen kernels with very similar results. I know the 6.9 kernel gained better support for preferred cores on Ryzen systems, but the FX-6300 predates Ryzen by quite a bit. I’m at a loss and trying to figure out the best way to move forward.

Any help is appreciated.

Here’s the output of inxi -FAZ for my system running the 6.8 kernel that works best for me:

System:
Kernel: 6.8.9-zen1-2-zen arch: x86_64 bits: 64
Desktop: Cinnamon v: 6.0.4 Distro: EndeavourOS
Machine:
Type: Desktop Mobo: Gigabyte model: GA-970A-UD3 serial:
BIOS: Award v: F4 date: 10/13/2011
CPU:
Info: 6-core model: AMD FX-6300 bits: 64 type: MT MCP cache: L2: 6 MiB
Speed (MHz): avg: 1405 min/max: 1400/3500 cores: 1: 1400 2: 1423 3: 1400
4: 1400 5: 1409 6: 1400
Graphics:
Device-1: NVIDIA TU117 [GeForce GTX 1650] driver: nvidia v: 550.67
Display: x11 server: X.Org v: 21.1.13 with: Xwayland v: 24.1.0 driver: X:
loaded: nvidia unloaded: modesetting gpu: nvidia,nvidia-nvswitch
resolution: 1920x1080~60Hz
API: EGL v: 1.5 drivers: nvidia,swrast
platforms: gbm,x11,surfaceless,device
API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: nvidia mesa v: 550.67
renderer: NVIDIA GeForce GTX 1650/PCIe/SSE2
Audio:
Device-1: AMD SBx00 Azalia driver: snd_hda_intel
Device-2: NVIDIA driver: snd_hda_intel
API: ALSA v: k6.8.9-zen1-2-zen status: kernel-api
Server-1: PipeWire v: 1.0.7 status: active
Network:
Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
driver: r8169
IF: enp3s0 state: up speed: 1000 Mbps duplex: full mac: 50:e5:49:c8:17:9a
IF-ID-1: tun0 state: unknown speed: 10000 Mbps duplex: full mac: N/A
Drives:
Local Storage: total: 18.41 TiB used: 5.77 TiB (31.3%)
ID-1: /dev/sda vendor: Toshiba model: HDWF180 size: 7.28 TiB
ID-2: /dev/sdb vendor: Toshiba model: MG03ACA400 size: 3.64 TiB
ID-3: /dev/sdc vendor: PNY model: CS900 240GB SSD size: 223.57 GiB
ID-4: /dev/sdd vendor: Seagate model: Backup+ Desk size: 4.55 TiB
type: USB
ID-5: /dev/sde vendor: Western Digital model: WD My Book 1130
size: 2.73 TiB type: USB
Partition:
ID-1: / size: 214.77 GiB used: 101.83 GiB (47.4%) fs: btrfs dev: /dev/dm-1
ID-2: /home size: 214.77 GiB used: 101.83 GiB (47.4%) fs: btrfs
dev: /dev/dm-1
ID-3: /var/log size: 214.77 GiB used: 101.83 GiB (47.4%) fs: btrfs
dev: /dev/dm-1
Swap:
ID-1: swap-1 type: partition size: 8.8 GiB used: 0 KiB (0.0%) dev: /dev/dm-0
Sensors:
System Temperatures: cpu: 31.4 C mobo: N/A gpu: nvidia temp: 36 C
Fan Speeds (rpm): N/A gpu: nvidia fan: 56%
Info:
Memory: total: 16 GiB available: 15.6 GiB used: 2.42 GiB (15.5%)
Processes: 278 Uptime: 1h 2m Shell: Zsh inxi: 3.3.34

Is there an auto overclock feature turned on in the Bios? I also notice your Bios is version F4 2011. There are 4 newer Bios versions with newer AMD AGESA for this board. I assume the motherboard is revision 1.0/1.1

Edit: If turbo mode is set enabled it may not be able to throttle properly.

It is a 1.1 board. I flashed the latest bios available from gigabyte’s website, but it’s still showing the same behavior when booting into the mainline 6.9 kernel.

There’s one of my favorite testing loads. On 6.8.9, Factorio spreads out over all 6 cores while loading all its textures to the GPU. On 6.9, it picks just one and sticks with it.

The motherboard does have a ‘Performance Boost’ setting. I’ve left it set to ‘Auto’.

Have you tried the LTS kernel?

I have tried the LTS kernel, currently 6.6.32. It’s in the same boat with the 6.8.9… ie, it works normally. The problem seems to be limited to the 6.9 series.

So you can use the LTS kernel for now?
Maybe the default kernel gets a fix later.

1 Like

And you could/should file a bug report to kernel developers. Fortunately this behaviour is reproduceable and came with 6.9 line. So I guess it can easier be fixed than the weird from time to time issues :wink:

https://bugzilla.kernel.org/

I’d considered filing a kernel bug, but kernel.org strongly discourages individuals filing their own bugs in favor of going through their distribution.

1 Like

Right now, that’s my fallback plan. With a system as old as mine, I’m not buying myself terribly a lot by using bleeding edge kernels. However, bleeding edge eventually becomes LTS, and since this problem has gone through two dot releases with the same behavior now, I felt it best to start mentioning it, especially since I can’t seem to find anyone else who has the same combo of hardware and kernel issue.

The AUR has lots of LTS kernels, so that might be a fallback also in the long run unless they fix the mainline kernel.

Have you tried disabling it?

I have tried disabling it, but with no change.

Reporting it would require you to build your own Kernel but it is all doable https://www.kernel.org/doc/html/latest/admin-guide/reporting-issues.html

I tried again today with the 6.9.3 mainline and zen kernels, but with the same results. 6.3.32 LTS works just fine as does 6.8.9 zen.

I’m considering:

a) trying to bring this up with the Arch packagers
b) trying to build my own 6.9 kernel. I’m still reading about how Arch handles kernel builds.

Any advice for either of those would be quite welcome.

OpenSUSE has a bug open for this:

https://bugzilla.opensuse.org/show_bug.cgi?id=1225968

And there is a discussion thread in their forum but with no solution:

3 Likes

Thank you so much for pointing this thread out! It wasn’t there when I posted this issue here. I chimed into the Opensuse thread to add confirmation. This does appear to be a kernel issue and not an OS-specific issue.

So this has apparently hit the LKML from the Arch side of things:

https://lore.kernel.org/all/7skhx6mwe4hxiul64v6azhlxnokheorksqsdbp7qw6g2jduf6c@7b5pvomauugk/

and has a patch submitted:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a693b9c95abd4947c2d06e05733de5d470ab6586

that will hopefully prove a fix for those of us affected on eventually all distros.

My understanding of the bug is that the Linux Kernel is mis-identifying FX/Piledriver-series CPUs like mine, the one in the LKML post, and the Suse forums post as Zen-series CPUs, and that using the wrong SMT topology causes the ‘locked on one core’ behavior.

Thank you all so much for looking into this guy with me. Thank you mbod for pointing out the Suse discussion. I’ve weighed in there and am tangentially mentioned in the Suse bug report.

This is one of the many reasons why the Linux community is amazing. :trophy:

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.