FPS drops in games

Hi!

I know a thread is going on already, though my system is a little different and I’m unsure the same solutions are for me.

I’m running a 7950x3d, 7900XTX and 96GB of RAM. I have a swapfile on NVME.

What I’m seeing is random periodic drops in FPS. I was just playing Insurgency Sandstorm and saw it there. I play Minecraft and see it there too. I’ll be playing and suddenly the game freezes for ~5 seconds and then keeps going.

This time i happened to have BTOP opened on my second monitor - I saw nothing that looked like another process coming along. I did see the GPU load drop to zero for that 5 seconds though.

I’m running the normal kernel (not zen) and I run my games with feral game mode (gamemoderun %command% in Steam). I’m using the default gamemode config located here: https://github.com/FeralInteractive/gamemode/blob/master/example/gamemode.ini

See the dips in the screenshot of btop:

Are there some good places to start?

Too much ram available, system can’t find the entries anymore :wink:

Both, Minecraft as well as Insurgency shouldn’t be an issue for your hardware. Therefore I can only provide the usual advice to check if your system, the motherboards bios and such are up to date.

And ask you to provide at least the output of inxi -Fxxc0z | eos-sendlog and your journalctl, as mentioned in how to include systemlogs in your post?

Hardware info: https://0x0.st/856-.txt
Journalctl: https://0x0.st/856Z.txt

I played again with mangohud running to track my temps a little better - never did either CPU or GPU break the mid 60s, and maybe 5 minutes in I got the freeze.

My BIOS might be a little out of date, but nothing egregious - maybe the March 2025 version.

I looked at my journalctl as well and see this - I don’t remember which minute the freeze happened in:

Jul 27 06:43:36 citadel steam[3081247]: wine: setpriority -10 for pid -1 failed: 3
Jul 27 06:44:10 citadel wireplumber[1824]: wp-event-dispatcher: <WpAsyncEventHook:0x55d7c20c5230> failed: <WpSiStandardLink:0x55d7c1fc36e0> link failed: some node was destroyed before the link was created
Jul 27 06:44:19 citadel wireplumber[1824]: wp-event-dispatcher: <WpAsyncEventHook:0x55d7c20c5230> failed: <WpSiStandardLink:0x55d7c2633c20> link failed: 1 of 1 PipeWire links failed to activate
Jul 27 06:46:18 citadel steam[3081247]: wine: setpriority -10 for pid -1 failed: 3
Jul 27 06:46:38 citadel steam[3081247]: wine: setpriority -10 for pid -1 failed: 3
Jul 27 06:46:39 citadel steam[3081247]: wine: setpriority -10 for pid -1 failed: 3
Jul 27 06:51:05 citadel steam[3081247]: (process:3093100): GLib-GObject-CRITICAL **: 06:51:05.094: g_object_unref: assertion 'G_IS_OBJECT (object)' failed

Additionally, I had set amdgpu.ppfeaturemask=0xffffffff before - I’ve changed to this boot now:

BOOT_IMAGE=/@/boot/vmlinuz-linux root=UUID=b7ffddb0-a536-4e30-9aff-d106ca80fb3d rw rootflags=subvol=@ nowatchdog nvme_load=YES loglevel=3 amdgpu.ppfeaturemask=0xfff73fff

And will retest later today with that. If that’s an issue still I can try Zen kernel, I guess?

There is a newer Bios version available for your motherboard.

The feature mask you’re mentioning differ ! In case you haven’t noticed.

In accordance to the amdgpu article in the Arch wiki, the method to identify the correct featuremask for your GPU is the actual output of this command in the bash:

$ printf 'amdgpu.ppfeaturemask=0x%x\n' "$(($(cat /sys/module/amdgpu/parameters/ppfeaturemask) | 0x4000))"

Yes - understand that it’s different. I looked at the header file for this. The command says I want amdgpu.ppfeaturemask=0xfff77fff however the one I pasted disables PP_OVERDRIVE_MASK. I figured if I have performance issues, turning off overclocking seems like a good move. I had PP_GFXOFF_MASK on before which it seems can cause trouble. I have no clue what PP_GFX_DCS_MASK is and why it should be on or off.

The reason I had it set to 0xffffffff before is because that is what CoreCtrl tells you to do.

I’ve updated firmware as well and will retest in a few hours.

Edit: Is it worth turning off the iGPU? I see activity on it occasionally while gaming and keeping an eye on btop though I’m unsure what might be using it.

Thanks.

Issue persists. What I’ve done so far:

  1. GPU feature mask change
  2. Tried with and without Feral gamemode - issue is more pronounced without gamemode
  3. Disabled iGPU
  4. Updated motherboard firmware
  5. Closed all apps except Steam & Game
  6. Removed CoolerControl in case that was causing thermal issues

I’ve noticed I can alt/tab just fine while the game is hung. Desktop works fine, game is still hung for a second or two when I alt-tab back into the game. So it’s not my whole system hanging, but any games seem to hang.

Boot logs show no errors at time of the hang.

Unfiltered logging from the time a hang happened:

Jul 27 20:27:39 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26492928 avail:73728 max:15360 skip:69888
Jul 27 20:27:39 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26566656 avail:40960 max:15360 skip:37120
Jul 27 20:27:39 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26607616 avail:196608 max:15360 skip:192768
Jul 27 20:27:40 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26811904 avail:57856 max:15360 skip:54016
Jul 27 20:27:40 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26869760 avail:16384 max:15360 skip:12544
Jul 27 20:27:40 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:26886144 avail:139264 max:15360 skip:135424
Jul 27 20:27:40 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27025408 avail:81920 max:15360 skip:78080
Jul 27 20:27:40 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27115008 avail:66048 max:15360 skip:62208
Jul 27 20:27:41 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27181056 avail:139264 max:15360 skip:135424
Jul 27 20:27:41 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27320320 avail:106496 max:15360 skip:102656
Jul 27 20:27:41 citadel rtkit-daemon[1659]: Supervising 5 threads of 3 processes of 1 users.
Jul 27 20:27:41 citadel rtkit-daemon[1659]: Supervising 5 threads of 3 processes of 1 users.
Jul 27 20:27:41 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27426816 avail:73728 max:15360 skip:69888
Jul 27 20:27:45 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:27500544 avail:1277952 max:15360 skip:1274112
Jul 27 20:27:45 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:28778496 avail:73728 max:15360 skip:69888
Jul 27 20:27:45 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:28852224 avail:81920 max:15360 skip:78080
Jul 27 20:27:46 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:28934144 avail:303104 max:15360 skip:299264
Jul 27 20:27:46 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:29237248 avail:24576 max:15360 skip:20736
Jul 27 20:27:50 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:29269504 avail:1516032 max:15360 skip:1512192
Jul 27 20:27:53 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:30785536 avail:1236992 max:15360 skip:1233152
Jul 27 20:27:54 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:32022528 avail:311296 max:15360 skip:307456
Jul 27 20:27:56 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:32333824 avail:729088 max:15360 skip:725248
Jul 27 20:27:56 citadel pipewire-pulse[2019]: mod.protocol-pulse: 0x557837b75be0: [Insurgency: Sandstorm] overrun recover read:33062912 avail:73728 max:15360 skip:69888

In a short gaming session, the hangs happened at 8:08, 8:13 and 8:27. No real cadence to them.

To continue to add more info - I’ve used a spare SSD to install EOS fresh. Installed Steam on top of that (being sure I selected vulkan-radeon/radv) and using the game. NOTHING else installed beyond what it took to run steam and then feral gamemode on top of that. Didn’t change groups for gamemode, for example.

Same issue repros. This is making me think that since people haven’t been complaining about the 7900XTX on Linux that I have something going on hardware wise.

I shouldn’t have a power issue - I have a RM1000x powering the system with 3x PCIe power cables (eg, no splitters).

Anyone have some things to try? I’m at wits end on this.

I would try to update the Bios to it’s latest version, which is Version 3265, released as of july 1st.
That may included some mitigations, but I can’t tell for sue.

Additionally, in the excerpts of your journalctl, there are CPU related segmentation faults,

Jul 27 01:45:04 citadel kernel: CacheThread_Blo[2937995]: segfault at 4c ip 00007f0576706df4 sp 00007f051f7f8f00 error 6 in libcef.so[6505df4,7f05723e3000+9a92000] likely on CPU 7 (core 7, socket 0)
...
Jul 27 06:35:14 citadel kernel: vkcube[3088186]: segfault at 7fae3c44632a ip 00007fae3c44632a sp 00007ffc7b42db90 error 14 likely on CPU 27 (core 11, socket 0)
Jul 27 06:35:14 citadel kernel: Code: Unable to access opcode bytes at 0x7fae3c446300.

All I can tell is that it won’t hurt to update your BIOS and to check if resizable BAR support is activated.

Hope this helps, if not a complete journalctl instead of a partial one would be useful to diagnose this further.

On a different note, I can’t tell which way your 96GB of RAM are distributed among the physical modules. If it’s 2x48GB or 4x24GB of identical make & model, I won’t bother. But unconventional RAM configurations / mismatches in sizes can result in weird issues and should be avoided.

Done already and didn’t help.

VKcube was run as a part of a UI to configure mangohud since I didn’t want to delve into config files.

I don’t see any segfaults elsewhere and the mangohud UI app is…bad. It takes a few minutes to open.

It is turned on - I think this confirms it’s working?

[    7.040097] [drm] Detected VRAM RAM=24560M, BAR=32768M

I’ll see about a full one with a gaming session. In the mean time I’ve asked XFX for a new vBIOS and switched the GPU over to the high performance vBIOS. Will test soon.

I also have the AMD GPU top installed and will keep it open on the side to see if voltages dip or something.

It’s 2x48 in the correct slots per Asus’s manual, but good thought.

Maybe try just using one RAM and see if the issue persist, try one, if the issue stays try the other (also try using differnt slots). Maybe you have a faulty RAM. Also look in the BIOS if XMP or EXPO is acticvated (if its activated try deactivating or the other way around).

If its not the RAM, the other suggestion I would have is look at the proton settings in steam. I dont know that game that you are playing. but I had something similar once or twice with a game and changing the proton version solved it. in my case back then changed from proton 6 to 9. and the other time with a different game from 9 to experimental. always had weird frame drops and freezing but changing versions solved it.

Jul 27 01:45:04 citadel kernel: CacheThread_Blo[2937995]: segfault at 4c ip 00007f0576706df4 sp 00007f051f7f8f00 error 6 in libcef.so[6505df4,7f05723e3000+9a92000] likely on CPU 7 (core 7, socket 0)`

There are two instances of segmentation faults in your log, and those aren’t related to each other. As the earlier one (above) occurred several hours earlier, not in relation to VKCube. But due to libcef.so, most likely due to the fact that steam uses a embedded chrome client under the hood.

It is. In case it’s been not enabled it would report BAR=256M.

Furthermore:

API: Vulkan v: 1.4.321 surfaces: N/A device: 0 type: discrete-gpu
    driver: mesa radv device-ID: 1002:744c device: 1 type: integrated-gpu
    driver: mesa radv device-ID: 1002:164e

I would deactivate the integrated Radeon graphics via BIOS permanently.

I turned off EXPO, I turned the iGPU off yesterday earlier as well. To be sure I turned off my WIFI controller and onboard ethernet as well as I use a discrete X550-T1 and a USB DAC.

I also went ahead and tried out CachyOS using their native Steam client, their proton etc.

I’ve also tried various Proton9, 10 and GE 10 versions with the same issue in all of them.

I also ran superposition a bit - I’ve seen no drops yet but i need to run it for 15 minutes and see what happens. Maybe the issues are isolated to a few games and the games are at fault? I played some Teardown and so no issues but it doesn’t put the card to the power limit. I’ll have to pick another game to play for a bit to try.

I looked at dmesg logs for any errors and saw none.

I’ll look for more segfaults.

You could check via vulkaninfo --summary which driver is actually being used.
The Arch wiki suggest the mesa driver instead of amdgpu / amdvlk. The linked article also provides the required info on how to mitigate that amdvlk is set as a default.

https://0x0.st/8Rcc.txt

Devices:
========
GPU0:
        apiVersion         = 1.4.311
        driverVersion      = 25.1.6
        vendorID           = 0x1002
        deviceID           = 0x744c
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD Radeon RX 7900 XTX (RADV NAVI31)
        driverID           = DRIVER_ID_MESA_RADV
        driverName         = radv
        driverInfo         = Mesa 25.1.6-arch1.1
        conformanceVersion = 1.4.0.0
        deviceUUID         = 00000000-0300-0000-0000-000000000000
        driverUUID         = 414d442d-4d45-5341-2d44-525600000000

I should be using the correct driver there. I’m using vulkan-radeon, not amdgpu.