RDNA 3 7900XTX memory clock stuck at 96mhz

after upgrading my kernel to 6.5.5 my memory isn’t appropriately using the higher pstates causing games to run poorly.

I tried some advice I found on the arch forum related to a RDNA 2 card having the same problem, but no luck

this is the output of my inxi -Faz

System:
  Kernel: 6.5.5-arch1-1 arch: x86_64 bits: 64 compiler: gcc v: 13.2.1
    clocksource: tsc available: hpet,acpi_pm
    parameters: BOOT_IMAGE=/boot/vmlinuz-linux
    root=UUID=859cc0a9-bb8c-435f-a0a6-41312b22f388 rw
    amdgpu.ppfeaturemask=0xffffffff loglevel=3 nowatchdog nvme_load=YES
    initcall_blacklist=simpledrm_platform_driver_init
  Desktop: KDE Plasma v: 5.27.8 tk: Qt v: 5.15.10 wm: kwin_x11 vt: 2
    dm: SDDM Distro: EndeavourOS base: Arch Linux
Machine:
  Type: Desktop System: Micro-Star product: MS-7C84 v: 1.0
    serial: <superuser required>
  Mobo: Micro-Star model: MAG X570 TOMAHAWK WIFI (MS-7C84) v: 1.0
    serial: <superuser required> UEFI: American Megatrends LLC. v: 1.B0
    date: 08/11/2022
Battery:
  Device-1: nintendo_switch_controller_battery_0003:057E:2009.0001 model: N/A
    serial: N/A charge: Full status: full
CPU:
  Info: model: AMD Ryzen 9 5900X bits: 64 type: MT MCP arch: Zen 3+ gen: 4
    level: v3 note: check built: 2022 process: TSMC n6 (7nm) family: 0x19 (25)
    model-id: 0x21 (33) stepping: 0 microcode: 0xA201016
  Topology: cpus: 1x cores: 12 tpc: 2 threads: 24 smt: enabled cache:
    L1: 768 KiB desc: d-12x32 KiB; i-12x32 KiB L2: 6 MiB desc: 12x512 KiB
    L3: 64 MiB desc: 2x32 MiB
  Speed (MHz): avg: 2869 high: 4543 min/max: 2200/5160 boost: enabled
    scaling: driver: acpi-cpufreq governor: schedutil cores: 1: 3627 2: 4533
    3: 3630 4: 3619 5: 2200 6: 3617 7: 2200 8: 2200 9: 2199 10: 2206 11: 2200
    12: 2196 13: 3619 14: 4543 15: 3620 16: 2200 17: 3620 18: 3629 19: 2199
    20: 2200 21: 2200 22: 2200 23: 2200 24: 2200 bogomips: 177684
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow mitigation: safe RET, no microcode
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines, IBPB: conditional, IBRS_FW,
    STIBP: always-on, RSB filling, PBRSB-eIBRS: Not affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: AMD Navi 31 [Radeon RX 7900 XT/7900 XTX] vendor: Sapphire NITRO+
    driver: amdgpu v: kernel arch: RDNA-3 code: Navi-3x process: TSMC n5 (5nm)
    built: 2022+ pcie: gen: 4 speed: 16 GT/s lanes: 16 ports: active: DP-1
    empty: DP-2,HDMI-A-1,HDMI-A-2 bus-ID: 2f:00.0 chip-ID: 1002:744c
    class-ID: 0300
  Display: x11 server: X.Org v: 21.1.8 compositor: kwin_x11 driver: X:
    loaded: amdgpu unloaded: modesetting,radeon alternate: fbdev,vesa
    dri: radeonsi gpu: amdgpu display-ID: :0 screens: 1
  Screen-1: 0 s-res: 2560x1440 s-dpi: 96 s-size: 677x381mm (26.65x15.00")
    s-diag: 777mm (30.58")
  Monitor-1: DP-1 mapped: DisplayPort-0 model: Dell S2721DGF
    serial: <filter> built: 2021 res: 2560x1440 dpi: 109 gamma: 1.2
    size: 597x336mm (23.5x13.23") diag: 685mm (27") ratio: 16:9 modes:
    max: 2560x1440 min: 720x400
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
    device: 1 drv: swrast gbm: drv: kms_swrast surfaceless: drv: radeonsi x11:
    drv: radeonsi inactive: wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 23.2.1-arch1.1
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon RX 7900 XTX (gfx1100
    LLVM 16.0.6 DRM 3.54 6.5.5-arch1-1) device-ID: 1002:744c memory: 23.44 GiB
    unified: no
  API: Vulkan v: 1.3.264 layers: 6 device: 0 type: discrete-gpu name: AMD
    Radeon RX 7900 XTX (RADV GFX1100) driver: mesa radv v: 23.2.1-arch1.1
    device-ID: 1002:744c surfaces: xcb,xlib
Audio:
  Device-1: AMD Navi 31 HDMI/DP Audio driver: snd_hda_intel v: kernel pcie:
    gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 2f:00.1 chip-ID: 1002:ab30
    class-ID: 0403
  Device-2: AMD Starship/Matisse HD Audio vendor: Micro-Star MSI
    driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s lanes: 16
    bus-ID: 31:00.4 chip-ID: 1022:1487 class-ID: 0403
  Device-3: Corsair VIRTUOSO XT Wireless Gaming Receiver
    driver: hid-generic,snd-usb-audio,usbhid type: USB rev: 2.0 speed: 12 Mb/s
    lanes: 1 mode: 1.1 bus-ID: 3-3:2 chip-ID: 1b1c:0a64 class-ID: 0300
    serial: <filter>
  API: ALSA v: k6.5.5-arch1-1 status: kernel-api
    tools: alsactl,alsamixer,amixer
  Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
  Server-2: PipeWire v: 0.3.80 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek RTL8125 2.5GbE vendor: Micro-Star MSI driver: r8169
    v: kernel pcie: gen: 2 speed: 5 GT/s lanes: 1 port: f000 bus-ID: 26:00.0
    chip-ID: 10ec:8125 class-ID: 0200
  IF: enp38s0 state: up speed: 1000 Mbps duplex: full mac: <filter>
  Device-2: Intel Wi-Fi 6 AX200 driver: iwlwifi v: kernel pcie: gen: 2
    speed: 5 GT/s lanes: 1 bus-ID: 28:00.0 chip-ID: 8086:2723 class-ID: 0280
  IF: wlan0 state: up mac: <filter>
  IF-ID-1: vmnet1 state: unknown speed: N/A duplex: N/A mac: <filter>
  IF-ID-2: vmnet8 state: unknown speed: N/A duplex: N/A mac: <filter>
Bluetooth:
  Device-1: Intel AX200 Bluetooth driver: btusb v: 0.8 type: USB rev: 2.0
    speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 1-4:3 chip-ID: 8087:0029
    class-ID: e001
  Report: btmgmt ID: hci0 rfk-id: 0 state: down bt-service: enabled,running
    rfk-block: hardware: no software: yes address: <filter> bt-v: 5.2 lmp-v: 11
    status: discoverable: no pairing: no
Drives:
  Local Storage: total: 6.37 TiB used: 1.77 TiB (27.8%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung
    model: SSD 970 EVO Plus 1TB size: 931.51 GiB block-size: physical: 512 B
    logical: 512 B speed: 31.6 Gb/s lanes: 4 tech: SSD serial: <filter>
    fw-rev: 2B2QEXM7 temp: 52.9 C scheme: GPT
  ID-2: /dev/sda maj-min: 8:0 vendor: Samsung model: SSD 860 QVO 2TB
    size: 1.82 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 2B6Q scheme: GPT
  ID-3: /dev/sdb maj-min: 8:16 vendor: Samsung model: SSD 870 QVO 4TB
    size: 3.64 TiB block-size: physical: 512 B logical: 512 B speed: 6.0 Gb/s
    tech: SSD serial: <filter> fw-rev: 2B6Q scheme: GPT
Partition:
  ID-1: / raw-size: 931.22 GiB size: 915.53 GiB (98.32%)
    used: 725.08 GiB (79.2%) fs: ext4 dev: /dev/nvme0n1p2 maj-min: 259:2
  ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
    used: 312 KiB (0.1%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
  Alert: No swap data was found.
Sensors:
  System Temperatures: cpu: 54.1 C mobo: N/A gpu: amdgpu temp: 51.0 C
    mem: 64.0 C
  Fan Speeds (rpm): N/A gpu: amdgpu fan: 0
Info:
  Processes: 505 Uptime: 2m wakeups: 1 Memory: total: 32 GiB
  available: 31.27 GiB used: 4.79 GiB (15.3%) Init: systemd v: 254
  default: graphical tool: systemctl Compilers: gcc: 13.2.1 clang: 16.0.6
  Packages: pm: pacman pkgs: 1337 libs: 345 tools: yay Shell: Zsh v: 5.9
  running-in: konsole inxi: 3.3.30

this is the output of
cat /sys/kernel/debug/dri/0/amdgpu_pm_info

GFX Clocks and Power:
        96 MHz (MCLK)
        1733 MHz (SCLK)
        1960 MHz (PSTATE_SCLK)
        1249 MHz (PSTATE_MCLK)
        754 mV (VDDGFX)
        57.0 W (average GPU)

GPU Temperature: 54 C
GPU Load: 42 %
MEM Load: 26 %

SMC Feature Mask: 0x0003ebb871ffffff
VCN: Disabled

Clock Gating Flags Mask: 0x3bc08030d
        Graphics Fine Grain Clock Gating: On
        Graphics Medium Grain Clock Gating: On
        Graphics Medium Grain memory Light Sleep: Off
        Graphics Coarse Grain Clock Gating: On
        Graphics Coarse Grain memory Light Sleep: On
        Graphics Coarse Grain Tree Shader Clock Gating: Off
        Graphics Coarse Grain Tree Shader Light Sleep: Off
        Graphics Command Processor Light Sleep: Off
        Graphics Run List Controller Light Sleep: Off
        Graphics 3D Coarse Grain Clock Gating: Off
        Graphics 3D Coarse Grain memory Light Sleep: Off
        Memory Controller Light Sleep: On
        Memory Controller Medium Grain Clock Gating: On
        System Direct Memory Access Light Sleep: Off
        System Direct Memory Access Medium Grain Clock Gating: Off
        Bus Interface Medium Grain Clock Gating: On
        Bus Interface Light Sleep: Off
        Unified Video Decoder Medium Grain Clock Gating: Off
        Video Compression Engine Medium Grain Clock Gating: Off
        Host Data Path Light Sleep: Off
        Host Data Path Medium Grain Clock Gating: Off
        Digital Right Management Medium Grain Clock Gating: Off
        Digital Right Management Light Sleep: Off
        Rom Medium Grain Clock Gating: Off
        Data Fabric Medium Grain Clock Gating: Off
        VCN Medium Grain Clock Gating: Off
        Host Data Path Deep Sleep: Off
        Host Data Path Shutdown: On
        Interrupt Handler Clock Gating: On
        JPEG Medium Grain Clock Gating: Off
        Repeater Fine Grain Clock Gating: On
        Perfmon Clock Gating: On
        Address Translation Hub Medium Grain Clock Gating: On
        Address Translation Hub Light Sleep: On

I’m not really sure how to proceed with debugging or addressing this issue

I’ve been reading forum threads including this one:

to try some of the easy fixes

Have you checked that it is using amd pstates?

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver

output of this was

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver        
acpi-cpufreq

does that mean my pstates are disabled?

that’s CPU only.

thats GPU

you are talking about different things

that makes sense, my issue is with my gpu only (as far as I can tell) is there a similar command that might help reveal what the problem is?

Yes. Also didn’t realize you were referring to gpu only.

are there any other modules or anything I should investigate before submitting a bug report? I’m not sure whether its a driver/mesa issue or a kernel problem

With my set up i use the following with amdgpu.

https://wiki.archlinux.org/title/AMDGPU
https://wiki.archlinux.org/title/Hardware_video_acceleration

lib32-mesa
vulkan-radeon
lib32-vulkna-radeon
libva-mesa-driver
lib32-libva-mesa-driver
mesa-vdpau
lib32-mesa-vdpau

libva-utils
vdpauinfo

after some additional fiddling I’ve been able to work around it by reducing my monitors refresh rate to 144hz, which is letting the memory clock up and down

I believe this is a repeat of a kernal bug that existed in 6.3 and 6.4 similar to
https://bbs.archlinux.org/viewtopic.php?id=287990

Bug still exists after an upgrade to 6.5.7