Need help with replacing open source AMD GPU drivers with proprietary

I’m desperate with the problems I reported previously on the forum, as they persist and terribly harm my user experience in many cases, wasting my time to the very least. I now am almost sure that the reason lays in GPU drivers. I did everything the Wiki says in the Installation part, and getting to the part of " How to ensure you are using AMDGPU-PRO driver", I of course get AMD instead of Advanced Micro Devices, Inc.. I tried to google my way through this task, but when there’s a request to help with it, there are mostly responses like: “Actfually most people don’t need proprietary dfvivers for AMD :nerd_face:”, or the problems reported simply are not related to mine to begin with.

I face the problem when the system freezes and crashes. I encounter a graphical bug with an opened window falling into squares with empty spaces in this pattern (you can see the desktop wallpaper through these empty squares), while audio keeps playing, if there’s any media being on. If this bug has occurred, no commands get through, but REISUB/REISUO. Plus I encounter minor visual bugs every now and then, these may look like if the GPU was damaged or overheated, which is not the case, as it used to work properly when I tested out switching the laptop to Windows 10.

Nothing helped me so far, and the only thing I positively didn’t try is switching from open-source drivers to proprietary. I’d be tremendously thankful if you’d guide me through this procedure.

The thing that reassured me in this point of view is that running an app that crashes this way the most frequently with vk_pro prefix, which I found here kinda helped me so far, this particular app shows signs of improvement.

$ inxi -FGa:

System:
  Host: AsusTUF Kernel: 6.12.1-arch1-1 arch: x86_64 bits: 64 compiler: gcc
    v: 14.2.1 clocksource: tsc avail: acpi_pm
    parameters: initrd=\32322911e5a24c1eaae815dc3a803759\6.12.1-arch1-1\initrd
    nvme_load=YES nowatchdog rw
    rd.luks.uuid=43df9ab0-a5fb-4db6-9f79-4e83838c9e9b
    root=/dev/mapper/luks-43df9ab0-a5fb-4db6-9f79-4e83838c9e9b
    systemd.machine_id=32322911e5a24c1eaae815dc3a803759
  Desktop: Xfce v: 4.18.1 tk: Gtk v: 3.24.43 wm: xfwm4 v: 4.18.0
    with: xfce4-panel tools: xfce4-screensaver vt: 7 dm: LightDM v: 1.32.0
    Distro: EndeavourOS base: Arch Linux
Machine:
  Type: Laptop System: ASUSTeK product: ASUS TUF Gaming A16 FA617NS_FA617NS
    v: 1.0 serial: <superuser required>
  Mobo: ASUSTeK model: FA617NS v: 1.0 serial: <superuser required>
    uuid: <superuser required> UEFI: American Megatrends LLC. v: FA617NS.410
    date: 06/15/2023
Battery:
  ID-1: BAT0 charge: 84.1 Wh (100.0%) condition: 84.1/90.0 Wh (93.5%)
    volts: 17.1 min: 15.9 model: AS3GWYF3KC GA50358 type: Unknown serial: 08F2
    status: full
CPU:
  Info: model: AMD Ryzen 7 7735HS with Radeon Graphics bits: 64 type: MT MCP
    arch: Zen 3+ gen: 3 level: v3 note: check built: 2022 process: TSMC n6 (7nm)
    family: 0x19 (25) model-id: 0x44 (68) stepping: 1 microcode: 0xA404102
  Topology: cpus: 1x dies: 1 clusters: 1 cores: 8 threads: 16 tpc: 2
    smt: enabled cache: L1: 512 KiB desc: d-8x32 KiB; i-8x32 KiB L2: 4 MiB
    desc: 8x512 KiB L3: 16 MiB desc: 1x16 MiB
  Speed (MHz): avg: 1397 min/max: 400/4829 boost: enabled scaling:
    driver: amd-pstate-epp governor: powersave cores: 1: 1397 2: 1397 3: 1397
    4: 1397 5: 1397 6: 1397 7: 1397 8: 1397 9: 1397 10: 1397 11: 1397 12: 1397
    13: 1397 14: 1397 15: 1397 16: 1397 bogomips: 102248
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm
  Vulnerabilities:
  Type: gather_data_sampling status: Not affected
  Type: itlb_multihit status: Not affected
  Type: l1tf status: Not affected
  Type: mds status: Not affected
  Type: meltdown status: Not affected
  Type: mmio_stale_data status: Not affected
  Type: reg_file_data_sampling status: Not affected
  Type: retbleed status: Not affected
  Type: spec_rstack_overflow status: Vulnerable: Safe RET, no microcode
  Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
    prctl
  Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
    sanitization
  Type: spectre_v2 mitigation: Retpolines; IBPB: conditional; IBRS_FW;
    STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not
    affected
  Type: srbds status: Not affected
  Type: tsx_async_abort status: Not affected
Graphics:
  Device-1: Advanced Micro Devices [AMD/ATI] Navi 33 [Radeon RX 7600/7600
    XT/7600M XT/7600S/7700S / PRO W7600] vendor: ASUSTeK driver: amdgpu
    v: kernel arch: RDNA-3 code: Navi-33 built: 2023+ pcie: gen: 4
    speed: 16 GT/s lanes: 8 ports: active: none empty: DP-1, HDMI-A-1,
    Writeback-1, eDP-1 bus-ID: 03:00.0 chip-ID: 1002:7480 class-ID: 0300
  Device-2: Advanced Micro Devices [AMD/ATI] Rembrandt [Radeon 680M]
    vendor: ASUSTeK driver: amdgpu v: kernel arch: RDNA-2 code: Navi-2x
    process: TSMC n7 (7nm) built: 2020-22 pcie: gen: 4 speed: 16 GT/s
    lanes: 16 ports: active: eDP-2 empty: DP-2, DP-3, DP-4, DP-5, DP-6,
    Writeback-2 bus-ID: 77:00.0 chip-ID: 1002:1681 class-ID: 0300 temp: 44.0 C
  Device-3: Sonix USB2.0 HD UVC WebCam driver: uvcvideo type: USB rev: 2.0
    speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 3-3:3 chip-ID: 2b7e:b685
    class-ID: 0e02
  Display: x11 server: X.Org v: 21.1.14 compositor: xfwm4 v: 4.18.0 driver:
    X: loaded: amdgpu unloaded: modesetting alternate: fbdev,vesa dri: radeonsi
    gpu: amdgpu display-ID: :0.0 screens: 1
  Screen-1: 0 s-res: 1920x1200 s-dpi: 96 s-size: 508x317mm (20.00x12.48")
    s-diag: 599mm (23.57")
  Monitor-1: eDP-2 mapped: eDP-1 model: BOE Display NE160WUM-NX2 built: 2022
    res: 1920x1200 hz: 165 dpi: 141 gamma: 1.2 size: 345x215mm (13.58x8.46")
    diag: 407mm (16") ratio: 16:10 modes: max: 1920x1200 min: 640x480
  API: EGL v: 1.5 hw: drv: amd radeonsi platforms: device: 0 drv: radeonsi
    device: 1 drv: radeonsi device: 2 drv: swrast gbm: drv: kms_swrast
    surfaceless: drv: radeonsi x11: drv: radeonsi inactive: wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: amd mesa v: 24.2.7-arch1.1
    glx-v: 1.4 direct-render: yes renderer: AMD Radeon 680M (radeonsi rembrandt
    LLVM 18.1.8 DRM 3.59 6.12.1-arch1-1) device-ID: 1002:1681 memory: 500 MiB
    unified: no
Audio:
  Device-1: Advanced Micro Devices [AMD/ATI] Navi 31 HDMI/DP Audio
    vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s
    lanes: 8 bus-ID: 03:00.1 chip-ID: 1002:ab30 class-ID: 0403
  Device-2: Advanced Micro Devices [AMD/ATI] Rembrandt Radeon High
    Definition Audio vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie:
    gen: 4 speed: 16 GT/s lanes: 16 bus-ID: 77:00.1 chip-ID: 1002:1640
    class-ID: 0403
  Device-3: Advanced Micro Devices [AMD] ACP/ACP3X/ACP6x Audio Coprocessor
    vendor: ASUSTeK driver: snd_pci_acp6x v: kernel alternate: snd_pci_acp3x,
    snd_rn_pci_acp3x, snd_pci_acp5x, snd_acp_pci, snd_rpl_pci_acp6x,
    snd_pci_ps, snd_sof_amd_renoir, snd_sof_amd_rembrandt,
    snd_sof_amd_vangogh, snd_sof_amd_acp63, snd_sof_amd_acp70 pcie: gen: 4
    speed: 16 GT/s lanes: 16 bus-ID: 77:00.5 chip-ID: 1022:15e2 class-ID: 0480
  Device-4: Advanced Micro Devices [AMD] Family 17h/19h HD Audio
    vendor: ASUSTeK driver: snd_hda_intel v: kernel pcie: gen: 4 speed: 16 GT/s
    lanes: 16 bus-ID: 77:00.6 chip-ID: 1022:15e3 class-ID: 0403
  API: ALSA v: k6.12.1-arch1-1 status: kernel-api
    tools: alsactl,alsamixer,amixer
  Server-1: sndiod v: N/A status: off tools: aucat,midicat,sndioctl
  Server-2: PipeWire v: 1.2.7 status: active with: 1: pipewire-pulse
    status: active 2: wireplumber status: active 3: pipewire-alsa type: plugin
    4: pw-jack type: plugin tools: pactl,pw-cat,pw-cli,wpctl
Network:
  Device-1: Realtek RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet
    vendor: ASUSTeK driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s
    lanes: 1 port: e000 bus-ID: 05:00.0 chip-ID: 10ec:8168 class-ID: 0200
  IF: eno1 state: down mac: e8:9c:25:80:fe:a9
  Device-2: Realtek RTL8852BE PCIe 802.11ax Wireless Network
    vendor: AzureWave driver: rtw89_8852be v: kernel pcie: gen: 1
    speed: 2.5 GT/s lanes: 1 port: d000 bus-ID: 06:00.0 chip-ID: 10ec:b852
    class-ID: 0280
  IF: wlan0 state: up mac: f8:54:f6:c1:f6:1c
  IF-ID-1: virbr0 state: down mac: 52:54:00:53:b4:26
  Info: services: NetworkManager, systemd-timesyncd, wpa_supplicant
Bluetooth:
  Device-1: IMC Networks Bluetooth Radio driver: btusb v: 0.8 type: USB
    rev: 1.0 speed: 12 Mb/s lanes: 1 mode: 1.1 bus-ID: 5-1:2 chip-ID: 13d3:3571
    class-ID: e001 serial: 00e04c000001
  Report: btmgmt ID: hci0 rfk-id: 0 state: up address: F8:54:F6:C1:F6:1D
    bt-v: 5.2 lmp-v: 11 status: discoverable: yes pairing: yes class-ID: 6c010c
Drives:
  Local Storage: total: 476.94 GiB used: 261.05 GiB (54.7%)
  SMART Message: Unable to run smartctl. Root privileges required.
  ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Western Digital model: WD PC
    SN740 SDDPNQD-512G-1002 size: 476.94 GiB block-size: physical: 512 B
    logical: 512 B speed: 63.2 Gb/s lanes: 4 tech: SSD serial: 23232P803651
    fw-rev: 73101000 temp: 37.9 C scheme: GPT
Partition:
  ID-1: / raw-size: 475.94 GiB size: 467.4 GiB (98.21%)
    used: 260.78 GiB (55.8%) fs: ext4 dev: /dev/dm-0 maj-min: 254:0
    mapped: luks-43df9ab0-a5fb-4db6-9f79-4e83838c9e9b
Swap:
  Kernel: swappiness: 60 (default) cache-pressure: 100 (default) zswap: no
  ID-1: swap-1 type: file size: 512 MiB used: 0 KiB (0.0%) priority: -2
    file: /swapfile
Sensors:
  System Temperatures: cpu: 47.4 C mobo: 43.0 C
  Fan Speeds (rpm): cpu: 0
  GPU: device: amdgpu temp: 44.0 C device: amdgpu temp: 49.0 C mem: 52.0 C
    fan: 0 watts: 14.00
Info:
  Memory: total: 16 GiB note: est. available: 14.87 GiB used: 4.99 GiB (33.6%)
  Processes: 418 Power: uptime: 55m states: freeze,mem,disk suspend: s2idle
    wakeups: 0 hibernate: platform avail: shutdown, reboot, suspend, test_resume
    image: 5.93 GiB services: power-profiles-daemon, upowerd,
    xfce4-power-manager Init: systemd v: 256 default: graphical
    tool: systemctl
  Packages: pm: pacman pkgs: 1538 libs: 485 tools: yay Compilers:
    clang: 18.1.8 gcc: 14.2.1 Shell: Bash v: 5.2.37 running-in: xfce4-terminal
    inxi: 3.3.36

“The most people don’t need proprietary amd drivers” crowd are pretty right, more often than not the proprietary driver is actually worse, both in performance and stability.

Did you try this? https://aur.archlinux.org/pkgbase/amdgpu-pro-installer

It’s in the first paragraph of the wiki page. WIth the next one explaining who amdgpu-pro is for.

Personally, I think you would be better served trying a different desktop environment than the proprietary drivers, people are saying you don’t need it for a reason, it’s unlikely to help you, not that trying is harmful.

If you are on wayland, try X, if you’re on X, try wayland.

You could also just simply try to use i3 (X) and sway (Wayland) to see if you have these issues there as well. They’re good environments for testing things like this because they don’t come bundled with any stuff that could interfere with your main configuration like many of the bigger environments (kde, gnome, xfce, etc) do.

Well, I installed all the software included within it, as it was written in the Wiki page I included in the post.

I started off with KDE, switching to Gnome, then to that minimalistic one popular among Arch users, I don’t recall the name, then settled on Xfce4, the bugs are out there no matter what. I hopped distros: Ubuntu, Mint, Debian, came back to Endeavour OS when it turned out this is not the case, I tried Wayland and X, I did everything, the only OS that I had no issues with was Windows 10, which I’m definitely not gonna use just because I don’t like it. Like I said, I’m desperate, and I’m sure the issue are drivers, and I haven’t tried the proprietary AMD drivers yet.

you have 2 amd gpus, an igpu/apu and a dedicated one, do you experience the issue on your igpu as well? the 680m is an exceptionally good igpu so you shouldn’t have problems at least running things to test them, if you don’t have the issue on the igpu it sorta implies that you have a hardware problem.

And if you’re using wine/proton it might even be a dxvk or wine issue which can be solved in other ways.

If it’s only happening on one program, the issue is probably not driver deep (or maybe it is but should have a solution without going that far).

I’m not quite sure about which hardware is the source of issues. What I know for sure is that no load on Windows could make such bugs appear ever, I loaded my GPU with the heaviest stuff I had in stuck, and no sign of these visual bugs or system crashes occurred. The hardware issue was the first thing I excluded.

The bugs occur even when I work within Thunar, sorting objects, managing archives etc. I kinda got used to them, they are not a big deal at this point. But using a few particular apps leads to total system failure and following REISUB.

Well you’re doing it right, just keep throwing solutions at the problem until something sticks, then go from there.

You could try distro hoppping, cachyos or plain arch if you wanna stay in the arch sphere, gentoo might be a decent option to troubleshoot since it allows you to very finely configure the kernel (it’s a rabbit hole though!).

Maybe fedora and ubuntu just to see.

If you wanna try that, just clear up some space on your drive (you can shrink partitions with gparted but be vary of data loss. Do NOT use kde partition manager! I’ve lost data with that every time, but never with gparted) which would allow you to try installing things on the freed up space without losing your current installation. Or you could replace windows since you seem determined not to use it.

1 Like

First off there are 3 newer UEFI Bios updates. You might want to update as they are for optimize system performance. The second thing is that you have a hybrid laptop and it is rendering on the IGPU not the 7600M. You need to switch graphics to use the dedicated gpu. Are you doing that as your ouput shows you are using X11 and rendering on the IGPU 680M. Have you completely set it up to use radeon-vulkan and installed the proper files and verified that vulkan is in fact working on the graphics. As reported most do not use amdgpu-pro. It does not provide better performance than the amdgpu kernel module.

https://wiki.archlinux.org/title/AMDGPU

https://wiki.archlinux.org/title/Vulkan

Edit: If you want to try the proprietary drivers that is an option but I don’t think it’s the be all end all to your issues.

1 Like

I tired distro hopping. Ubuntu, Mint and Debian are out of the list. However, I didn’t try plain Arch, as I was kinda afraid that it’s gonna overwhelm me with complexity. However, now that you suggested it, paired with a dual boot option, maybe I should give it a try.

There’s no Windows on this laptop, it was a part of my “distro hopping” strategy. I used to need Windows for certain tasks several years ago, but nowadays it’s just Endeavour OS. But I’ll see what can be done in regard of dual boot Arch. I just may have freed enough free space recently to tinker with partitions. Hopefully, gparted is safe enough, as I’d like to keep as much of the current set up intact.

I tell you hwhat, both of you are most probably right. After installing the packages from the amdgpu-pro-installer, I launched Stalker 2 to check, and the performance worsened a lot. After deleting these packages, performance went back to normal. However, now I’m expecting those graphical issues again, we’ll see.

Could you please give me a hand regarding UEFI updates? I clearly remember trying, but it seems that I failed.

And yeah, this integrated-dedicated switching too seems to be a possible issue which I tried to deal with. But, again, it seems that I failed with that as well.

Updating the UEFI Bios is easy. It also depends whether you are doing it from Windows or Linux. If doing it from Windows you download the appropriate Bios update for Windows and it will be an .exe file. Very simple.

If doing it from EndeavourOS you will use EZ Flash Utility. So you download the appropriate Bios update for EZ Flash.

Instructions here for both:

https://www.asus.com/support/faq/1008859/

Edit: As a side note: Installing eos with an amd system is mostly automatic. It will use amdgpu unless it is a gpu that requires radeon which requires some additional configuration. Then to use vulkan you need to install vulkan-radeon and also you need the 32 bit lib files if gaming usually. There is some set up to make sure that everything is working and i provided the link for vulkan. It’s very easy and again you don’t want to use amdvlk. If the proper files are installed you should have output when you run the verification.

vulkaninfo

You can also set up amdgpu for hardware acceleration & accelerated video decoding which also requires some additional files installed and some configuration and then verification which will show the proper ouput if it’s working using the commands.

vainfo
vdpauinfo

https://wiki.archlinux.org/title/Hardware_video_acceleration#AMD/ATI

1 Like

BIOS updates are a bit dangerous, if you screw them up, ur motherboard usually becomes a paperweight. If ur on a laptop make triple sure you are downloading the latest bios for your exact model. The method then is usually to place that bios rom on a fat32 formatted USB drive, boot into the uefi menu and update the bios from the rom through the interface (interface varies between vendors).

To check which gpu you’re using to run a game, the best way imo is to use mangohud, you can configure it to show you the active GPU. That way you can know for sure if you’re at least using the correct gpu. I’m not sure how switching is done between amd igpu and dgpu, i have an nvidia dgpu.

1 Like

You see, I generally don’t have issues with gaming. Funnily enough, it’s more usual for me to get crashes while using Firefox than while gaming. That’s why I might’ve sounded frustrated while trying to describe the situation.

Thanks for the advice about formatting the drive. I’ve got exFAT USB here, should format it to FAT32 first, I suppose.

Much obliged, on it :handshake:

if ur bios can read exfat then exfat is fine, but it’s not guaranteed that it can, it’s a tossup. fat32 on the other hand is the most portable filesystem in existence, almost any os (including pretty much all bioses, uefi or not) can read fat32.

Besides this portability it’s not a very good filesystem though, exfat is less portable since not everything can read it (still highly portable though), but it supports bigger files and some other niceities.

1 Like

Reporting. BIOS updated successfuly.

Vulkan packages for AMD installed. If I’m not mistaken, at least vulkan-radeon was indeed missing. Need a few days for observation to confirm that one of these helped. Gonna check how much uptime the computer’s gonna last after these too. For now I haven’t seen those glitches yet.

1 Like

Okay, so the thing I noticed so far is that when I set up the iGPU as default through $ MESA_VK_DEVICE_SELECT, bugs are less likely to appear and no crashes occured so far. However, performance of such tasks as games is poor. When I set up dGPU through this command, performance of games is good, but the bugs are back and the crashes become possible as well. It seems that the main source of the issue is that dGPU and iGPU conflict in a way or another. I’m 100% sure that it is not a hardware issue, as, once again, the experience on Windows was smooth.

Can you disable your igpu? in the bios I mean? If your laptop is multiplexed you should be able to disable it somehow, those are becoming more common these days (if it’s not multiplexed your dgpu depends on your igpu to display images).

It’d destroy your battery life maybe to use it that way but if it’s an option it’s worth a try to see if it fixes this.

Also, consider trying to actually get to the root of the problem through logs (stdout, jotraurnalctl and dmesg). If you can accurately track down which error messages are related to your issues, you could try reporting them as bugs for either mesa or the kernel.

1 Like

Do you use Asusctl or Asus Linux ?

Furthermore, there are some hints on the Arch Wiki concerning the ASUS TUF Gaming A16 you are using.

  • suspend-to-RAM is broken on BIOS versions 311-313. The latest working BIOS version is 309, but it breaks suspend-to-RAM on Windows.

  • For Steam gaming start games with DRI_PRIME=1, for gaming and desktop graphic glitch issues see AMDGPU#Tear free rendering

  • If you experience green artifacts on eDP display, disable PanelSelfRrefresh2 with the following kernel parameter amdgpu.dcdebugmask=0x200

2 Likes

I had no Idea these existed. I installed these and currently exploring what are capable of.

It used to be one of the first workarounds I tried back in the day. The game doesn’t even start under the DRI_PRIME=1 command.

I googled how to do this one, and I wrote this line down in /etc/default/grub. This file didn’t exist before I created it with nano, so I’m not sure if it does anything whatsoever.

In general, something of these have seemingly helped. Gonna give it a few days of testing before confirming anything, but don’t see anything annoying so far.

I cannot believe I didn’t find this part PARTICULARLY about my model of laptop, holy Jesus. I honestly searched for Linux related issues regarding my model, and I couldn’t find this. Thank you. Let’s see what we’ve got.

You’ll need the vulkan-mesa-layers to be installed, as well as lib32-vulkan-mesa-layers for 32 bit applications. Check if these are installed. And DRI_PRIME=1 should be entered as a command in the launch options of the game, which would require %command% to be subsequently. So, DRI_PRIME=1 %command% should be the complete launch option for the steam game.

Alternatively you and try DRI_PRIME=1! %command% as a lanch option which should enforce the dGPU to be used instead of the iGPU, if I’m not mistaken.

Did you even have green artifacts on your Laptops build-in screen? eDP is the embedded display port the internal display is connected to. If not didn’t saw any green artifacts on it, you won’t need to do that or the following stuff, which I’ll describe anyway.

Kernel parameters could be added to the kernel cmdline via the dracut drop-in file.

sudo nano /etc/dracut.conf.d/cmdline.conf

The file may not exist, create it and add the following contents

# etc/dracut.conf.d/cmdline.conf
kernel_cmdline+=" amdgpu.dcdebugmask=0x200 "

Keep the spaces in the front at the end, as dracut would complain about it otherwise.

I’m not certain if it is strictly required, but I usually run sudo dracut-rebuild when I’m adding kernel parameters for good measure.

1 Like