Kernel update 6.15.5-arch1-1 caught me: No login, extra graphics card appeared

Now the 6.15.5-arch1-1 update caught me cold: Laptop stuck at login, screen appears (sometimes dimmed) but mouse & system stuck and getting hot like hell.

There is a “AMD Radeon” sticker on this thingy, but I never saw anything but the integrated Intel graphics. After the last update, a never-before seen AMD graphics card appears:

  Device-2: Advanced Micro Devices [AMD/ATI] Venus PRO [Radeon HD 8850M /
    R9 M265X] vendor: Toshiba driver: N/A alternate: radeon, amdgpu
    arch: GCN-1 code: Southern Islands process: TSMC 28nm built: 2011-20 pcie:
    gen: 3 speed: 8 GT/s lanes: 4 link-max: lanes: 16 bus-ID: 01:00.0
    chip-ID: 1002:6823 class-ID: 0380

Unfortunately, the laptop’s BIOS has no switch to enable/disable either graphics card.

I tried to edit the kernel line on boot, adding module_blacklist=amdgpu,radeon which brought the system up, and it’s now cool again, fan almost off. And works.

Now why does booting bring up radeon-related errors, and why does inxi suddenly show the AMD card (which it never did before, neither on Arcolinux nor on EOS)?

How can I disable that obscure graphics card correctly? I mean it worked fine before, using only the Intel, and I’m not gaming.

For completeness, the current inxi -Farz (system being booted manually usig above module_blacklist=amdgpu,radeon):

https://0x0.st/8GQe.txt

Current lsmod:

https://0x0.st/8GQL.txt

Current lspci (Radeon appears here too, didn’t ever before):

https://0x0.st/8GQp.txt

That is a pretty obscure GPU you have there. :slight_smile:

I am not sure if that is a bug or fix in the latest kernel. amdgpu supports GCN 1(Southern Islands) if you explicitly enable it. Venus is also GCN 1 but I am not sure if amdgpu actually supports it or not.

So it is possible that something was fixed enabling the proper detection of you GPU. Alternatively, it is possible something is broken causing the improper detection of your GPU. :sweat_smile:

Either way, blacklisting it is one way to “make it go away”. From a purely academic standpoint, I would wonder if enabling SI support would make it work properly or not.

I tried the following, which seems to work, but is that the correct way?

  1. Created a new file blacklist-amd-radeon.conf:
    blacklist amdgpu
    blacklist radeon
    
  2. Ran
    sudo reinstall-kernels
    
  3. Rebooted. Seems to work…

Personally, I would probably add it to /etc/kernel/cmdline and the run sudo reinstall-kernels but as long as it works, I guess either way is fine.

You mean, out of academic interest, set

radeon.si_support=0 amdgpu.si_support=1

?
:thinking:

Yes.

Let’s try. Can’t break it more than I did before… :rofl:
BRB

So I quickly modified my (ex-) /etc/modprobe.d/blacklist-amd-radeon.conf:

# blacklist amdgpu
# blacklist radeon

# Toshiba L70-B Southern Islands (GCN 1)
options radeon si_support=0
options amdgpu si_support=1

Did a sudo reinstall-kernels, dracut didn’t error.

Rebooted.

Some odd errors in dmesg: https://0x0.st/8GQ3.txt

inxi -Gaz now says:

Graphics:
  Device-1: Intel 4th Gen Core Processor Integrated Graphics vendor: Toshiba
    driver: i915 v: kernel arch: Gen-7.5 process: Intel 22nm built: 2013 ports:
    active: eDP-1 empty: HDMI-A-1,VGA-1 bus-ID: 00:02.0 chip-ID: 8086:0416
    class-ID: 0300
  Device-2: Advanced Micro Devices [AMD/ATI] Venus PRO [Radeon HD 8850M /
    R9 M265X] vendor: Toshiba driver: amdgpu v: kernel alternate: radeon
    arch: GCN-1 code: Southern Islands process: TSMC 28nm built: 2011-20 pcie:
    gen: 3 speed: 8 GT/s lanes: 4 link-max: lanes: 16 bus-ID: 01:00.0
    chip-ID: 1002:6823 class-ID: 0380
  Device-3: Chicony TOSHIBA Web Camera - HD driver: uvcvideo type: USB
    rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 2-1.3:4
    chip-ID: 04f2:b448 class-ID: 0e02
  Display: x11 server: X.Org v: 21.1.18 with: Xwayland v: 24.1.8 driver: X:
    loaded: modesetting alternate: fbdev,intel,vesa dri: crocus gpu: i915
    display-ID: :0 screens: 1
  Screen-1: 0 s-res: 1600x900 s-dpi: 96 s-size: 423x238mm (16.65x9.37")
    s-diag: 485mm (19.11")
  Monitor-1: eDP-1 model: LG Display 0x0396 built: 2012 res: mode: 1600x900
    hz: 60 scale: 100% (1) dpi: 106 gamma: 1.2 size: 382x215mm (15.04x8.46")
    diag: 438mm (17.3") ratio: 16:9 modes: 1600x900
  API: EGL v: 1.5 hw: drv: intel crocus platforms: device: 0 drv: crocus
    device: 1 drv: swrast gbm: drv: crocus surfaceless: drv: crocus x11:
    drv: crocus inactive: wayland
  API: OpenGL v: 4.6 compat-v: 4.5 vendor: intel mesa v: 25.1.5-arch1.1
    glx-v: 1.4 direct-render: yes renderer: Mesa Intel HD Graphics 4600 (HSW
    GT2) device-ID: 8086:0416 memory: 1.46 GiB unified: yes
  Info: Tools: api: eglinfo,glxinfo x11: xdpyinfo, xprop, xrandr

System appears to run, but I’m a little unsure if it’s in a good state… and how can I find which GPU it is actually using?
I mean, having 2 GPUs for one LCD screen seems a bit overkill…

It should be using the Intel GPU unless you explicitly tell it to use the AMD GPU.

That being said, unless you actually want to use the ADM GPU you are just wasting power by enabling it. If you were happy with the system the way it was before, might as well blacklist it again.

glxinfo | grep 'renderer string'
DRI_PRIME=1 glxinfo | grep 'renderer string'

If that ‘works’ .. you could go about testing some things using the dGPU.

I dont know if you do anything graphically intensive .. gaming for example? If you use steam you might try running something with the game launch options

DRI_PRIME=1 %command%

But I dont know if any of this is applicable to you.

[matthias@toshi-mch modprobe.d]$ glxinfo | grep 'renderer string'
OpenGL renderer string: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
[matthias@toshi-mch modprobe.d]$ DRI_PRIME=1 flxinfo | grep 'renderer string'
bash: flxinfo: Kommando nicht gefunden.
[matthias@toshi-mch modprobe.d]$ DRI_PRIME=1 glxinfo | grep 'renderer string'
OpenGL renderer string: Mesa Intel(R) HD Graphics 4600 (HSW GT2)
[matthias@toshi-mch modprobe.d]$ 

No gaming here whatsoever :wink:

I wonder about the many dmesg errors, surely a bad sign?

My tendency would be to blacklist it again, I’m thinking… Much less boot errors then.

Hope it’s not needed for the HDMI port. Then again, before it worked, with just the i915 stuff. Hm.

It isn’t likely. Laptops where the ports are connected directly to the discrete GPU are uncommon.

Could anyone of you make something of the dmesg errors?

Maybe really better to blacklist it again? (Could also make for a little more battery life—it only holds 2½–3 hours anyway.)

I only wonder why inxi shows it suddenly (and never before), even when the modules are blacklisted?

Well, I went for the blacklist option again, for the time being. Need this laptop to work, and that seems to generate much fewer error messages on boot.

Still new ones, all regarding ACPI. Mysterious.

$ sudo dmesg | grep -iE "acpi.*(warn|error|abort)"
[    0.200138] ACPI Warning: Time parameter 255 us > 100 us violating ACPI spec, please fix the firmware. (20240827/exsystem-142)
[    0.224029] acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_ERROR)
[   18.492605] ACPI Warning: SystemIO range 0x0000000000001828-0x000000000000182F conflicts with OpRegion 0x0000000000001800-0x000000000000187F (\PMIO) (20240827/utaddress-204)
[   18.492617] ACPI Warning: SystemIO range 0x0000000000000840-0x000000000000084F conflicts with OpRegion 0x0000000000000800-0x000000000000085F (\_SB.PCI0.PEG0.PEGP.GPIO) (20240827/utaddress-204)
[   18.492623] ACPI Warning: SystemIO range 0x0000000000000840-0x000000000000084F conflicts with OpRegion 0x0000000000000800-0x0000000000000BFF (\GPR) (20240827/utaddress-204)
[   18.492630] ACPI Warning: SystemIO range 0x0000000000000830-0x000000000000083F conflicts with OpRegion 0x0000000000000800-0x000000000000085F (\_SB.PCI0.PEG0.PEGP.GPIO) (20240827/utaddress-204)
[   18.492635] ACPI Warning: SystemIO range 0x0000000000000830-0x000000000000083F conflicts with OpRegion 0x0000000000000800-0x000000000000083F (\GPRL) (20240827/utaddress-204)
[   18.492640] ACPI Warning: SystemIO range 0x0000000000000830-0x000000000000083F conflicts with OpRegion 0x0000000000000800-0x0000000000000BFF (\GPR) (20240827/utaddress-204)
[   18.492646] ACPI Warning: SystemIO range 0x0000000000000800-0x000000000000082F conflicts with OpRegion 0x0000000000000800-0x000000000000085F (\_SB.PCI0.PEG0.PEGP.GPIO) (20240827/utaddress-204)
[   18.492651] ACPI Warning: SystemIO range 0x0000000000000800-0x000000000000082F conflicts with OpRegion 0x0000000000000800-0x000000000000083F (\GPRL) (20240827/utaddress-204)
[   18.492656] ACPI Warning: SystemIO range 0x0000000000000800-0x000000000000082F conflicts with OpRegion 0x0000000000000800-0x0000000000000BFF (\GPR) (20240827/utaddress-204)
[   18.492661] ACPI Warning: SystemIO range 0x0000000000000800-0x000000000000082F conflicts with OpRegion 0x0000000000000810-0x0000000000000813 (\IO_D) (20240827/utaddress-204)
[   19.442839] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.GFX0.DD02._BCL], AE_NOT_FOUND (20240827/psargs-332)
[   19.442850] ACPI Error: Aborting method \_SB.PCI0.PEG0.PEGP.DD02._BCL due to previous error (AE_NOT_FOUND) (20240827/psparse-529)

Thanks for your help anyway!