Cannot boot using nvidia-dkms, have to install nvidia

Hmm, this log is gone now let me try installing this version again.

And the system hardlocked I’ll try to see if this provides something useful

DKMS make.log for nvidia-545.29.02 for kernel 6.6.3-arch1-1 (x86_64)
Sun 03 Dec 2023 08:45:45 AM PST
make[1]: Entering directory '/usr/lib/modules/6.6.3-arch1-1/build'
warning: the compiler differs from the one used to build the kernel
  The kernel was built by: gcc (GCC) 13.2.1 20230801
  You are using:           cc (GCC) 13.2.1 20230801
  SYMLINK /var/lib/dkms/nvidia/545.29.02/build/nvidia/nv-kernel.o
  SYMLINK /var/lib/dkms/nvidia/545.29.02/build/nvidia-modeset/nv-modeset-kernel.o
 CONFTEST: hash__remap_4k_pfn
 CONFTEST: set_pages_uc
 CONFTEST: list_is_first
 CONFTEST: set_memory_uc
 CONFTEST: set_memory_array_uc
 CONFTEST: set_pages_array_uc
 CONFTEST: ioremap_cache
 CONFTEST: ioremap_wc
 CONFTEST: ioremap_driver_hardened
 CONFTEST: ioremap_driver_hardened_wc
 CONFTEST: ioremap_cache_shared
 CONFTEST: pci_get_domain_bus_and_slot
 CONFTEST: get_num_physpages
 CONFTEST: pde_data
 CONFTEST: xen_ioemu_inject_msi
 CONFTEST: phys_to_dma
 CONFTEST: get_dma_ops
 CONFTEST: dma_attr_macros
 CONFTEST: dma_map_page_attrs
 CONFTEST: write_cr4
 CONFTEST: of_find_node_by_phandle
 CONFTEST: of_node_to_nid
 CONFTEST: pnv_pci_get_npu_dev
 CONFTEST: of_get_ibm_chip_id
 CONFTEST: pci_bus_address
 CONFTEST: pci_stop_and_remove_bus_device
 CONFTEST: pci_rebar_get_possible_sizes
 CONFTEST: wait_for_random_bytes
 CONFTEST: register_cpu_notifier
 CONFTEST: cpuhp_setup_state
 CONFTEST: dma_map_resource
 CONFTEST: get_backlight_device_by_name
 CONFTEST: timer_setup
 CONFTEST: pci_enable_msix_range
 CONFTEST: kernel_read_has_pointer_pos_arg
 CONFTEST: kernel_write_has_pointer_pos_arg
 CONFTEST: dma_direct_map_resource
 CONFTEST: tegra_get_platform
 CONFTEST: tegra_bpmp_send_receive
 CONFTEST: flush_cache_all
 CONFTEST: vmf_insert_pfn
 CONFTEST: jiffies_to_timespec
 CONFTEST: ktime_get_raw_ts64
 CONFTEST: ktime_get_real_ts64
 CONFTEST: full_name_hash
 CONFTEST: pci_enable_atomic_ops_to_root
 CONFTEST: vga_tryget
 CONFTEST: cc_platform_has
 CONFTEST: seq_read_iter
 CONFTEST: unsafe_follow_pfn
 CONFTEST: drm_gem_object_get
 CONFTEST: drm_gem_object_put_unlocked
 CONFTEST: add_memory_driver_managed
 CONFTEST: device_property_read_u64
 CONFTEST: devm_of_platform_populate
 CONFTEST: of_dma_configure
 CONFTEST: of_property_count_elems_of_size
 CONFTEST: of_property_read_variable_u8_array
 CONFTEST: of_property_read_variable_u32_array
 CONFTEST: i2c_new_client_device
 CONFTEST: i2c_unregister_device
 CONFTEST: of_get_named_gpio
 CONFTEST: devm_gpio_request_one
 CONFTEST: gpio_direction_input
 CONFTEST: gpio_direction_output
 CONFTEST: gpio_get_value
 CONFTEST: gpio_set_value
 CONFTEST: gpio_to_irq
 CONFTEST: icc_get
 CONFTEST: icc_put
 CONFTEST: icc_set_bw
 CONFTEST: dma_buf_export_args
 CONFTEST: dma_buf_ops_has_kmap
 CONFTEST: dma_buf_ops_has_kmap_atomic
 CONFTEST: dma_buf_ops_has_map_atomic
 CONFTEST: dma_buf_ops_has_map
 CONFTEST: dma_buf_has_dynamic_attachment
 CONFTEST: dma_buf_attachment_has_peer2peer
 CONFTEST: dma_set_mask_and_coherent
 CONFTEST: devm_clk_bulk_get_all
 CONFTEST: get_task_ioprio
 CONFTEST: mdev_set_iommu_device
 CONFTEST: offline_and_remove_memory
 CONFTEST: wait_on_bit_lock_argument_count
 CONFTEST: radix_tree_empty
 CONFTEST: radix_tree_replace_slot
 CONFTEST: pnv_npu2_init_context
 CONFTEST: cpumask_of_node
 CONFTEST: ioasid_get
 CONFTEST: mm_pasid_drop
 CONFTEST: migrate_vma_setup
 CONFTEST: mmget_not_zero
 CONFTEST: mmgrab
 CONFTEST: iommu_sva_bind_device_has_drvdata_arg
 CONFTEST: vm_fault_to_errno
 CONFTEST: find_next_bit_wrap
 CONFTEST: acpi_video_backlight_use_native
 CONFTEST: drm_dev_unref
 CONFTEST: drm_reinit_primary_mode_group
 CONFTEST: get_user_pages_remote
 CONFTEST: get_user_pages
 CONFTEST: pin_user_pages_remote
 CONFTEST: pin_user_pages
 CONFTEST: drm_gem_object_lookup
 CONFTEST: drm_atomic_state_ref_counting
 CONFTEST: drm_driver_has_gem_prime_res_obj
 CONFTEST: drm_atomic_helper_connector_dpms
 CONFTEST: drm_connector_funcs_have_mode_in_name
 CONFTEST: drm_connector_has_vrr_capable_property
 CONFTEST: drm_framebuffer_get
 CONFTEST: drm_dev_put
 CONFTEST: drm_format_num_planes
 CONFTEST: drm_connector_for_each_possible_encoder
 CONFTEST: drm_rotation_available
 CONFTEST: drm_vma_offset_exact_lookup_locked
 CONFTEST: nvhost_dma_fence_unpack
 CONFTEST: dma_fence_set_error
 CONFTEST: fence_set_error
 CONFTEST: sync_file_get_fence
 CONFTEST: drm_aperture_remove_conflicting_pci_framebuffers
 CONFTEST: drm_fbdev_generic_setup
 CONFTEST: drm_connector_attach_hdr_output_metadata_property
 CONFTEST: drm_helper_crtc_enable_color_mgmt
 CONFTEST: is_export_symbol_gpl_of_node_to_nid
 CONFTEST: drm_crtc_enable_color_mgmt
 CONFTEST: drm_atomic_helper_legacy_gamma_set
 CONFTEST: is_export_symbol_gpl_sme_active
 CONFTEST: is_export_symbol_present_swiotlb_map_sg_attrs
 CONFTEST: is_export_symbol_present_swiotlb_dma_ops
 CONFTEST: is_export_symbol_present___close_fd
 CONFTEST: is_export_symbol_present_close_fd
 CONFTEST: is_export_symbol_present_get_unused_fd
 CONFTEST: is_export_symbol_present_get_unused_fd_flags
 CONFTEST: is_export_symbol_present_nvhost_get_default_device
 CONFTEST: is_export_symbol_present_nvhost_syncpt_unit_interface_get_byte_offset
 CONFTEST: is_export_symbol_present_nvhost_syncpt_unit_interface_get_aperture
 CONFTEST: is_export_symbol_present_tegra_dce_register_ipc_client
 CONFTEST: is_export_symbol_present_tegra_dce_unregister_ipc_client
 CONFTEST: is_export_symbol_present_tegra_dce_client_ipc_send_recv
 CONFTEST: is_export_symbol_present_dram_clk_to_mc_clk
 CONFTEST: is_export_symbol_present_get_dram_num_channels
 CONFTEST: is_export_symbol_present_tegra_dram_types
 CONFTEST: is_export_symbol_present_pxm_to_node
 CONFTEST: is_export_symbol_present_screen_info
 CONFTEST: is_export_symbol_present_i2c_bus_status
 CONFTEST: is_export_symbol_present_tegra_fuse_control_read
 CONFTEST: is_export_symbol_present_tegra_get_platform
 CONFTEST: is_export_symbol_present_pci_find_host_bridge
 CONFTEST: is_export_symbol_present_tsec_comms_set_init_cb
 CONFTEST: is_export_symbol_present_tsec_comms_send_cmd
 CONFTEST: is_export_symbol_present_tsec_comms_clear_init_cb
 CONFTEST: is_export_symbol_present_tsec_comms_alloc_mem_from_gscco
 CONFTEST: is_export_symbol_present_tsec_comms_free_gscco_mem
 CONFTEST: is_export_symbol_present_memory_block_size_bytes
 CONFTEST: crypto
 CONFTEST: is_export_symbol_present_int_active_memcg
 CONFTEST: dma_ops
 CONFTEST: swiotlb_dma_ops
 CONFTEST: noncoherent_swiotlb_dma_ops
 CONFTEST: vm_fault_has_address
 CONFTEST: vm_insert_pfn_prot
 CONFTEST: vmf_insert_pfn_prot
 CONFTEST: vm_ops_fault_removed_vma_arg
 CONFTEST: kmem_cache_has_kobj_remove_work
 CONFTEST: sysfs_slab_unlink
 CONFTEST: proc_ops
 CONFTEST: timespec64
 CONFTEST: vmalloc_has_pgprot_t_arg
 CONFTEST: mm_has_mmap_lock
 CONFTEST: pci_channel_state
 CONFTEST: pci_dev_has_ats_enabled
 CONFTEST: remove_memory_has_nid_arg
 CONFTEST: add_memory_driver_managed_has_mhp_flags_arg
 CONFTEST: num_registered_fb
 CONFTEST: pci_driver_has_driver_managed_dma
 CONFTEST: vm_area_struct_has_const_vm_flags
 CONFTEST: backing_dev_info
 CONFTEST: memory_failure_has_trapno_arg
 CONFTEST: mm_context_t
 CONFTEST: vm_fault_t
 CONFTEST: mmu_notifier_ops_invalidate_range
 CONFTEST: mmu_notifier_ops_arch_invalidate_secondary_tlbs
 CONFTEST: migrate_vma_added_flags
 CONFTEST: migrate_device_range
 CONFTEST: handle_mm_fault_has_mm_arg
 CONFTEST: handle_mm_fault_has_pt_regs_arg
 CONFTEST: mempolicy_has_unified_nodes
 CONFTEST: mempolicy_has_home_node
 CONFTEST: mpol_preferred_many_present
 CONFTEST: mmu_interval_notifier
 CONFTEST: drm_bus_present
 CONFTEST: drm_bus_has_bus_type
 CONFTEST: drm_bus_has_get_irq
 CONFTEST: drm_bus_has_get_name
 CONFTEST: drm_driver_has_device_list
 CONFTEST: drm_driver_has_legacy_dev_list
 CONFTEST: drm_driver_has_set_busid
 CONFTEST: drm_crtc_state_has_connectors_changed
 CONFTEST: drm_init_function_args
 CONFTEST: drm_helper_mode_fill_fb_struct
 CONFTEST: drm_master_drop_has_from_release_arg
 CONFTEST: drm_driver_unload_has_int_return_type
 CONFTEST: drm_atomic_helper_crtc_destroy_state_has_crtc_arg
 CONFTEST: drm_atomic_helper_plane_destroy_state_has_plane_arg
 CONFTEST: drm_mode_object_find_has_file_priv_arg
 CONFTEST: dma_buf_owner
 CONFTEST: drm_connector_list_iter
 CONFTEST: drm_atomic_helper_swap_state_has_stall_arg
 CONFTEST: drm_driver_prime_flag_present
 CONFTEST: drm_gem_object_has_resv
 CONFTEST: drm_crtc_state_has_async_flip
 CONFTEST: drm_crtc_state_has_pageflip_flags
 CONFTEST: drm_crtc_state_has_vrr_enabled
 CONFTEST: drm_format_modifiers_present
 CONFTEST: drm_vma_node_is_allowed_has_tag_arg
 CONFTEST: drm_vma_offset_node_has_readonly
 CONFTEST: drm_display_mode_has_vrefresh
 CONFTEST: drm_driver_master_set_has_int_return_type
 CONFTEST: drm_driver_has_gem_free_object
 CONFTEST: drm_prime_pages_to_sg_has_drm_device_arg
 CONFTEST: drm_driver_has_gem_prime_callbacks
 CONFTEST: drm_crtc_atomic_check_has_atomic_state_arg
 CONFTEST: drm_gem_object_vmap_has_map_arg
 CONFTEST: drm_plane_atomic_check_has_atomic_state_arg
 CONFTEST: drm_device_has_pdev
 CONFTEST: drm_crtc_state_has_no_vblank
 CONFTEST: drm_mode_config_has_allow_fb_modifiers
 CONFTEST: drm_has_hdr_output_metadata
 CONFTEST: dma_resv_add_fence
 CONFTEST: dma_resv_reserve_fences
 CONFTEST: reservation_object_reserve_shared_has_num_fences_arg
 CONFTEST: drm_connector_has_override_edid
 CONFTEST: drm_master_has_leases
 CONFTEST: drm_file_get_master
 CONFTEST: drm_modeset_lock_all_end
 CONFTEST: drm_connector_lookup
 CONFTEST: drm_connector_put
 CONFTEST: drm_driver_has_dumb_destroy
 CONFTEST: fence_ops_use_64bit_seqno
 CONFTEST: drm_aperture_remove_conflicting_pci_framebuffers_has_driver_arg
 CONFTEST: drm_mode_create_dp_colorspace_property_has_supported_colorspaces_arg
 CONFTEST: dom0_kernel_present
 CONFTEST: nvidia_vgpu_kvm_build
 CONFTEST: nvidia_grid_build
 CONFTEST: nvidia_grid_csp_build
 CONFTEST: pm_runtime_available
 CONFTEST: pci_class_multimedia_hd_audio
 CONFTEST: drm_available
 CONFTEST: vfio_pci_core_available
 CONFTEST: mdev_available
 CONFTEST: cmd_uphy_display_port_init
 CONFTEST: cmd_uphy_display_port_off
 CONFTEST: memory_failure_mf_sw_simulated_defined
 CONFTEST: drm_atomic_available
 CONFTEST: is_export_symbol_gpl_refcount_inc
 CONFTEST: is_export_symbol_gpl_refcount_dec_and_test
 CONFTEST: drm_alpha_blending_available
 CONFTEST: is_export_symbol_present_drm_gem_prime_fd_to_handle
 CONFTEST: is_export_symbol_present_drm_gem_prime_handle_to_fd
 CONFTEST: ib_peer_memory_symbols

Latest available packages are:

  • nvidia-dkms 545.29.06-1
  • linux 6.6.3.arch1-1
  • linux-headers 6.6.3.arch1-1

These should be compatible with each other.

The old packages were a test from earlier when working with an offline install, jsut to see what would happen if we triggered the rebuild when updating to the same package that comes in offline.

I can update to latest as a test, but I suspect it will be the same result.

Alright I got it

make error 139, segmentation fault, isn’t this what ibt=off was for?

make[3]: *** [/var/lib/dkms/nvidia/545.29.06/build/Kbuild:182: /var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/dma_buf_export_args.h] Segmentation fault (core dumped)
make[3]: *** Waiting for unfinished jobs....
 CONFTEST: cpumask_of_node
 CONFTEST: ioasid_get
make[3]: *** [/var/lib/dkms/nvidia/545.29.06/build/Kbuild:184: /var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/icc_set_bw.h] Error 139
make[3]: *** Deleting file '/var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/icc_set_bw.h'
make[2]: *** [/usr/lib/modules/6.6.3-arch1-1/build/Makefile:1913: /var/lib/dkms/nvidia/545.29.06/build] Error 2
make[1]: *** [Makefile:234: __sub-make] Error 2
make[1]: Leaving directory '/usr/lib/modules/6.6.3-arch1-1/build'
make: *** [Makefile:82: modules] Error 2
DKMS make.log for nvidia-545.29.06 for kernel 6.6.3-arch1-1 (x86_64)
Sun 03 Dec 2023 09:14:43 AM PST
make[1]: Entering directory '/usr/lib/modules/6.6.3-arch1-1/build'
warning: the compiler differs from the one used to build the kernel
  The kernel was built by: gcc (GCC) 13.2.1 20230801
  You are using:           cc (GCC) 13.2.1 20230801
  SYMLINK /var/lib/dkms/nvidia/545.29.06/build/nvidia/nv-kernel.o
  SYMLINK /var/lib/dkms/nvidia/545.29.06/build/nvidia-modeset/nv-modeset-kernel.o
 CONFTEST: hash__remap_4k_pfn
 CONFTEST: set_pages_uc
 CONFTEST: list_is_first
 CONFTEST: set_memory_uc
 CONFTEST: set_memory_array_uc
 CONFTEST: set_pages_array_uc
 CONFTEST: ioremap_cache
 CONFTEST: ioremap_wc
 CONFTEST: ioremap_driver_hardened
 CONFTEST: ioremap_driver_hardened_wc
 CONFTEST: pci_get_domain_bus_and_slot
 CONFTEST: ioremap_cache_shared
 CONFTEST: get_num_physpages
 CONFTEST: pde_data
 CONFTEST: xen_ioemu_inject_msi
 CONFTEST: phys_to_dma
 CONFTEST: dma_attr_macros
 CONFTEST: get_dma_ops
 CONFTEST: dma_map_page_attrs
 CONFTEST: write_cr4
 CONFTEST: of_node_to_nid
 CONFTEST: of_find_node_by_phandle
 CONFTEST: pnv_pci_get_npu_dev
 CONFTEST: of_get_ibm_chip_id
 CONFTEST: pci_bus_address
 CONFTEST: pci_stop_and_remove_bus_device
 CONFTEST: pci_rebar_get_possible_sizes
 CONFTEST: wait_for_random_bytes
 CONFTEST: register_cpu_notifier
 CONFTEST: cpuhp_setup_state
 CONFTEST: dma_map_resource
 CONFTEST: get_backlight_device_by_name
 CONFTEST: timer_setup
 CONFTEST: pci_enable_msix_range
 CONFTEST: kernel_read_has_pointer_pos_arg
 CONFTEST: kernel_write_has_pointer_pos_arg
 CONFTEST: dma_direct_map_resource
 CONFTEST: tegra_get_platform
 CONFTEST: tegra_bpmp_send_receive
 CONFTEST: flush_cache_all
 CONFTEST: vmf_insert_pfn
 CONFTEST: jiffies_to_timespec
 CONFTEST: ktime_get_raw_ts64
 CONFTEST: ktime_get_real_ts64
 CONFTEST: full_name_hash
 CONFTEST: pci_enable_atomic_ops_to_root
 CONFTEST: vga_tryget
 CONFTEST: cc_platform_has
 CONFTEST: seq_read_iter
 CONFTEST: unsafe_follow_pfn
 CONFTEST: drm_gem_object_get
 CONFTEST: drm_gem_object_put_unlocked
 CONFTEST: device_property_read_u64
 CONFTEST: add_memory_driver_managed
 CONFTEST: devm_of_platform_populate
 CONFTEST: of_dma_configure
 CONFTEST: of_property_count_elems_of_size
 CONFTEST: of_property_read_variable_u8_array
 CONFTEST: i2c_new_client_device
 CONFTEST: of_property_read_variable_u32_array
 CONFTEST: i2c_unregister_device
 CONFTEST: of_get_named_gpio
 CONFTEST: devm_gpio_request_one
 CONFTEST: gpio_direction_input
 CONFTEST: gpio_direction_output
 CONFTEST: gpio_get_value
 CONFTEST: gpio_set_value
 CONFTEST: gpio_to_irq
 CONFTEST: icc_get
 CONFTEST: icc_put
 CONFTEST: icc_set_bw
 CONFTEST: dma_buf_ops_has_kmap
 CONFTEST: dma_buf_ops_has_kmap_atomic
 CONFTEST: dma_buf_ops_has_map
 CONFTEST: dma_buf_ops_has_map_atomic
 CONFTEST: dma_buf_has_dynamic_attachment
 CONFTEST: dma_buf_attachment_has_peer2peer
 CONFTEST: dma_set_mask_and_coherent
 CONFTEST: devm_clk_bulk_get_all
 CONFTEST: get_task_ioprio
 CONFTEST: mdev_set_iommu_device
 CONFTEST: offline_and_remove_memory
 CONFTEST: wait_on_bit_lock_argument_count
 CONFTEST: radix_tree_empty
 CONFTEST: radix_tree_replace_slot
 CONFTEST: pnv_npu2_init_context
make[3]: *** [/var/lib/dkms/nvidia/545.29.06/build/Kbuild:182: /var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/dma_buf_export_args.h] Segmentation fault (core dumped)
make[3]: *** Waiting for unfinished jobs....
 CONFTEST: cpumask_of_node
 CONFTEST: ioasid_get
make[3]: *** [/var/lib/dkms/nvidia/545.29.06/build/Kbuild:184: /var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/icc_set_bw.h] Error 139
make[3]: *** Deleting file '/var/lib/dkms/nvidia/545.29.06/build/conftest/compile-tests/icc_set_bw.h'
make[2]: *** [/usr/lib/modules/6.6.3-arch1-1/build/Makefile:1913: /var/lib/dkms/nvidia/545.29.06/build] Error 2
make[1]: *** [Makefile:234: __sub-make] Error 2
make[1]: Leaving directory '/usr/lib/modules/6.6.3-arch1-1/build'
make: *** [Makefile:82: modules] Error 2

Eureka, looks like @joekamprad was correct in that it was either a CPU or RAM issue.

I iteratively stepped back my mother board settings.

  1. First I tried turning off TPM, sudo pacman -S nvidia-dkms, hard lock
  2. Then I tried turning off XMP, , sudo pacman -S nvidia-dkms, hard lock
  3. I restored optimized defaults on my motherboard which reset 2 CPU specific settings, 1 CPU/RAM setting.
  • BCLK 100Mhz lock → Disabled
  • Enhanced Turbo → Auto
  • Undervolt Protection → Auto

With these off, sudo pacman -S nvidia-dkms, dkms / dracut successfully rebuilt and I could load into KDE.

So it was an “overclock” issue, I suspect Enhanced Turbo as this allows 2 cores to 5.8 and the rest to 5.5 on a 13900K.

This is a feature of Intel CPUs and shouldn’t cause this issue, I am going to turn each back on one by one to determine which is causing it.

EDIT It was/is Enhanced Turbo

You might also try updating your BIOS if it isn’t already at the latest rev.

I am on the latest of the latest for this board, so in reality I should maybe roll back a revision.

First I will try to determine what setting causes it.

EDIT It was/is Enhanced Turbo

Thank you all for bearing with me in this frustrating journey :smiley:
I am marking this as solved

Now to figure out why the Grub install gives me 2 boot options in UEFI

I have 1 endeavouros option and 1 UEFI OS option, both coming from the grub install

Could just be my bios being dumb, doesn’t happen with systemd boot

UEFI OS entries are generated automatic by the firmware.
grub install will create the entry starting with endeavouros thats taken from grub config option
GRUB_DISTRIBUTOR='EndeavourOS'
Both should work to load grub.

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.