Hello, EndeavourOS community.
I have a very weird problem that only affects Overwatch, for some reason. Like the title says, it crashes my SATA SSD after I’ve played this game for around 15 minutes or so. After that approximate amount of time, the SSD fails and gets remounted as “ro,relatime”. Absolutely no other game gives me this problem at all, and if I install Overwatch anywhere else, nothing happens either. I’ve tried playing other games like Doom 2016, for quite a while, installed in the same SSD, and nothing wrong happens.
My laptop doesn’t have multiple m2 slots, so I’m using one of those adapters that change your optical drive for a hard drive bay. That’s why this is a SATA SSD. If it’s for just playing the game, I would install it in my internal HDD to get rid of the problem, but I’d like to solve this issue.
My laptop is an Asus GL553VE. Here is the inxi -Fazy
output:
System:
Kernel: 5.15.86-1-lts arch: x86_64 bits: 64 compiler: gcc v: 12.2.0
parameters: BOOT_IMAGE=/boot/vmlinuz-linux-lts
root=UUID=6b8ca19e-d3b7-4a50-9393-04fa285d1c68 rw
cryptdevice=UUID=0e26d6df-863d-4932-9c69-d9f47dbd6364:luks-0e26d6df-863d-4932-9c69-d9f47dbd6364
root=/dev/mapper/luks-0e26d6df-863d-4932-9c69-d9f47dbd6364
resume=/dev/mapper/luks-440ff5e6-9b55-4c5e-ae3b-7f76440cae39 loglevel=3
nowatchdog nvme_load=YES nvidia-drm.modeset=1
Desktop: KDE Plasma v: 5.26.5 tk: Qt v: 5.15.8 wm: kwin_x11 vt: 1 dm: SDDM
Distro: EndeavourOS base: Arch Linux
Machine:
Type: Laptop System: ASUSTeK product: GL553VE v: 1.0
serial: <superuser required>
Mobo: ASUSTeK model: GL553VE v: 1.0 serial: <superuser required>
UEFI: American Megatrends v: GL553VE.308 date: 04/29/2019
Battery:
ID-1: BAT0 charge: 26.7 Wh (98.5%) condition: 27.1/48.2 Wh (56.1%)
volts: 16.1 min: 14.4 model: Simplo SDI ICR18650 type: Li-ion
serial: <filter> status: N/A cycles: 33
CPU:
Info: model: Intel Core i7-7700HQ bits: 64 type: MT MCP arch: Kaby Lake
gen: core 7 level: v3 note: check built: 2018 process: Intel 14nm family: 6
model-id: 0x9E (158) stepping: 9 microcode: 0xF0
Topology: cpus: 1x cores: 4 tpc: 2 threads: 8 smt: enabled cache:
L1: 256 KiB desc: d-4x32 KiB; i-4x32 KiB L2: 1024 KiB desc: 4x256 KiB
L3: 6 MiB desc: 1x6 MiB
Speed (MHz): avg: 3346 high: 3487 min/max: 800/3800 scaling:
driver: intel_pstate governor: powersave cores: 1: 3349 2: 3378 3: 3168
4: 3487 5: 3361 6: 3301 7: 3323 8: 3404 bogomips: 44798
Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
Vulnerabilities:
Type: itlb_multihit status: KVM: VMX disabled
Type: l1tf mitigation: PTE Inversion; VMX: conditional cache flushes, SMT
vulnerable
Type: mds mitigation: Clear CPU buffers; SMT vulnerable
Type: meltdown mitigation: PTI
Type: mmio_stale_data mitigation: Clear CPU buffers; SMT vulnerable
Type: retbleed mitigation: IBRS
Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via
prctl and seccomp
Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer
sanitization
Type: spectre_v2 mitigation: IBRS, IBPB: conditional, RSB filling,
PBRSB-eIBRS: Not affected
Type: srbds mitigation: Microcode
Type: tsx_async_abort status: Not affected
Graphics:
Device-1: Intel HD Graphics 630 vendor: ASUSTeK driver: i915 v: kernel
arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports: active: eDP-1
empty: HDMI-A-1 bus-ID: 00:02.0 chip-ID: 8086:591b class-ID: 0300
Device-2: NVIDIA GP107M [GeForce GTX 1050 Ti Mobile] vendor: ASUSTeK
driver: nvidia v: 525.78.01 alternate: nouveau,nvidia_drm non-free: 525.xx+
status: current (as of 2022-12) arch: Pascal code: GP10x process: TSMC 16nm
built: 2016-21 pcie: gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3
speed: 8 GT/s bus-ID: 01:00.0 chip-ID: 10de:1c8c class-ID: 0302
Device-3: Realtek USB2.0 HD UVC WebCam type: USB driver: uvcvideo
bus-ID: 1-6:2 chip-ID: 0bda:57f5 class-ID: 0e02 serial: <filter>
Display: x11 server: X.Org v: 21.1.6 compositor: kwin_x11 driver: X:
loaded: intel,nvidia unloaded: modesetting alternate: fbdev,nouveau,nv,vesa
dri: i965 gpu: i915 display-ID: :0 screens: 1
Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.00x11.22")
s-diag: 582mm (22.93")
Monitor-1: eDP-1 mapped: eDP1 model: LG Display 0x046f built: 2016
res: 1920x1080 hz: 60 dpi: 143 gamma: 1.2 size: 340x190mm (13.39x7.48")
diag: 395mm (15.5") ratio: 16:9 modes: 1920x1080
API: OpenGL v: 4.6 Mesa 22.3.2 renderer: Mesa Intel HD Graphics 630 (KBL
GT2) direct render: Yes
Audio:
Device-1: Intel CM238 HD Audio vendor: ASUSTeK driver: snd_hda_intel
v: kernel bus-ID: 00:1f.3 chip-ID: 8086:a171 class-ID: 0403
Sound API: ALSA v: k5.15.86-1-lts running: yes
Sound Server-1: PulseAudio v: 16.1 running: no
Sound Server-2: PipeWire v: 0.3.63 running: yes
Network:
Device-1: Intel Wireless 7265 driver: iwlwifi v: kernel pcie: gen: 1
speed: 2.5 GT/s lanes: 1 bus-ID: 02:00.0 chip-ID: 8086:095a class-ID: 0280
IF: wlan0 state: up mac: <filter>
Device-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet
vendor: ASUSTeK driver: r8169 v: kernel pcie: gen: 1 speed: 2.5 GT/s lanes: 1
port: d000 bus-ID: 03:00.0 chip-ID: 10ec:8168 class-ID: 0200
IF: enp3s0 state: down mac: <filter>
Bluetooth:
Device-1: Intel Bluetooth wireless interface type: USB driver: btusb v: 0.8
bus-ID: 1-8:3 chip-ID: 8087:0a2a class-ID: e001
Report: rfkill ID: hci0 rfk-id: 8 state: down bt-service: enabled,running
rfk-block: hardware: no software: yes address: see --recommends
Drives:
Local Storage: total: 1.6 TiB used: 963.09 GiB (58.9%)
SMART Message: Unable to run smartctl. Root privileges required.
ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Toshiba model: N/A
size: 238.47 GiB block-size: physical: 512 B logical: 512 B speed: 31.6 Gb/s
lanes: 4 type: SSD serial: <filter> rev: 57XA4104 temp: 50.9 C scheme: GPT
ID-2: /dev/sda maj-min: 8:0 vendor: HGST (Hitachi) model: HTS721010A9E630
size: 931.51 GiB block-size: physical: 4096 B logical: 512 B speed: 6.0 Gb/s
type: HDD rpm: 7200 serial: <filter> rev: A3J0 scheme: GPT
ID-3: /dev/sdb maj-min: 8:16 vendor: Western Digital
model: WDS500G2B0A-00SM50 size: 465.76 GiB block-size: physical: 512 B
logical: 512 B speed: 6.0 Gb/s type: SSD serial: <filter> rev: 00WD
scheme: GPT
Partition:
ID-1: / raw-size: 229.37 GiB size: 224.71 GiB (97.97%)
used: 94.52 GiB (42.1%) fs: ext4 dev: /dev/dm-0 maj-min: 254:0
mapped: luks-0e26d6df-863d-4932-9c69-d9f47dbd6364
ID-2: /boot/efi raw-size: 300 MiB size: 299.4 MiB (99.80%)
used: 472 KiB (0.2%) fs: vfat dev: /dev/nvme0n1p1 maj-min: 259:1
Swap:
Kernel: swappiness: 60 (default) cache-pressure: 100 (default)
ID-1: swap-1 type: partition size: 8.8 GiB used: 0 KiB (0.0%) priority: -2
dev: /dev/dm-1 maj-min: 254:1
mapped: luks-440ff5e6-9b55-4c5e-ae3b-7f76440cae39
Sensors:
System Temperatures: cpu: 56.0 C pch: 62.5 C mobo: N/A
Fan Speeds (RPM): cpu: 3200
Info:
Processes: 312 Uptime: 3d 14h 15m wakeups: 8 Memory: 31.24 GiB
used: 10.97 GiB (35.1%) Init: systemd v: 252 default: graphical
tool: systemctl Compilers: gcc: 12.2.0 clang: 14.0.6 Packages: 1517
pm: pacman pkgs: 1509 libs: 465 tools: pamac,yay pm: flatpak pkgs: 8
Shell: Bash v: 5.1.16 running-in: konsole inxi: 3.3.24
The SSD in question is a Western Digital SA510. I tried checking out the logs the other day when it crashed again, and this is what dmesg
showed:
[190270.545621] ata3.00: exception Emask 0x50 SAct 0x80 SErr 0x48c0800 action 0xe frozen
[190270.545628] ata3.00: irq_stat 0x0c000040, interface fatal error, connection status changed
[190270.545631] ata3: SError: { HostInt CommWake 10B8B LinkSeq DevExch }
[190270.545637] ata3.00: failed command: READ FPDMA QUEUED
[190270.545639] ata3.00: cmd 60/98:38:00:1e:68/00:00:0f:00:00/40 tag 7 ncq dma 77824 in
res 40/00:3c:00:1e:68/00:00:0f:00:00/40 Emask 0x50 (ATA bus error)
[190270.545650] ata3.00: status: { DRDY }
[190270.545656] ata3: hard resetting link
[190272.315625] ata3: SATA link down (SStatus 0 SControl 300)
[190272.627703] ata3: hard resetting link
[190273.017985] ata3: SATA link down (SStatus 0 SControl 300)
[190273.082794] ata3: hard resetting link
[190273.407063] ata3: SATA link down (SStatus 0 SControl 300)
[190273.407070] ata3.00: disabled
[190273.407076] ahci 0000:00:17.0: port does not support device sleep
[190273.407089] sd 2:0:0:0: [sdb] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s
[190273.407091] sd 2:0:0:0: [sdb] tag#7 Sense Key : Illegal Request [current]
[190273.407093] sd 2:0:0:0: [sdb] tag#7 Add. Sense: Unaligned write command
[190273.407095] sd 2:0:0:0: [sdb] tag#7 CDB: Read(10) 28 00 0f 68 1e 00 00 00 98 00
[190273.407096] blk_update_request: I/O error, dev sdb, sector 258481664 op 0x0:(READ) flags 0x80700 phys_seg 14 prio class 0
[190273.407116] ata3: EH complete
[190273.407119] sd 2:0:0:0: rejecting I/O to offline device
[190273.407120] blk_update_request: I/O error, dev sdb, sector 692471528 op 0x0:(READ) flags 0x80700 phys_seg 4 prio class 0
[190273.407128] blk_update_request: I/O error, dev sdb, sector 258481664 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[190273.407134] blk_update_request: I/O error, dev sdb, sector 487092368 op 0x1:(WRITE) flags 0x800 phys_seg 3 prio class 0
[190273.407146] blk_update_request: I/O error, dev sdb, sector 692471528 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[190273.407148] Aborting journal on device dm-2-8.
[190273.407152] ata3.00: detaching (SCSI 2:0:0:0)
[190273.407160] blk_update_request: I/O error, dev sdb, sector 429450288 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
[190273.407164] EXT4-fs error (device dm-2) in ext4_reserve_inode_write:5752: Journal has aborted
[190273.407165] EXT4-fs warning (device dm-2): ext4_end_bio:344: I/O error 10 writing to inode 21635228 starting block 53680518)
[190273.407169] blk_update_request: I/O error, dev sdb, sector 692471528 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[190273.407171] Buffer I/O error on device dm-2, logical block 53680518
[190273.407172] EXT4-fs error (device dm-2): mpage_map_and_submit_extent:2513: inode #21635228: comm kworker/u16:2: mark_inode_dirty error
[190273.407175] EXT4-fs error (device dm-2): mpage_map_and_submit_extent:2515: comm kworker/u16:2: Failed to mark inode 21635228 dirty
[190273.407177] EXT4-fs error (device dm-2) in ext4_writepages:2833: Journal has aborted
[190273.407183] blk_update_request: I/O error, dev sdb, sector 486807552 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[190273.407186] blk_update_request: I/O error, dev sdb, sector 486807552 op 0x1:(WRITE) flags 0x800 phys_seg 1 prio class 0
[190273.407188] Buffer I/O error on dev dm-2, logical block 60850176, lost sync page write
[190273.407189] EXT4-fs error (device dm-2): ext4_journal_check_start:83: comm kworker/u16:2: Detected aborted journal
[190273.407191] blk_update_request: I/O error, dev sdb, sector 692471528 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[190273.407192] JBD2: Error -5 detected when updating journal superblock for dm-2-8.
[190273.407215] EXT4-fs warning (device dm-2): ext4_end_bio:344: I/O error 10 writing to inode 21635228 starting block 53680519)
[190273.407223] EXT4-fs error (device dm-2): ext4_journal_check_start:83: comm kworker/u16:10: Detected aborted journal
[190273.407225] Buffer I/O error on dev dm-2, logical block 0, lost sync page write
[190273.407229] EXT4-fs (dm-2): previous I/O error to superblock detected
[190273.407231] EXT4-fs (dm-2): previous I/O error to superblock detected
[190273.407258] Buffer I/O error on dev dm-2, logical block 0, lost sync page write
[190273.407366] Buffer I/O error on dev dm-2, logical block 0, lost sync page write
[190273.407374] EXT4-fs (dm-2): I/O error while writing superblock
[190273.407374] EXT4-fs (dm-2): I/O error while writing superblock
[190273.407375] EXT4-fs (dm-2): I/O error while writing superblock
[190273.407376] EXT4-fs (dm-2): Remounting filesystem read-only
[190273.407376] EXT4-fs (dm-2): Remounting filesystem read-only
[190273.407378] EXT4-fs (dm-2): failed to convert unwritten extents to written extents -- potential data loss! (inode 21635228, error -30)
[190273.407379] EXT4-fs (dm-2): ext4_writepages: jbd2_start: 7168 pages, ino 21635229; err -30
[190273.407382] Buffer I/O error on device dm-2, logical block 53680519
[190273.407384] Buffer I/O error on device dm-2, logical block 53680520
[190273.407386] Buffer I/O error on device dm-2, logical block 53680521
[190273.407387] Buffer I/O error on device dm-2, logical block 53680522
[190273.407388] Buffer I/O error on device dm-2, logical block 53680523
[190273.407412] Buffer I/O error on dev dm-2, logical block 29, lost async page write
[190273.407416] Buffer I/O error on dev dm-2, logical block 60293121, lost async page write
[190273.407418] Buffer I/O error on dev dm-2, logical block 60293122, lost async page write
[190273.407420] Buffer I/O error on dev dm-2, logical block 86559358, lost async page write
[190273.465638] sd 2:0:0:0: [sdb] Synchronizing SCSI cache
[190273.465734] sd 2:0:0:0: [sdb] Synchronize Cache(10) failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[190273.465737] sd 2:0:0:0: [sdb] Stopping disk
[190273.465743] sd 2:0:0:0: [sdb] Start/Stop Unit failed: Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
I tried looking up some troubleshooting regarding these messages, and for the most part, it seems people either claim the drive is about to die, or it’s due to malfunctioning SATA cables. Some people who have encountered similar output with dmesg
have solved this issue in their cases by changing the SATA cables for better quality ones. My drive seems to be in really good health according to smartctl
, so I’m suspecting it’s the optical caddy it’s installed in. One thing that has caught my attention lately, is that whenever my laptop is booting up, opening this drive (it’s encrypted with LUKS) always halts the boot process and takes like 3-4 seconds to complete. The other drives, including the internal HDD (which is connected directly to the motherboard through the SATA port), takes about a second.
These drives in /etc/fstab
are configured like this:
/dev/mapper/sdb1 /home/hugazo/Disco\040D ext4 defaults,noatime 0 2
/dev/mapper/sda1 /home/hugazo/Disco\040F ext4 defaults 0 2
/dev/mapper/sdb1
being the problematic SSD in the caddy, and /dev/mapper/sda1
being the internal HDD connected directly to the motherboard through the SATA port.
Has anyone had this happen to them? Should I run some specific tests to try and narrow the cause down? If it ends up being the caddy, does anyone have a recommendation for one that doesn’t malfunction like this?