Kernel memory management misbehaviour

Hi,
I am running Endeavouros using kernel 5.14.1-zen1-1-zen. I do have 8GB of physical RAM and 16GB of swap, both is shown via top correctly. System hasr run since years without any issue. Since some days (maybe after last kernel update) it is running of memory. Startup is normal (Plasma), all applications start normally. RAM is used up to 7.6GB for applications and buffers, swap is not used. After some hours browsers are dying. No matter what browser (opera, firefox, chromium) - all cannot open websites because of “memory lack”. top shows same as before: no swap used, physical memory is eaten up to 200MB left. It seems swapping does not start - but why? How can I debug this?

First try switching to the LTS kernel to see if it really is being caused by the kernel or some other application which was updated.

I know this might not be of much help in your situation but I have been running the same kernel on a test machine since a couple of days ago and I am not having any such issue.

specs
System:    Kernel: 5.14.1-zen1-1-zen x86_64 bits: 64 compiler: gcc v: 11.1.0 
           parameters: initrd=\arch-cinnamon\intel-ucode.img initrd=\arch-cinnamon\initramfs-linux-zen.img rw 
           root=UUID=1d9646ea-0edf-423b-85a7-88a429c6314b rootflags=subvol=@arch-cinnamon 
           lsm=landlock,lockdown,yama,apparmor,bpf nowatchdog zswap.enabled=0 
           Desktop: Cinnamon 5.0.5 tk: GTK 3.24.30 info: plank wm: Muffin vt: 2 dm: GDM 40.1 Distro: Arch Linux 
Machine:   Type: Laptop System: Dell product: XPS 13 9380 v: N/A serial: <filter> Chassis: type: 10 serial: <filter> 
           Mobo: Dell model: 0KTDY6 v: A00 serial: <filter> UEFI: Dell v: 1.14.0 date: 05/27/2021 
Battery:   ID-1: BAT0 charge: 30.6 Wh (71.7%) condition: 42.7/52.0 Wh (82.2%) volts: 8.4 min: 7.6 
           model: LGC-LGC6.73 DELL H754V8C type: Li-ion serial: <filter> status: Charging 
CPU:       Info: Quad Core model: Intel Core i7-8565U bits: 64 type: MT MCP arch: Kaby Lake note: check family: 6 
           model-id: 8E (142) stepping: B (11) microcode: EA cache: L2: 8 MiB 
           flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx bogomips: 31999 
           Speed: 800 MHz min/max: 400/4600 MHz Core speeds (MHz): 1: 800 2: 800 3: 800 4: 800 5: 800 6: 745 7: 701 8: 800 
           Vulnerabilities: Type: itlb_multihit status: KVM: VMX disabled 
           Type: l1tf status: Not affected 
           Type: mds mitigation: Clear CPU buffers; SMT vulnerable 
           Type: meltdown status: Not affected 
           Type: spec_store_bypass mitigation: Speculative Store Bypass disabled via prctl and seccomp 
           Type: spectre_v1 mitigation: usercopy/swapgs barriers and __user pointer sanitization 
           Type: spectre_v2 mitigation: Full generic retpoline, IBPB: conditional, IBRS_FW, STIBP: conditional, RSB filling 
           Type: srbds mitigation: Microcode 
           Type: tsx_async_abort status: Not affected 
Graphics:  Device-1: Intel WhiskeyLake-U GT2 [UHD Graphics 620] vendor: Dell driver: i915 v: kernel bus-ID: 00:02.0 
           chip-ID: 8086:3ea0 class-ID: 0300 
           Device-2: CN09357GLOG008CLACSJA01 Integrated_Webcam_HD type: USB driver: uvcvideo bus-ID: 1-5:2 chip-ID: 0c45:6723 
           class-ID: 0e02 
           Display: x11 server: X.Org 1.20.13 compositor: muffin driver: loaded: modesetting unloaded: vesa 
           alternate: fbdev,intel display-ID: :0 screens: 1 
           Screen-1: 0 s-res: 1920x1080 s-dpi: 96 s-size: 508x285mm (20.0x11.2") s-diag: 582mm (22.9") 
           Monitor-1: eDP-1 res: 1920x1080 hz: 60 dpi: 166 size: 293x162mm (11.5x6.4") diag: 335mm (13.2") 
           OpenGL: renderer: Mesa Intel UHD Graphics 620 (WHL GT2) v: 4.6 Mesa 21.2.1 direct render: Yes 
Audio:     Device-1: Intel Cannon Point-LP High Definition Audio vendor: Dell driver: snd_hda_intel v: kernel 
           alternate: snd_soc_skl,snd_sof_pci_intel_cnl bus-ID: 00:1f.3 chip-ID: 8086:9dc8 class-ID: 0403 
           Sound Server-1: ALSA v: k5.14.1-zen1-1-zen running: yes 
           Sound Server-2: JACK v: 1.9.19 running: no 
           Sound Server-3: PulseAudio v: 15.0 running: no 
           Sound Server-4: PipeWire v: 0.3.34 running: yes 
Network:   Device-1: Qualcomm Atheros QCA6174 802.11ac Wireless Network Adapter vendor: Rivet Networks Killer 1435 Wireless-AC 
           driver: ath10k_pci v: kernel port: efa0 bus-ID: 02:00.0 chip-ID: 168c:003e class-ID: 0280 
           IF: wlan0 state: up mac: <filter> 
Bluetooth: Device-1: N/A type: USB driver: btusb v: 0.8 bus-ID: 1-7:3 chip-ID: 0489:e0a2 class-ID: e001 
           Report: bt-adapter note: tool can't run ID: hci0 rfk-id: 0 state: down bt-service: disabled rfk-block: hardware: no 
           software: no address: N/A 
Drives:    Local Storage: total: 465.76 GiB used: 81.07 GiB (17.4%) 
           SMART Message: Required tool smartctl not installed. Check --recommends 
           ID-1: /dev/nvme0n1 maj-min: 259:0 vendor: Samsung model: SSD 970 EVO 500GB size: 465.76 GiB block-size: 
           physical: 512 B logical: 512 B speed: 31.6 Gb/s lanes: 4 rotation: SSD serial: <filter> rev: 2B2QEXE7 scheme: GPT 
Partition: ID-1: / raw-size: 456.76 GiB size: 456.76 GiB (100.00%) used: 80.84 GiB (17.7%) fs: btrfs dev: /dev/nvme0n1p2 
           maj-min: 259:2 
Swap:      Kernel: swappiness: 60 (default) cache-pressure: 100 (default) 
           ID-1: swap-1 type: file size: 8 GiB used: 0 KiB (0.0%) priority: -2 file: /swap/swapfile 
Sensors:   System Temperatures: cpu: 38.0 C mobo: N/A 
           Fan Speeds (RPM): cpu: 0 fan-2: 0 
Info:      Processes: 256 Uptime: 35m wakeups: 1800 Memory: 7.42 GiB used: 3.07 GiB (41.4%) Init: systemd v: 249 
           tool: systemctl Compilers: gcc: 11.1.0 Packages: 626 pacman: 615 lib: 163 flatpak: 11 Shell: Bash v: 5.1.8 
           running-in: gnome-terminal inxi: 3.3.05

Running with LTS kernel is only temporarily possible because of I use applications which depends on zen-kernel. But of course for testing that is ok. I have started now with LTS-kernel and unused memory has increased ~ 1GB (now 1.4GB is “free”). I have started tons of applications and swapping started. I run all browsers simultaneously and there was no memory leak. I will compare using zen-kernel. Looks interesting issue.

OK started PC again with ZEN-kernel, started half of the applications I started with LTS-kernel and browsers ran out of memory. It is related to ZEN-kernel.

I continued starting applications. Now former started applications crashes, newly started stay a short time alive until next started application kicked them off, too. All the time swap stays at “no-in-use-level”. That is not the behaviour I expected…

If it’s a zen kernel issue, all you can really do is wait.

2 Likes

linux-zen is coming with LRU patches. I do not know details about LRU but it is memory management related. There are open issues for it. Example:

You might want to ask for help on github.

1 Like

Many thanks. I have made an issue report, too.

1 Like

Today new kernel 5.14.1-zen2-1-zen arrived at testing and I can confirm it starts swapping flawlessly. Issue has gone - now wait until it reaches stable…

3 Likes