Sometimes, Endeavour OS stops opening new applications

kiamlaluno · September 7, 2022, 7:40pm

This is an issue I had at least three times in the past month, independently from the Desktop Environment I used.

I was running commands from the shell (Bash or Zsh) when, all in a sudden, the shell stopped to respond: The last executed command didn’t output anything, and the shell didn’t show its prompt even after ten minutes.
I was able to close the terminal, which asked for confirmation, since the shell was running a command. I could still use the already open applications, but opening a new application from the Desktop Environment menu would not success.
Even shutting down the computer from the menu didn’t have any effect. I had to shut down the computer using the shut-down button.

After I restarted the computer, all was back to normal.

Why does this happen?

It doesn’t seem the shell command I was executing was the cause, as that was different all the times. The last time, I was executing paru for the second time, after the first time it said there weren’t updates available; before that I was testing the output returned by getopt.

jonathon · September 7, 2022, 7:53pm

The first two things that come to mind are bad RAM and a failing disk, so you should check those.

anon11595408 · September 7, 2022, 8:12pm

If using KDE on ext4, turn off HD-watch or something similar, trying to diagnose your hdd in the background.

It told me 2-3 years ago, my hdd was old, getting erros and soon needed to be replaced.

After re-formatting, and never using that KDE-service again (always had to explicitly disable it after re-install), it never re-appeared to this day, same hdd has kept working properly since.

kiamlaluno · September 10, 2022, 4:48pm

I used Mate System Monitor to check what happens. This is what I noticed when I used Firefox and it became unresponsive.

I closed Firefox; its window disappeared without any confirmation message
I re-opened Firefox from the Mate Classic Menu
The Starting Firefox window appeared on the Windows List for few seconds
The System Monitor showed two Firefox processes; the child process’s status was Zombie

I could close those processes from the System Monitor, but without restarting the computer, Firefox and other applications didn’t show up anymore.

For Firefox, I now deselected the option to open previous windows and tabs.

I am not sure Firefox is the culprit for the issue I have. All the times I had this issue, Firefox was open, but it was not necessary the application I was using when I noticed the problem I described in the OP. (It could be I left it open in the background while I was testing some scripts I edited basing on posts I found on various forums.)

Triby · September 11, 2022, 6:35am

Please, avoid to do this, as it can cause issues. Instead, press Ctrl+Alt+F2 to get into TTY, login as root and try systemctl poweroff to shut down, or systemctl reboot to reboot.

The next time the issue happen, after reboot go to a terminal and type:

journalctl -b -1 | eos-sendlog

This will generate the last boot log (not current) and upload it, just copy the link and paste it here. Maybe there will be some information about what’s happening and it’d be easier to find a solution.

Pudge · September 11, 2022, 2:44pm

Good advice. Do what @Triby suggested first, but if that doesn’t work try @Kresimir 's Magic Carpet ride…I mean Magic SysRq Key (REISUB)

You need to ENABLE REISUB BEFORE you need it.

Pudge

kiamlaluno · September 13, 2022, 9:34pm

I did as suggested by @Triby and @Pudge.

I was using Firefox and opened a new tab to make a research, but the browser didn’t show anything. I then opened Vivaldi to make the same research, but I obtained the same result.
I restarted the computer using the Magic SysRq Key and ran journalctl -b -1 | eos-sendlog from Mate Terminal.

Unfortunately, what that command returned, after uploading 584 MB was just https://clbin.com/.

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  584M    0    19  100  584M      0   386k  0:25:48  0:25:48 --:--:--  336k
https://clbin.com/

Is there any way to reduce the log size? I guess the log size is the issue.

anon11595408 · September 13, 2022, 10:22pm

Check this out:

https://wiki.archlinux.org/title/Systemd/Journal#Journal_size_limit

jonathon · September 13, 2022, 11:07pm

Something is definitely wrong with your system, that amount of journal entries for a single boot might indicate something spamming the journal or something consistently failing. Maybe look at the journal yourself to see if you can see what’s going on?

Running out of memory (and/or filling tmpfs locations) would definitely be a reason for applications to fail to open.

kiamlaluno · September 14, 2022, 9:35am

I think I am currently experiencing three different issues: An issue is the messages that flood the journal, which are the messages given by AER while it’s correcting errors received from the Realtek RTL8821AE 802.11ac PCIe Wireless Network Adapter; the second issue is the one I described here; the third issue is that, apparently, the journal service is sometimes not able to stop/restart. (I keep reading an error about that, when I shut-down or restart the computer.)

I checked the memory. EndeavourOS and the applications I run don’t use more than 3.3 GB (out of 7.7 GB reported by Mate System monitor).
I also checked the space used by tmpfs locations. They don’t use more than 12 MB.

Thanks to @ricklinux, who helped me in Is there a way to avoid some messages are logged in the journal? I can now see all those errors / informational messages I was not able to see until now. (It’s like a new door has been open for me!)

ricklinux · September 16, 2022, 5:50pm

Can we look at this agian now that you have suppressed the error messages. Post links.

sudo dmesg | eos-sendlog

journalctl -b -0 | eos-sendlog

kiamlaluno · September 16, 2022, 6:55pm

https://clbin.com/jUD0Y

https://clbin.com/8OI6Z

ricklinux · September 16, 2022, 10:13pm

@kiamlaluno

Maybe you should try the kernel parameter

acpi_osi=‘Windows 2020’

kiamlaluno · September 16, 2022, 11:08pm

Whoops, I just shut-down the computer before to check how installing Endeavour OS on my laptop went.

Is there any “sign” I would see, when adding that parameter went wrong?

ricklinux · September 17, 2022, 12:25am

Did you reinstall? That kernel parameter may or may not help. It may make things worse or possibly give new problems. One doesn’t know until they try it.

Edit: Or it helps with your issues?

kiamlaluno · September 17, 2022, 5:28am

I did it now, since yesterday I was leading to my bed. I mean, I added the kernel parameter. (I didn’t reinstall Endeavour OS on this computer. I reinstalled it on the laptop.)
I edited the /etc/default/grub file, ran sudo grub-mkconfig -o /boot/grub/grub.cfg and reboot. I checked the journal and I didn’t see any new “entry” reporting new errors.

I will see what happens in the next days.

kiamlaluno · September 18, 2022, 8:58am

So far, I didn’t get that issue anymore. The last time it happened, I was adding pci=noaer to the kernel parameters, on September 15/16.

After I added acpi_osi='Windows 2020' to the kernel parameters, I haven’t been hit by the issue.

I checked the following post to see if they fixed a bug that could cause the issue I am experiencing. (I know, the fact I was using Firefox when I was not able to open new applications is just a coincidence.)

I only found this note, which isn’t my case, as Firefox was simply not able to connect to sites. Firefox 104.0.2 has been released on September 6, then; if the issue they fixed was causing the issue I reported, I should not have that issue on September 15/16.

Fixed an issue causing some users to crash in out-of-memory conditions (bug 1774155).

I will see what happens in the next 30 days.

ricklinux · September 18, 2022, 1:46pm

This is and was confusing to me. That’s why i asked if you reinstalled. So you originally had the flooding of error messages and i asked you to try pci=noaer This got rid of the flooding or the error messages related to pci-e & this was causing Firefox and other apps to not function properly. Did this kernel parameter alleviate this issue?

Then i said maybe you should try acpi_osi=“Windows 2020”

Was it already working properly after pci=noaer or was it the acpi_osi=“Windows 2020” kernel paramter that made the difference? If it is working fine then i would leave both.

kiamlaluno · September 18, 2022, 2:10pm

I apologize for the confusion.

The last time I had this issue was after I added pci=noaer to the /etc/default/grub file, but before sudo grub-mkconfig -o /boot/grub/grub.cfg would return and I could reboot the computer. That happened on September 16.
On September 17, I added acpi_osi='Windows 2020' to the /etc/default/grub file, ran sudo grub-mkconfig -o /boot/grub/grub.cfg, and rebooted the computer.

The issue I described doesn’t happen all the times I use the computer. I could wait a week before it happens again, or it could happen after two days. That’s why I said I am waiting it happens again.

ricklinux · September 18, 2022, 3:52pm

Don’t know what else you have installed either or changes maybe you have made just to be sure.