This is an issue I had at least three times in the past month, independently from the Desktop Environment I used.
I was running commands from the shell (Bash or Zsh) when, all in a sudden, the shell stopped to respond: The last executed command didn’t output anything, and the shell didn’t show its prompt even after ten minutes.
I was able to close the terminal, which asked for confirmation, since the shell was running a command. I could still use the already open applications, but opening a new application from the Desktop Environment menu would not success.
Even shutting down the computer from the menu didn’t have any effect. I had to shut down the computer using the shut-down button.
After I restarted the computer, all was back to normal.
Why does this happen?
It doesn’t seem the shell command I was executing was the cause, as that was different all the times. The last time, I was executing paru for the second time, after the first time it said there weren’t updates available; before that I was testing the output returned by getopt.
I used Mate System Monitor to check what happens. This is what I noticed when I used Firefox and it became unresponsive.
I closed Firefox; its window disappeared without any confirmation message
I re-opened Firefox from the Mate Classic Menu
The Starting Firefox window appeared on the Windows List for few seconds
The System Monitor showed two Firefox processes; the child process’s status was Zombie
I could close those processes from the System Monitor, but without restarting the computer, Firefox and other applications didn’t show up anymore.
For Firefox, I now deselected the option to open previous windows and tabs.
I am not sure Firefox is the culprit for the issue I have. All the times I had this issue, Firefox was open, but it was not necessary the application I was using when I noticed the problem I described in the OP. (It could be I left it open in the background while I was testing some scripts I edited basing on posts I found on various forums.)
Please, avoid to do this, as it can cause issues. Instead, press Ctrl+Alt+F2 to get into TTY, login as root and try systemctl poweroff to shut down, or systemctl reboot to reboot.
The next time the issue happen, after reboot go to a terminal and type:
journalctl -b -1 | eos-sendlog
This will generate the last boot log (not current) and upload it, just copy the link and paste it here. Maybe there will be some information about what’s happening and it’d be easier to find a solution.
I was using Firefox and opened a new tab to make a research, but the browser didn’t show anything. I then opened Vivaldi to make the same research, but I obtained the same result.
I restarted the computer using the Magic SysRq Key and ran journalctl -b -1 | eos-sendlog from Mate Terminal.
Unfortunately, what that command returned, after uploading 584 MB was just https://clbin.com/.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 584M 0 19 100 584M 0 386k 0:25:48 0:25:48 --:--:-- 336k
Is there any way to reduce the log size? I guess the log size is the issue.
Something is definitely wrong with your system, that amount of journal entries for a single boot might indicate something spamming the journal or something consistently failing. Maybe look at the journal yourself to see if you can see what’s going on?
Running out of memory (and/or filling tmpfs locations) would definitely be a reason for applications to fail to open.
I think I am currently experiencing three different issues: An issue is the messages that flood the journal, which are the messages given by AER while it’s correcting errors received from the Realtek RTL8821AE 802.11ac PCIe Wireless Network Adapter; the second issue is the one I described here; the third issue is that, apparently, the journal service is sometimes not able to stop/restart. (I keep reading an error about that, when I shut-down or restart the computer.)
I checked the memory. Endeavour OS and the applications I run don’t use more than 3.3 GB (out of 7.7 GB reported by Mate System monitor).
I also checked the space used by tmpfs locations. They don’t use more than 12 MB.
I did it now, since yesterday I was leading to my bed. I mean, I added the kernel parameter. (I didn’t reinstall Endeavour OS on this computer. I reinstalled it on the laptop.)
I edited the /etc/default/grub file, ran sudo grub-mkconfig -o /boot/grub/grub.cfg and reboot. I checked the journal and I didn’t see any new “entry” reporting new errors.
So far, I didn’t get that issue anymore. The last time it happened, I was adding pci=noaer to the kernel parameters, on September 15/16.
After I added acpi_osi='Windows 2020' to the kernel parameters, I haven’t been hit by the issue.
I checked the following post to see if they fixed a bug that could cause the issue I am experiencing. (I know, the fact I was using Firefox when I was not able to open new applications is just a coincidence.)
I only found this note, which isn’t my case, as Firefox was simply not able to connect to sites. Firefox 104.0.2 has been released on September 6, then; if the issue they fixed was causing the issue I reported, I should not have that issue on September 15/16.
Fixed an issue causing some users to crash in out-of-memory conditions (bug 1774155).
This is and was confusing to me. That’s why i asked if you reinstalled. So you originally had the flooding of error messages and i asked you to try pci=noaer This got rid of the flooding or the error messages related to pci-e & this was causing Firefox and other apps to not function properly. Did this kernel parameter alleviate this issue?
Then i said maybe you should try acpi_osi=“Windows 2020”
Was it already working properly after pci=noaer or was it the acpi_osi=“Windows 2020” kernel paramter that made the difference? If it is working fine then i would leave both.
The last time I had this issue was after I added pci=noaer to the /etc/default/grub file, but before sudo grub-mkconfig -o /boot/grub/grub.cfg would return and I could reboot the computer. That happened on September 16.
On September 17, I added acpi_osi='Windows 2020' to the /etc/default/grub file, ran sudo grub-mkconfig -o /boot/grub/grub.cfg, and rebooted the computer.
The issue I described doesn’t happen all the times I use the computer. I could wait a week before it happens again, or it could happen after two days. That’s why I said I am waiting it happens again.