06-13-2024, 07:26 AM
(This post was last modified: 07-02-2024, 12:34 PM by Tim Curtis.
Edit Reason: Marked as solved
)
Hi Folks,
There's quite a tale here so please bear with me. The tl;dr is moOde 9 on a Pi3A+ becomes catatonic probably because of something in the wifi.
I upgraded my Pi Zero W to a Pi3A+ in preparation for the 64bit only moOde 9 series. Even back on 8.3.9 I ran into wifi issues in which the syslog would show "wlan0 disconnected" periodically and the player would vanish off the network for a bit then the wlan0 would re-attach but the mDNS entries wouldn't work until the wifi connection on the client end was reset. After a lot of trouble shooting (new power supply, new SD card, re-flashing, minimal config, etc etc) including returning the Pi and getting a new one to rule out hardware, I traced that problem to my router scanning for interference and changing channel. Once I turned that "feature" off, it became stable. The Pi Zero W in the same situation never saw that problem, nor does the Pi4B+ I have on the same wifi network. I conclude that the wifi chip in the 3 must be a different beast.
Now, on to moOde 9 I find that the 3 will go into a catatonic state, I think mostly in situations where there is a deal of two-way network traffic (scanning the library, swapping from screen to screen while configuring and so on). By catatonic I mean that the UI will swap happily to pages that have already been loaded (I assume using the browser cache) but will either show the spinner or the "reconnect" message indefinitely elsewhere. CTRL+F5 will give a "website not found" message in this state. When connected via SSH the catatonia is more total. The thing just stops. No input is recognised, no commands run, if I'm tailing a log it simply shows the entry when the freeze began and never any more. If I start a new SSH session in this state using the mDNS host name the window will sit waiting for the password prompt forever, I've waited several hours and seen no response. SSH using the IP will result in "no route to host". My router reports that the device is still attached. This is why I call it catatonic, it is as if all the devices in the network think the Pi is "there" in some sense, but it isn't even as if they say it isn't responding, no "busy" type messages, no "connection lost" messages, just patiently waiting for a communication that never comes and never times out. The only thing that wakes it all up is to pull the power and start the Pi3 up again. When I do this leaving the SSH or UI connected they remain unresponsive until the boot completes at which point the UI will refresh and the SSH sessions will report a broken pipe and disconnect.
Given my experience with the "smart" feature on my ISP router (VirginMedia Hub 3 for those who know/care), I decided to finally buy my own (a Linksys M2000) thinking that would sort it out. No joy, exactly the same (although everything else on my network is a LOT happier).
What do the logs have to say? Pretty much nothing. There's no syslog anymore obviously but tailing the journal (journalctl -f) shows no smoking gun. The log will just stop. The final message is the last of what ever it did before, often something to do with time sync or log cleanup. Examining the log after a reboot shows that the beginning of the log shows the time and date of when the system froze last before it syncs with the network and jumps to the correct time. Don't know if that's at all relevant other than it shows just how little was going on in the Pi when it freezes.
I've totally drawn a blank on how to diagnose this further. I even installed wireshark, but ran out of talent on first launch (wow, I do not understand networks....) I know it's not something in moOde, it must be related to the Bookworm under-pinning, but I've not found a way to trigger the issue using just Raspian. I take heart from the fact that others have noticed a difference in network behaviour on Pi3 models and hope I'm not alone in this. Has anyone else seen Pi3 wifi instability, or have any idea where I can look for a clue as to the cause?
There's quite a tale here so please bear with me. The tl;dr is moOde 9 on a Pi3A+ becomes catatonic probably because of something in the wifi.
I upgraded my Pi Zero W to a Pi3A+ in preparation for the 64bit only moOde 9 series. Even back on 8.3.9 I ran into wifi issues in which the syslog would show "wlan0 disconnected" periodically and the player would vanish off the network for a bit then the wlan0 would re-attach but the mDNS entries wouldn't work until the wifi connection on the client end was reset. After a lot of trouble shooting (new power supply, new SD card, re-flashing, minimal config, etc etc) including returning the Pi and getting a new one to rule out hardware, I traced that problem to my router scanning for interference and changing channel. Once I turned that "feature" off, it became stable. The Pi Zero W in the same situation never saw that problem, nor does the Pi4B+ I have on the same wifi network. I conclude that the wifi chip in the 3 must be a different beast.
Now, on to moOde 9 I find that the 3 will go into a catatonic state, I think mostly in situations where there is a deal of two-way network traffic (scanning the library, swapping from screen to screen while configuring and so on). By catatonic I mean that the UI will swap happily to pages that have already been loaded (I assume using the browser cache) but will either show the spinner or the "reconnect" message indefinitely elsewhere. CTRL+F5 will give a "website not found" message in this state. When connected via SSH the catatonia is more total. The thing just stops. No input is recognised, no commands run, if I'm tailing a log it simply shows the entry when the freeze began and never any more. If I start a new SSH session in this state using the mDNS host name the window will sit waiting for the password prompt forever, I've waited several hours and seen no response. SSH using the IP will result in "no route to host". My router reports that the device is still attached. This is why I call it catatonic, it is as if all the devices in the network think the Pi is "there" in some sense, but it isn't even as if they say it isn't responding, no "busy" type messages, no "connection lost" messages, just patiently waiting for a communication that never comes and never times out. The only thing that wakes it all up is to pull the power and start the Pi3 up again. When I do this leaving the SSH or UI connected they remain unresponsive until the boot completes at which point the UI will refresh and the SSH sessions will report a broken pipe and disconnect.
Given my experience with the "smart" feature on my ISP router (VirginMedia Hub 3 for those who know/care), I decided to finally buy my own (a Linksys M2000) thinking that would sort it out. No joy, exactly the same (although everything else on my network is a LOT happier).
What do the logs have to say? Pretty much nothing. There's no syslog anymore obviously but tailing the journal (journalctl -f) shows no smoking gun. The log will just stop. The final message is the last of what ever it did before, often something to do with time sync or log cleanup. Examining the log after a reboot shows that the beginning of the log shows the time and date of when the system froze last before it syncs with the network and jumps to the correct time. Don't know if that's at all relevant other than it shows just how little was going on in the Pi when it freezes.
I've totally drawn a blank on how to diagnose this further. I even installed wireshark, but ran out of talent on first launch (wow, I do not understand networks....) I know it's not something in moOde, it must be related to the Bookworm under-pinning, but I've not found a way to trigger the issue using just Raspian. I take heart from the fact that others have noticed a difference in network behaviour on Pi3 models and hope I'm not alone in this. Has anyone else seen Pi3 wifi instability, or have any idea where I can look for a clue as to the cause?
----------------
Robert
Robert