Ok, so here's my stats and some musings from several days of messing about.
A completely clean system with no configuration other than adding the SMB share. Doing nothing we get pings to the NAS (claudia.local) like this:
Code:
--- claudia.local ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99151ms
rtt min/avg/max/mdev = 2.331/5.264/10.978/1.334 ms
The network stats look like this:
Code:
aster@testbed:~ $ iwconfig
lo no wireless extensions.
wlan0 IEEE 802.11 ESSID:"TeawithRuby"
Mode:Managed Frequency:2.412 GHz Access Point: xx:xx:xx:xx:xx:xx
Bit Rate=72.2 Mb/s Tx-Power=31 dBm
Retry short limit:7 RTS thr:off Fragment thr:off
Power Management:off
Link Quality=49/70 Signal level=-61 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0
master@testbed:~ $ ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 493 bytes 42733 (41.7 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 493 bytes 42733 (41.7 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.73 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::e37c:6082:21f3:16fe prefixlen 64 scopeid 0x20<link>
ether b8:27:eb:39:c6:5a txqueuelen 1000 (Ethernet)
RX packets 10615 bytes 4576639 (4.3 MiB)
RX errors 0 dropped 4 overruns 0 frame 0
TX packets 3800 bytes 4264896 (4.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
So far so good, now we kick off a scan and ping looks like this:
Code:
--- claudia.local ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 99462ms
rtt min/avg/max/mdev = 1.398/10.543/61.529/10.859 ms
Average doubled, but no packet loss or anything. I'd expect it to be slower, there's more going on on the network after all.
Network stats don't look any different really either. Tis from after the scan had run for a minute or so and I'd run 100 pings:
Code:
master@testbed:~ $ iwconfig
lo no wireless extensions.
wlan0 IEEE 802.11 ESSID:"TeawithRuby"
Mode:Managed Frequency:2.412 GHz Access Point: xx:xx:xx:xx:xx:xx
Bit Rate=65 Mb/s Tx-Power=31 dBm
Retry short limit:7 RTS thr:off Fragment thr:off
Power Management:off
Link Quality=50/70 Signal level=-60 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0
master@testbed:~ $ ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 1320 bytes 103290 (100.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1320 bytes 103290 (100.8 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.73 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::e37c:6082:21f3:16fe prefixlen 64 scopeid 0x20<link>
ether b8:27:eb:39:c6:5a txqueuelen 1000 (Ethernet)
RX packets 292022 bytes 415853125 (396.5 MiB)
RX errors 0 dropped 6 overruns 0 frame 0
TX packets 108325 bytes 13774061 (13.1 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I didn't have anything playing and wasn't messing in the UI at all during this run of the test. I did check the mpd log for activity while the scan was running every so often, and re-ran the network commands a few times. This last was seconds before the system froze:
Code:
master@testbed:~ $ ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 2810 bytes 211474 (206.5 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 2810 bytes 211474 (206.5 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
wlan0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.1.73 netmask 255.255.255.0 broadcast 192.168.1.255
inet6 fe80::e37c:6082:21f3:16fe prefixlen 64 scopeid 0x20<link>
ether b8:27:eb:39:c6:5a txqueuelen 1000 (Ethernet)
RX packets 1076258 bytes 1574774233 (1.4 GiB)
RX errors 0 dropped 7 overruns 0 frame 0
TX packets 373034 bytes 36784306 (35.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
Again, nothing shouting out at me. The irritating thing is that even if there was a smoking gun in these stats when the issue hits, I can't see it because of the frozen state.
Other things I have tried - thinking it might be mDNS related somehow, I mounted the NAS using the IP reserved for it in the router's DHCP. Same behaviour. The share never shows up in the "Scan" in the moode configuration screen (never has on any version), so something isn't 100% in the exposure of the share on the network, but it does always mount when the path is added manually, and it works happily on my Pi4 9.x devices (one ethernet, one wifi) and worked fine on all 8.x and earlier versions (to about 6 when I started). I've used some tools on my Android phone to check the mDNS broadcast and _smb._tcp scans find the NAS correctly. Maybe there's mileage in my trying the moode scan code manually in case that shed some light? But then, mDNS is probably not the problem since it freezes with the IP as well. It's straws I'm clutching at.
One thing I can be pretty certain of is that it is related to the SMB share on the NAS. If I don't mount, scan or play from the NAS the system is fine. The swapping about in the UI or swapping stations was co-incidental, just a scan will kick off the issue most times. Playing from the NAS on the rare occasion that the scan doesn't freeze will eventually trigger it. I've listened to the radio for hours with no issue. I suspect it must be something the Synology does at it's end that is the trigger, but I'm at a loss to work out what that might be. I've turned off the disc power management just in case so the discs never sleep. Long shot since why would the discs sleep while a scan was underway? Anyway, it achieved nothing.
Next payday, I'll get a zero 2 and see if that plays ball and maybe try again everytime there's a kernel update in case. I can't see anyway to get any useful diagnostics for anyone to look at so I'll thank you for your time and move on I think. The Pi3 will have to find another job