Thank you for your donation!


Cloudsmith graciously provides open-source package management and distribution for our project.


Local display function fails with HDMI in moOde 7.0.1
#1
Shortly after Christmas, Alain reported a strange behavior in his moOde 7.0.1 player (initially, posts #37, #84) in which it became unresponsive after a period of time. These posts and the subsequent interchange established it happens only when he enables moOde's local display with a small HDMI LCD display attached and is a consequence of a Chromium process consuming available memory. He reported he does not see the same issue with moOde 6.7.1 installed on the same hardware.

I spent some time over the weekend trying to pin down the mechanism using an RPi3B+; deadly dull work because it takes hours for the fail to occur on my system after making any change and rebooting.

1) I confirmed that the behavior happens with moOde 7.0.1 and that it doesn't happen with moOde 6.7.1. Note that between these two releases, the underlying Raspberry Pi OS went from 10.4 to 10.6 and the installed Chromium went from 78.0.3904.108 to 84.0.4147.141 whereas AFAIK there was no significant change to moOde's implementation of the local display.

2) I confirmed that the behavior happens only when the HDMI output is involved, even if no HDMI display is connected to the RPi. No such behavior occurs with the Raspberry Pi 7" Touch Display, which is also driven by moOde's local display function but which is not an HDMI display (it uses the DSI port). As an aside, I believe this is why the test team didn't notice the behavior until the issue was raised. AFAIK none of us routinely uses an HDMI local display.

3) The startup of the local display goes like this-

a) When the local display function is enabled, xinit reads the commands in /home/pi/.xinitrc and starts an X11 server running an instance of the Chromium browser with a set of suitable options.


b) the Chromium process in turn spawns 8 more chromium processes which perform various functions and communicate with each other via shared memory.
    - one of these Chromium processes is type=renderer,  started with a number of options established by the primary process (e.g., not set by moOde).

4) I wrote a simple bash script which logs data I considered significant, sleeps for 60 seconds, then repeats.

Note - all elapsed times reported in the following are approximate

5) while the moOde player sits idle with the local display enabled, this script reports that for roughly 25 minutes of CPU time (which is about about 4 hours wallclock time on an RPi3B+ with other processes competing for CPU cycles) everything is pretty stable. The free memory in the system is ca 260MB and the VmData segment in the type=renderer process is ca 170MB.

6) then I to see the VmData segment in the type=renderer process begin to grow and the free memory in the system begin to fall. The process grabs about 20 MB/min. The oomscore of this process begins to climb.

7) after about 10 min (wall clock time) of this, the system starts reducing the amount of memory available for buffers and cache in order to keep feeding this process.

8) after about another 20 min (wall clock), so much memory has been consumed that the system is struggling to run other processes. The 60-sec sleep cycle of my script begins stretching out...2 min...20 min... while IO activity begins to climb as the system tries to move pages of memory to make room.

9) the status of the type=renderer process changes from a rhythmic S/R (sleep/run) cycle to a "D" status which is usually called uninterruptible sleep.

10) time stretches seemingly to infinity. The type=renderer process eventually becomes a zombie (maybe the oomkiller got it but I can't speak to this). Normal memory distribution is restored. At some point, the process is removed from the process table. Somewhere in here, syslog errors are thrown concerning blocked kworker task, mmc_rescan, etc.

11) Normal function of moOde is restored at some point, albeit without the local display because its renderer is no more. My logging doesn't say when, but I suppose when the memory is redistributed.

What's not happening? This is not thread exhaustion - no new processes or threads are thrown. This is not shared memory exhaustion - the size of /dev/shm does not expand.

Incidentally, since in one post Alain mentioned he modified an hdmi setting in /boot/config.txt, I should note that even clearing all the hdmi settings on my installation did not change the issue. Neither is this issue caused by some library limitation.



Root cause?

So, the issue would seem likely due to a change in Chromium which occurred between v78 and v84, although I can't begin to interpret the commits to its code base which occurred. Less likely but I suppose also possible, it's due to a change in Raspberry Pi OS. I can't explain, for example, why using the DSI interface instead of the HDMI interface would cause a difference, and this certainly is down at the OS level.

Internet searches turned up only a very few hits on some of the items I report above and none of the hits has revealed useful information. Published lists of unpublished Chromium options don't offer anything which appears to me to be useful. I don't know enough about the Chromium developer community to know how to go about asking for enlightenment. The Chromium user forums are full of unanswered questions. It still might be we are somehow misusing Chromium, but if there really is an issue then the devs ought to know about it.



Possible remediation?

1) I suppose the moOde watchdog script could be augmented to include monitoring some key datapoint, vmstat, for example, and triggering a response when needed. Possibilities which come to mind
a) simply restart xinit. This is pretty brutal and would cause the same initial screen-flash sequence we see when we start the local display. I consider it very distracting.
b) if this behavior truly occurs only when Chromium remains idle, then contrive to give it something innocuous to do from time to time. I like this better than a) but I haven't yet determined what might do the trick. Again, any screen activity would be distracting.

Other alternatives
2) put the Chromium processes in their own memory-limited cgroup. This might keep the rest of moOde running but one still has to deal with the OOM issue.
3) try to install an earlier version of the Chromium package to see if the issue resolves. I'm not interested in pursuing this approach.
4) try enabling swap on the RPi3+ or moving to a larger memory RPi4B (Chief Brody to Quint "you're gonna need a bigger boat!", Jaws 1975). This might ameliorate the situation, but it seems to me a very unsatisfactory solution.


Regards,
Kent
Reply
#2
Interesting. I can hook up an HDMI display to one of my systems and see if I can repro the memory consumption issue.

I'm not seeing any memory issues though on a 3B+ with HDMI port ON but no display connected. This particular 3B+ runs the production release image which has all default settings and is on 7/24. Below are memory stats from a few minutes ago.

Code:
pi@moode:~ $ free -h
             total        used        free      shared  buff/cache   available
Mem:          924Mi        73Mi       654Mi        12Mi       196Mi       784Mi
Swap:            0B          0B          0B
pi@moode:~ $
Enjoy the Music!
moodeaudio.org | Mastodon Feed | GitHub
Reply
#3
(01-04-2021, 05:41 PM)Tim Curtis Wrote: Interesting. I can hook up an HDMI display to one of my systems and see if I can repro the memory consumption issue.

I'm not seeing any memory issues though on a 3B+ with HDMI port ON but no display connected. This particular 3B+ runs the production release image which has all default settings and is on 7/24. Below are memory stats from a few minutes ago.

Code:
pi@moode:~ $ free -h
             total        used        free      shared  buff/cache   available
Mem:          924Mi        73Mi       654Mi        12Mi       196Mi       784Mi
Swap:            0B          0B          0B
pi@moode:~ $

With moOde's local display enabled? Long term, I see the same, but only after all the sturm und drang in the middle period.

Regards,
Kent
Reply
#4
Oops I forgot to turn on Local Display. I'll do that and see what shakes.
Enjoy the Music!
moodeaudio.org | Mastodon Feed | GitHub
Reply
#5
Hi Guys,
When playing a song and pause, it get unresponsive in 20 minutes. If you don't play anything you could wait hours before seeing anything.
Reply
#6
(01-04-2021, 06:10 PM)Alaini93 Wrote: Hi Guys,
When playing a song and pause, it get unresponsive in 20 minutes. If you don't play anything you could wait hours before seeing anything.

You mentioned this before, but it's an aspect I was not able to repro.

Case in point, when I saw this reply, I started playing a track, paused one minute into it, and started my test logger again. It's been running over an hour without seeing any climb in memory usage (yet!).

Perhaps there's some other aspect of your configuration I'm not duplicating, but that's a mystery for another day. My first order of business is to deal with the case I can repro.

Regards,
Kent
Reply
#7
I'm not able to repro on either of my systems.

One is a 3B+ (Eth) with HDMI on, Local UI on but no display connected. The other is an Allo USBRidge SIG (CM3+ WiFi adapter) with  with HDMI on, Local UI on and an HDMI display connected and showing the UI.

Both systems sitting idle on a paused track for about 45 mins.

Memory looks normal.
Code:
pi@rp3:~ $ free -h
             total        used        free      shared  buff/cache   available
Mem:          924Mi       272Mi       249Mi        55Mi       403Mi       545Mi
Swap:            0B          0B          0B

pi@moode:~ $ free -h
             total        used        free      shared  buff/cache   available
Mem:          924Mi       198Mi       304Mi        26Mi       421Mi       644Mi
Swap:            0B          0B          0B
Enjoy the Music!
moodeaudio.org | Mastodon Feed | GitHub
Reply
#8
I also tested both systems playing radio for about 10 mins to see if any oddness in memory usage and there was none.
Enjoy the Music!
moodeaudio.org | Mastodon Feed | GitHub
Reply
#9
If your tests turn like mine, you have hours to go yet Smile
Reply
#10
Memory usage stayed same for an hour in my tests which suggests (to me) that its not suddenly going to go south as time > an hour but if thats whats happening then we should see more reports of something like this since many users have connected displays and leave their Pi's on for many hours.

The other behavior I wasn't able to repro was "When playing a song and pause, it get unresponsive in 20 minutes".
Enjoy the Music!
moodeaudio.org | Mastodon Feed | GitHub
Reply


Forum Jump: