Help with crash report

Dear all,

I’m struggling with frequent interpreter crashes while developing my WFSCollider-Class-Library and Unit-Lib Quarks. They are a graphical composition interface created originally for the Game of Life WFS system but also in use for many other types of things. I’ve recently added a lot of new stuff, including stuff with Menu / MenuAction and pushing out a lot of inconsistencies and bugs. However, it seems, especially when working on larger projects, the SC interpreter tends to crash quite often. This isn’t completely new, but it seems to be happening more now. I’ve looked at many possible causes (amongst others found out that Menu doesn’t get garbage collected properly and remains in the QObject heap forever if you don’t explicitly destroy it, as do the MenuActions inside it). After a lot of cleaning up though, the crashes still happen. I wonder if any of you who has deeper understanding of macOS crash reports could take a look with me. I’ve posted one here:

Perhaps it is possible to spot where the issue may be from the crash report? To reproduce this crash you’d need to install the WFSCollider-Class-Library quark basically just fool around in it for some time, quickly opening and closing windows and replacing UMaps/Units etc. Or open a large project and hit the (i) button a couple of times, editing all selected objects in one very large window. In general I think it is Qt related, as it happens when opening and/or closing windows, but I’m a bit out of debugging options…

cheers & thanks,
Wouter

It seems to be a seg fault related to Qt.

I believe there is no one working right now on the qt side. If you’re on a big project right now, maybe the solution would be to avoid sclang qt for now. Maybe another kind of interface?

Hmm, that’s unfortunate. I don’t know any other gui scheme that would currently be able to replace all the functionality that we have in here, and next to that, the lib is very much finetuned on the graphical possibilities. In general it seems that if I just create a lot of views, and then destroy them at some point SC will crash when new views need to be created. It goes less fast now that I properly destroy Menu’s, but still. The same happens with QImage (which I stopped using for this reason), after creating and showing 3 images or so the interpreter crashes. Also sometimes it crashes at recompile or quit. Due to recent changes I’m now creating more views in total (a lot of UserViews), and it seems that’s what triggering these crashes. Some kind of memory fillup/leak perhaps. Is there any view on anyone looking into the Qt stuff in the near future? Should it be updated to a more recent version? And in general, would a “bleeding edge” build of SC bring any change to this problem or is the Qt stuff untouched since 3.13.0?

Yes, it sounds like a memory leak all right. You could also try to put part of the GUI on top of another sclang process. Maybe that would make you feel safer during a concert since not everything would crash.

Have you tried recent versions? Or running the interpreter with more memory?

The stable build is always recommended, of course. I never understand going to a stage with a recent unstable build

(GBD is a debugger that you can interact with, maybe it helps to get more hints)

I haven’t tried recent builds yet, are there any that I can download somewhere?

About running the interpreter with more memory; I looked into that but couldn’t figure out how to. Is that possible somehow?

Looking at gcInfo etc. it doesn’t look very unusual btw, within sclang things seem to be cleaned up properly.

Several threads seem to be involved in garbage collection operations (PyrGC::Collect() and related functions). This might suggest a concurrency issue where an object was being collected or finalized while still in use…Those more complex behaviors with the GUI (for instance: QWidget::event(QEvent*), QApplication::notify()) you described are recent? Killing and restarting complex GUI stuff etc?

I think you may need to hard code the C code and recompile, if I remember correctly.

Hopefully the C++ gurus are coming soon to correct me))))

In what I recently made there are two new things; the use of Menu / MenuAction and in general creating (quite a lot) more Views. There is some complex behavior in the sense that most GUI elements use some form of MVC interaction with the objects they represent. This has been in there for a long time, but what could be new about it is that there are simply more of them now. I’ve noticed before that I need to be very careful with setting things in GUI elements that have just been closed, and I’ve eliminated most of those cases over time. But in this case that doesn’t really seem to be in what’s happening, as it mostly was problematic with fast changing parameters via MIDI etc. (which I’m not doing now during these tests).

The Menu’s did seem to cause issues; I’ve basically moved over from using PopUpMenu to my own creation based on a StaticText creating a Menu when clicked. As this seemed to cause issues I worked out that Menu’s aren’t garbage collected at all (opposed to what the docs are saying) and even if I manually do it by calling .destroy after use the MenuActions remain in the QObject heap. I made a Menu:deepDestroy method for this and that solves that issue, and I’m also only creating the menus now when the StaticText is actually clicked. In the tests that I did no actual Menu’s were created and still the interpreter crashes after opening a large window (with a lot of Views in it) a couple of times…

Tricky thing is that I haven’t been able to reproduce the crashes with code outside the WFS lib. Creating 10.000 UserViews in a window over and over again doesn’t seem to cause any issues…

Btw the WFSCollider-Class-Lib has a history to it, development started about 20 years ago and it’s grown ever since. It’s become a very useful tool for all kinds of things, and especial spatial audio design. I sure hope we can make it behave (more) stable again…

I’ve had similar problems removing views in TX Modular. After much trial and error, I now use this method instead of removing them directly:

	*deferRemoveView {arg holdView;
		if (holdView.notNil, {
			if (holdView.notClosed, {
				holdView.focus(false);
				holdView.visible_(false);
				{if (holdView.notNil and: {holdView.notClosed}, {holdView.remove})}.defer(2);
			});
		});
	}

This may be over-cautious, but I now seldom get problems.
Hope that helps.
Paul

Sounds like a nice solution, I did something similar with the Menu’s in some places. Meanwhile I’ve located the memory-leak, it was in my code after all (I was deepCopying a Dictionary with Fonts and QPalettes in it where I only needed a shallow copy with one value changed). My crashtest doesn’t crash anymore now :-). Thanks for the help and tips!

cheers,
Wouter

PS: do give WFSCollider a try, it’s had a complete overhaul in view of the hardware upgrade of our Game of Life WFS system. I’ll make a separate post in a while here and on socials to announce.

1 Like

But it shouldn’t crash the interpreter, it’s still a bug, right?

Hi Wouter,

There seem to be a few memory leak issues that have come in.

Interestingly I’ve noticed a crash that seems to happen sometimes on clicking a RoundButton. Perhaps that’s a clue.

I’ve not had a moment to find a consistent reproducer. Do you have one, even if it’s complicated? Ie some code that consistently leads to a crash at the same point from a clean start?

@smoge or @TXMod do either of you have a consistent reproducer?

My experience has been in the past that when I am working quickly, pressing gui buttons that cause views to be removed and new views built (and maybe other objects created as well), sometimes - but NOT consistently - I seem to get crashes.
With the method posted before, by making the view invisible and removing it 2 secs later with defer, I don’t seem to get problems.
It didn’t seem to be linked to one particular action, but more about working fast.
Intuitively this feels like there can be some delay in Qt doing things behind the scenes, so you need to give it time to work before updating - but I’ve no concrete evidence (and no deep understanding of Qt).
Best,
Paul

Next time it happens, post the crash log here please.

Have been retesting my system now without the defer but I can’t reproduce the crashes.
Other things have also changed in the code in recent months, so it might have been something else that was causing it. It’s so hard to know with intermittent crashes, although they did seem to stop when I added defer.
Will test more in coming days and will post here if I get any more crashes.
Best,
Paul

1 Like

I’m not familiar with GC (so please, if I’m talking nonsense, point me to the right direction), but after reading the last reports, especially Wouter’s one with many threads all with the same problems… it seems that it can be

  1. overwhelm the garbage collector
  2. Concurrency Issues? Some timing aspects between garbage collection cycles and the instantiation of new GUI objects. If the garbage collector is activated just as resources are being allocated to a new GUI, there could be conflicts or concurrency issues,
  3. maybe the circular referential counting, in certain situations (concurrent? or large graphs?), is not able to complete the job, and the leftover data will conflict with the new allocation that will happen to have very similar sizes, and could end up around the same place in memory

The workarounds that DO WORK right now, make the old object miss the inclusion in the ‘free’ list. It means the fix can be based on timing adjustments.

Is there a way to trigger the GC from the interpreter? This could be used to see what would happen changing some variants…

I don’t think those are likely, tbh, given how the GC works.

The GC does an incremental amount of collection on each allocation. It uses/re-uses pre-allocated blocks. I think the most likely explanations in descending likelihood are, 1. something in the Qt related code, 2. a bug introduced into the GC code, 3. a bug in Qt itself. 1 and 2 strongly more probable.

Thank you. Would Wilson and Johnstone’s “Real-Time Non-Copying Garbage Collection” from the 1999 ACM (and another one also cited in the SC book) still qualify as relevant documentation after all these years? I’ll try to find them.

I plan to write a blog post addressing the stark lack of comments in the SC source code, comparing it with the extensive documentation found in other language implementations (for example: GHC has more comment lines than code in many files), and analyzing the implications for each development model. When I was reading, and studying, I think I never saw a project this size like this. Still, I was able to contribute with an improvement/bugfix in the scheduling-related code :sweat_smile: Probably luck, but my point is to think together about it.

I can’t remember what the relevant reference was, but that sounds right. There’s some discussion in the Writing Primitives file, as well as in the SC Language chapter of the SC Book.