it’s ok don’t worry.
Thank you for trying to help
Your help is very useful, thank you.
Here’s where I’m at right now:
I started SuperCollider as usual from the IDE.
In a terminal, I execute:
ps x | grep sclang
to get the pid, the command return:
5147 ? SLl 0:31 /usr/local/bin/sclang -i scqt
6152 pts/1 S+ 0:00 grep --color=auto sclang
if I execute thisProcess.pid in the IDE, I get this in the post window:
5147
so I assumed this is the pid.
Then in a terminal:
sudo gdb --args sclang 5147
then I type continue at gdb command prompt and I get The program is not being run.
this is the whole output from gdb:
sudo gdb --args sclang 5147
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from sclang...
(gdb) continue
The program is not being run.
(gdb)
From there if I quit the interpreter manually, nothing happens in gdb.
I resume from the beginning but this time I execute run at gdb prompt before (or after) continue and I get this output:
(gdb) run
Starting program: /usr/local/bin/sclang 6554
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffe2a17640 (LWP 7403)]
[New Thread 0x7fffe2216640 (LWP 7404)]
compiling class library...
[New Thread 0x7fffe10f3640 (LWP 7405)]
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-root'
[New Thread 0x7fffd21ff640 (LWP 7406)]
[New Thread 0x7fffd18bd640 (LWP 7407)]
[New Thread 0x7fffd10bc640 (LWP 7408)]
[New Thread 0x7fffd08bb640 (LWP 7409)]
[New Thread 0x7fffc3bff640 (LWP 7410)]
[New Thread 0x7fffc33fe640 (LWP 7411)]
[New Thread 0x7fffc29fd640 (LWP 7412)]
[Detaching after fork from child process 7413]
[New Thread 0x7fffc0fff640 (LWP 7414)]
[New Thread 0x7fffaffff640 (LWP 7415)]
[Thread 0x7fffaffff640 (LWP 7415) exited]
[Thread 0x7fffc0fff640 (LWP 7414) exited]
[New Thread 0x7fffc0fff640 (LWP 7416)]
[7400:7400:1224/213415.664760:ERROR:zygote_host_impl_linux.cc(90)] Running as root without --no-sandbox is not supported. See https://crbug.com/638180.
[Thread 0x7fffc29fd640 (LWP 7412) exited]
[Thread 0x7fffc33fe640 (LWP 7411) exited]
[Thread 0x7fffc3bff640 (LWP 7410) exited]
[Thread 0x7fffd08bb640 (LWP 7409) exited]
[Thread 0x7fffd10bc640 (LWP 7408) exited]
[Thread 0x7fffd18bd640 (LWP 7407) exited]
[Thread 0x7fffd21ff640 (LWP 7406) exited]
terminate called without an active exception
Thread 1 "sclang" received signal SIGABRT, Aborted.
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140736997897216) at ./nptl/pthread_kill.c:44
44 ./nptl/pthread_kill.c: Aucun fichier ou dossier de ce nom.
I don’t know what to do from there.
Any help is very welcome
I tried another way:
I started SuperCollider as usual from the IDE.
I execute thisProcess.pid in the IDE, I get this in the post window:
8799
Then in a terminal:
sudo gdb -p 8799
this is the output of gdb:
sudo gdb -p 8799
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04.2) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 8799
[New LWP 8828]
[New LWP 8829]
[New LWP 8830]
[New LWP 8833]
[New LWP 8834]
[New LWP 8835]
[New LWP 8836]
[New LWP 8837]
[New LWP 8838]
[New LWP 8839]
[New LWP 8842]
[New LWP 8848]
[New LWP 8849]
[New LWP 8850]
[New LWP 8851]
[New LWP 8856]
[New LWP 8857]
[New LWP 8858]
[New LWP 8859]
[New LWP 8860]
[New LWP 8865]
[New LWP 8866]
[New LWP 8867]
[New LWP 8883]
[New LWP 8884]
[New LWP 8885]
[New LWP 9002]
[New LWP 9003]
[New LWP 9018]
[New LWP 9020]
[New LWP 9665]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007947a5d18c3f in __GI___poll (fds=0x79478c0053c0, nfds=4, timeout=95) at ../sysdeps/unix/sysv/linux/poll.c:29
29 ../sysdeps/unix/sysv/linux/poll.c: Aucun fichier ou dossier de ce nom.
(gdb)
at gdb prompt I type continue, I get this:
(gdb) continue
Continuing.
[Thread 0x794758ff9640 (LWP 9665) exited]
from there, if I exit the interpreter manually, I get this in gdb:
[Thread 0x79471affd640 (LWP 9003) exited]
[Thread 0x794710ff9640 (LWP 9002) exited]
Thread 1 "sclang" received signal SIGTERM, Terminated.
__futex_abstimed_wait_common64 (private=128, cancel=true, abstime=0x0, op=265, expected=9018, futex_word=0x79471a7fc910) at ./nptl/futex-internal.c:57
57 ./nptl/futex-internal.c: Aucun fichier ou dossier de ce nom.
Does it look better ? do you think I’m good to debug sclang ?
You’re using GIT? I.e. you downloaded the sources with git clone? If so execute either git log (extensive output) or git reflog (short output) from inside of the repository directory. You will see a list of commit hashes with some description of the commit, e.g.:
$ git reflog
8d166ea06 (HEAD -> 3.14, origin/3.14) HEAD@{0}: pull: Fast-forward
20323d4db HEAD@{1}: pull: Fast-forward
4ac3a5899 HEAD@{2}: checkout: moving from 3.13 to 3.14
7fe0b680f (origin/3.13, 3.13) HEAD@{3}: checkout: moving from develop to 3.13
1c4ccdf3c (develop) HEAD@{4}: pull: Fast-forward
f70e312df HEAD@{5}: checkout: moving from 3.13 to develop
7fe0b680f (origin/3.13, 3.13) HEAD@{6}: checkout: moving from event_syntax_subtleties to 3.13
b2b8ad6cd HEAD@{7}: commit: set variable 'a' to nil before demonstrating the effect of using a variable as key in an Event
8059e86a8 HEAD@{8}: rebase (finish): returning to refs/heads/event_syntax_subtleties
...
… the topmost hash is the commit you’re on.
hth, Stefan
Hi,
Thank you for these infos, yes I’ve download the source with git clone.
Hello,
the exact commit of my build is 602ab7463
git reflog
602ab7463 (HEAD -> develop, origin/develop, origin/HEAD) HEAD@{0}: clone: from https://github.com/supercollider/supercollider.git
The --args indicates that you wish to pass 5147 as an argument to sclang. That’s not correct. The intent here is for gdb to use 5147 to find the process to attach to.
So where I wrote in my post sudo gdb sclang xxx, the command that you should have issued is:
sudo gdb sclang 5147
I hadn’t forgotten to include --args; no need for correction. The instructions as given would have worked. However I’m glad at least that you found another command syntax that worked.
Yes, go ahead and continue debugging – wait for a crash and then do where in the gdb session.
(Also, yes, it’s easier to get the sclang pid from thisProcess.pid. However, in the ps output, the first word of the command string tells you which process is which.)
hjh
sorry for the confusion, I indeed made a mistake in the command.
I will wait until the interpreter crash again and will let you know.
Thank you for all your advices
FWIW I have also experienced this behavior, seemingly at random, on 3.13. The IDE will refuse to compile, sometimes even just fail to open a file - until you “fix” the problem. How did I fix it? I dont recall exactly, but whatever I did make absolutely no sense - for example, I deleted a comment and it magically fixed, or something like that. Seems like a major bug. Happened again just now and I finally googled it, brought me here. I thought maybe it was some weird ASCII character the IDE doesn’t like? Somehow it got into one of my comments or when I copy/pasted something?
Thanks for the report.
On its surface, this looks like a different issue, so I’d like to confirm something before perhaps splitting the topic.
Have you observed sclang crashing? That is, things are going fine and then “interpreter stopped forcefully”? If it’s something other than “stopped forcefully,” then it isn’t a crash (which would make it a different issue).
There are a few known issues in the way that the IDE parses a document into tokens. One of those is that a syntax error in an earlier part of a document causes the IDE to fail to recognize later regions (blocks in parentheses).
// Oops, didn't close a paren here
((1+1).postln;
// now this block can't be run with one key command
(
var a = 10.rand;
a * 5
)
I suspect that your issue is more like this, in which case the current topic (using a c++ debugger to identify the cause of a sclang crash) is irrelevant to you.
hjh
It wasn’t a syntax issue, if that were the case I would have never said anything. It was clearly an issue with the interpreter going haywire inexplicably due to some issue with the file or how it was treating the file. Whats particularly diabolical about this bug is that it won’t surface in your current server session - in my experience anyway, it only will surface after you shut down the server and boot it up again, say, the next day, when you didn’t notice any problem in your previous session, and you could have hours of work you did in your previous session, and you’re now locked out of loading your file with the error “interpreter has crashed or stopped forcefully” until you fix it. If you can’t figure it out, you’ll have to go to a backup file and youve just lost all your work from the previous session.
Thankfully the most I stood to lose was maybe 30-40 minutes of work but I managed to resolve the issue anyway. I went back to my notes and I had marked down that this was solved by deleting out a comment block at the top of the file and copy/pasting the working code underneath it (the code was GUI-related, FWIW) from an older version of the file. There was no syntax difference.
I see – this wasn’t clear from your first post, because that post makes references to the IDE refusing to compile or failing to open a file (where “compiling” could depend on the boundaries of the code block that were sent over to the interpreter [which boundaries are determined by the IDE], and where the IDE opening a file has nothing to do with the interpreter at all). Framing the issue initially in terms of the IDE is naturally going to make it seem like a different issue.
“… you’re now locked out of loading your file with the error ‘interpreter has crashed or stopped forcefully’ until you fix it…”
When and how is the file being loaded? For that matter, what type of file is it? .scd interactive code will be submitted to the interpreter in one way, after sclang initialization, while a .sc class definition will be compiled a different way (during startup).
“… it won’t surface in your current server session…”
It’s also worth nailing down what you mean by a server session. For instance, I almost never reboot the server without also recompiling the class library (resetting the language). Why? Because, during a session, I create a lot of objects that are pointing to server resources. If I reboot the server, then all of those objects are invalid. At least for my use cases, it’s simpler to just blow away everything in the language and start over, rather than trying to release only the invalid objects. So in my usage habits, I have no concept of a server session that is distinct from a language session. (And I tend not to leave SC open during system sleep.)
(But I could imagine other scenarios where there might be a large data set in sclang memory that’s expensive to recompute, in which case it would make sense not to blow that away.)
Basically… to proceed, we need to understand a likely sequence of events where the issue occurs. I do understand that it’s sporadic and that this might not be easy to write out, but at this time, I have zero idea of your usage pattern. It’s clear to you, but it isn’t clear to readers: “… after you shut down the server and boot it up again…” Is it really only the server that’s rebooting? Or does this include recompiling the class library? Rebooting the interpreter? Relaunching the IDE? At what point in whichever process is sclang crashing? (And the above questions, about what type of file it is, how/when it’s being “compiled” etc.)
Do you have a broken file that can reproduce the issue, so that someone else can test it?
hjh
Next time it happens I’ll try and preserve the file. Just simple windows OS, double clicking an .scd file to open it, not using any custom classes or .sc files. Sorry about being unclear. When I said shut down the server and boot it up again, just close the IDE and reopen it I guess. When it happened I didn’t use ‘quit server’ / reboot server / kill all servers, it was just opening it the next day. It seems the problem doesn’t surface until the current server process is killed and a new one started, and the memory allocations / start up parameters reinitialized / reset or whatever. Whatever happens during that process is somehow lending to the file not being able to be read, because it doesn’t happen in the current running server.
Im not the most efficient programmer and it might lead to bloated data structures or unnecessarily using more memory, maybe that has something to do with it. When I have my current program set up its using around 2 GB depending on the samples I have loaded. I do have one particular GUI panel that takes 2-3 minutes to load, and if I kill all the nodes/threads (using the ‘ctrl + .’) I’ll leave that panel open so I dont have to sit and wait, but everything else I close and reopen. Not sure on the backend the difference between that GUI panel’s objects staying in memory vs killing all the nodes outside of one being used for visuals and the other for actual audio, but I’m pretty sure the server instance stays the same.
In that case, there’s one possibility that could explain why document contents could take down the language at startup.
In SC code, you can access contents of an IDE document using the .string or .selectedString methods. These use a “text mirror” in sclang memory. (Why not just get the contents from the IDE? Because the IDE is a separate process – so we’d have to send a request and wait for a reply, making the programming interface more complicated, more like aControlBus.get.)
When the IDE starts up, it sends a command to the IDE with every document’s contents, path, selection etc. From this, the text mirror gets reset.
There could be some weird combination of bytes at this stage that causes something to break. Probably the best way to find out what that might be is to have access to a document that causes the problem. (Here… we’ve observed that the IDE ↔ language communication is more fragile in Windows than in Mac or Linux. Not yet fully understood why that is.)
For future, to keep terminology clear – it’s helpful to think of SC in terms of three nested layers: SC IDE = front end; the IDE launches (and owns) the interpreter process called sclang; then sclang launches (and owns) the server process.
When you close the IDE, this causes the subordinate process (sclang) to quit as well. Quitting sclang also quits the server.
The problem you’re describing seems to be between the IDE and language entirely. The server is not involved at these levels. I’m concerned here that continuing to describe the problem in terms of rebooting the server will confuse other people who get involved in troubleshooting.
“When it happened I didn’t use ‘quit server’ / reboot server / kill all servers…” Exactly – so it isn’t a server-specific problem! To cause the problem, you have to quit a higher-level process (and in fact, you’re closing the top-level process, the IDE).
“it’s using around 2 GB depending on the samples I have loaded…” samples = server stuff = almost certainly irrelevant to an IDE ↔ language problem.
Basically, we need to reproduce the problem with a file.
hjh
Hello,
Reading the description of @danknugz, I feel like the error message is the same but the cause looks different.
What is described does not correspond to the problem of this thread in my opinion.
Anyway, I’m systematically attach gdb tp sclang and I’m still waiting for the interpreter to crash.
We’ll see.
That’s almost certainly true, yes. The scenarios leading up to the crash are not at all the same. Well, let’s see, no problem to split the thread when more information comes in.
hjh
Hello community,
I wish you a happy new year.
I finally got the Interpreter has crashed or stopped forcefully. [Exit code: 11] error again.
This is the gdb output:
Thread 1 "sclang" received signal SIGSEGV, Segmentation fault.
PyrGC::DLInsertAfter (obj=0x557c0c48d900, after=0x557bfa1891c0, this=0x557bfa187440) at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.h:219
219 after->next->prev = obj;
(gdb) where
#0 PyrGC::DLInsertAfter (obj=0x557c0c48d900, after=0x557bfa1891c0,
this=0x557bfa187440)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.h:219
#1 PyrGC::ToBlack (obj=0x557c0c48d900, this=0x557bfa187440)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.h:243
#2 PyrGC::ScanOneObj (this=this@entry=0x557bfa187440)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.cpp:545
#3 0x0000557bde101ce8 in PyrGC::Collect (this=0x557bfa187440)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.cpp:676
#4 0x0000557bde102085 in PyrGC::Allocate (
inRunCollection=<optimized out>, sizeclass=3, inNumBytes=112,
--Type <RET> for more, q to quit, c to continue without paging--
this=0x557bfa187440)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.h:307
#5 PyrGC::NewFrame (this=0x557bfa187440, inNumBytes=112,
inFlags=inFlags@entry=0, inFormat=inFormat@entry=1,
inAccount=<optimized out>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/GC.cpp:393
#6 0x0000557bde1015ee in blockValue (g=0x557bde45f3a0 <gVMGlobals>,
numArgsPushed=2)
at /home/fabien/SuperCollider_source/supercollider/lang/LangPrimSource/PyrPrimitive.cpp:939
#7 0x0000557bde12daac in doPrimitive (g=0x557bde45f3a0 <gVMGlobals>,
meth=0x557bfca46500, numArgsPushed=<optimized out>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangPrimSource/P--Type <RET> for more, q to quit, c to continue without paging--
yrPrimitive.cpp:3888
#8 0x0000557bde102a78 in Interpret (g=0x557bfa187440,
g@entry=0x557bde45f3a0 <gVMGlobals>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/PyrInterpreter3.cpp:3035
#9 0x0000557bde1bb3e8 in runInterpreter (g=0x557bde45f3a0 <gVMGlobals>,
selector=<optimized out>, numArgsPushed=<optimized out>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/PyrInterpreter3.cpp:127
#10 0x0000557bde1bfd4b in runLibrary (selector=<optimized out>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/PyrLexer.cpp:2274
#11 0x0000557bde1f8a7f in SC_LanguageClient::tickLocked (
this=this@entry=0x557bfa019f00, nextTime=nextTime@entry=0x7ffe47fd61b0)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/SC_La--Type <RET> for more, q to quit, c to continue without paging--
nguageClient.cpp:277
#12 0x0000557bde11db86 in QtCollider::LangClient::tick (
this=0x557bfa019ef0)
at /home/fabien/SuperCollider_source/supercollider/QtCollider/LanguageClient.cpp:92
#13 0x00007c04680e733f in QObject::event(QEvent*) ()
from /lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007c0468f6c713 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () from /lib/x86_64-linux-gnu/libQt5Widgets.so.5
#15 0x00007c04680b9e3a in QCoreApplication::notifyInternal2(QObject*, QEvent*) () from /lib/x86_64-linux-gnu/libQt5Core.so.5
#16 0x00007c04681123eb in QTimerInfoList::activateTimers() ()
from /lib/x86_64-linux-gnu/libQt5Core.so.5
#17 0x00007c0468112d34 in ?? () from /lib/x86_64-linux-gnu/libQt5Core.so.5
#18 0x00007c046711bcbb in g_main_context_dispatch ()
--Type <RET> for more, q to quit, c to continue without paging--
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x00007c0467171258 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x00007c0467119363 in g_main_context_iteration ()
from /lib/x86_64-linux-gnu/libglib-2.0.so.0
#21 0x00007c04681130b8 in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib/x86_64-linux-gnu/libQt5Core.so.5
#22 0x00007c04680b875b in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () from /lib/x86_64-linux-gnu/libQt5Core.so.5
#23 0x00007c04680c0cf4 in QCoreApplication::exec() ()
from /lib/x86_64-linux-gnu/libQt5Core.so.5
#24 0x0000557bde11d9ed in non-virtual thunk to QtCollider::LangClient::commandLoop() ()
at /home/fabien/SuperCollider_source/supercollider/QtCollider/LanguageClient.cpp:37
#25 0x0000557bde10ef40 in SC_TerminalClient::run (this=0x557bfa019f00,
--Type <RET> for more, q to quit, c to continue without paging--
argc=<optimized out>, argv=<optimized out>)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/SC_TerminalClient.cpp:275
#26 0x0000557bde0fcd12 in main (argc=3, argv=0x7ffe47fd6798)
at /home/fabien/SuperCollider_source/supercollider/lang/LangSource/cmdLineFuncs.cpp:27
(gdb) continue
Continuing.
Couldn't get registers: Aucun processus ayant ce numéro.
(gdb) [Thread 0x7c03d2ffd640 (LWP 2517) exited]
[Thread 0x7c03d37fe640 (LWP 2515) exited]
[Thread 0x7c03d3fff640 (LWP 2496) exited]
[Thread 0x7c03ce7f4640 (LWP 2495) exited]
[Thread 0x7c03cdff3640 (LWP 2378) exited]
[Thread 0x7c03cd7f2640 (LWP 2377) exited]
[Thread 0x7c03ccff1640 (LWP 2376) exited]
[Thread 0x7c03f0ff9640 (LWP 2360) exited]
[Thread 0x7c03f17fa640 (LWP 2359) exited]
[Thread 0x7c03f1ffb640 (LWP 2358) exited]
[Thread 0x7c03f27fc640 (LWP 2353) exited]
[Thread 0x7c03f2ffd640 (LWP 2352) exited]
[Thread 0x7c03f37fe640 (LWP 2351) exited]
[Thread 0x7c03f3fff640 (LWP 2350) exited]
[Thread 0x7c0418ff9640 (LWP 2349) exited]
[Thread 0x7c0419ffb640 (LWP 2348) exited]
[Thread 0x7c041b7fe640 (LWP 2344) exited]
[Thread 0x7c042cfff640 (LWP 2342) exited]
[Thread 0x7c042dbfe640 (LWP 2341) exited]
[Thread 0x7c042e3ff640 (LWP 2335) exited]
[Thread 0x7c042f7fe640 (LWP 2332) exited]
[Thread 0x7c043cfff640 (LWP 2330) exited]
[Thread 0x7c0448dfd640 (LWP 2329) exited]
[Thread 0x7c04495fe640 (LWP 2328) exited]
[Thread 0x7c0449dff640 (LWP 2327) exited]
[Thread 0x7c0450d66640 (LWP 2324) exited]
[Thread 0x7c04527ff640 (LWP 2319) exited]
[Thread 0x7c0453972640 (LWP 2316) exited]
[Thread 0x7c0454173640 (LWP 2315) exited]
[Thread 0x7c045439f400 (LWP 2291) exited]
[Thread 0x7c042ffff640 (LWP 2331) exited]
[New process 2291]
Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
Thank you for your help
Is this is still using a commit from 2023 (602ab7463)? If so, you’re on your own unfortunately. There have been a few changes and fixes to the GC since then and when 3.14 was released. If you can reproduce this on either the current release version of 3.14.1, or 3.15 dev, preferably with some code that reproduces the bug, I’ll try to find some time to dig deeper.
This is the same commit (from 2023).
In this case I lost several months of work.
Well I can see 4 options.
- Upgrade to 3.14.1 - if bug still exists, we can try to fix it.
- Downgrade to 3.13 - only good if the bug was introduced in 3.14.
- Go through all the commits from the past 3 years that might affect the GC, cherry picking them into your own sc distribution, seeing if they fix the problem.
- Go the other way, start with 3.14.1 and slowly remove commits.
3 and 4 are a lot of work as you are effectively creating your own sc distribution. I’d strongly recommend upgrading and trying to fix any issues that arise.
Dev builds should never be used for a performance piece, and they should always be kept up to date. They should only be used when you are looking to help develop sc by submitting bug reports / being a guinea pig.
If you had some code that reproduced this issue, I might know exactly which commit you need to include, but as this isn’t easy to reproduce and seems to occur only in longer running pieces, I’d just be guessing (#6261 maybe?). There are also a few prs in 3.15 that could solve this if it wasn’t fixed in 3.14.