Why no line numbers in SC runtime errors?

dantepfer · March 3, 2025, 10:52pm

First of all, huge thanks to all who contribute to the SC codebase. I’ve been using SC extensively for years (see for example my Tiny Desk Concert) and love it.

A perhaps dumb question: to me, the single-most-annoying thing about SC is the lack of line numbers in its runtime errors. I find it slows down bug-hunting enormously. There must be a reason that SC, unlike most other programming languages, doesn’t give the line number where the error took place, and I’d love to know why if anyone knowledgeable has a moment.

Thanks!

jamshark70 · March 4, 2025, 2:13am

To implement that properly, we would need:

Editor integration (with every supported editor) so that the SC compiler knows the line number offset of the selected code block. That is, when you compile a Java file, you’re compiling the whole file, so it’s necessary only to count lines. In SC, the selected code block may begin at line 0, or line 10, or line 537. So the sclang compiler can’t identify the absolute line number in the document based only on the supplied text. Editors would need to send a line number offset, and the sclang backend would have to pass that into the compiler. Entirely technically possible, though editors that didn’t implement that change wouldn’t benefit.
The data structure for functions would need to include a mapping between bytecode indices and line numbers. The SC interpreter doesn’t know the program text, only the bytecodes stored in the FunctionDef or Method object. When an error occurs, the bytecode index must be known somewhere – so to report a line number with the error, there needs to be a lookup to identify which line the bytecode index belongs to.

I suppose for a proper C++ programmer (which isn’t me…), neither of these is especially difficult – but they haven’t been done yet, so, we don’t have line numbers.

hjh

smoge · March 4, 2025, 2:49am

So, while it’s technically possible and not necessarily conceptually complex, calling it “not especially difficult” oversimplifies it. For my understanding, changing bytecode format to include line number information would require profound modifications to the interpreter (probably also data structures, etc).

Also, performance seems to be an issue on discussions in general here; additional metadata (lookups, memory usage) would also impact that.

dantepfer · March 4, 2025, 3:11am

Thanks much, @jamshark70 and @smoge. Curious if others find this to be something important. As far as I can tell, it doesn’t seem like a common complaint. Is it something that you both would ideally like to have added? Or do you not get frustrated with SC’s error reporting?

smoge · March 4, 2025, 3:22am

Yes, it would be a significant enhancement—no doubt about it.

But I don’t see anyone rewriting core language/interpreter code or extending data structures/function objects to store source information.

I don’t mean there aren’t other ways to improve our error reporting system.
I don’t see this happening that way. That’s all.

jamshark70 · March 4, 2025, 4:06am

Yes, I agree, I should walk back from that a bit.

But… one could add a member variable to FunctionDef to store e.g.:

(
{ |x, y|
	var out = atan2(y, x);
	out * 3  // nonsense, but...
}.def.dumpByteCodes
)

BYTECODES: (12)
  0   31		 PushTempZeroVar 'y'
  1   30		 PushTempZeroVar 'x'
  2   0E 16    SendSpecialBinaryArithMsgX 'atan2'
  4   80 02    StoreTempVar 'out'
  6   32		 PushTempZeroVar 'out'
  7   2C 03    PushInt 3
  9   B0       TailCallReturnFromFunction
 10   E2       SendSpecialBinaryArithMsg '*'
 11   F2       BlockReturn

// then the bytecode-line map could be:
[ 0, 3, 6, 4 ]

Meaning “bytecode 0 starts line 3, then bytecode 6 starts line 4” and do a linear search for array[i * 2] > index and report, then, array[i * 2 - 1] as the line number. So if it died on atan2, that’s bytecode index 2. i == 0 fails, but i == 1 passes, then it would report line 3.

No change to bytecode format. Building the array would be… interesting… in the compiler though.

hjh

smoge · March 4, 2025, 4:21am

That sounds more realistic. Clever solution.

However, even after a few weeks of work by someone familiar with the compiler, much testing with all sorts of code would still be needed.

jordan · March 4, 2025, 7:24am

A little while ago I came up with a solution for when you have a ‘doesnotunderstand’ error. This works because (almost) all the information you need is already in the debug frame object, expect from where the error occured, so the code has to search through all the code in the call stack in text format to find it. This is entirely a SC solution - no c++. See below for example.

Storing the character index for each bytecode would also work and make lookup easier. However, you’d still have to walk up the call stack to find the meaningful bit of code. One issue is DoesNotUnderstand as this doesn’t fail when you write 1.asdfds, but only when the DoesNotUnderstand error isn’t caught — sending any random message is valid behaviour in smalltalks. This means figuring out exactly where an ‘error’ has occurred is harder in SC than other languages.

The SC interpreter doesn’t know the program text

It is accessible in the DebugFrame Object.

{
	\asdf;
	1.getBackTrace.functionDef.sourceCode
}.()

The one exception to this seems to be when you have a function object written in a method. Further, methods delete their source code, but store the file name and the charPos so it can be reloaded.

Here’s an example of what I got working, the code is a mess and needs refactoring but works - I’ve just been busy.

In file /home/jordan/Work/software/quarks/Piping/Classes/ScaleExt.sc: line 38
BorkedClass:throwVariable:
		var b = 'good variable';
		"someting".postln;
		a + 1;
		b.bad_message
        ^^^^^^^^^^^^^
		//a.ahahahahaha(42)
	}

Variable 'b' with value good variable (Symbol), did not understand the message 'bad_message'.

Now it can only give you the file name and line number when this is a class, when the code is passed at runtime you would still get the body of the code, but you’d not get a file name, and the line number would be relative to the function body. I don’t think this is a deal breaker though.

dantepfer · March 5, 2025, 4:57am

That looks amazing @jordan — even a partial solution would, in my opinion, be a huge improvement over what we have now.

scztt · March 10, 2025, 2:34pm

If it’s interesting, here’s a verrrrrrry old proof of concept of adding line number metadata to compiled code, which facilitates line numbers for errors as well as some other debugging functionality. This code is originally at least 10 years old, so I’d probably write this a bit differently now, but I think it’s more or less the right way to do this.

dantepfer · March 11, 2025, 12:56am

In my totally humble opinion, as someone who’s used SC quite a lot but never contributed to the source, this should be a top priority for SuperCollider. The bug-hunting experience in SC is vastly more frustrating than it is in pretty much any other modern programming language, and leads to wasted time. Just having a line number would make a gigantic difference. It seems from this discussion that there are already several viable efforts in this direction. How do other people feel about it?

dantepfer · March 11, 2025, 1:00am

I wanted to mention, @jamshark70, that I don’t think absolute line numbers are essential at all. Just having relative line numbers would make a night-and-day difference. It’s fairly trivial to find a line number from execution start yourself, whereas “message < not understood” (for example) tells you absolutely nothing about where the error occurred (and very often has little to do with the actual presence of < in your own code, for reasons that yes, I do understand).

jamshark70 · March 11, 2025, 2:25am

I agree that relative line numbers are better than no line numbers at all. But I think that keeping “execution start” in your mind may be less trivial than you think.

The case I was thinking of is something like this: you start off with some version of a couple of functions, loading them both in one code block.

~heresAFunction = {
	...
	...
	...
	...
};

~anotherFunction = {
	...
	...
	...
};

Case 1: You edit anotherFunction and, for whatever reason, instead of rewriting the entire block, you just selected anotherFunction and its definition, and executed it as its own block. Now, when you get an error report with relative line numbers, you have to remember that the functions were loaded differently, and that line 1 may be either function. (We’re not always organized when experimenting with code… you could say “put them in a (…) block and always run them together” but sometimes people also load buffers or other persistent resources in their init block, in which case they would select just the bit they want to redefine and avoid leaking buffers. In practice it’s very easy to be inconsistent about block start, and I wouldn’t go so far as to say this is the user’s fault – it’s part of the freewheeling SC ethos.)

Case 2: You added lines to heresAFunction and reran only that definition. Now anotherFunction’s line numbers are wrong.

… but the stack trace might – admittedly it isn’t always possible to find from the stack trace (I recently had trouble tracking down an error due to stack corruption), but it usually contains valuable clues.

In any case, I agree with:

no line numbers
relative line numbers
absolute line numbers

hjh

dantepfer · March 11, 2025, 6:43am

Great points. A lot of my SC coding is just executing the entire document, so the scenario above wasn’t immediately on my mind. In any case, I’m glad to see that there does seem to be genuine interest in the topic, and that several people have already made proofs of concept for it. Hopefully it goes somewhere.

jordan · March 11, 2025, 7:13am

One solution is just to print the source and highlight the error line. All code from runtime is available to print. Functions in the class library don’t have the source, but do have stable line numbers.

jordan · March 11, 2025, 7:21am

Looks like a great addition! Just curious, can you see which bytecode is being executed in each frame on the call stack?

dantepfer · March 12, 2025, 1:36am

I’d be very happy with that!