Hi all,
In the frame of the extension/quark cycle, I am working to implement a new cyclic algorithm: the discrete logarithm. Apriori, it works like it should, only it returns weird result when the big numbers are involved. I know this is due to the limit of the SuperCollider implementation, but I would like either add a condition with warning when numbers reach the limit of the SC implementation, or to implement this algorithm in C++, in order to remove any whimsical result.
Here is the draft code:
(
~discreteLog = { |g, p, i=1, res|
var sm = ~squareAndMultiply.(g,i)%p;
// or
// var sm = (g**i).asInteger%p;
if (res.isNil.not && res.asArray.includes(sm))
// or
// if (res.isNil.not && (res.asArray[0] == sm))
{
res
}
{
if ((i>1000) || (sm<=0))
{ "you reached the limits of ...".error; res }
{ res = res.add(sm); ~discreteLog.(g, p, i+1, res) }
}
};
~squareAndMultiply = {
|n, exp|
var res=1;
exp.asDigits(2).do{|it| res = res * res; if(it == 1) {res = res * n}};
res
}
)
~discreteLog.(27, 23);
// the result should be: [ 4, 16, 18, 3, 12, 2, 8, 9, 13, 6, 1 ]
Have you tried using Float instead of Integer? 27.0 instead of 27 in this case. Then, you will have a 64-bit floating-point number, and the range will be much, much larger.
(Or maybe change the code to convert it to Float. I haven’t checked the quark yet)
Nowadays, there are good algorithms for arbitrary big numbers; it may be possible in the future to implement in sclang.
I confirm it works with Float.
Only, SuperCollider for this kind of computation in comparison with C++ or Common Lisp is rather limited.
Yes, let see in the future if algorithms for arbitrary big numbers will be relevant within SC.
…wait, that means Float has a larger integer range than Integer by 2^22 (~4mil).
okay I just checked the source code and I'm so confused...
Here is some code displaying this oddity.
(1 << 62).postln.asBinaryString(64).postln
// 0000000000000000000000000000000000000000000000000000000000000000
(1 << 63).postln.asBinaryString(64).postln
// 0000000000000000000000000000000000000000000000000000000000000000
// so at this point you'd probably expect it just to be zeroing
// out everything after 32.... nope
(1 << 64).postln.asBinaryString(64).postln
// 0000000000000000000000000000000000000000000000000000000000000001
I wonder if there is a reason for this, or if it was forgotten?
Also, if there was a compiler flag to set int to 64bits (which would be completely valid in the C++ spec) would all sclang ints suddenly be 64bit? Could you do this with macros?
The thing about Float 32 vs. Doubles is that, as I understand, there is not much point in using 32-bit floating-point.
Floats (probably 32-bits) are almost always a bad idea, unless you Really Know What You Are Doing. Use Doubles. There’s rarely a speed disadvantage—modern machines will use the same floating-point unit for both. With Doubles, you are much less likely to hang yourself with numerical errors. One time when Float might be a good idea is if you have a lot of them, say a giant array of Floats.
That test is on a 64bit architecture, I imagine the performance on 32bits system (I’m thinking about raspberry pi in particular) might be very different. I haven’t been able to find one though…
I think the primary reason why sclang uses 64-bit floating point numbers, but only 32-bit integers, is because the original 32-bit interpreter uses a technique called NaN-tagging (AKA NaN-boxing). The idea is to use some NaN bit patterns in the upper 16 bits of a 64-bit float to designate the object type, so the lower 48 bits can be used to store other types. On a 32-bit systems, pointers are 32 bit wide, so that is no problem, but you are also confined to 32 bit integers *). See lang/LangSource/PyrSlot32.h.
On 64-bit systems, sclang actually uses simple tagged unions, see lang/LangSource/PyrSlot64.h. **) Theoretically, sclang could have “upgraded” to 64-bit integers, but it would have also required updating the whole code base, which probably would have been too much work and error prone, so they just kept the 32-bit integer type.
*) Actually, with NaN-tagging you could use 48-bit integers (= 64-bit integer with the upper 16 bits masked off).
**) Actually, NaN-tagging does work for 64-bit systems, assuming that only the lower 48 bits of pointers are valid and the OS does not use the upper 16 bits (which is typically the case).
The different subclasses of RawArray do not imply other types in the language, right? One could not use a 32-float, for example, since it would create problems with all algebraic operations without design changes. Maybe the same reason an additional Integer type would be a problem too?
Int8Array - 8 bit integer
Int16Array - 16 bit integer
Int32Array - 32 bit integer
FloatArray - 32 bit floating point
DoubleArray - 64 bit floating point
SymbolArray - symbols
The different subclasses of RawArray do not imply other types in the language, right?
That’s a good point! They do not necessarily refer to actual types in the language. There is still only a single floating point and integer type (not counting Char).
Note that it is always safe to cast between float and double; in the worst case you just lose some bits of precisions. In the case of FloatArray, the floats are internally kept as an array of (32-bit) float, but whenever you access individual members, they are necessarily converted back to (64-bit) double.
With integers it’s a bit more complicated: signed integer overflow is undefined behavior in C/C++, so you cannot simply save any 32-bit number in, say, an 8-bit signed integer. I just quickly checked and sclang does not do any range checks, so the following code would actually trigger undefined behavior:
What I find confusing is that the slot stores an int64, but all functions use just the normal int (which could be as low as 16bit according to the standard…). Also that the char is big?
Do you think a change to fixed width integers might be worthwhile?
I’ve made a draft pr of this, it’s only a few hundred lines, mostly using Clion’s refactoring of function signatures and took an hour or two, I’ve run the unit tests and haven’t found any bugs (on Linux)… which probably means there aren’t enough tests.
I haven’t got the windows 32bit ci to work yet, but that might be more interesting.
Ah, you’re right, I didn’t catch this! But yeah, they didn’t bother to update the whole code base to 64-bit integers – which is understandable in a way because it would be a massive undertaking.
(which could be as low as 16bit according to the standard…)
Only on some microcontrollers
Also that the char is big?
Yes. What is even more confusing is that the setter (SetChar) uses a char parameter, but the getter (slotRawChar) returns an int…
The case of Char is a bit curious in general. Is there any other (popular) scripting language that has a dedicated integer type for characters? I guess the idea was that Char should have its own set of methods and not behave as a number; in particular, it should not participate in arithmetic operations. That’s why it only inherits from Magnitude, but not SimpleNumer. However, it would be very inefficient if it was a regular (heap-allocated) object, so it had to be a built-in type.
A good behavior could be to wrap around, stay within the type range, and print a warning.
At least that’s what happens for Char, see the _AsAscii primitive:
SetChar(a, slotRawInt(a) & 255);
.asAscii resp. .asDigit are the only methods for explicitly creating a Char.
However, this is the current implementation of .put for Int8Array and Int16Array, see putIndexedSlot:
case obj_int16:
if (NotInt(c))
return errWrongType;
((int16*)(obj->slots))[index] = slotRawInt(c);
break;
case obj_int8:
if (NotInt(c))
return errWrongType;
((int8*)(obj->slots))[index] = slotRawInt(c);
break;
As you can see, there are no range checks. (Not to mention the pointer casts that violate strict aliasing rules.)