RingBuffer even slower than using a List

I checked out RingBuffer today. As I understand it, the point of using a circular buffer is speed, since there’s never any reshuffling of items. So I was surprised to find that the RingBuffer class is slower than doing the same operation the supposedly inefficient way:

//***********List

a=List.new
100.do{|i| a.add((hi: i))}
//writing:
(
var n = Main.elapsedTime;

a.add((hi: 99));
a.removeAt(0);

(Main.elapsedTime - n).postln;
)
//—> ~ 1 to 2 e-5 s

//reading:
(
var n = Main.elapsedTime;

a.do{|item|
   item.hi = item.hi + 1;
};

(Main.elapsedTime - n).postln;
)
//—> ~ 1.5 e-4 s


//***********RingBuffer

a = RingBuffer.new(101); //need to initialize with one more than we want! so strange
100.do{|i| a.add((hi: i))}
(
var n = Main.elapsedTime;

a.overwrite((hi: 99));

(Main.elapsedTime - n).postln;
)
//—> ~ 2 to 4 e-5 s
//—> writing to RingBuffer is slower than doing it with a List!!!

//reading:
(
var n = Main.elapsedTime;

a.do{|item|
   item.hi = item.hi + 1;
};

(Main.elapsedTime - n).postln;
)
//—> ~ 1.5 e-4 s
//—> reading RingBuffer is about the same as reading from the List




//testing that time measurement is functioning the way I think it is:
(
var n = Main.elapsedTime;
(Main.elapsedTime - n).postln;
)
//—> ~ 1 e-7
//so yes, works fine

Also, isn’t it strange that RingBuffer is able to hold one less than its initialization size number of items? I can’t understand why this would’ve been done this way.

I pointed out some issues in the help for RingBuffer here, too:

One thing is, a benchmark should usually span a large number of iterations. If the duration you’re measuring is extremely short, then the margin of error is likely on the same order as the duration being measured, meaning that you can’t have high confidence in the measurement. But if you measure once for a loop performing the operation 100,000 times, then the margin of error is a fraction of a percent and the confidence can be higher.

Also, avoid unnecessary operations in a benchmark. Events aren’t needed here. If it’s just simple numbers being written into the collection, then the measurement will be more focused on the operations that you’re testing.

I had a few minutes for a quick test – it’s true that RingBuffer is slower for a 100-item size bound.

The time required for an Array removeAt(0) should be linear time based on the array size – very fast for small arrays, slower for large arrays. If you benchmark the array/list version with 10000 items rather than 100, then array/list is much slower. Meanwhile, RingBuffer’s pop and add operations don’t iterate, so we would expect constant time. I found that RingBuffer performs about the same whether the maximum size is 100 or 10000.

At 100 items, array/list are faster; at 10000 items, RingBuffer is faster.

Array removeAt is done in a primitive, which gives it a big speed boost but it can’t escape the O(n) behavior of larger n requiring more time.

Bug fixes are welcome. (I.e., I don’t think there is a “why it’s done that way.”)

hjh

I also think there have been some misconceptions about it. It’s a textbook case of “asymptotic complexity for small inputs”

And there is yet another factor to make the situation even worse.

SuperCollider’s RingBuffer is implemented entirely in sclang code, not a primitive like other collection classes. (Even then, it still has constant and predictable complexity and time)

At the same time, primitives optimize constant factors, not algorithmic complexity. RingBuffer will continue to have O(1) complexity, and the others still have O(n) complexity.

This image illustrates a bit of that

This image, even if not exact, shows why ring buffers work so well on low-level, real-time code. It is not " faster", it is predictable

Thanks much, @smoge and @jamshark70 — this all makes good sense. The key is in the lack of primitive implementation for RingBuffer — because outside of that, there’s just no reason why a circular buffer should be slower than a List, even for very small arrays. Might be worth implementing a primitive! I’ll ponder that, now that I’ve gotten my feet wet with HungarianSolve.

Regarding fixing what does seem to be a bug (the size being one less than the initialization size): first of all, it’s mentioned (although cryptically) in the help, so that seems to indicate it was on purpose; secondly, wouldn’t fixing it create potential issues with backward-compatibility? If you think it’s a good idea to fix it, I’m happy to.

I really don’t think there is a “bug”. And I am not sure if implementing as a primitive will help that much in few situations.

I really don’t think there is a “bug”

The ‘bug’ I’m referring to is not the lack of optimization, but the fact that RingBuffer has a capacity which is 1 less than the initialization size. I.e., if you do

a = RingBuffer(10)

the RingBuffer can only hold 9 values. Do you see a reason for that?

I believe the reason RingBuffer holds one less item than its initialization size stems from a classic implementation challenge with circular buffers: distinguishing between full and empty states.

The way circular buffers typically work is by tracking two pointers: a write pointer (where you’ll add the following item) and a read pointer (where you’ll read the subsequent item). Here’s where things get tricky, though - when the buffer is empty, these two pointers point to the same position. But if you fill the buffer and the write pointer wraps around, it would also end up at the same position as the read pointer. So now you can’t tell if your buffer is empty or full!

There are a few ways to solve this. The most common approach, which sclang seems to use, is to waste one slot simply - you always keep one position empty and consider the buffer “full” when the write pointer is one position behind the read pointer. Sure, you lose one slot, but the implementation stays clean and simple.

The alternatives would be to track a separate counter for how many items are in the buffer (which adds overhead) or to use a boolean flag to track whether it’s full (which adds complexity).

While the discussion doesn’t reveal why SuperCollider’s developers specifically chose this approach, the “waste one slot” method is totally Okay. It’s reliable, avoids extra state variables, and honestly, losing one slot is usually no big deal. It makes the code cleaner and less prone to bugs, even if it does confuse users who expect RingBuffer(10) to actually hold 10 items!

Not directly related, but give perspective on those kind of things:

‘Proofs and Refutations’

smoge’s illustration is very good! O(1) only says that an operation is independent from the number of elements, it doesn’t say anything about the cost of the operation itself. The important property is that the cost remains constant.

The key is in the lack of primitive implementation for RingBuffer — because outside of that, there’s just no reason why a circular buffer should be slower than a List, even for very small arrays.

First, I don’t think that we need a primitive implementation. I can’t imagine a case where the performance of RingBuffer would matter in a real-life application.

Second, I should point out that sclang’s List class is just a thin wrapper over Array. Array has the annoying property that .add may return a new object, so you always have to reassign the variable:

a = [ 1, 2, 3, 4 ];
a = a.add(5); // we must reassign to 'a'!

List simply holds an internal Array instance and manages the reassignment for you:

a = List.with(1, 2, 3, 4);
a.add(5); // not need to reassign 'a'!

RingBuffer is similar to List in that it also manages an internal Array instance, but the add and pop methods need additional code to check the read and write indices. So it’s not surprising that the methods are slower (for small element counts).

the RingBuffer can only hold 9 values. Do you see a reason for that?

smoge’s explanation above is correct. However, I agree that it’s confusing that RingBuffer(10) can only hold 9 elements. Sometimes we actually need the exact number of elements. (For example, using a ring buffer to track the last N pitches played by an instrument.)

Here’s a possible fix:

	*new { | size, collectionClass(Array) |
		^this.newCopyArgs(collectionClass.newClear(size + 1), 0, 0)
	}

	// return maximum capacity.
	maxSize {
		^array.size - 1;
	}

I think this should do the trick, but I haven’t tested it.

EDIT: scratch that. The help file already clearly documents the current behavior:

Initial size. The collection will be able to hold one minus this number of values.

We can’t really change that without breaking existing projects, so I guess we just have to live with it :slight_smile:

1 Like

Yo–

Yes, after one more minute of thinking, you are right. I stand corrected, and I remove what I said, emphasizing the “optimization” rationale behind the issue.

(Maybe on the C++ dsp level, it would make sense, but even then, it would be less critical than the predictability of the code.)

Besides, to count as “optimization”, the number of elements would be so significant that I can’t imagine a practical reason for that.

So, to be clear: there is no bug; it’s a simple and good implementation. One of the “standard” solutions, in a sense.

Nothing much to see here, homies

I’m out
Peace

Not directly related, but give perspective on those kind of things:
Love this reference!

The question in my mind, as someone who doesn’t know about implementation of ring buffers, is whether it is useful to expect the user to understand these inner workings. The purpose of object-oriented-programming is generally to shield calls to an object from how it works internally, and to this effect, it would seem to make a lot more sense to me to just add a slot so that it holds as many objects as we asked it to. After all, that’s the expected behavior with Array and List and so on.

We can’t really change that without breaking existing projects, so I guess we just have to live with it :slight_smile:

Yeah, that was my fear — even though I think it would make a lot of sense to adjust it as you suggested (by simply adding a slot internally), it would break backward-compatibility. Too bad!

Separately, am I crazy, or does that sentence in the help make absolutely no sense? I filed a bug about it. Both of these make sense:

The collection will be able to hold this minus one number of values.
The collection will be able to hold one fewer than this number of values.

Separately, am I crazy, or does that sentence in the help make absolutely no sense?

You’re absolutely right! I didn’t even notice :smiley: What about:

The actual capacity will be the given size minus one element, see the maxSize method.

(I’m not a native speaker, though.)

2 Likes

that wording seems great to me, @Spacechild1