a List is what you want for anything that’s growing
There are a couple of concerns here: 1, ease of adding elements (List is the winner here), and 2, performance (List will never beat Array).
A List is just a wrapper for an array. It’s easier to add things because it does a = a.add(thing)
internally, for you, saving you from the trouble of having to remember the reassignment for yourself. (The overhead of reallocating arrays is still there, just hidden from the visible code.) But add
dispatches to Array’s add
method – two method calls for each add
– so it will always be slower.
Five benchmarks:
- Wrong use of Array:add. This is the worst in terms of performance, because every expanded array created by
add
is discarded = huge garbage-collector load. And the final array is the wrong size. - Correct use of Array:add, without pre-allocating the number of elements.
a = a.add(...)
is necessary. - Correct use of Array:add, with pre-allocation. This is a bit faster because it doesn’t have to garbage-collect intermediate arrays.
- List, without pre-allocating.
- List, with pre-allocation. For 4 and 5, I believe the overhead of the double method calls outweighs other considerations – comparison of these two is probably random.
(
var n = 10000000;
bench {
var a = Array.new;
n.do { a.add(1.0.rand) };
a.size.debug("unallocated, no reassignment");
};
bench {
var a = Array.new;
n.do { a = a.add(1.0.rand) };
a.size.debug("unallocated, with reassignment");
};
bench {
var a = Array.new(n);
n.do { a.add(1.0.rand) };
a.size.debug("pre-allocated, no reassignment");
};
bench {
var l = List.new;
n.do { l.add(1.0.rand) };
l.size.debug("List, unallocated");
};
bench {
var l = List.new(n);
n.do { l.add(1.0.rand) };
l.size.debug("List, pre-allocated");
};
)
unallocated, no reassignment: 1 // BZZZT wrong
time to run: 2.589702371 seconds.
unallocated, with reassignment: 10000000
time to run: 1.229955393 seconds.
pre-allocated, no reassignment: 10000000
time to run: 0.95807274900005 seconds.
List, unallocated: 10000000
time to run: 2.022845863 seconds.
List, pre-allocated: 10000000
time to run: 2.051731603 seconds.
In the 8.5 second example, it’s actually flatten
that kills it. Merely restructuring the code (as fredrik did) to avoid flatten
is enough. The distinction between List and Array isn’t really significant here.
( //~8.5sec
var a = [1, 2, 3, 4, 5, 6, 7, 8];
bench { ~b = (a.size.factorial).asInteger.collect{arg i; a.permute(i)}; };
bench { ~b = ~b.flatten.asStream };
)
time to run: 0.067028030000074 seconds.
time to run: 15.500484482 seconds.
hjh
PS Apparently my laptop is not very good