Loops

This is where my bias for a certain sort of descriptive programming as essential to maintainability comes out.

By "descriptive" in this context I mean "code which is explicit about what the intent of the developer is". (A side effect of this sort of declaration is usually small, named functions, which are good for other reasons. But this post isn't about functions, as such. It's about loops.)

The for loop was introduced as part of the structured programming revolution. (It's not a fundamental: it's easy to create with goto, the minus operator, and if, the real primitives*, but it "tames" the goto into a reliable and meaningful higher-level concept.)

* Ok, jz, jnz, add, sub, and cmp are the real primitives. I'm not talking assembler here.

The classic C for loop is

int i;
...
for (i = 0; i < lim; ++i) { contents, possibly referencing i }

Recent languages have provided syntactic sugar when iterating over something other than pure integral values (and sometimes pure integral values as well, see iota). Almost every "modern" language now supports a range based for loop of some sort. For today I'm going to compare the support in go with the approach taken by C++ STL algorithms, which effectively abolish explicit loops in favour of library functions taking functions as parameters.

Go is simple when what you want to do is built into go. (This is a tautology, obviously.) As you move away from that, it becomes less intuitive to follow. (Not worse than C, but not necessarily much better.)

Go provides a range loop, which uses go's ability to return two values to iterate over both an index and a value at the same time. This allows easy implementation of logic resembling for_each_n as well as avoiding needing to index into the slice or map on every iteration.

Compared to C++, though, it has a problem: it's not extensible. It's confined to slices and maps, both built-in types. (The C++ range loop allows user-created types, as long as they provide the appropriate syntax to act as ranges. I'm skipping over the C++ range for loop here as I think that it should normally be treated as superseded by the C++20 range/views versions of the STL algorithms.)

There's another reason to prefer C++: the greater expressiveness of the STL algorithms, which declare intent more clearly than the use of a single kind of loop - literally single in go, where for is used for while - in which all the logic has to be internal to the loop.

Consider the examples below. In all cases we have a range (vector (or actually, just about any single-valued container)/map in C++, slice/map in go) called r; a const function bool bar(T t) which operates on the type in the collection, and another function T foo(T t) which transforms the argument into another value of the same type (and which may be a closure). In some cases there are destination ranges as well, as required by the examples.

Case 1:

This is the simplest case: a simple for loop. In this instance foo(), for the purposes of the example, is taken to be a closure affecting the state of the data it references and returning void, for maximum simplicity.

The go version is:

for _, v := range r {
foo(v)
}

This is nice and simple, compared with the C-style equivalent, which requires indexing into the slice on each iteration:

for i:= 0; i < len(r); i++ {
foo(r[i])
}

(Though note that those two types of loops behave differently in go, though it makes no difference in this particular case: in the old-fashioned loop the value accessed is different each time, at a different location in memory, because it is a different offset into an array. In the range form the address of the range value gets reused on every iteration and the values passed around are references. This can lead to subtle bugs which require that the variable be captured before it can be passed into any context which uses it outside the loop.)

C++ is even more laconic:

std::ranges::for_each(r, foo);

though the actual number of letters is actually greater in this version because of the namespace qualification.

Conceptually, both say the same thing: iterate over every element and apply foo to every element.

Note, though, that in C++ exactly the same code will handle a generated series (e.g. an iota_view, or a user-defined type), or a filtered view. For go, you need to modify the loop for the first:

for v := r.generate(); !r.IsEnd(v); v = r.generate() {
foo(v)
}

... and the second isn't even an available concept.

Let's make what we're doing a trifle more complicated.

Case 2:

Copy all the elements matching a criterion to a second unsorted collection (vec will be a vector in C++, a slice in go):

std::ranges::copy_if(r, std::back_inserter(vec), bar);

We have the extra wordiness of supplying the back_inserter, but it does declare on its face exactly what it does.

for _, v := range r {
if bar(v) {
vec = append(vec, v)
}
}

Again, because of the length of the words in C++ this is shorter by the standards of wc (well, actually, shorter by characters, longer by lines). But you have to look inside the loop to determine what is happening. Cost is probably fairly similar, especially if you reserve adequate space in the target first.

Now for a numeric example. I'm using C++ 23 fold_left rather than accumulate, but (as with the other C++ examples) this could be converted back into C++03 by using the version with iterators, in this case with a different name.

Case 3:

auto res = std::ranges::fold_left(r, 0, std::plus<int>());

res := 0
for _, v := range r {
res += v
}

The same general observations apply as in case 2, with one other tweak: the C++ example is a function returning a value, where the go example modifies a value in the scope in which the loop is embedded. To really do what the C++ version does, you need

func accumulate(r []int) int {
rval := 0
for _, v := range r {
rval += v
}
return rval
}

res := accumulate(r)

This is starting to get a bit more complicated, though the last form does provide encapsulation for multiple uses.

Let's make that one a bit more complicated: swap the int for a string, and require that a transforming function be called on the value before it is added (concatenated). We'll ignore the "wrap in a function" version for go.

Case 4:

auto res = std::ranges::fold_left(r | std::views::filter(bar), ""s, std::plus<std::string>());

res := ""
for _, v := range r {
if bar(v) {
res += v
}
}

The C++ version visibly severs the filtering logic from the accumulating logic. In a case like this where it's just a named function that's a minor (though still useful) advantage; as the complexity of the data increases it takes more attention to look into the go loop to see what is going in. If this were one-time logic, which would just be incorporated into the raw loop but passed as a lambda into the filter view adaptor it would make more of a difference.

Now let's look at something more complex at a higher level. We want to process a set of values from a map, transforming them before processing. We also want to process them twice - first getting a minimum value, and secondly (a little later) dividing the output of the transform into two ordered collections based on a predicate.

Case 5:
Here's the C++:

auto r2 = r | std::views::values std::views::transform(bar);

auto res = std::ranges::min(r2);
...
std::ranges::partition_copy(r2, std::inserter(s1, s1.end()), std::inserter(s2, s2.end()), foo);

Note that we've used two sets as our target collections, which gives us the ordering out of the box. We've also been able to capture the value and transform logic in a lazily-evaluated view, so we don't need to repeat the logic.

For go:

res := bar(r[0])
for _, v := range r {
s := bar(v)
if cmp.Less(s, res) {
res = s
}
}
...
for _, v := range r {
s := bar(v)
if foo(s) {
s1 = append(s1, s)
} else {
s2 = append(s2, s)
}
}
slices.Sort(s1)
slices.Sort(s2)

go has no native sorted slices, so we have to sort after assigning the values.

At this point we've reached the level of "much easier to determine what's going on", i.e. more maintainable, in C++. There are also a lot more places in the go code to make errors that the compiler won't catch, and there's repetition of logic.

This is, in some ways, not really fair to go. Go isn't a direct competitor to C++; it could probably be better described as "partway between C and Java". It has a large set of libraries which deliver high level functionality like networking at high-level protocols, image processing, various encodings, and explicit support for threading and coroutines, but it has (when used carefully) a relatively low overhead compared to Java. (Yes, it uses GC but also makes careful use of the stack, which doesn't give you the control C or C++ do, but should normally be cheaper than Java, plus, of course, not relying on a virtual runtime machine.) As a string manipulation language for simple applications which require only arrays and hash maps it also competes with perl and Python, and has better typing than either.

An experienced C++ developer might whip C++ out for small utilities - I do, sometimes - but it's normally deployed for larger projects, and/or projects where you want low-level control over resources (e.g. reliably scoped lifetimes, fine control of memory) plus high performance, but also OO methodology (or you'd use C). C++ won't surprise you with a garbage collector running.

The larger your project, the more the minor maintainability issues will add up. C++ requires more knowledge to use effectively but properly used modern C++ allows the deployment of more tools to tackle a full set of problems, and allows code to be more expressive and thus more maintainable.

(The exception to this better support for large projects is that go has full support for modules where compiler support is lagging on C++20 modules. It will get here eventually.)

Search This Blog

C++ Development: The Breviary Project

Loops

Comments

Post a Comment

Popular posts from this blog

Boundaries

State Machines

Considerations on an Optimization