Posts

Simplicity and maintainability

We generally consider simple, efficient, and maintainable to be related: that something improved in one of the three ways will likely benefit in others. However, there's an ambiguity in "simple", related to one in "maintainable". Consider the following case: You have a delimited set of instances of some human-readable structure. For this to be a useful example, the number of cases should be somewhere between five and maybe a hundred, and it should be open - that is, there should be a probability that more instances will be added as time goes on. An example might be specifications of message structures in an ASCII format for an evolving protocol. These structures also aren't simple discrete units. They come in pairs or even triplets, with there being a base type and one or more derived types. The derivation is fairly simple for a human to do and could be put into a set of rules for an algorithm. You need all of these cases to be defined for a program to r...

Removing consecutive if tests

Consider a set of exclusive tests followed by actions. That is: if (test1()) doSomething(); else if (test2()) doSomethingElse(); else if (test3()) ... If you get the data for the conditions and the actions into a single convenient parameter, if necessary by creating an arbitrary struct, then we have: Param foo(...); if (test1(foo)) doSomething(foo); else if (test2(foo)) doSomethingElse(foo); else if (test3(foo)) ... This is exactly equivalent, logically, to the sequence in using func_pair= std:pair<std::function<bool(const foo& )>, std::function<void(foo&)>>; std::vector<func_pair> vec; auto iter = std::ranges::find_if(vec, [&foo](func_pair inVal) -> bool { return inVal.first(foo); }); if (iter != vec.end()) iter->second(foo); if the vector has been populated with the appropriate tests. (This is not exactly the Command pattern, which is more a generalization of the switch statement. I introduced this a few posts ago, in ...

Emergent Design

Before talking about emergent design, I always feel that I should suggest going and looking at Jim Coplien's discussions of the problems with "agile" design. The problem is, Coplien's right. So-called emergent design - that is, design generated by an atomic, bottom-up approach in incremental steps - will work only if someone is (or a few people in very tight communication are) working in conformity to an implicit but detailed and coherent overall plan, in which case it's not really emergent at all. Continuous refactoring can preserve and even improve an underlying good design, but it's a very long way to get from a heap of well-executed details to anything like an optimal system architecture. The other problem is, we don't really have a choice. The alternative is detailed top-down design - that is, waterfall - but that requires a requirements stage which is detailed, thorough, and subtle, and no organization I've been in in the past decade and a half w...

Enums and classes

One general piece of advice which I sometimes see is to avoid enums entirely and instead prefer the use of classes to represent what would otherwise be enum values. (This is not meaningful in Java where enums are classes with full extensibility.) Instead of enum class OfficeNames { LAUDS = 0, PRIME = 1, TERCE = 2, SEXT = 3, NONE = 4, VESPERS = 5, COMPLINE = 6 }; one would define an abstract class as a parent: class OfficeName { virtual ~OfficeName(); }; and then implement a set of inheriting concrete classes, which should be stateless: class LaudsName: public OfficeName { ~LaudsName() override; }; // etc. Even if you don't have any virtual functions which do something, this makes it impossible to cast an integer directly to the enum type (which can create surprises in handling what you may think is a strictly limited set of values). The bigger driver of the advice is that it allows the removal of switch statements. Instead of: void MyClass::performUpdate(con...

Function sizes

Having expressed my discomfort with the idea of a fixed function size, what do  I think should be the maximum size of a function? First of all, counting semicolons is pointless. There is no difference in essentials between. foo(generateX().createParam()); and auto x = generateX(); auto param = x.createParam(); foo(param); other than the greater scope in which the two temporary variables live. (I note that many developers tend to be more comfortable with the latter. I see much more of  int n = std::accumulate(...); process(n, param2); //n not used further than I do of process(std::accumulate(...),    param2); in general code.) There is an obvious outer limit on a normal maximum size: too large to be seen on one screen. We will allow the screen to be that of a large monitor. However, even that has its exceptions. Assume that you have received a message from an external source and that the type is expressed as a string. Based on that type we create a new message object...

Thoughts on Refactoring and Design: Five Lines

One (very decent) book on refactoring has the title Five Lines of Code in which five lines is taken as the upper limit on a function (where five means "count semicolons and control statements"). In addition, it has two rules which tightly restrict what those five lines can be: 1) Only one if statement per function, at the top of the function. This statement would be the "one thing" that function does. 2) No if plus else, or switch, controls for data one has control over (i.e. part of one's own source and not from a standard or third-party library). Instead, late bindings using polymorphism are prescribed. A corollary of this is the elimination of enums in favour of substantive classes (except in languages like Java, where enums are substantive classes). The author does note that the number five might vary a bit by domain. You can certainly do this in C++ -- the TypeScript examples are easily convertible to C++ -- but it may be worth putting forward some concer...