CWEB
After a gap of about two decades(!) I decided to revisit composition in Knuth/Levy's CWEB on a C++ project.
First, some technical notes:
1) I was unable to get the ctwill macros to work with my TeX system (despite the fact that it is all part of a single distribution). Admittedly, I didn't try too terribly hard, but it means that my output contains no mini-indexes or other ctwill features.
2) I write code using emacs, and this meant also adjusting to switching between cweb-mode and c++-mode. A number of functions I had written in emacs lisp were not portable directly to the new context, as they depended on file naming conventions. (If I were to take up use of CWEB more heavily, they would not be too difficult to modify - search backwards for @(...>= to determine the current output file name.)
3) My initial inclination was to use CWEB includes to generate one TeX file for the project. I abandoned this both because it ended up causing some size-related problems and because I think it inconsistent with a better use of the documentation model to track modules (see below).
With that out of the way, what were my conclusions?
First, matching my recollections, the process of writing code in a discursive mode does indeed change the nature of the coding process, and for the better. There is less need to hold everything in your head, as it is natural to sketch out design immediately adjacent to the actual implementation, and to do so in a mode easier to read and revisit.
Secondly, the styles of OO coding which have become much more normative since the 1990s (small classes and functions, extension via delegation rather than inheritance, programming to interfaces, and dependency injection) all work very well with CWEB, as the macro model allows grouping for reading which maximizes clarity of code which naturally wants to be compiled in separate compilation units.
Thirdly, and this is important, CWEB's tangle mechanisms make it equally straightforward to work at a small module level regardless of compiler support at an organizational level. My current compiler's support for C++ modules is very limited and buggy ("crashes compiler" levels of buggy), but the organization of a CWEB file expressing a conceptual small module is essentially the same regardless of whether one uses modules or a traditional .h/.cpp model of file organization. You not only get a logical grouping of classes but you also get a discussion tying them together at the conceptual level.
As always, the major drawback lies not in the technical end of it, or even the learning curve for basic composition in CWEB, but in the fact that the approach has a benefit if, and only if, the developer is fluent in the natural language used in the team (e.g. English) and is also reasonably competent at prose composition, and has a commitment to documenting thought processes during composition. CWEB is very much not a model which fits with documenting after writing the code.
Its other minor drawback is that if a single .w file generates a number of headers and .cpp files, the standard build systems cannot detect whether a change affects a header, a single implementation file, or only the documentation, and build times thus also increase. If one models one's files on conceptual small modules, then a migration to using actual modules as the generated code files (given adequate compiler support) would provide a partial solution for this. It's also a reason not for using CWEB includes, as that tends to magnify the problem.
There is nothing preventing using CWEB for some files in a project and having other files edited directly. However, this implies that there is a good way of defining which files are complex or interesting enough to benefit from the additional information provided by the literate programming process, and which are simple and straightforward enough not to benefit. It probably makes more sense to use or not use it on a project by project basis, or at least a directory by directory basis. (A library might be implemented differently from another library in the same codebase, for example.)
I found it more practical to keep my unit tests in separate files (and, as is my custom, in a separate subdirectory). The process of creating the tests, and the interaction between that process and writing the substantive code, is invisible from the point of view of the flow of the documentation.
In particular, the literate programming paradigm is pretty much in opposition to the TDD model: it puts general design, and discussion of design, up front; if you don't agree with Jim Coplien about the importance of up-front design, and just want to have it bubble up from below, this model is not for you.
To be more concrete about it: CWEB encourages a model in which (within a given output file) one sketches out the logic by a series of single-line descriptions of high-level functionality (mixed in with some literal code as a framework or as interstitial glue), described in a conceptual way. Each such line is then expanded into a more detailed discussion of the design and implementation issues, accompanied by a mix of code and further such single-line descriptions. This goes on until you reach a level where the expansion has no single-line components. Because the single-line values are expanded as macros the unit of comprehension - the unit where one is concerned with cyclomatic complexity - is not the function but rather the block of code represented by the macro.
It is critical for the benefits of the discipline that this captures a process, rather than being a back-formation from already-generated code. The iterative model in TDD, which, in its iterations, keeps revising the same code to reach a final form, and which is driven feature-by-feature -- you need a concrete feature to test -- is an entirely different process.
Tests are ancillary to this process. They validate the implementations; in principle, complete test coverage of a well-written CWEB file would have one test for each bit of functionality large enough to warrant substantive discussion. You could, in principle, do generation of unit tests within the flow of a CWEB program. The obvious way to do so, without making extensions to the cweave functionality, would be to embed the discussion of each test and the associated code block inside a footnote associated with the corresponding block of discussion. This does work, but it looks a little odd - cweave doesn't know about footnotes, so although the relevant code blocks get put into the footnote they get numbered as part of the overall monotonic incrementing pattern. It would also make for a footnote-heavy format. (Shifting the notes to endnotes is also possible, but that would mean that the tests would be devoid of any context, and the visible sequence of numbers would skip.)
Comments
Post a Comment