Psalm Specifications
One domain I decided to use was: create a program which can use simple XML inputs plus built-in knowledge to generate the Day Hours of the old breviary for any given day. The attraction of this project was twofold: first, the data itself is available via old archiv.org copies of English translations of the Sarum Breviary, as modified by hardcopy versions of both the Sarum and Roman breviaries; although I would have to extract and structure data I would not have to input it all by hand. Secondly, the rules for determining what is to be presented are, shall we say, non-trivial; there's a reason breviaries tend to come with multiple ribbons for keeping track of multiple place simultaneously. (In addition, this is a domain I know enough about that I would not flounder in the complexities that had to be resolved.)
The complexity may be gauged by the fact that the library I wrote representing the domain is somewhat over 28,000 lines of code (that's going by wc and some lines are single characters such as { or }, but it's still a substantial project).
That said, let me start with a very low-level area of the breviary library: representing psalms (the office is essentially built around psalms and parts of psalms).
Rather, more precisely: specifying psalms, or parts of psalms.
In the western breviary we can have three types of things treated as "psalms": first, entire psalms; second, one or more parts of psalms which are standardized sections (119/118 is entirely structured this way (I will be using the Coverdale / AV numberings and not the Vulgate numberings, so 119 from now on)); third, an extracted subset of a psalm which is not such a standard block (there is one instance of this, in Compline). One of these units will (most of the year, with a few exceptions for some psalms which are said back-to-back) be followed by the recitation of the Gloria Patri, and in some cases each such section will be surrounded by a repeated antiphon.
So our first concrete task is the creation of a class to represent such a psalm specification. It can be used to control what psalm verses get generated at a given point when the output is being formatted.
The specifications have one of three separate formats:
e.g. "23" --> Psalm 23
e.g. "119:1" --> Psalm 119, Section 1
e.g. "31:1-6" --> The first six verses of psalm 36 (this is a special case).
Without further ado, here's the initial class interface:
class PsalmSpec : public IPsalmSpec
{
public:
PsalmSpec() {}
explicit PsalmSpec(std::span<std::string> inPsalms);
PsalmSpec(const std::string &inFirst, const std::string &inSecond)
: m_psalms{ Validate(inFirst), Validate(inSecond) }
{ }
explicit PsalmSpec(const std::string &inPsalm):
m_psalms{ Validate(inPsalm) }
{ }
~PsalmSpec() override;
void formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum = false) const override;
static const std::string& Validate(const std::string &inVal);
private:
std::vector<std::string> m_psalms;
};
A couple of C++ version observations: (1) brace initialization is being used in the one initializer list visible; in C++03 initializing a vector in that way was impossible and would have required calls to push_back() after the vector was created, much as is done; (2) std::span is used for another constructor, which allows the type of the underlying value to be hidden by a view -- it could be an array or a vector but we neither know nor care. In pre-C++20 code this would have had to be either a const reference to an array, a const reference to a vector, or a pointer (C-style). C++11-style override is used for virtual functions.
The more recent language features generally tend to make the code a little cleaner, but they're nothing earth-shaking.
More generally, the class is immutable and implements a single-function interface for managing output. We'll look at why the interface exists rather than just a simple concrete class below. Even though it implements an interface, it does support all the standard data class calls via the automatically-generated functions.
The constructor with the span is deferred to the cpp file:
PsalmSpec::PsalmSpec(std::span<std::string> inPsalms)
{
std::ranges::for_each(inPsalms, [&](const auto &p) { m_psalms.push_back(Validate(p)); });
}
This is short and simple, made so in part by the use of a lambda expression; it was the original implementation.
This could, though, be implemented using a more specific algorithm (std::transform) or using a transforming view. Let's look at that.
The equivalent use of std::transform is
PsalmSpec::PsalmSpec(std::span<std::string> inPsalms)
{
std::ranges::transform(inPsalms, std::back_inserter(m_psalms),
[](const auto &p) { return Validate(p); });
}
The biggest difference this makes is that the function no longer needs to be a closure with a reference to the environment; in theory you could just pass Validate() itself as the argument. (The compiler will probably optimize away the wrapper in any case.) In addition, the use of std::transform might be more efficient than the repeated push_back() calls of the for_each algorithm. The other difference is that the algorithm itself tells you the overall shape of the action.
The other alternative as of C++20 is to use the views transform capability:
std::ranges::copy(inPsalms | std::views::transform([](const
std::string& inStr) { return Validate(inStr); }),
std::back_inserter(m_psalms));
All this really does, in this use case, is move the transformation call from being an algorithm argument to being a filter in a views pipeline, It's not clear to me that there is any particular benefit of one form over the other; std::views::transform is more useful when the algorithm to be extended is not std::copy,
Note that in all these cases the use of the ranges version of the algorithms means that there is no more reliance on exterior, explicit, iterators -- the iterators are now internal, which allows not only for the simplification of the syntax but also for (at least potentially) a slightly more efficient implementation. (note that in this case the use of span guarantees that the objects are in contiguous memory, and template specialization might make use of that).
The use of Validate() is mildly interesting. Its use keeps us from having simpler initializations; what does it do?
Originally the function for handling output of the data had to do several checks on the format of the stored record. This tangled up the logic and could be potentially more expensive (construction takes place once, output potentially many times).
When the specification formats (referenced above) are being processed stoi() gets called, which can throw if it is passed a string which does not begin with a digit. We would like to fail fast, during the construction phase.
So the function below was introduced:
const std::string&
PsalmSpec::Validate(const std::string &inVal)
{
if (!std::isdigit(inVal[0]))
throw std::runtime_error("Invalid beginning for psalm spec " + inVal);
int i = 1;
while (inVal[i] != '\0')
{
char c = inVal[i++];
if (!std::isdigit(c) && (c != '-') && (c != ':'))
throw std::runtime_error("Invalid character in psalm spec " + inVal);
if (((c == '-') || (c == ':')) && !std::isdigit(inVal[i]))
throw std::runtime_error("Invalid character sequence in psalm spec "
+ inVal);
}
return inVal;
}
This guarantees that the strings interned will conform at least in the places where unexpected exceptions might be generated once the object is created. (The first character must be a digit, and a digit must follow a dash or a colon.) The function simply returns its argument; on failure the exception thrown will guarantee that the object itself is never fully initialized.
Refactoring PsalmSpec
Finally, let's look at the formatting functionality, which wil lead to some significant refactoring. This was a first cut, logically correct:
void PsalmSpec::formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum) const
{
std::ranges::for_each(m_psalms, [&](const auto &p) {
if (std::size_t index = p.find(':'); index != std::string::npos)
{
if (p == "31:1-6"s) // Special case for compline
{
inFormatter.formatHeading("Part of Psalm 31"s);
const Psalm &ps = inPsalter.getPsalm(31);
std::ranges::for_each(std::ranges::iota_view{ 1, 6 },
[&](const int inIndex) {
ps.formatVerse(inFormatter, inIndex);
});
}
else
{
inFormatter.formatHeading("Psalm "s + p);
inPsalter.getPsalm(std::stoi(p))
.formatSection(inFormatter, std::stoi(p.substr(index + 1)));
}
}
else
{
inFormatter.formatHeading("Psalm "s + p);
inPsalter.getPsalm(std::stoi(p)).formatSection(inFormatter, 1);
}
if (!inIsTriduum && (p != "63"s) && (p != "114"s) && (p != "148"s)
&& (p != "149"s))
inFormatter.formatGloriaPatri();
});
}
This is where the somewhat quirky nature of the domain starts to show itself. Programming around the Breviary is full of exceptions.
I'll defer talking about the two interfaces for another post, as they would be a distraction from the main point; let it suffice for now to say that the IEncapsulatedOfficeFormatter abstract class support generic formatted output based on logical categorization of data and the IPsalter class represents storage of Psalm objects, and that (as can be seen) the Psalm objects know how to apply the formatter to their contents.
On the language features front, this uses the range version of for_each and iota_view (both C++20), the declaration of a scoped variable inside the if test (C++17), literal strings (C++14), lambdas and stoi (C++11). A similar, though less tight, implementation could be done in C++03 but would require an explicit class to apply to the values, and the logic for the iota_view could be used with a boost counting iterator. All iteration would be explicit rather than implicit.
It's ugly, though, and potentially inefficient -- as with the validation, this has logic which could be determined earlier on, at construction time, and separating out the various branches would also make for cleaner code.
We'll start by interning objects rather than strings, and moving the formatting logic to the objects. This gives us:
class PsalmSpec : public IPsalmSpec
{
class PsalmRep
{
public:
PsalmRep(const std::string &inVal) : m_rep(inVal) {}
void format(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter, const bool inIsTriduum) const;
private:
std::string m_rep;
};
public:
PsalmSpec() {}
explicit PsalmSpec(std::span<std::string> inPsalms);
PsalmSpec(const std::string &inFirst, const std::string &inSecond)
: m_psalms{ Validate(inFirst), Validate(inSecond) }
{ }
explicit PsalmSpec(const std::string &inPsalm)
: m_psalms{ Validate(inPsalm) }
{ }
~PsalmSpec() override;
void formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum = false) const override;
static auto Validate(const std::string &inVal) -> PsalmRep;
private:
std::vector<PsalmRep> m_psalms;
};
The logic for PsalmSpec::PsalmRep::format is still exactly as it was in PsalmSpec::formattedPrint, and will not be shown here. Validate now returns a temporary (making use of the RVO optimization) which will be used in copy construction, using the newer trailing return type specification, but its deployment in the constructors is unchanged.
formattedPrint() itself is simplified:
void
PsalmSpec::formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum) const
{
std::ranges::for_each(m_psalms, [&](const auto &p) {
p.format(inFormatter, inPsalter, inIsTriduum);
});
}
Now that we are not just passing around string references, the general constructor does benefit from the transform view version, if we change std::copy to std::move:
std::ranges::move(inPsalms | std::views::transform(Validate),
std::back_inserter(m_psalms));
as we might avoid one copy-construction operation. Eliminating the lambda is also shorter and clearer.
We now want to refactor the formatting logic so that on a given iteration the tests are simpler and faster; the main constraint being that we also need to support the copy-construction.
We introduce a strategy class, IFormatter, and we provide three substantive instantiations. The class supports a clone() as well as a format operation to support copy construction:
class PsalmRep
{
class IFormatter
{
public:
virtual ~IFormatter();
virtual std::unique_ptr<IFormatter> clone() const = 0;
virtual void format(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum) const = 0;
};
class StandardFormatter : public IFormatter
{
public:
StandardFormatter(const std::string &inVal);
~StandardFormatter() override;
std::unique_ptr<IFormatter>
clone() const override
{
return std::make_unique<StandardFormatter>(*this);
}
void format(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum) const override;
private:
std::string m_rep;
int m_index;
bool m_suppressGloriaPatri;
};
class SectionFormatter : public IFormatter
{
public:
SectionFormatter(const std::string &inVal);
~SectionFormatter() override;
std::unique_ptr<IFormatter>
clone() const override
{
return std::make_unique<SectionFormatter>(*this);
}
void
format(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter, const bool inIsTriduum) const override;
private:
std::string m_rep;
int m_index;
int m_section;
bool m_suppressGloriaPatri;
};
class ComplineFormatter : public IFormatter
{
public:
~ComplineFormatter() override;
std::unique_ptr<IFormatter>
clone() const override
{
return std::make_unique<ComplineFormatter>();
}
void
format(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter, const bool inIsTriduum) const override;
};
...
private:
std::unique_ptr<IFormatter> m_rep;
};
These have specialized, and simpler, formatting logic.
The standard formatter (no section specification) shifts the determination of special case Gloria Patri suppression to its constructor:
PsalmSpec::PsalmRep::StandardFormatter::StandardFormatter(
const std::string &inVal)
: m_rep(inVal), m_index(std::stoi(inVal)),
m_suppressGloriaPatri((inVal == "63"s) || (inVal == "114"s)
|| (inVal == "148"s) || (inVal == "149"s))
{ }
Note that this is the only one of the three versions which now needs this special logic. It also does the conversion to an integral index on construction, moving that work out of the formatting.
void PsalmSpec::PsalmRep::StandardFormatter::format(
const IEncapsulatedOfficeFormatter &inFormatter, const IPsalter &inPsalter,
const bool inIsTriduum) const
{
inFormatter.formatHeading("Psalm "s + m_rep);
inPsalter.getPsalm(m_index).formatSection(inFormatter, 1);
if (!inIsTriduum && !m_suppressGloriaPatri)
inFormatter.formatGloriaPatri();
}
The section formatter validates its inputs on construction, and, likewise, does all integral conversions on construction:
PsalmSpec::PsalmRep::SectionFormatter::SectionFormatter(
const std::string &inVal)
: m_rep(inVal), m_index(std::stoi(inVal))
{
if (std::size_t index = m_rep.find(':'); index != std::string::npos)
{
m_section = std::stoi(m_rep.substr(index + 1));
}
else
throw std::runtime_error("Section formatter passed invalid parameter: "
+ inVal);
}
The logic for formatting is now closer to the standard form:
void PsalmSpec::PsalmRep::SectionFormatter::format(
const IEncapsulatedOfficeFormatter &inFormatter, const IPsalter &inPsalter,
const bool inIsTriduum) const
{
inFormatter.formatHeading("Psalm "s + m_rep);
inPsalter.getPsalm(m_index).formatSection(inFormatter, m_section);
if (!inIsTriduum)
inFormatter.formatGloriaPatri();
}
The Compline formatter doesn't even need data:
void PsalmSpec::PsalmRep::ComplineFormatter::format(
const IEncapsulatedOfficeFormatter &inFormatter, const IPsalter &inPsalter,
const bool inIsTriduum) const
{
inFormatter.formatHeading("Part of Psalm 31"s);
const Psalm &ps = inPsalter.getPsalm(31);
std::ranges::for_each(
std::ranges::iota_view{ 1, 6 },
[&](const int inIndex) { ps.formatVerse(inFormatter, inIndex); });
if (!inIsTriduum)
inFormatter.formatGloriaPatri();
}
Construction of the object now does the work of distinguishing the types:
PsalmSpec::PsalmRep::PsalmRep(const std::string &inVal)
{
if (std::size_t index = inVal.find(':'); index != std::string::npos)
{
if (inVal == "31:1-6"s)
m_rep= std::make_unique<ComplineFormatter>();
else
m_rep = std::make_unique<SectionFormatter>(inVal);
}
else
m_rep = std::make_unique<StandardFormatter>(inVal);
}
We do need to provide explicit copy/move constructors now, to handle that unique_ptr:
PsalmRep(const PsalmRep& inRep): m_rep(inRep.m_rep->clone()){ }
PsalmRep(PsalmRep&& inRep): m_rep(std::move(inRep.m_rep)) { }
PsalmRep& operator=(const PsalmRep& inRep)
{
if (this == &inRep)
return *this;
m_rep = std::move(inRep.m_rep->clone());
return *this;
}
PsalmRep& operator=(PsalmRep&& inRep)
{
if (this == &inRep)
return *this;
m_rep = std::move(inRep.m_rep);
return *this;
}
~PsalmRep() { }
This has now segregated the logic so that each function is comparatively easy to understand.
It's worth noting that there are extensive unit tests supporting this, and that the PsalmSpec class, as a fundamental object, gets a heavy workout not only in its own unit test but in tests of all the other classes which make use of it. These tests were used to validate the refactoring.
Other Variants
We aren't done with looking at handling variation yet. PsalmSpec inherits from
class IPsalmSpec
{
public:
virtual ~IPsalmSpec();
virtual void formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum = false) const = 0;
};
for a reason.
There's a special set of psalm specifications for the minor hours (Prime, Terce, Sext, and None) which make use of paired parts of Psalm 119, with both parts said with one Gloria Patri. These are all the psalms for Terce, Sext and None, and some of the psalms for Prime. (Note that the psalms used reflect those of the much older monastic Roman Breviary and the Sarum Breviary, and not the more recent distribution of the psalms which you will find in the Breviary of Pius XII.)
Those specs can be much more heavily simplified.
We therefore have
class Psalm119PsalmSpec : public IPsalmSpec
{
public:
Psalm119PsalmSpec(const Psalm *inPsalm, const std::string &inSection1,
const std::string &inSection2);
~Psalm119PsalmSpec() override;
void formattedPrint(const IEncapsulatedOfficeFormatter &inFormatter,
const IPsalter &inPsalter,
const bool inIsTriduum = false) const override;
void formatWithoutPsalter(const IEncapsulatedOfficeFormatter &inFormatter,
const bool inIsTriduum = false) const;
private:
const Psalm *m_psalm;
std::pair<std::pair<std::string, int>, std::pair<std::string, int>>
m_sections;
};
The constructor not only pre-generates the integral values but uses a pre-lookup of the psalm (its lifetime is guaranteed throughout the life of the program: the Psalter object is created very early on as a common resource and is not altered until the end of the program).
Psalm119PsalmSpec::Psalm119PsalmSpec(const Psalm *inPsalm,
const std::string &inSection1,
const std::string &inSection2)
: m_psalm(inPsalm), m_sections{
std::make_pair(std::make_pair(inSection1, std::stoi(inSection1)),
std::make_pair(inSection2, std::stoi(inSection2)))
}
{ }
Note that this follows current guidelines in using a bare pointer for non-owning use. Using a pointer rather than a reference makes the object assignable. It would also have been possible to set an instance of Psalm 119 as a class static variable and initialize it when the Psalter class is being initialized.
Formatting is also short and simple. The non-virtual function formatWithoutPsalter provides an implementation which explicit anout the lack of later dependency on a psalter, and the interface call (needed in Prime) merely calls the simpler implementation:
void Psalm119PsalmSpec::formattedPrint(
const IEncapsulatedOfficeFormatter &inFormatter, const IPsalter &inPsalter,
const bool inIsTriduum) const
{
formatWithoutPsalter(inFormatter, inIsTriduum);
}
void
Psalm119PsalmSpec::formatWithoutPsalter(
const IEncapsulatedOfficeFormatter &inFormatter,
const bool inIsTriduum) const
{
auto printer = [&](const std::pair<std::string, int> &inVals) {
inFormatter.formatHeading("Psalm 119:" + inVals.first);
m_psalm->formatSection(inFormatter, inVals.second);
};
printer(m_sections.first);
printer(m_sections.second);
if (!inIsTriduum)
inFormatter.formatGloriaPatri();
}
As far as language features go the stand-alone lambda assigned to an auto variable is C++14, It doesn't buy us a lot of simplicity -- each of the lines where it is uses can be unrolled to two lines -- but it does allow applying the DRY principle.
Another post will deal with how all this is deployed in specific offices.
Comments
Post a Comment