Parsing Chapters

Every office has a short chapter, which is a verse or two from some biblical book.  The source of the text is conventionally provided as well, down to the chapter level, but not the verse level.

In its normal form, a chapter element is a simple representation of this:

      <chapter src="Isa. ii">it shall come to pass in the last days, that the mountain of the Lord's house shall be established in the top of the mountains, and shall be exalted above the hills; and all nations shall flow unto it.</chapter>

However, there are a couple of exceptions.

In some cases, Terce and Second Vespers use the chapter of Lauds.  When this occurs, in order to avoid repetition, we use:

      <chapter ref="Lauds"/>

In addition, some chapters are used throughout a season; this is the case only in Lent and Passiontide.  In this case we have:

      <chapter ref="LENT"/>

or

      <chapter ref="PASSIONTIDE"/>

These are expanded to the explicit values during parsing.

Chapter Tag

The Chapter Tag has to manage two kinds of reference for the ref attribute as well as handle the src attribute, used only when the ref attribute is not.

class ChapterTag : public BreviaryTag

{

public:

  enum class SeasonalChapterValue {

    None,

    Lent,

    Passiontide

  };


  ChapterTag(): BreviaryTag(GetName()) {}

  ~ChapterTag() override;

  bool isLaudsReference() const { return m_laudsReference; }

  SeasonalChapterValue getSeasonalChapter() const { return m_seasonal; }

  const std::string &getSrc() const { return m_src; }

  static const std::string GetName() { return "chapter"; }

private:

  int allowedAttributeCount() const override { return 2; }

  std::span<std::string> getAllowedAttributes() const override;

  bool validate(std::string_view inAttribute,

                std::string_view inValue) const override;

  void setValue(std::string_view inAttribute,

                std::string_view inValue) override;

  bool checkMandatoryAttributes() const override

  {

    if (hasAttribute("ref"))

      {

if (hasAttribute("src"))

  return false;

return isClosed();

      }

    else

      return !isClosed() && hasAttribute("src");

  }

  SeasonalChapterValue m_seasonal = SeasonalChapterValue::None;

  bool m_laudsReference = false;

  std::string m_src;

};

The implementations for handling the attributes are straightforward:

std::span<std::string> ChapterTag::getAllowedAttributes() const

{

  static std::array<std::string, 2> rval{ "ref", "src" };

  return rval;

}

bool ChapterTag::validate(std::string_view inAttribute,

                          std::string_view inValue) const

{

  if (inAttribute == "ref")

    return ((inValue == "Lauds") || (inValue == "LENT")

            || (inValue == "PASSIONTIDE"));

  return !inValue.empty();

}

void ChapterTag::setValue(std::string_view inAttribute,

                          std::string_view inValue)

{

  if (inAttribute == "ref")

    {

      if (inValue == "Lauds")

        m_laudsReference = true;

      else if (inValue == "LENT")

m_seasonal = SeasonalChapterValue::Lent;

      else

m_seasonal = SeasonalChapterValue::Passiontide;

    }

  else

    m_src = inValue;

}

Note that it would be possible in principle to do some minor formal checking on the src attribute -- first letter capitalized, allow only letters, numbers, spaces and periods, contains at least one space, last word has to be a valid roman numeral -- but, especially as this is merely informational, the trouble seemed to be excessive, so we simply accept any content as valid.

Chapter Element

class ChapterElement : public TextElementBase

{

public:

  ChapterElement(std::string_view inText, const OfficeNames inOffice);

  ChapterElement(const ILaudsChapter &inLaudsChapter, std::string_view inText,

                 const OfficeNames inOffice);

  ~ChapterElement() override;

  const std::string &getStartTagName() const override

  {

    return m_tag.getName();

  }

  const std::string &getSrc() const { return m_src; }

private:

  ChapterTag m_tag;

  std::string m_src;

  void setSourceAndText(std::string_view inText);

  void setSeasonalValues(const ChapterTag::SeasonalChapterValue inValue,

                         const OfficeNames inOffice);

};

The ChapterElement class has two constructors: one used in a context where a Lauds chapter is available and might be used, one generally otherwise.

ChapterElement::ChapterElement(std::string_view inText,

                               const OfficeNames inOffice):

    TextElementBase(inText, ChapterTag::GetName())

{

  auto val = m_tag.set(inText.substr(0, getLength()));

  if (!val.has_value())

    {

      throw OfficeParseException(val.error(), inText.substr(0, getLength()));

    }

  if (m_tag.isClosed())

    {

      if (m_tag.getSeasonalChapter() != ChapterTag::SeasonalChapterValue::None)

        {

          setSeasonalValues(m_tag.getSeasonalChapter(), inOffice);

          return;

        }

      else

        throw OfficeParseException(

            "Unexpected closed chapter element in non-Terce/non-Vespers context", inText);

    }

  setSourceAndText(inText);

}


ChapterElement::ChapterElement(const ILaudsChapter &inLaudsChapter,

                               std::string_view inText,

                               const OfficeNames inOffice):

    TextElementBase(inText, ChapterTag::GetName())

{

  auto val = m_tag.set(inText.substr(0, getLength()));

  if (!val.has_value())

    {

      throw OfficeParseException(val.error(), inText.substr(0, getLength()));

    }

  if (m_tag.isClosed())

    {

      if (m_tag.isLaudsReference())

        {

          m_src = inLaudsChapter.getChapterSrc();

          m_text = inLaudsChapter.getChapterText();

        }

      else

        setSeasonalValues(m_tag.getSeasonalChapter(), inOffice);

    }

  else

    {

      setSourceAndText(inText);

    }

}

setSourceAndText() handles the common logic in the two constructors for the "standard" form:

void ChapterElement::setSourceAndText(std::string_view inText)

{

  m_src = m_tag.getSrc();

  std::string_view body(inText.substr(getLength()));

  setTextFromBody(body);

  body.remove_prefix(incrementLength(m_text.length()));

  incrementLength(checkEndTag(body));

}

and setSeasonalValues() does the conversion from the seasonal enum values to the expected content:

void ChapterElement::setSeasonalValues(

    const ChapterTag::SeasonalChapterValue inValue, const OfficeNames inOffice)

{

  switch (inValue)

    {

      using enum ChapterTag::SeasonalChapterValue;

    case Lent:

      {

        switch (inOffice)

          {

            using enum OfficeNames;

          case LAUDS:

            m_src = "Joel ii";

            m_text

                = "Turn ye even unto Me, saith the Lord, with all your heart, and with fasting, and with weeping, and with mourning. And rend your heart, and not your garments, and turn unto the Lord your God.";

            break;

          case TERCE:

            m_src = "Joel ii";

            m_text

                = "Turn unto the Lord your God; for He is gracious and merciful ; slow to anger, and of great kindness, and repenteth Him of the evil.";

            break;

          case SEXT:

            m_src = "Isa. iv";

            m_text

                = "Let the wicked forsake his way, and the unrighteous man his thoughts : and let him return unto the Lord, and He will have mercy upon him, and to our God, for He will abundantly pardon.";

            break;

          case NONE:

            m_src = "Isa. lviii";

            m_text

                = "Deal thy bread to the hungry, and bring the poor that are cast out to thy house : when thou seest the naked, cover thou him ; and hide not thyself from thine own flesh, saith the Lord Almighty.";

            break;

          case VESPERS:

            m_src = "Ezek. xviii";

            m_text

                = "The soul that sinneth, it shall die. The son shall not bear the iniquity of the father, neither shall the father bear the iniquity of the son, saith the Lord Almighty.";

            break;

          default:

            return;

          }

      }

      break;

    case Passiontide:

      {

        switch (inOffice)

          {

            using enum OfficeNames;

          case LAUDS:

            m_src = "Jer. xi";

            m_text

                = "The Lord hath given me knowledge of it, and I know it : then Thou shewedst me their doings. But I was like a lamb or an ox that is brought to the slaughter.";

            break;

          case TERCE:

            m_src = "Isa. l";

            m_text

                = "I hid not my face from shame and spitting. For the Lord God will help me : therefore shall I not be confounded.";

            break;

          case SEXT:

            m_src = "Isa. l";

            m_text

                = "For the Lord God will help me; therefore shall I not be confounded : therefore have I set my face like a flint, and I know that I shall not be ashamed.";

            break;

          case NONE:

            m_src = "Jer. xvii";

            m_text

                = "Let them be confounded that persecute me, but let not me be confounded ; let them be dismayed, but let not me be dismayed : bring upon them the day of evil, and destroy them with double destruction, O Lord our God.";

            break;

          case VESPERS:

            m_src = "Lament. iii";

            m_text

                = "O Lord, Thou hast pleaded the causes of my soul : Thou hast redeemed my life.";

            break;

          default:

            return;

          }

      }

    default:

      return;

    }

}

It would obviously be possible to have decided to make all the text in the breviary configurable, allowing, for example, a change in the translations used.  However, the data I have to work from in electronic form is all AV, and the likelihood of my having the energy, even given the inclination, to convert it even at a later date to (for example) the Vulgate is vanishingly small.  So functions like this quite deliberately hardcode their data.

Minor Refactoring

At this point on reviewing the code I asked myself (again; I'd asked it before but not with enough thought): why don't I encapsulate that pattern with checking the std::expected value?

The compiler support is missing for the monadic functions associated with std::expected.  Once it is available, I would like to convert

  auto val = m_tag.set(inText.substr(0, getLength()));

  if (!val.has_value())

    {

      throw OfficeParseException(val.error(), inText.substr(0, getLength()));

    }

    ...

to

m_tag.set(inText.substr(0, getLength())).or_else(...).and_then(...);

So any encapsulation would need to be easily convertible to the newer form on a compiler upgrade.

Passing in a function to handle the successful case into another function which can deploy it will work:

class TagSetter

{

public:

  template <typename T>

  void

  set(T &inTag,

      std::function<void(const T &, const int)> inProcessor,

      std::string_view inText) const

  {

    auto val = inTag.set(inText);

    if (!val.has_value())

      {

        throwOnError(val.error(), inText);

      }

    else

      {

        inProcessor(inTag, val.value());

      }

  }

private:

  void throwOnError(BreviaryTag::ErrorTypes inType, std::string_view inText) const;

};

What happens when it's put in place?

The first constructor above would now read:

ChapterElement::ChapterElement(std::string_view inText,
                               const OfficeNames inOffice):
    TextElementBase(inText, ChapterTag::GetName())
{
  auto l = [&](const ChapterTag& inVal, const int inAttributes)
  {
    if (inVal.isClosed())
      {
if (inVal.getSeasonalChapter() != ChapterTag::SeasonalChapterValue::None)
  {
    setSeasonalValues(inVal.getSeasonalChapter(), inOffice);
    return;
  }
else
  throw OfficeParseException(
     "Unexpected closed chapter element in non-Terce/non-Vespers context",
     inText);
      }
    setSourceAndText(inText);
  };

  TagSetter().set<ChapterTag>(m_tag, l, inText.substr(0, getLength()));
}

We haven't gained very much, but there's still some immediate gain:

1) We've encapsulated the repeated lines regarding testing the error value and throwing.  Admittedly, we've replaced that with a boilerplate call to TagSetter.set() and the need to set up a lambda, but there's a minor cleanup nevertheless.

2) The expressed logic is now all main case handling rather than error handling, which is always a bonus.

3) On a new compiler, we now have to change to use monadic functions in one place.

4) In some cases with multiple constructors, though not here, we could possibly move some repeated logic from the main class to the functor (which would mean setting up a class hierarchy, not using lambdas).  This makes sense only if the commonalities don't manipulate elements in the main class, as otherwise it makes sense to retain them as member functions. 


Comments

Popular posts from this blog

Boundaries

State Machines

Considerations on an Optimization