Clarifying functionality

Many conversions of loops to STL forms are near to net-zero in terms of complexity changes:

From

for (int i = 0: i != r.size(); ++i)

    vec.emplace_back(r[i]);

to

for (auto iter = r.begin(): iter != r.end(); ++iter)

    vec.emplace_back(*iter);

to

std::for_each (r.begin(), r.end(), [&vec](const auto& inVal) {

    vec.emplace_back(inVal);

});

to

std::copy(r.begin(), r.end(), std::back_inserter(vec));

to

std::ranges::copy(r, std::back_inserter(vec));

gives us five versions of the same operation in the same ballpark as far as figuring out what is going on, with only the last really giving us a noticeably shorter form because of the simpler form of the range algorithms.  (The last version is also almost certainly more efficient than the first version.) The biggest difference is that one can basically read the last one from left to right as "copy the contents of r onto the back of a vector vec".

Some changes, though, can do a great deal to clarify functionality via separation of responsibilities.

I dealt with a recent example in the context of a bit of low-level parsing.

Here is the old loop and its context:

  const static std::string target("<" + HymnTag::GetTagName());

  std::string_view rest = inText;

  for (auto index = rest.find(target); index != std::string_view::npos;

       index = rest.find(target))

    {

      rest.remove_prefix(index);

      if (auto index2 = rest.find('>'); index2 == std::string_view::npos)

        throw OfficeParseException(BreviaryTag::ErrorTypes::BadlyFormedElement,

                                   "Missing close paren for "

                                       + std::string(rest.substr(0, 20)));

      else

        {

          if (extractHymn(inUse, rest, index2))

            return;

          rest.remove_prefix(index2 + 1);

        }

    }

It's not that easy to figure out exactly what's happening, because the control is distributed between the use of find() to reset an index (which is then used once inside the loop), and the resetting of the rest view for the next passage through the for control block, which takes place within the loop.  In addition, there's logic to cut short the iteration by returning from the function inside the loop.

Reworking this into an STL-compatible model separates responsibility much more cleanly.

A range model controls the incrementing of the offsets in the text to be processed:

class HymnSourceRange

{

public:

  class Iterator

  {

  public:

    using difference_type = std::ptrdiff_t;

    using value_type = std::string_view;

    Iterator(std::string_view inView): m_view(inView) { resetView(); }

    bool operator==(const bool inVal) const { return m_status == inVal; }

    std::string_view operator*() const { return m_view; }

    Iterator &operator++()

    {

      m_view.remove_prefix(target.length());

      resetView();

      return *this;

    }

    void operator++(int) // post-incrementable

    {

      ++*this;

    }


  private:

    void resetView()

    {

      if (auto index = m_view.find(target); index == std::string_view::npos)

        {

          m_status = false;

          m_view = ""sv;

        }

      else if (index != 0)

        {

          m_view.remove_prefix(index);

        }

    }


    std::string_view m_view;

    bool m_status = true;

    static inline std::string target{ "<"

                                      + BreviaryLib::HymnTag::GetTagName() };

  };

  HymnSourceRange(std::string_view inRange): m_view(inRange) {}

  Iterator begin() { return Iterator(m_view); }

  bool end() { return false; }

private:

  std::string_view m_view;

};

Separating this logic out allows the loop to be rewritten in a range for_each model.

We then separate the concerns in the logic by (1) putting the early termination condition into a take_while view and (2) putting the check for an irregular attribute into a filter which throws on failure:

  bool found = false;

  std::ranges::for_each(

      HymnSourceRange(inText)

          | std::views::take_while(

              [&found](std::string_view inView) { return !found; })

          | std::views::filter([](std::string_view inView) {

              if (inView.find('>') == std::string_view::npos)

                throw OfficeParseException(

                    BreviaryTag::ErrorTypes::BadlyFormedElement,

                    "Missing close paren for " + std::string(inView));

              return true;

            }),

      [&](std::string_view inView) { found = extractHymn(inUse, inView); });

The body of the loop is now just one line which does the actual substantive logic and resets the found control flag.

Trying to read the loop as the copy example was readable is not quite as easy, but it's still doable": "until you run out of data or until it succeeds try to extract a hymn from the data, checking for a formally invalid tag end in the process"

The icing on the cake is that similar (though less complicated) logic is used in parsing another element type, and the range class can be generalized as a simple template to support both use cases.

Comments

Popular posts from this blog

Boundaries

Overview

Considerations on an Optimization