LT Project: Author

The LTLibrary Author class is designed to do two things: provide access to the author's last name, and to provide a lexicographic sort key based on the author's full name.  The expected format of the author name is "Smith, John", or "Aristotle".  For sorting, punctuation and spaces are removed and the string is converted to lower case.

namespace LtLibrary

{

class Author {

public:

  void set(const std::string &inName);

  void set(const std::string &inName, const std::string &inCompName);

  auto empty() const { return m_name.empty(); }

  auto

  compare(const Author &inOther) const

  {

    return m_compName.compare(inOther.m_compName);

  }

  auto containsName(std::string_view inName) const { return m_name.contains(inName); }

  std::string_view getLastName() const;

private:

  std::string m_name;

  std::string m_compName;

};

}

inline bool operator==(const LtLibrary::Author& inFirst, const LtLibrary::Author&  inSecond)

{

  return inFirst.compare(inSecond) == 0;

}

Note two things about this in particular:

1) It is possible for two different author names to compare equal, although it would be rare: "Smith, Ed" would show as equivalent to "Smith, E.D.".  This is an edge case which is probably wrong by true bibliographic conventions but is acceptable here. (The aim is to ensure that e.g. "Smith, E.D.", "Smith, E. D.", "Smith, E D" all compare equivalently; all might appear in Library Thing records.)

2) The only constructor is, deliberately, one which leaves the object in a consistent but incomplete state.  This reflects the fact that the object will be declared before the record to set it has been parsed.

The two variants of set() use boost to get a lower-case copy and then uses C++20 erase_if to get rid of unwanted characters.

void Author::set(const std::string &inName) {   set(inName, inName); }

void Author::set(const std::string &inName, const std::string &inCompName)

{

  m_name = inName;

  m_compName = boost::algorithm::to_lower_copy(inCompName);

  std::erase_if(m_compName, [](const char x) -> bool { return std::ispunct(x) || std::isspace(x); });

}

Returning a string_view for the last name simplifies the logic for returning the value by allowing remove_suffix to be applied cheaply.

std::string_view Author::getLastName() const

{

  std::string_view rval(m_name);

  if (auto sz = m_name.find(','); sz != std::string::npos)

    rval.remove_suffix(m_name.length() - sz);

  return rval;

}

Overall, this is a fairly short concrete class which nevertheless shows, in net, the overall effect of having available the newer language calls to simplify the code and to express intent. set() would be significantly longer and messier without erase_if. A version of getLastName() which returned a std::string would be more expensive (though no wordier --

std::string Author::getLastName() const

{

  std::size_t sz = m_name.find(',');

  if (sz != std::string::npos)

     return m_name.substr(0, sz);

  return m_name;

}

 -- the operation on the string_view is cheaper than the need to generate a substring.) Using C++23 contains is clearer and slightly neater than "m_name.find(inName) != std::string::npos" and very possibly slightly cheaper.

Comments

Popular posts from this blog

Boundaries

State Machines

Considerations on an Optimization