Iterable Messages
I have been messing around with parsing and building delimited-field messages, using the FIX protocol's basic structure (key=value for fields, single-character field separator) as a simple model to work with. This is mainly to see what effect the new C++ features (notably the use of string views) have on performance.
I have three classes with essentially the same logical interface.The first is a map of string views which owns the string on which the views are based. It allows access to the fields by key, and can be iterated over, but it is designed to be otherwise as cheap as possible. No ordering is done on the keys: so good for lookups, not good for generation unless the generating logic imposes an order or unless order does not matter. (In the FIX header order does matter, but in the FIX body it is not, strictly, important, although most implementations use a conventional numeric order.) Its projected use would be as an interim step on receiving a message: parse the string, check a few fields to determine where to dispatch the message, dispatch the message unchanged to another processing context, whether another thread or another application.
The content model is extremely simple:
std::string m_base;
std::unordered_map<std::string_view, std::string_view> m_data;
(Experience showed an unordered_map, unsurprisingly, to outperform a map, and map ordering is useless here, as we don't want either lexicographic or numerical field order.)
Optimization is set to -O3, using g++12.
I put together a small test looking at construction costs, costs associated with simple accesses, and costs associated with iteration over the contents. All the usual caveats apply to quick-and-dirty tests with no careful warmup and probably unrealistic cache conditions, but all I'm looking for is some general indicators of performance.
Construction: 936 microseconds to do 5000 instances, plus an access
Access: 108 (ditto)
Iteration: 680 (ditto)
The next step up is a map which allows for the same sort of key lookup but iterates over the values in the order they appear in the original message. This is more usable in most contexts for data generation. (Copying the generated fields will copy them in the right order.) This uses boost MultiIndex.
struct Field
{
std::string_view key;
std::string_view value;
};
typedef boost::multi_index::multi_index_container<
Field, boost::multi_index::indexed_by<
boost::multi_index::sequenced<>,
boost::multi_index::hashed_unique<boost::multi_index::member<
Field, std::string_view, &Field::key> > > >
storage_type;
std::string m_base;
storage_type m_data;
This gives:
Ordered strings construction: 884
Ordered strings access: 61 (uses hashing)
Ordered strings iteration: 1049 (uses sequential)
This is actually better all along the line than the simple version. (The version without optimization on has construction as more expensive, which is what you would intuitively expect.) We can, therefore, essentially forget about actually using the first version. Note that iteration is essentially useless in the first variant and should not be compared.
There is one downside of this model: the class is more expensive to copy than you might expect. If you do a simple copy, it will "work" but the indexed storage will continue to reference the old underlying string rather than the new one. You essentially have to rebuild the index on copying to avoid lifetime issues. It's best to make it noncopyable; if you need to copy a message around, it's better to use the older-style map below and avoid successive reparsings of the string.
Finally, we have an old-style baseline (which is also useful for modifiable records) which uses std::string rather than std::string_view. It uses the boost MultiIndex model, following the conclusion above.
struct Field
{
std::string key;
std::string value;
};
typedef boost::multi_index::multi_index_container<
Field, boost::multi_index::indexed_by<
boost::multi_index::sequenced<>,
boost::multi_index::hashed_unique<boost::multi_index::member<
Field, std::string, &Field::key> > > >
storage_type;
This does not need to take a copy of the full message.
The constructor does make some use of std::string_view in parsing the initial string, even though it does not store the extracts as string_views. This adjustment saves about one-fifth of the time which it would take if only std::strings were used as intermediaries.
Modifiable ordered strings: 2159
Modifiable ordered strings access: 86
Modifiable ordered strings iteration: 675
Different runs generate different numbers; the numbers above are representative. But they are representative: the relative rankings in terms of speed remain consistent across runs. (Without optimization on, the two classes depending on the Boost MultiIndex perform relatively worse; presumably the underlying implementation is more affected by the compilation optimizations than is the implementation of the unordered_map.)
The one clear conclusion is that yes, indeed, the cost of using std::string is visible at the construction time, even though that version avoids copying a long string to use as the base of the map. The second more tentative conclusion is that the differing models used in the implementation of the multi-index containers optimize better than the standard STL ones (I note this because I have seen some discussion of it elsewhere on the web) and that there is little point for using the "simple" form even in contexts where ordering is not necessary.
Finally, we have an old-style baseline (which is also useful for modifiable records) which uses std::string rather than std::string_view. It uses the boost MultiIndex model, following the conclusion above.
struct Field
{
std::string key;
std::string value;
};
typedef boost::multi_index::multi_index_container<
Field, boost::multi_index::indexed_by<
boost::multi_index::sequenced<>,
boost::multi_index::hashed_unique<boost::multi_index::member<
Field, std::string, &Field::key> > > >
storage_type;
This does not need to take a copy of the full message.
The constructor does make some use of std::string_view in parsing the initial string, even though it does not store the extracts as string_views. This adjustment saves about one-fifth of the time which it would take if only std::strings were used as intermediaries.
Modifiable ordered strings: 2159
Modifiable ordered strings access: 86
Modifiable ordered strings iteration: 675
Different runs generate different numbers; the numbers above are representative. But they are representative: the relative rankings in terms of speed remain consistent across runs. (Without optimization on, the two classes depending on the Boost MultiIndex perform relatively worse; presumably the underlying implementation is more affected by the compilation optimizations than is the implementation of the unordered_map.)
The one clear conclusion is that yes, indeed, the cost of using std::string is visible at the construction time, even though that version avoids copying a long string to use as the base of the map. The second more tentative conclusion is that the differing models used in the implementation of the multi-index containers optimize better than the standard STL ones (I note this because I have seen some discussion of it elsewhere on the web) and that there is little point for using the "simple" form even in contexts where ordering is not necessary.
Comments
Post a Comment