Small utilities, other languages

Sometimes, all you want is a small, stand-alone utility.

I strongly recommend J. Guy Davidson and Kate Gregory's Beautiful C++: 30 Core Guidelines for Writing Clean, Safe, and Fast Code, which is an expansion of some of the more important C++ core guidelines (available at https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines ).  I got to thinking: the guidelines are small, useful blocks of text.  The website is designed for browsing, but they support a cookie-style type of accessing as individual blocks as well.  Why not produce a small utility which can take a downloaded version and extract guidelines by their reference number (e.g. F.1 for "The first guideline in Functions") or (in a cookie-style manner) randomly?

C++ is overkill for something this small, which is essentially small bits of text manipulation. So I decided to use go instead.  The whole utility is under 400 lines of code.

So let's start by looking from the top down.  Here's main():

0 func main() {

if len(os.Args) <= 1 {

usage()

os.Exit(1)

}

var factory TesterFactory

factory.SetValue(os.Args[1])

var record PrintableRecord

if (os.Args[1] != "-r") && (len(os.Args) > 2) {

for  i := 2; i < len(os.Args); i++ {

10 if (os.Args[i][0:2] == "-s") {

record.Suppress(os.Args[i][2:])

}

}

}


f, err := os.Open("/home/james/Downloads/CppCoreGuidelines.md")

if (err != nil) {

fmt.Println("Source file did not exist or could not be opened")

os.Exit(1)

20 }

scanner := bufio.NewScanner(f)


tester, headingType := factory.CreateTester()


if (len(headingType) > 0) {

headings := MakeHeadings(headingType)

if !headings.IsKnownHeader() {

fmt.Println("Section format must contain a . or be known heading type")

os.Exit(1)

30 }

for scanner.Scan() {

headings.CheckCandidate(scanner.Text())

}

headings.Print()

} else {

inRecord := record.FindEntryBeginning(tester, scanner)

for inRecord {

if scanner.Scan() {

inRecord = record.ProcessNewLine(scanner.Text())

40 } else {

break

}

}

record.Print()

}

f.Close();

}

We can unpick each function or struct in order of appearance.

Line 3:

usage() is a one-line standalone function printed if no argument is given:

func usage() {

fmt.Println("Usage: extractGuideline section-abbreviation|-r [-s<paragraphs to suppress>]*\n\nSection abbreviation may be stem for listing headings")

}

Line 6:

TesterFactory manages the interpretation of the parameters to determine what checks to generate to select the section to be displayed.

Here's its definition:

type TesterFactory struct {

sectionValue string

randValue int

}

It has two associated functions:

func (tf * TesterFactory) SetValue(val string) {

if (val == "-r") {

tf.randValue = rand.Intn(430)

} else {

tf.sectionValue = val

tf.randValue = 0

}

}

(Note that the assignment of 0 to the random value ensures that only one type of tester can be selected for.  That magic number 430 is the number of actual headings that might be selected from.  In a larger application it would be a named constant.)

func (tf TesterFactory) CreateTester() (BeginningTester, string) {

if tf.randValue > 0 {

return &RandomBeginningTester{tf.randValue}, ""

} else if strings.Index(tf.sectionValue, ".") > 0 {

return &SpecificBeginningTester{tf.sectionValue}, ""

} else {

return &NullBeginningTester{}, tf.sectionValue

}

}

We use the abstraction BeginningTester as a return value for the factory

type BeginningTester interface {

MatchesHeader(s string, count int) (bool, int)

}

This returns a bool for the primary question (does this match the rule) plus an integer representing the index into the string at which one wants to start printing -- this is a side-effect of the logic for lookup and it avoids having to determine it all over again.

There are two substantive implementations.  The RandomBeginningTester is defined as:

type RandomBeginningTester struct {

randValue int

}

func (r RandomBeginningTester) MatchesHeader(s string, count int) (bool, int) {

if (count == r.randValue) {

ind := strings.Index(s, "</a>")

if (ind != -1) {

return true, ind + 4

}

}

return false, -1

}

The SpecificBeginningTester is defined as:

type SpecificBeginningTester struct {

val string

}

func (r SpecificBeginningTester) MatchesHeader(s string, count int) (bool, int) {

if matchesPossibleBeginning(s) {

ind := strings.Index(s, "</a>" + r.val +":")

return (ind != -1), ind + 4

} else {

return false, -1

}

}

There is a NullBeginningTester. It is never actually used, but it exists to be returned in the event that a list of headings is wanted rather than a single section.

type NullBeginningTester struct {

}

func (r NullBeginningTester) MatchesHeader(s string, count int) (bool, int) {

return false, -1

}

The NullBeginningTester is returned along with a string representing the class of headers looked for (this is determined by checking for the "." which is part of any full specification).  So the calling context (we will get to that) uses the latter value to determine what to do with the factory output.

Line 7:

The PrintableRecord is declared here even though it might not be used (if a list of headings is required) so that it can be referenced during the parsing of arguments.

type PrintableRecord struct {

Title

Sections

}

This is a composite record.

func (pr PrintableRecord) Print() {

if (!pr.Empty()) {

pr.PrintTitle()

pr.PrintBody()

}

}

func (pr * PrintableRecord) FindEntryBeginning(tester BeginningTester, scanner *bufio.Scanner) bool {

inRecord := false

ind := -1

count := 0

flag := scanner.Scan()

for (!inRecord && flag) {

s := scanner.Text()

inRecord, ind = tester.MatchesHeader(s, count)

if matchesPossibleBeginning(s) {

count++

}

if inRecord {

pr.SetTitle(s, ind)

}

flag = scanner.Scan()

}

if flag {

return inRecord

} else {

return false

}

}

Many of the functions are inherited from the anonymous structs.  There is the Title:

type Title struct {

Name string

}

func (t Title) PrintTitle() {

fmt.Println(t.Name)

fmt.Println(strings.Repeat("=", len(t.Name))+ "\n")

}

func (t Title) Empty() bool {

return len(t.Name) == 0

}

func (t * Title) SetTitle(s string, ind int) {

t.Name = s[ind:]

}

and the slightly more complex Sections:

type Sections struct {

values []Section

toSuppress []string

}

func (ss * Sections) addNewSection(s string) {

var lines []string

value := strings.ToTitle(s[6:])

lines = append(lines, strings.Repeat("-", len(value)))

ss.values = append(ss.values, Section{value, lines })

}

func (ss * Sections) addLine(s string) {

ss.values[len(ss.values)-1].AddLine(s)

}

func (ss Sections) isTerminatingHeader(s string) bool {

return (strings.Index(s, "### <a name=") == 0) ||

(strings.Index(s, "## <a name=") == 0) ||

(strings.Index(s, "# <a name=") == 0)

}

func (ss * Sections) Suppress(s string) {

if (len(s) > 0) {

ss.toSuppress = append(ss.toSuppress, s)

}

}

func (ss Sections) PrintBody() {

for  i := 0; i < len(ss.values); i++ {

if !ss.values[i].IsInList(ss.toSuppress) {

ss.values[i].Print()

}

}

}

func (ss * Sections) ProcessNewLine(s string) bool {

if ss.isTerminatingHeader(s) {

return false

} else if strings.Index(s, "##### ") == 0 {

ss.addNewSection(s)

} else if len(ss.values) > 0 {

ss.addLine(s)

}

return true

}

Sections has several private functions to clarify the longer ProcessNewLine() function.  It contains an array of Section objects:

type Section struct {

name string

lines []string

}

func (s Section) Print() {

fmt.Println(strings.ToTitle(s.name))

for j := 0; j < len(s.lines); j++ {

fmt.Println(s.lines[j])

}

}

func (s * Section) AddLine(val string) {

s.lines = append(s.lines, val)

}

func (s Section) IsInList(inList []string) bool {

return slices.Index(inList, s.name) != -1

}

Lines 8-11

Having already set the factory with the first parameter, we iterate through any additional parameters looking to see if there are any sections to suppress (more than one can be specified).

Lines 16-21:

These are spent opening the input file.

Line 23:

This is where we make use of the factory.  Note that, as noted above, the headingType (which will be empty unless we are generating a list of headings) is tested first.

Line 26:

Here we generate the Headings list, using a factory function which sets up a map for checking:

type Headings struct {

theType string

values []string

expansions map[string]string

}


func MakeHeadings(tp string) Headings {

return Headings{ tp, nil,  map[string]string{

"A": "Architectural ideas",

"C": "Classes and class hierarchies",

"CP": "Concurrency and parallelism",

"CPL": "C-style programming",

"Con": "Constants and immutability",

"E": "Error handling",

"ES": "Expressions and statements",

"Enum": "Enumerations",

"F": "Functions",

"I": "Interfaces",

"NR": "Non-Rules and myths",

"P": "Philosophy",

"Per": "Performance",

"R": "Resource management",

"SF": "Source files",

"SL": "The Standard Library",

"T": "Templates and generic programming",

}}


}

func SortHeadings(a, b string) int {

ind := strings.Index(a, ".")

s1 := a[ind+1:]

s2 := b[ind+1:]

ind2 := strings.Index(s1, ":");

s1 = s1[0:ind2];

ind2 = strings.Index(s2, ":");

s2 = s2[0:ind2];

ival1, err := strconv.ParseInt(s1, 10, 32)

if err != nil {

return 0

}

ival2, err2 := strconv.ParseInt(s2, 10, 32)

if err2 != nil {

return 0

}

return (int)(ival1 - ival2)

}

func (h Headings) Print() {

if (len(h.values) > 0) {

s := h.expansions[h.theType]

fmt.Println(s);

fmt.Println(strings.Repeat("=", len(s))+ "\n")

slices.SortFunc(h.values, SortHeadings)

for i := 0; i < len(h.values); i++ {

fmt.Println(h.values[i])

}

}

}

We sort by number, so we have to define an appropriate sorting function rather than just calling slices.Sort().

func (h Headings) IsKnownHeader() bool {

return len(h.expansions[h.theType]) > 0

}

func (h * Headings) CheckCandidate(s string) {

if matchesPossibleBeginning(s) {

offset := strings.Index(s, "</a>" + h.theType +".")

if offset != -1 {

h.values = append(h.values, s[offset+4:])

}

}

}

Lines 27-34

All the logic for reading in and displaying the headings.  Because we are never in a "special" state the loop can just use the scanner.Scan() return value for the loop control and it needs only a one-line body. (It takes as many lines to reject a bad value as it does to do the actual processing, at this level.)

Line 36:

The FindEntryBeginning() function hides a loop over the beginning of the file, until the entry selected appears.  It returns true, having set the title, if a match is found.  This is where the interface call to MatchesHeader is made.

Lines 37-42

This loop is explicit on the face of the function; there seemed to be no obvious object to anchor it to, and the body is very, very short.  ProcessNewLine() is one of the functions passed through the PrintableRecord directly to the anonymous Sections struct.

Line 45

Print what we have collected.

Alternatives

There is a simpler alternative hiding under that implementation.

At present, we collect all the data in an entry and then print after it has been collected.  It would certainly be possible to just dump the lines to the screen directly on reading.  The current model was chosen as being more open to extension in the future (for example an extension where more than one entry is wanted and the entries are to be ordered by some criteria not matching the default order; or an extension to print those entries containing a list of keywords (where the printability could not be determined until the whole entry was read in).  To manage the simple I/O model one would need to make only minor changes:

1) Create interfaces corresponding to the Title and Sections structs.

2) Write new structs which use different implementations of SetTitle() and ProcessNewLine() which simply dump data to the output rather than storing them.

3) Use the new structs in place of the older ones, probably using a factory to handle the struct generation.

Comments

Popular posts from this blog

Boundaries

Overview

Considerations on an Optimization