Comparing strings

Comparing strings is inherently different from comparing numbers. Numbers have constant, universally meaningful values. To evaluate the relationship between the magnitudes of two strings, you must make a lexical comparison. Lexical comparison means that when you test a character to see if it is ?greater than? or ?less than? another character, you are actually comparing the numeric representation of those characters as specified in the collating sequence of the character set being used. Most often this will be the ASCII collating sequence, which assigns the printable characters for the English language numbers in the range 32 through 127 decimal. In the ASCII collating sequence, the first ?character? in the list is the space, followed by several common punctuation marks, and then uppercase and lowercase letters. With respect to the alphabet, this means that the letters nearer the front have lower ASCII values than those nearer the end. With these details in mind, it becomes easier to remember that when a lexical comparison that reports s1 is ?greater than? s2, it simply means that when the two were compared, the first differing character in s1 came later in the alphabet than the character in that same position in s2.

C++ provides several ways to compare strings, and each has advantages. The simplest to use are the nonmember, overloaded operator functions: operator ==, operator != operator >, operator <, operator >=, and operator <=.

The overloaded comparison operators are useful for comparing both full strings and individual string character elements.

Notice in the following code fragment the flexibility of argument types on both the left and right side of the comparison operators. For efficiency, the string class provides overloaded operators for the direct comparison of string objects, quoted literals, and pointers to C-style strings without having to create temporary string objects.

The c_str( ) function returns a const char* that points to a C-style, null-terminated string equivalent to the contents of the string object. This comes in handy when you want to pass a string to a standard C function, such as atoi( ) or any of the functions defined in the <cstring> header. It is an error to use the value returned by c_str( ) as non-const argument to any function.

You won?t find the logical not (!) or the logical comparison operators (&& and ||) among operators for a string. (Neither will you find overloaded versions of the bitwise C operators &, |, ^, or ~.) The overloaded nonmember comparison operators for the string class are limited to the subset that has clear, unambiguous application to single characters or groups of characters.

The compare( ) member function offers you a great deal more sophisticated and precise comparison than the nonmember operator set. It provides overloaded versions that allow you to compare two complete strings, part of either string to a complete string, and subsets of two strings. The following example compares complete strings:

The swap( ) function in this example does what its name implies: it exchanges the contents of its object and argument. To compare a subset of the characters in one or both strings, you add arguments that define where to start the comparison and how many characters to consider. For example, we can use the overloaded version of compare( ):

s1.compare(s1StartPos, s1NumberChars, s2, s2StartPos, s2NumberChars);

In the examples so far, we have used C-style array indexing syntax to refer to an individual character in a string. C++ strings provide an alternative to the s[n] notation: the at( ) member. These two indexing mechanisms produce the same result in C++ if all goes well:

There is one important difference, however, between [ ] and at( ). When you try to reference an array element that is out of bounds, at( ) will do you the kindness of throwing an exception, while ordinary [ ] subscripting syntax will leave you to your own devices:

Responsible programmers will not use errant indexes, but should you want to benefits of automatic index checking, using at( ) in place of [ ] will give you a chance to gracefully recover from references to array elements that don?t exist. Execution of this program on one of our test compilers gave the following output:
invalid string position

The at( ) member throws an object of class out_of_range, which derives (ultimately) from std::exception. By catching this object in an exception handler, you can take appropriate remedial actions such as recalculating the offending subscript or growing the array. Using string::operator[]( ) gives no such protection and is as dangerous as char array processing in C.

Share
Tweet
Share
Pin