## Generic algorithms

Algorithms are at the core of computing. To be able to write an algorithm once and for all to work with any type of sequence makes your programs both simpler and safer. The ability to customize algorithms at runtime has revolutionalized software development.

The subset of the standard C++ library known as the Standard Template Library (STL) was originally designed around generic algorithms?code that processes sequences of any type of values in a type-safe manner. The goal was to use predefined algorithms for almost every task, instead of hand-coding loops every time you need to process a collection of data.

Stream iterators

Like any good software library, the Standard C++ Library attempts to provide convenient ways to automate common tasks. We mentioned in the beginning of this tutorial that you can use generic algorithms in place of looping constructs. So far, however, our examples have still used an explicit loop to print their output. Since printing output is one of the most common tasks, you would hope for a way to automate that too.

That?s where stream iterators come in. A stream iterator allows you to use a stream as either an input or an output sequence. To eliminate the output loop in the CopyInts2.cpp program, for instance, you can do something like the following.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
// Uses an output stream iterator #include <algorithm> #include <cstddef> #include <iostream> #include <iterator> using namespace std; bool gt15(int x) { return 15 < x; } int main() { int a[] = {10, 20, 30}; const size_t SIZE = sizeof a / sizeof a[0]; remove_copy_if(a, a + SIZE, ostream_iterator<int>(cout, "\n"), gt15); } ///:~ |

In this example we?ve replaced the output sequence b in the third argument to remove_copy_if( ) with an output stream iterator, which is an instance of the ostream_iterator class template declared in the

It is just as easy to write to a file instead of to cout, of course. All you have to do is provide an output file stream instead of cout:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
// Uses an output file stream iterator #include <algorithm> #include <cstddef> #include <fstream> #include <iterator> using namespace std; bool gt15(int x) { return 15 < x; } int main() { int a[] = {10, 20, 30}; const size_t SIZE = sizeof a / sizeof a[0]; ofstream outf("ints.out"); remove_copy_if(a, a + SIZE, ostream_iterator<int>(outf, "\n"), gt15); } ///:~ |

An input stream iterator allows an algorithm to get its input sequence from an input stream. This is accomplished by having both the constructor and operator++( ) read the next element from the underlying stream and by overloading operator*( ) to yield the value previously read. Since algorithms require two pointers to delimit an input sequence, you can construct an istream_iterator in two ways, as you can see in the program that follows.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
// Uses an input stream iterator #include <algorithm> #include <fstream> #include <iostream> #include <iterator> #include "../require.h" using namespace std; bool gt15(int x) { return 15 < x; } int main() { ifstream inf("someInts.dat"); assure(inf, "someInts.dat"); remove_copy_if(istream_iterator<int>(inf), istream_iterator<int>(), ostream_iterator<int>(cout, "\n"), gt15); } ///:~ |

The first argument to replace_copy_if( ) in this program attaches an istream_iterator object to the input file stream containing ints. The second argument uses the default constructor of the istream_iterator class. This call constructs a special value of istream_iterator that indicates end-of-file, so that when the first iterator finally encounters the end of the physical file, it compares equal to the value istream_iterator

Algorithm complexity

Using a software library is a matter of trust. You trust the implementers to not only provide correct functionality, but you also hope that the functions execute as efficiently as possible. It?s better to write your own loops than to use algorithms that degrade performance.

To guarantee quality library implementations, the C++ standard not only specifies what an algorithm should do, but how fast it should do it and sometimes how much space it should use. Any algorithm that does not meet the performance requirements does not conform to the standard. The measure of an algorithm?s operational efficiency is called its complexity.

When possible, the standard specifies the exact number of operation counts an algorithm should use. The count_if( ) algorithm, for example, returns the number of elements in a sequence satisfying a given predicate. The following call to count_if( ), if applied to a sequence of integers similar to the examples earlier in this tutorial, yields the number of integer elements that are greater than 15:

1 |
size_t n = count_if(a, a + SIZE, gt15); |

Since count_if( ) must look at every element exactly once, it is specified to make a number of comparisons exactly equal to the number of elements in the sequence. Naturally, the copy( ) algorithm has the same specification.

Other algorithms can be specified to take at most a certain number of operations. The find( ) algorithm searches through a sequence in order until it encounters an element equal to its third argument:

1 |
int* p = find(a, a + SIZE, 20); |

It stops as soon as the element is found and returns a pointer to that first occurrence. If it doesn?t find one, it returns a pointer one position past the end of the sequence (a+SIZE in this example). Therefore, find is said to make at most a number of comparisons equal to the number of elements in the sequence.

Sometimes the number of operations an algorithm takes cannot be measured with such precision. In such cases, the standard specifies the algorithm?s asymptotic complexity, which is a measure of how the algorithm behaves with large sequences compared to well-known formulas. A good example is the sort( ) algorithm, which the standard says takes ?approximately n log n comparisons on average? (n is the number of elements in the sequence) . Such complexity measures give a ?feel? for the cost of an algorithm and at least give a meaningful basis for comparing algorithms. As you?ll see, the find( ) member function for the set container has logarithmic complexity, which means that the cost of searching for an element in a set will, for large sets, be proportional to the logarithm of the number of elements. This is much smaller than the number of elements for large n, so it is always better to search a set by using its find( ) member function rather than by using the generic find( ) algorithm. […]