typesetting notes: @I(...) surrounds Italic text, and @b(...) is for bold face. Due to the heave yse of programming keywords in the text, it would be impossible to read without some representation for the alternate font. You can use a search and replace to convert these to your preferred format (like ` ' or [ ] for reading online) or the printer control codes you need or the correct commands to import into your word processor. This is the first C++ lesson. The subject is Operators. This is adapted from material in my upcoming book. Please direct any comments and criticsm to me on the forum or by email. --John C++ Operators Copyright 1990 by John M. Dlugosz Say you have defined a @I(complex) type to work with complex numbers. In C, to add two complex numbers you would have to have a function like @I(complex add(complex,complex)). An expression of any size would end up looking more like LISP than C. In C++, you can overload operators. In this case, I would much rather say @I(c=a+b;) than @I(c=add(a,b)). Naturally, I can do exactly that. The name of an operator is the reserved word @I(operator) followed by the symbol of the operator being referred to. In this case, @I(operator+) is what I am after. @I(operator+) can be treated just like any other function name. That is how I define the function: @I(complex operator+ (complex,complex);) It is exactly like @I(add()) except the name of the function is now @I(operator+). I can use it just like I did above, as @I(c=operator+(a,b);) That is hardly an improvement! But I can also use the operator in its natural operator syntax, as @I(c=a+b;) The operators are just like functions with a special name, but some restrictions apply. You can only define those operators that exist, with the correct number of parameters. The type of the parameters are much more flexible, and is the whole point of defining them. However, at least one parameter to an operator function must be of a class type. So, you could not define @I(void operator+ (char*, char*);) because all arguments are of built-in types. Operators can be member functions. In this case, the first argument is the receiver and the function is declared with one less argument than it actually has. So if I defined operator+ as a member, I would have @I(complex complex::operator+ (complex);) It can be used as @I(c=a.operator+(b);) as a member function with a funny name, or as the normal infix operator. In the infix syntax, the left argument is taken as the object being operated on. @SECTION(available operators) These are the operators that can be overloaded: new delete @I(storage allocation) + - * @I(both unary and binary forms & @I(unary and binary, unary is special) = @I(assignment operator is special) / % ^ | ~ ! < > << >> == != += -= *= /= %= ^= &= |= ->* && || , @I(have left to right evaluation) ++ -- @I(prefix and postfix, special way to distinguish -> () [] @I(special in various ways) In addition, @I(conversion operators) are also defined with the operator keyword. Conversion operators are covered in a later chapter. Some operators can be used in more than one way. For example, operator- can be used as unary (one argument) or binary (two arguments). All operators keep the same order of precedence when they are redefined. @SECTION(defining an operator) Back to the example of complex addition. Here is a @I(complex) class that has operator+= and operator+ defined for complex numbers. @BEGIN(LISTING) //example using overloaded operators to define arithmetic //on complex numbers. class complex { double x, y; public: complex (double xx, double yy) { x=xx; y=yy; } complex& operator+= (const complex& b); friend complex operator+ (const complex& a, const complex& b); }; complex& complex::operator+= (const complex& b) { x += b.x; y += b.y; return *this; } complex operator+ (const complex& a, const complex& b) { complex temp= a; return temp += b; } @END(LISTING) The operator+= is defined as a member function. It takes one argument in addition to the implicit @I(this). The line @I(return temp += b;) is equivalent to @I(return temp.operator+=(b);). Remember, operators defined as members have one less argument defined than they actually take. @SECTION(standard input and output) The @I(stream) library in C++ uses overloaded operators for standard input and output. The operator<< is used as a "put to", and operator>> is used as a "get from". There are classes @I(ostream) and @I(istream) to handle output and input. Functions such as: @BEGIN(LISTING) ostream& operator<< (ostream&, int); ostream& operator<< (ostream&, const char*); istream& operator>> (istream&, int&); istream& operator>> (istream&, char*&); @END(LISTING) Are used to handle input and output. Each operator returns the first argument as the function result, so they can be chained together. The standard input and output are available as variables @I(cin) and @I(cout). @BEGIN(LISTING) cout << "hello world!\n"; cout << "the answer is:" << x; cin >> x >> y; //read values into x and y; @END(LISTING) You can define your own output and input operators to operate on your own types. @BEGIN(LISTING) ostream& operator<< (ostream& o, const complex& c) { //I'm a friend, and have access to c.x and c.y o << '(' << c.x << ',' << c.y << ')'; return o; } istream& operator>> (istream& i, complex& c) { double x, y; char c; i >> c >> x >> c >> y >> c; c= complex(x,y); return o; } @END(LISTING) The input operator is a little overkill for such a simple structure-- you could have made it a friend and just read in the x and y components directly. But this illustrates a more general technique. You read the data needed to create an object, and then call a constructor to put it together. In general, use @I(cin >> x;) to read a value, and @I(cout << x;) to write a value. That is all you have to remember until you get to the chapter on streams. @SECTION(operator=) Normally when you write an assignment such as @I(x=y;) the value of y is moved to x simply by copying the bits. The assignment operator lets you define how assignment will take place instead. Consider a class that contains pointers. When you assign one instance to another, what you really want is a "deep copy" where the pointers are also duplicated to give the new copy its own. @BEGIN(LISTING) class C { char* name; int age; public: // other members omitted for example... C& operator= (C&); //assignment operator }; C& C::operator= (C& x) { age= x.age; //copy each element free (name); //free `name' before overwriting it! name= strdup (x.name); //make unique copy return *this; } @END(LISTING) The assignment operator must be a member. It should be used to provide assignment, and not for anything else. The parameter can be any type though. In fact, you can have multiple assignment operators defined for a class, so you can assign various things to it. @BEGIN(LISTING) class String { char* s; int len; public: String& operator= (String&); String& operator= (const char*); String& operator= (char); }; @END(LISTING) In class C, notice that the operator= returns a @I(C&) and in fact returns @I(*this). This is common practice, and allows chaining of assignment and the use of the assignment in a larger expression as with the built-in use. The built-in assignment operator returns the first argument as its result, so you can do things like @I(foo(a=b+1);) and @I(a=b=c;). It is common practice to continue this tradition by returning @I(*this) from operator=. Assignment in C does not return an lvalue, but my supplied operator= does. In C++, this has been extended to the built-in definitions as well. This means that you could say something like @I(p= &(c=b);) which takes the address of the return value from assignment, even for built-in types. You could not do that in C: you would get the error @I(argument to `&' must be an lvalue). You can define operator= to return anything you like. Generally it returns @I(*this) or has no return value (defined as void), but there may be reasons to do otherwise. The operator= is unique in that it is not inherited. Actually, it is inherited in a special way. If you have a class that has members or base classes that have an operator= defined, and do not provide an operator= in this class, the compiler generates one for you that calls operator= for those pieces of the class that have an operator= defined, and copies the rest. @BEGIN(LISTING) // example of derived class needing operator= class D { int x; C y; // C from above example public: D& operator= (D&); }; D& D::operator= (D& second) { x= second.x; //normal copy y= second.y; //calls C::operator=() return *this; } @END(LISTING) Since class D contained a member that should not be copied in the plain way, it defined an operator= that copied it correctly by calling the operator= on that member. However, if I had not written @I(D::operator=()) then it would still have done the same thing! This is sometimes called the Miranda rule: if you do not write an operator=, one will be provided for you. Beware though that if you do write an operator= for a class, make sure you indeed copy everything in it. @b(Version Notes) In C++ versions prior to 2.0, there was no Miranda rule. The operator= was not inherited at all. If you did not provide a new operator= in a derived class, or did not write an operator= in a class with members that need it, you get a bit-for-bit copy anyway. @SECTION(operator[]) The operator[] can be overloaded, and it is one of the special ones. First of all, the syntax is different. Writing @I(x[y];) will call @I(x.operator[](y);) Second, it must be a member function. Here is an example of a vector class that uses this operator. This implements an array that grows as needed to accommodate new elements. @BEGIN(LISTING) #include typedef int eltype; class vector { eltype* elements; int capacity; public: vector (int startsize= 10); ~vector() { delete[capacity] elements; } int size() { return capacity; } eltype& operator[] (int index); }; vector::vector (int startsize) { capacity= startsize; elements= new eltype[startsize]; } eltype& vector::operator[] (int index) { if (index >= capacity) { //out of bounds eltype* new_array= new eltype[index+1]; for (int loop= 0; loop < index; loop++) new_array[loop]= elements[loop]; delete[capacity] elements; elements= new_array; capacity= index+1; } return elements[index]; } main() { vector a; for (;;) { char c; int index, value; cout << "operation (r,w,q)? "; cin >> c; switch (c) { case 'q': goto done; case 'r': //read test cout << " index "; cin >> index; if (index < 0 || index >= a.size()) cout << "index out of range."; else cout << "contains " << a[index]; break; case 'w': //write test cout << " index and value "; cin >> index >> value; a[index]= value; break; case 's': //size? cout << "array holds " << a.size() << " elements."; break; case 'l': //list them for (index= 0; index < a.size(); index++) cout << "\n [" << index << "] " << a[index]; break; } cout << '\n'; } done: cout << "program finished.\n"; } @END(LISTING) The operator[] returns an element of the array, and returns it by reference. This allows it to be used on the left hand side of an assignment, as seen in the write test in the main program. This is a typical application-- to access a collection of elements. The parameter to operator[] can be of any type though. For example, an associative array could associate strings and numbers. Index it with a number and it returns the string, and index it with a string and it returns the number! Interesting? Here is the code: @BEGIN(LISTING) #include #include /* associative array of strings */ typedef char* eltype; class asa { //associative string array eltype* elements; int capacity; public: asa (int startsize= 10); ~asa() { delete[capacity] elements; } int size() { return capacity; } eltype& operator[] (int index); int operator[] (eltype s); //the 'backwords' version }; asa::asa (int startsize) { capacity= startsize; elements= new eltype[startsize]; } int asa::operator[] (eltype s) { for (int loop= 0; loop < capacity; loop++) if (!strcmp(s,elements[loop])) return loop; //found it //did not find it-- add it (*this)[loop]= strdup(s); return loop; } eltype& asa::operator[] (int index) { if (index >= capacity) { //out of bounds eltype* new_array= new eltype[index+1]; for (int loop= 0; loop < index; loop++) new_array[loop]= elements[loop]; delete[capacity] elements; elements= new_array; capacity= index+1; } return elements[index]; } main() { asa a; for (;;) { char c; int index; char value[40]; cout << "operation (r,w,b,q)? "; cin >> c; switch (c) { case 'q': goto done; case 'b': // 'backwards' read test cout << "enter string "; cin >> value; cout << "found at " << a[value]; break; case 'r': //read test cout << " index "; cin >> index; if (index < 0 || index >= a.size()) cout << "index out of range."; else cout << "contains " << a[index]; break; case 'w': //write test cout << " index and value "; cin >> index >> value; a[index]= strdup(value); break; case 's': //size? cout << "array holds " << a.size() << " elements."; break; case 'l': //list them for (index= 0; index < a.size(); index++) cout << "\n [" << index << "] " << a[index]; break; } cout << '\n'; } done: cout << "program finished.\n"; } @END(LISTING) This program was based heavily on the previous. I started with the vector class. A global search and replace changed the name from @I(vector) to @I(asa), and simply changing the typedef of eltype changed the type to operate on char*'s instead of ints. Then I added the new function: @I(int asa::operator[] (eltype s)). It simply searches the array for a matching string. The test driver also required only minor changes. Changing the definition of the variable @I(value) to the new type also changed the behavior of all the input and output statements that use it. All I had to do was add a new case to try the string look up. Note that this program is not all that great. The simple stream input does not let me enter strings with a space in them, the array is not initialized so listing it can cause problems, and it is up to the user of the class to free up pointers before they are overwritten. @SECTION(operator()) The operator() is another strange one. It is sometimes called the @I(function-call operator). Like operator[], it must be a member. The way to call it is easier shown than explained. Look how it is used: @BEGIN(LISTING) class C { //stuff... public: void operator() (int x); }; C x; //call operator() x(5); x.operator()(5); //same thing @END(LISTING) Here, @I(x) is a variable of type @I(C). Using the name as if it were a function will call operator(). When defining operator(), you give a parameter list just line any other operator, so you have two sets of parenthesis in the definition. When calling it, you just have the one set. The operator() can be defined to have any number of arguments-- it is the only operator that can do this. @BEGIN(LISTING) void C::operator()(); //no args int C::operator() (char* s, int y); //2 args @END(LISTING) So what is the point in having such an operator? For one, it can be used in a manner similar to the operator[]. Here is a vector class that uses operator[] to access an element with range checking, and operator() to access an element without range checking. @BEGIN(LISTING) // example of using operator() on a vector similar to operator[] typedef int eltype; class vector { eltype* contents; int first, last; //bounds of the array public: vector (int first, int last); ~vector() { delete[last-first+1] contents; } eltype& operator[] (int index); //access with range checking eltype& operator() (int index) //access without range checking { return contents[index-first]; } }; vector::vector (int f, int l) { first= f; last= l; contents= new eltype[l-f+1]; } eltype& vector::operator[] (int index) { if (index < first || index > last) { //deal with the error somehow. In this case, //I'll return a special internal value that can //be written to without trashing memory. static eltype dump; return dump; } return contents[index-first]; } @END(LISTING) If @I(a) is of type vector, you could access @I(a[5]) or @I(a(5)) which are similar in meaning. Having both [] and () available gives two different ways to subscript a vector. Another thing you can do with the operator() is to take advantage of its ability to have different numbers of arguments. The operator[] can only take one argument, but you might have a matrix class that takes two subscripts. You could use operator() to subscript the class instead. @BEGIN(LISTING) // example of using operator() instead of operator[] // so I can use two arguments. #include const int matsize= 3; typedef double eltype; class square { eltype data[matsize][matsize]; public: eltype& operator() (int x, int y); }; eltype& square::operator() (int x, int y) { assert (x > 0 && y > 0 && x < matsize && y < matsize); return data[x][y]; } @END(LISTING) Given a variable @I(M) of type @I(square), you could write @I(M(1,2)) to access an element. In the class definition, the eltype definition is used as usual. In this example, the size of the matrix is specified with @I(matsize) as well. To change the size, you only need to change this definition. All other parts of the code reference the size by this name. In C, you would have to use a #define for this purpose. In C++, a const variable can be used in constant expressions, such as the size of an array definition. Another variation on this theme is a string class which uses operator() to take a substring. Writing @I(s(3,7)) would return the third through seventh characters in the string s. Another time operator() is used in when a class only has one method, or one very important method. Consider an iterator class. It steps through a linked list, returning the next element each time the @I(next()) method is called. Rather than calling it @I(next()), this sometimes uses operator() for that purpose. @BEGIN(LISTING) // example of using operator() in an iterator class node { friend class iterator; //grant class iterator access to my private data node* next; public: char* data; void insert_after (node*); //insert a node after this node. }; class iterator { node* p; //keep track of my position in the list public: node* operator()(); //advance to the next position iterator (node* n) { p= n; } }; void node::insert_after (node* p) { //insert p after this node p->next= next; next= p; } node* iterator::operator()() { //return the node and advance to the next node if (!p) return p; //end of the line, don't advance node* temp= p; p= p->next; return temp; } @END(LISTING) @SECTION(operator->) The operator-> is the strangest one of all. It must be a member. How it is called, and what it does, takes some explaining. The operator-> must be defined to return a pointer type or a class type that itself has an operator-> defined. The call is made with an object on the left and a member name on the right, such as @I(x->a), but the @I(a) is not an argument to the function. The operator-> is applied to @I(x), and then the result is used on the left side of -> again. The @I(a) will be the name of a field, not an argument of any kind. The example @I(x->a) is equivalent to @I((x.operator->())->a). The operator is "slipped in" to the member access. Here is an example of using operator-> to implement a "smart pointer". @BEGIN(LISTING) //example of using operator-> to create a "smart pointer" class C { //members go here... public: int x; //a public data member void dosomething(); //member function }; class Cptr { C* p; public: Cptr() { p=0; } //always initialized Cptr& operator= (C* ptr) { p= ptr; return *this; }; C* operator->(); }; extern void error(); //report an error, somehow C* Cptr::operator->() { if (!p) { //oops! NULL pointer error(); } else return p; } @END(LISTING) The smart pointer class holds a pointer to a C, and has operator-> defined on it so it will return that pointer. You could have: @BEGIN(LISTING) C x; Cptr p; p= &x; y= p->x; //refer to element in C p->dosomething(); //even member functions @END(LISTING) The operator-> is a unary operator that returns a C*, and then the -> operation is redone with that return value on the left. So @I(p->x) is equivalent to @I((p.operator->())->x;). If that still confuses you, remember that the operator is simply a member function with a funny name. Calling it @I(fetch()) would let you say @I((p.fetch())->x;) which is perhaps clearer. @SECTION(operator&) The unary form operator& is unusual because it is already defined for all class type. It normally takes the address of the object. You can redefine it to do anything you want. Normally it is used to inform the system that an object is having its address taken. It is sometimes used in debug code to report to the programmer when objects have their address taken. @BEGIN(LISTING) someclass* someclass::operator& () { report_on (this); //tell the programmer return this; //and do what I came for. } @END(LISTING) This operator can be defined as a member function as shown above, or as a non-member function. You should watch out for pitfalls when using an operator&. Namely, how do you take the address of an object if operator& is defined? This suggests one use of operator& is to make a class where you cannot take the address of an object-- operator& is private or causes a run-time error to be printed. In the example of operator-> a smart pointer was illustrated. The Cptr type had an operator= that let you assign a C* to a Cptr. Instead, you could use operator& to make you use smart pointers exclusively: define a @I(Cptr C::operator&();) so that taking the address of a C object gives a smart pointer directly. @SECTION(operator,) The comma operator is not all that unusual. It is rarely used because the comma is used so much in C++ already. It is unusual in the respect that the comma is already defined between class types, so you have to be careful sometimes in knowing how overloading with resolve. The comma operator forces left-to-right evaluation. That can be handy. The order of precedence is very low, so you will usually need parenthesis around the comma expression. @b(come up with an example) @SECTION(operator++ and operator--) These two are similar, and I'll just talk about operator++. The same comments apply to operator--. There are two forms of ++ for built-in types. As a C programmer, you know that @I(y= x++;) and @I(y= ++x); has different meanings. In C++ you can define both the prefix and postfix forms. Defining the prefix form is exactly what you would expect, since it is the same as any other unary operator. A member function such as @I(C::operator()) or a nonmember such as @I(operator++(C&)) will do. The postfix form is defined with an extra argument. This second argument is an int, and is always passed a value of 0. Defining @I(C::operator++(int)) or @I(operator++ (C&,int)) will define a postfix operator. Here is the smart pointer example again, showing prefix and postfix operators added. @BEGIN(LISTING) //example of using operator++ with the smart pointers class C { //members go here... public: int x; //a public data member void dosomething(); //member function }; class Cptr { C* p; public: Cptr() { p=0; } //always initialized Cptr& operator= (C* ptr) { p= ptr; return *this; }; C* operator->(); //see example in operator-> section Cptr& operator++(); //preincrement Cptr operator++(int); //postincrement }; Cptr& Cptr::operator++() { //preincrement ++p; return *this; } Cptr operator++ (int) { //postincrement Cptr temp= *this; ++p; return temp; //return original unmodified copy } @END(LISTING) @b(Version Note) In C++ versions prior to 2.1, the postfix form was not available. The only way to define an operator++ or operator-- was with one argument. It called this same function for either prefix or postfix use. For compatibility, do not use such a function as a postfix operator.