Yes, there are two new and two delete operators (they are not functions).
The new() and delete() are called to allocate a single object and the new[]() and
delete[]() are called to allocate an array of objects. You should always use delete()
with new() and delete[]() with new[]().
Examples:
SomeObject *x = new SomeObject; // use new()
SomeObject *y = new SomeObject(initialization, parameters, see, constructors); // use new()
x->someMethod(); // call some method for x
...
delete y; // use delete()
delete x; // use delete()
SomeObject *x = new SomeObject [20]; // use new[](), allocate 20 objects!
SomeObject *y = new SomeObject(initialization, parameters) [20]; // use new[]()
x[0].someMethod(); // call some method for the zero-th object
y[19].someMethod(); // call last y object's method
...
delete [] y; // must use delete[]() since it is an array
delete [] x; // use delete[]()
There may be a number inside the brackets of the delete, but these are ignored.
All of the following constructions are accepted by the g++ compiler on host, but some are not supported by the dcc compiler (for target).
Supported by the dcc compiler are:
unsigned long int **ul = new unsigned long int*; // 'one pointer to an unsigned long int'
char * ch = new char [10]; // 'array of characters'
RTDataObject *it = new RTDataObject [10]; // 'array of RTDataObjects'
RTDataObject **it = new RTDataObject* [10]; // 'array of pointers to RTDataObjects'
void **it = new void* [10]; // 'array of pointers (to something)'
void **it = new (void*) [10]; // 'array of pointers (to something)'
unsigned long int **ul = new unsigned long int* [10]; // 'array of pointers to unsigned long ints'
unsigned long int **ul = new (unsigned long int*) [10]; // 'array of pointers to unsigned long ints'
unsigned long int *ul = new unsigned long int [10]; // 'array of unsigned long ints'
unsigned long int *ul = new (unsigned long int) [10]; // 'array of unsigned long ints'
Not supported by the dcc compiler are:
char *ch = new (char) [10]; // 'array of characters'
RTDataObject *it = new (RTDataObject) [10]; // 'array of RTDataObjects'
RTDataObject **it = new (RTDataObject*) [10]; // 'array of pointers to RTDataObjects'
Conclusions:
Below are three examples to describe the difference between a shallow copy, a deep copy and a member-wise copy. A standard copy-constructor in C++ does a member-wise copy!
In the following, I use an RTString as an example. An RTString has internally a
pointer to the contents of the string, so in a picture it looks like this:
Only copies the pointer, so the object is shared.
example:
RTString *shallow_copy = &original; // only the address is copied,
the rest is shared
Make a complete separate object with contents.
example:
RTString deep_copy = original; // in C++ this would result in a
member-wise copy, but OTD has overridden this behaviour to a deep copy
This is the default C++ behaviour, but in OTD it is overridden, so that a deep copy
is done when you write something that is normally a member-wise copy. But be aware of
this and secondguess yourself when you're not sure.
example:
RTPointer member_copy = original; // the pointer object is copied,
but what it points to is not duplicated
This guidline states that we should only use the EC++ subset of C++. On page 36 of the A revision is a list of C++ elements that are not part of EC++ and subsequently should not be used by us. Below is the list with comments of mine in red. All of these are typical C++ (i.e. non-C) stuff, so is probably C programmers have never heard of them. But anyway, these elements are the more exotic elements of C++.
C++ can convert many types into many other types. This is convenient, but also
hazardous:
It's convenient in int i = 3; float f = i + 1.8;
It's hazardous in protocol.send(message, &myObject); when an RTPointer is
expected as data. In this case, the &myObject is cast to void* and the send(int, void*)
is used instead of the (intended!) send(int, RTDataObject);
I think it's good to write out the implicit casts (i.e. make them explicit), but also
know the rules that apply for implicit casts.
Parts of the text below are from
Overload resolution [i.e. which overload of a particular function to use] involves conversions, which may be needed to match a function signature to the types written in a call. [When the types are differ from what is expected, a conversion (cast) must be done. If this cast is not written, C++ will insert an implicit cast according to the following rules of priority]
T | <=> | T& | ||
T[] | ==> | T* | ||
T(argtypes) | ==> | (T*)(argtypes) | ||
T | ==> | const T | ||
T | ==> | volatile T | ||
T* | ==> | const T* | ||
T* | ==> | volatile T* |
Built-in conversions of built-in types around built-in operators are the 'usual arithmetic conversions' of ANSI C:
When C++ encounters a function call with N arguments, it considers all visible function overloads of that name. It identifies the overloads eligible to be called with N arguments (including those using default arguments). Then it determines which of the eligible overloads can match the arguments provided, and attempts to find a 'best' match. To be a best match, an overload must
For example:int f(float, long); int f(double, unsigned);No conversion at all is a better match than promotion, and promotion a better match than a signed-unsigned change.f(1, 1); // call: f(int, int)The choices to match an overload to the call are:f(int=>float, int=>long) f(int=>double, int=>unsigned)The first overload provides a better match on the second argument, and an equally good match on the first argument.f(0U, 1); // call: f(unsigned, int)The first overload provides a better match on the second argument and an equally good match on the first.f(1.0F, 0); // call: f(float, int)The first overload provides better matches on both arguments.f(1.0, 0); // call: f(double, int)The first overload provides a better match on the second argument but the second overload provides a better match on the first argument. The call is ambiguous.
Practical example:
protocol.send() has 2 overloads:int RTEndPortRef::send(int, const RTDataObject&, int prio = General); int RTEndPortRef::send(int, void* data = 0, int prio = General);Which one is used for the following calls?
protocol.send(aMessage);
// ==> protocol.send(aMessage, 0, General);
The second overload is the only one that can be called with 1 parameter.
protocol.send(aMessage, "the data");
// ==> protocol.send(aMessage, (void*)"the data", General);
The first overload requires the second parameter to be converted from const char* to RTString (an RTString constructor (level 4)) and a down-cast to RTDataObject (a trivial cast (level 1), since RTDataObject is a base-class of RTString).
The second overload only requires the second operand to be converted from const char* to void* (a built-in conversion (level3)), so the second is a closer match.
protocol.send(aMessage, 1);
// ==> protocol.send(aMessage, (void*)1, General);
The same as for the string above is valid here, but the const int must be converted to an RTInteger by the RTInteger constructor, which is harder than the cast from const int to void*
In C++ it is possible to declare variables in the initialization part of the
for(), like in:
for (int i = 0; i < length; i++)
{
do something;
}
What is the scope if i in this case? In other words: "Where is i known?"
This is a question they have discussed in the C++ world for a while and the outcome
is as follows:
i's scope is as if it is declared just before the for().
int i;
for (i = 0; i < length; i++)
{
do something;
}
This means that i is known after the for() as well, so it is an
error to declare it again. E.g.:
for (int i = 0; i < length; i++)
{
replaceSpace(s[i],'_');
}
for (int i = 0; i < length; i++)
{
makeCapital(s[i]);
}
This code-fragment is ok if the declaration in the second for() is omitted.
Now the problem...
As I said, these rules took a while to take shape and the compiler we use (g++) is
obviously a bit older, because it translates the declaration into a declaration
within the body of the for() instead of just before. This means
that after the for(), the i is not known anymore and can be
declared again (even stronger: it must be declared again if you want to use it).
So in g++, the brown piece of code is correct and the correct version will not
compile!
This translation is compiler specific and should therefore not be used! I.e.:
After the body of a for(), don't make any assumptions about the
existence of variables declared in the initialization part of the
for().
If you want to be on the safe side: Don't declare variables in the initialization
part of the for().
bool result = ~false;
if (result == true)
{
printf("This is good\n");
}
else
{
printf("This is not good\n");
}
What do you think is printed in this exampe? I don't know! It's compiler specific!
Why?
A computer really only works with numbers, so there must be a mapping from true/false
to numbers. In C/C++, the definition of true and false is as follows:
What should you do?
bool result = ~false; if (result) ...If we want to take numbers into the boolean domain, we should use the comparison operators:
FILE *f = fopen("something","r"); bool success = (f != NULL);
The idea behind a constant reference is the following:
In the BDH OTD model, we are now going to rely more and more on functions and
inheriting them instead of separate non-inheriting state machines.
This poses the problem of "Which function will be called?", since several
functions may apply (see also "Implicit conversion rules
of C++").
Aspects:
What is overloading?
If at a particular point in the program, several functions with the same
name, but with different parameters and/or const-ness are visible (i.e. they
are in scope), those functions are said to be "overloads of each other".
All overloads can be called, because they are discriminated by signature.
Example:
class RTString : public RTDataObject
{
public:
RTString(const RTString&);
RTString(const char*);
char* getContents(void);
const char* getContents(void) const;
};
What is overriding?
Overriding is when a sub-class defines a function with exactly the same
signature as in the base-class. This sub-class function overrides the
base-class function. In effect, the sub-class function masks the base-class
function.
By default, only the override is visible and the overridden is not.
Example:
class RTDataObject : public RTObject
{
public:
virtual RTDataObject* copy(void) const;
...
}
class RTString : public RTDataObject
{
public:
virtual RTDataObject* copy(void) const;
...
}
Here we see a function that is overridden in every class. If we normally
call copy() in an RTString-method, we get the override and
the override is said to mask the overridden function.
(It is possible to call the RTDataObject's copy() directly
by calling RTDataObject::copy().)
The danger is that one wants to use override (i.e. substitute base-class behaviour with sub-class behaviour), but one doesn't use the exact same signature, so that an overload is created and not an override. In this case, the function that should be overridden is still visible!
The rules for determining which function to call in an expression are:
Example:
int f(int); // this is ::f(int), because it is defined in global namespace
class A
{
int f(int);
int f(void);
int f(char) const;
};
class B : public A
{
public:
int g(void) { return f(); };
}
class C : public B
{
int f(int size = 0);
int f(char);
};
void main(void)
{
int i;
A a;
const C c;
i = f(1); // see note 1
i = a.f(2); // see note 2
i = c.f(); // see note 3
i = c.f('a'); // see note 4
i = c.g(); // see note 5
}
note 1:
step 1: ::f(int) is the only function with this name that is visible.
step 2: f(1) matches on f(int).
step 3: const-ness is ok.
note 2:
step 1: Visible are: A::f(int), A::f(void) and A::f(char) const (::f(int)
is masked).
step 2: f(2) matches on A::f(int).
step 3: const-ness is ok.
note 3:
step 1: Visible are: A::f(void), A::f(char) const, C::f(int size = 0)
and C::f(char) (A::f(int) is overridden by C::f(int size = 0)).
step 2: f() matches both A::f(void) and C::f(int size = 0). This is an ambiguous
call, so the compiler cannot choose.
note 4:
step 1: Visible are: A::f(void), A::f(char) const, C::f(int size = 0)
and C::f(char).
step 2: f('a') matches on both A::f(char) const and C::f(char).
step 3: because b is a const object, a const function matches better
than a non-const, so A::f(char) const is called.
note 5:
for g(), B::g(void) is called, that's easy. But now f()...
step 1: f() is called from class B, so only A::f(int), A::f(void) and A::f(char) const
are visible.
step 2: f() matches A::f(void).
step 3: const-ness is ok.
Note that class C also knows C::f(int size = 0), which also matches the call, but
because the actual call is done from class B, this is also the class where the visibility
is determined.
Compare this result with note 3.
Every class occupies a certain amount of memory (pointed to by the this pointer). When inheriting, a subclass gets all that the baseclass has, so also the data. So, with single inheritence, the first part of an object's memory is the same as that of its baseclass and the rest is specific for that class, see the example below:
class A { int i; }; class B : public A { int j; }; class C : public B { int k; }; class D : public A { int x; };memory-map:
Below are a few examples where this can happen:
Example 1: Return as baseclass
SomeBaseClass bad(void) { SomeSubClass a; // ... return a; } SomeBaseClass& good(void) { // Cannot return a pointer to a local variable, therefore, make the variable // on the heap (or return a member variable). SomeSubClass *a = new SomeSubClass; // ... return *a; }Slicing occurs here at the return, where only the baseclass-part is copied and returned. Remember that the lifespan of a local variable ends when the function ends (i.e. just after the return), therefore, a must be copied.
Example 2: Assign to baseclass
SomeBaseClass a; SomeSubClass b; a = b;a's assignment (SomeBaseClass::operator=()) is used and that one only copies the baseclass stuff.
(SomeSubClass)a = b;It is not legal to cast a baseclass to a subclass.
SomeBaseClass *a; // now a pointer! SomeSubClass b; a = &b; // do not really copy a = new SomeSubClass(b); // make a new object, by using the copy-constructor
Example 3: Using a baseclass reference
SomeSubClass a, b; SomeBaseClass &c = a; // nothing wrong here, just a reference c = b; // should copy contents of b into a (both ot type SomeSubClass)In the last assignment, again the SomeBaseClass::operator=() of c is used, so the object b is sliced. The assignment from a to c is not a problem, since only the reference is copied and the object is not.
A general rule is: When using a baseclass
to manipulate a subclass, use pointers or references, so that the data doesn't
get copied. If you really want to copy it, use the copy-method (available for
RTDataObject) or see to it that the copy is done in the right way (probably, you
can add some comments in the code :-) ).
Twice the same method?
As described in Dangers of overloading and overriding,
Class::method() and Class::method() const have different signatures
and thus are overloads and not overrides. Therefore, it is possible to define the
following:
class Data { private: int x; int y; public: int getX(void) const { return x; } int getY(void) const { return y; } bool setX(int val) { x = val; return true; } bool setY(int val) { y = val; return true; } }; class DataClass { private: Data data; public: Data& getData(void) { return data; } const Data& getData(void) const { return data; } };The two getData methods are the interesting part. It looks a bit overdone, but otherwise, the following code-fragments wouldn't work:
const DataClass data1; int x = data1.getData().getX(); //everything is const here ==> safe! data1.getData().setX(5); //won't work, since result of getData() is const // and setX() is non-const. DataClass data2; data2.getData().setX(5); //does work: because data2 is non-const, the non-const // version of getData() is taken, which also // returns non-const.Please have a look at this. It may look a bit weird at first, but this is when const shows it's importance: You can return something different in case of const.
There is one thing that I didn't mention and that is why a const object prefers the const overload and a non-const object prefers the non-const overload. This is explained in the next section, Twice the same method? one step further.
As explained in the previous section, Twice the same
method?, different overloads can be chosen, depending on the const-ness of
the object itself. How that is done is the topic of this section.
At the end of this section, two (gcc 2.95.2) errors will be explained:
"passing `const String' as `this' argument of `String::operator char *()' discards qualifiers"and
"choosing `String::operator char *()' over `String::operator const char *() const'
for conversion from `String' to `const char *'
because conversion sequence for the argument is better"
Let's take a look at this example:
class String { public: String(const char* other = NULL); String(const String& other); operator char*(void); operator const char*(void) const; };The advantage here is that the String::operator const char*() may be optimized, because it is only used when the buffer is read-only!
String string; const String const_string; char *pstr; const char *const_pstr; pstr = string; // should call operator char*() const_pstr = const_string; // should call operator const char*()The reason that this works is because the string object is passed as the this pointer to the method as an implicit argument. If you would write this argument explicitly (this is just for explanation, it's not valid C++!!), you would get something like this:
'real' C++ | Showing the implicit this pointer |
---|---|
operator char*(void) | operator char*(String* this) |
operator const char*(void) const | operator const char*(const String* this) |
What about the other possible assignments, I hear you ask? They are:
pstr = const_string; const_pstr = string;
const_pstr = (char*)string; // good const_pstr = (const char*)string; // not goodIf you think this looks weird, the gcc 2.95.2 compiler agrees, so it issues a warning for this case:
class A { private: const int maxsize; int *array; int size; public: A(int, int*); }; A::A(int _size, int* _array) : maxsize(100) //initializer { array = _array; //could also be done in initializer size = min(_size, maxsize); //could also be done in initializer }A::maxsize cannot be set in the constructor-body, because it's const and you cannot assign to a const member. A const can only be initialized once and never assigned to. compare this to normal C:
const int x = 0; //initialization is allowed x = 10; //assignment isn'tA baseclass constructor is also called in the initializer (you can put entire expressions in the initializer):
class B : public A { private: const char *name; public: B(void); }; B::B(void) : A(50, new int[50]), name("B-class") //initializer with expressions { //empty body }
Virtual methods are the way C++ implements polymorphism. But when is which
function called? For that it is convenient to know how it is implemented at a
lower level. This helps to get rid of the mystic "The function is determined
at runtime".
Before we know which function is called, C++ tries to match its actual
arguments to the function signatures that are defined. This is called
argument matching and is described in
Implicit conversion rules of C++. Also const
and non-const functions play a part in that, see
Dangers of overloading and overriding and
Twice the same method?. And of course, we
always have to be aware of Slicing, which can
occur if you copy a subclass.
For calling an ordinary function, you only need the code-pointer where the
code is. Let's say we have the following piece of code:
void f(int) {}
f(3);
Then the function-call results in the following assembly code:
push 3 ;put the argument on the stack
call f ;jump to the function-code
Conclusion: A normal function is just a piece of code. Because a normal function only describes code, it can only be used to describe algorithms.
For a method, it is a bit different, because a method also needs access to
the object on which' behalf it's called. For this, it needs an implicit,
hidden, extra argument: the this-pointer.
Let's say we have the following piece of code:
class A
{
private:
int a;
public:
void f(int i) { a = i; }
};
A a;
a.f(3);
This method-call results in the following assembly code:
push &a ;arg1: a pointer to the calling object, the this-pointer
push 3 ;arg2: the explicit argument
call f ;jump to the method code
And this is how the function is translated into assembly code:
mov *(arg1 + 1), arg2
return
arg1 is the implicit this-pointer. 1 is the offset of the
first member variable, a. This instruction thus moves the value of
arg2 (3) into where the variable a is in the calling object.
Conclusion: A method is a function that has access to the data of the object in which' behalf it's called. Because a member function is associated with data, it can be used for a variety of things: algorithms (with or without state), access to member variables.
If we now have the case of virtual methods, the called method is only known
at run-time. See for example the next piece of code:
class A
{
private:
int a;
public:
virtual void f(int);
virtual void g(int);
};
class B : public A
{
public:
virtual void f(int); //this is an override
virtual void h(int);
};
A a; //object of type A
B b; //object of type B
A *pa; //pointer to object of type A (its static type)
pa = &a; //pointer is now pointing to object of type A (its current dynamic type)
pa->f(3); //call f(3) on behalf of object a to which pa is pointing
pa = &b; //pointer is now pointing to object of type B (its current dynamic type)
pa->f(3); //call f(3) on behalf of object b to which pa is pointing
First pa is made to point to object a and the function A::f(int)
is called, because pa is pointing to an object of type A.
Later, pa is made to point to object b and the function B::f(int)
is called, because pa is pointing to an object of type B.
How is this accompliced?
This is done by means of a Virtual Method Table (VMT). This VMT is a
table that is defined with each class that has virtual methods.
The last four statements are translated into the following assembly code
(Note that the compiler knows that f is a virtual method, so the generated
assembly code is different from the code for a non-virtual method-call):
mov pa, &a ;pa is now pointing to object a
push pa ;arg1: a pointer to the calling object, the this-pointer
push 3 ;arg2: the explicit argument
mov VMT, *(pa+0) ;VMT is the pointer to the Virtual Method Table
call *(VMT+0) ;jump to the method code A::f(int)
mov pa, &b ;pa is now pointing to object b
push pa ;arg1: a pointer to the calling object, the this-pointer
push 3 ;arg2: the explicit argument
mov VMT, *(pa+0) ;VMT is the pointer to the Virtual Method Table
call *(VMT+0) ;jump to the method code B::f(int)
Explanation:
pa is the this-pointer of the object for which we want to call
a method. As said before, every object, that has virtual methods, has a pointer
in its data that points to the VMT of the class. This pointer is stored as the
first field in the object's data (i.e. the data to which the this-pointer
is pointing). Therefore, *(pa+0) is the value
of this pointer to the VMT. See also the figure below.
This VMT is used to get a pointer to the function that should be called.
Note that the code to call A::f(int) and B::f(int) is the same!
The only difference is that a different VMT is retrieved, because pa is
pointing to a different object. Therefore, the actual function called is
determined by the object pa is pointing to at run-time !
About the figure: The rectangular boxes are associated with classes and the
rounded boxes are associated with objects. The VMTs contain pointers to the
defined virtual methods for that class. VMTA points to
A::f(int) and A::g(int). VMTB points to
B::f(int), because this is an override, B::h(int), which is a
new method and also A::g(int), because it is not overridden and
inherited from class A.
a and b are instances of the classes A and B and
thus contain both an instance of member variable a. They also contain
a pointer to the proper VMT.
Conclusion: A virtual method can be used to separate the interface and the
implementation: the interface is defined by the base-class (this method
can even be abstract) and the implementation is defined by the subclass. The
consequence is that the implementation can be different for every subclass.
Why would we want this?
One reason is because we want to have different implementations for an
algorithm (called the Strategy Pattern). Another reason is because the
algorithm needs information that only the subclass has (this is called the
Template Method).
This VMT-pointer is basically what is known as the dynamic type and Run-Time Type Information (RTTI) can also be based on it: If we have an object, we can compare its VMT-pointer to all the known VMTs and the one that matches, is of the current dynamic type.
C++ uses the keyword virtual to declare both a virtual function and its
overrides in sub-classes. Non-virtual functions are the default and don't require
special keywords.
Java only knows virtual functions (as fas as I know), so all functions are
virtual and don't require any special keywords.
C# has some more keywords. Like in C++, the default is non-virtual. To declare
a function as virtual, use the keyword virtual. To declare an override
for a virtual function, use the keyword override. To declare an override
for a non-virtual function, use the keyword new (which is unrelated
to the memory allocation operator).