Return to DNJ Online home page

 

The .NET Platform
Development Tools
COM & COM+
Data Access
Web Development
XML Technologies
Windows Servers
Wireless & Mobile
Security issues
Design & Process
Career Development
Analysis & Comment
Disposable Objects

Subscribe to our RSS feed to receive notification of new articles as they are published.

Events Diary
Software Update

About Us
Advertisers

 

You are not logged in: login here to access all areas.


Managed Extensions for C++

Managed Extensions for C++ let you create .NET code with C++  and to re-use existing code with .NET. Richard Grimes explains how.

Author: Richard Grimes

Last updated: Nov 2001

There is little doubt that the .NET Framework represents a wonderful technology and does live up to most of the claims from Microsoft. However, it’s not going to succeed in the real world unless it can work with existing native code. Thankfully, Microsoft has taken this into consideration, particularly if the native code in question is C++. 
     This is because the new C++ compiler has a split personality. It allows you to compile ISO compliant C++, with better compliance than any C++ compiler so far produced by Microsoft. The same compiler allows you to compile code that will run under the .NET runtime. Such code is called ‘managed’ code, because it is designed to be managed by the .NET runtime. Traditional code, written for compilation directly into machine code, is called ‘unmanaged’ code. A version of the new compiler is given away free as part of the .NET SDK.
     The .NET runtime has two features that are particularly significant - its ability to just-in-time (JIT) compile and execute Microsoft intermediate language (IL), and the inclusion of a garbage collector (GC) that tracks object usage and automatically releases the object’s memory space when it is no longer required. The new Microsoft C++ compiler allows you to write C++ code that will be compiled to IL, and to create new data types that will be allocated on the GC managed heap.
     The interesting point is that you can combine code that uses native code and has unmanaged data, with managed code in the same source file. This means that you can use your existing C++ code in .NET applications. Managed Extensions for C++ is the only compiler that allows you to mix native and IL code. No other .NET compiler allows you to do this.

Reference Types
The Managed Extensions extend the C++ language with the keywords shown in the table (right), and these keywords will only compile if you compile with the /CLR (Common Language Runtime) switch. Note that the code you write using the Managed Extensions for C++ is .NET code, and so the object model follows the rules defined by the .NET Framework. This means only single implementation inheritance is allowed, although multiple interface inheritance 
is supported.
     In addition to these new keywords and the new compiler switch, the Managed Extensions has two new pragmas (see overleaf). When you compile code using the /CLR switch, all code will be compiled to IL regardless of whether the classes are managed or not. The exception is when methods contain code that cannot be converted to IL, or when you use the new pragmas to indicate that you do not want IL code. Methods that cannot be compiled to IL contain __asm statements, the varargs API, or setjmp/longjmp.
     This code shows the new pragmas in action:

void DoSomething()
{
  // all code is compiled to IL
  // data created with new is
  // managed
}

#pragma unmanaged
void SomethingElse()
{
  // all code is compiled to 
  // native code
  // no data is managed
}
#pragma managed

Under .NET, code is compiled into assemblies and scoped by namespaces. Assemblies are the unit of deployment in .NET and can be contained in an EXE, but more usually in one or more DLLs. All types in .NET are completely described in metadata, and metadata is possibly the most important part of .NET because it is through metadata that the runtime can enforce type safety.
     If you want to use a class in another assembly then you have to import the metadata from that assembly. This is done in C++ with the #using keyword. The core of the .NET base class library is contained in a DLL called mscorlib.dll, and so every C++ source file under .NET should include the following:

#using <mscorlib.dll>
using namespace System;

The ‘using namespace’ line is not strictly required; it merely allows you to use the types in the System namespace without having to explicitly scope them.
     Shown below is a simple example of a .NET class. The first point to note is the strange use of the public keyword on the class. This indicates that the class is accessible outside of the current assembly. You can regard this as being similar to using __declspec(dllexport) on a function in unmanaged code to allow the function to be called outside of the DLL that houses it.

public __gc class MyType{};
public __gc class CMyClass
{
  int m_x;
  MyType* m_t;
public protected:
  CMyClass() : m_x(0), m_t(new MyType){}
  ~CMyClass() {/* release other resources */}
  __property int get_X(){return m_x;}
  __property void set_X(int value){m_x = value;}
  void Dispose(){/* release scarce resources */}
};

     The class is marked with the __gc keyword, which indicates that instances of the class must be created with the .NET new operator and are created on the managed heap. This class has an implicit base class of Object in the System namespace. All instances of this class are maintained by the garbage collector, and so you never need to call delete.
     The next point to note is the use of the public protected access specifier on the class members. The complete range of access specifiers is shown in the panel ‘Class member access specifiers’. In this example, the access specifier states that the members in CMyClass can be accessed by any code in the current assembly, and by code in external assemblies only if it is in classes derived 
from CMyClass.
     Strictly speaking .NET classes do not have destructors because you are no longer responsible for destroying them. This is the responsibility of the GC which determines when an instance is no longer being referenced and then, at some time later (potentially a lot later), frees the memory used by the instance. Contrast this to native C++ classes where you either explicitly destroy an object by calling delete, or create the object on the stack frame and allow the scope of the stack frame to determine the lifetime of the object. In both cases the developer determines the lifetime of the object and hence when the destructor will be called. 
     In the .NET case the developer has little control over when the object’s memory is freed. Microsoft recommends that developers use the Dispose pattern demonstrated in the panel above. That is, your class implements a method that the calling code calls explicitly when the object is no longer needed. This Dispose method releases any resources that the object holds:

CMyClass* c = new CMyClass;
// use the object
// object is not needed, so release
// its resources
c->Dispose();
c = NULL;

Here, I call Dispose() to immediately release any resources that CMyClass holds. Although I have created the object with new, this is the managed version of the new operator so I do not need to call delete. Instead, I use the trick of assigning the pointer to NULL when I no longer want to use the object. This removes all references to the object so that the GC will mark the object as a candidate for release. This will quicken the freeing of its memory, but I still don’t determine when this will happen.
     If you have a resource that you don’t need to release immediately then you can use the Finalize pattern. In this case you give your class a destructor and the C++ compiler will generate a Finalize method that will call the code in the destructor. When the .NET runtime sees that an object has a Finalize method, it will delay the freeing of the object’s memory until the GC’s finalize thread has called the object’s Finalize method.
     This technique has the potential to keep your object alive far longer than you would expect, and for this reason Microsoft discourages you from using it. Note that if you have a destructor, and you explicitly call delete on an object pointer, then the destructor will be called and the .NET runtime will be told not to schedule the Finalize method to be called by the finalize thread. However, delete does not free the object, so the object pointer will be valid but the object that it points to will no longer have allocated resources.
     The CMyClass class also demonstrates how to declare a property. This property is read-write, but you can make your properties read-only or write-only by implementing only the get_ or set_ method. The name of this property is X and it allows you to treat this as if it is a data member in the class from a language point of view. However, note that properties do not represent storage, so they can be members of interfaces and they are not called when an object is serialized (only data members, which under .NET are called ‘fields’, are serialized).

Value Types
An instance of a __gc type will always be created on the GC managed heap. Create an array of 10,000 instances, for example, and the array will contain 10,000 pointers each pointing to one of 10,000 objects on the GC managed heap. On current systems a pointer takes up 32 bits, so if the object is not significantly larger than 32 bits this represents an inefficient use of memory. In order to overcome this, the .NET Framework defines value types. An instance of a value type is created inline, and all the primitive data types, such as integers, floating point or Boolean, are value types. For example:

double pi = 3.1415;

In this code the variable pi is created on the stack and will exist as long as the 
stack frame.
     The class in the panel above has a value type (m_x is a 32-bit integer) and a reference type (m_t is a pointer to an instance of MyType) as data members. The amount of memory taken up by an instance of CMyClass includes the storage for m_x and the pointer m_t. In other words, because m_x is a value type it lives on the managed heap as part of the CMyClass object. The MyType object, on the other hand, lives elsewhere on the managed heap and is referred to by the m_t pointer. In general, value types are referred to directly, whereas reference types are always referenced through pointers. 
     Value types are implicitly derived from the class ValueType and are implicitly __sealed. This means that you cannot derive from a value type, but you can call the methods of ValueType. The following code calls the property X on a CMyClass object:

c->X = 42;
Console::WriteLine(S”The value is {0}”, __box(c->X));

In the first line I assign the property a value, and thus set_X() is called. In the second line I retrieve the property, so get_X() is called. However, get_X() returns an int, which is a value type. The static method WriteLine() of the Console class is overloaded, but does not have a version that takes a string and an int. Instead, the nearest match is one that takes a string and an Object*. Object is the base class of reference types, so this implies there must be a mechanism to convert a value type to a reference type. This is the purpose of the __box() operator.
     The __box() operator performs a task known as ‘boxing’, where the runtime creates an object and copies into it the value of the value type parameter. The format string in WriteLine() indicates that it will format a string into the placeholder “{0}”, so WriteLine() calls ToString() on the boxed value to get the string value of the item.
     Boxing involves creating a ‘hidden’ object, and because this inevitably has an effect on the GC managed heap, managed C++ requires that you explicitly specify this is what you want (other languages such as C# do this automatically). 
The opposite process, unboxing, does not consume extra memory because you are accessing a value within an existing object, so this happens automatically in C++, as it does in other .NET languages.
     Notice that, because an int is a value type (specifically the class Int32), you are allowed to call the ValueType methods on it. For example, the following two lines of code will compile:

String* s1 = (p->X).ToString();
String* s2 = (42).ToString();

Also note that the format string has a prefix of ‘S’. This indicates that the string is a managed string (of type String). If you miss out this prefix the compiler will automatically generate code to convert the unmanaged string to a managed string.

Program Entry Point
Every executable must have an entry point. In C++ this is either main() or WinMain(). The .NET Framework does allow you to have global methods, but of all the .NET languages only Managed C++ allows you to declare global methods. The other languages only allow methods to be declared as part of a class. You access a global method from outside of the assembly where it is defined by using an instance of the Module class for the module in which it is defined.
     When the compiler sees a main() or WinMain() method in your code it generates code to initialise the CRT and makes the method the entry point for the assembly. By default the C++ compiler calls the linker with the subsystem:console switch, so if you want a GUI application then you have to specify that the linker is called with the subsystem:windows switch. For example, here is a managed application that prints a message to the console:

// compile with /CLR
#using <mscorlib.dll>
using namespace System;

void main()
{
  Console::WriteLine(“Hello DNJ!”);
}

And here is some code that creates a message box that displays the message:

// compile with /CLR /link /SUBSYSTEM:WINDOWS
#using <mscorlib.dll>
#using <System.Windows.forms.dll>
using namespace System;
using namespace System::Windows::Forms;

extern “C” int __stdcall WinMain(void*, void*, char*, int)
{
  MessageBox::Show(“Hello DNJ!”, S”Test”);
}

Arrays
The other main area where you’ll notice a big difference is arrays. C++ Arrays in IL are instances of the System::Array class, but you only get a subset of the facilities. For example System::Array allows you to specify the lower bound, but managed C++ array always have a lower bound of 0.
      To declare and access an array you use square brackets:

Int32 a[,] = new Int32[3, 2];
for (int x = 0; x < 3; x++)
{
  for (int y = 0; y < 2; y++)
  {
    a[x, y] = x + y;
  }
}

This looks odd to a C++ programmer. The array a is exactly six items in size, measuring exactly three dimensions by two. Contrast this to the unmanaged version where a two dimensional array is in fact a one dimensional array where each item is a one dimensional array.
     The data in this array is created on the managed heap, but you have no guarantee as to whether this memory is contiguous. The runtime will do a bounds check whenever you access an array, which means you cannot use indexes outside of the declared range as you can with unmanaged arrays in C. This prevents many memory access bugs and plugs a potentially dangerous security risk.
     At the start of this article I talked about reusing code, and I said that the best way to do this in .NET is to use C++. The Managed Extensions for C++ allows you to call native code from IL code, and to use classes that have unmanaged data with code that uses managed data. In order to make this possible, Microsoft has had to extend the C++ language. However, there is no support in Visual Studio.NET for using C++ to write GUI code - the language is simply too complicated for Microsoft to write the parser needed by the Windows Forms Designer which generates C# and VB.NET code. 
     That said, it is not Microsoft’s intention that you use C++ to write full .NET applications. Instead it makes more sense to use it to write .NET wrappers around your existing native code. Managed Extensions for C++ is the only .NET language that allows you to do this, which makes it a powerful tool in your .NET toolbox.


New C++ keywords

__abstract
Indicates that the class is abstract. You cannot use this class directly; you must derive a class from it.

__box 
Used to convert a value type to an object.

__delegate 
Used to declare a ‘delegate’ (a type safe function pointer).

__event
Declares an event member, which allows you to generate an event through a delegate.

__finally
Used with C++ ‘try’ keyword to provide a block of code that is executed when the try block is left. Previous versions of Visual C++ only allowed this keyword with a __try block and SHE.

__gc
Indicates that a type is managed by the garbage collector.

__hook
Used to ‘hook’ an event handler to a class that can generate events.

__identifier
Indicates that a type is a .NET type rather than a C++ keyword.

__interface
Used to declare an interface. Use with __gc for a managed interface.

__nogc
Indicates that the type is not managed by the .NET garbage collector.

__pin
‘Pins’ an object in memory (useful when you want to pass a pointer to a managed type to unmanaged code).

__property
Indicates that a method is a get or set method on a property.

__sealed
Indicates that the class cannot be a base class.

__try_cast
Casting operator that throws an exception if the cast fails.

__typeof
Returns a Type object for a class.

__unhook
Used to ‘unhook’ an event handler from an event generation class.

__value
Indicates that a type is managed, but is allocated on the stack and not on the managed heap.


Class member access specifiers

In general, class member access specifiers have two parts. The first half describes the accessibility of the class from code within the same assembly, while the second part describes accessibility from code outside the assembly. Where the two parts are the same, as in ‘public public:’, then only the one part needs to be used, as in ‘public:’.

Send to a friend

Top of page

Click here for our Privacy Statement. Copyright © Matt Publishing. All rights reserved. No part of this site may be reproduced without the prior consent of the copyright holder.

Send to a friend

New C++ keywords

Class member access specifiers