Wrox Home  
Professional .NET Framework 2.0
by Joe Duffy
April 2006, Paperback

Ordinary Calls (call)

You might have already guessed the primary native code difference between an ordinary call and a virtual call based on descriptions elsewhere. Simply put, a virtual call looks at the method-table of the object against which the method is dispatching to determine the method-table slot address to use for the call, while others just use the token supplied at the call-site to determine the method-table slot address at compile time. Slot offsets for virtual calls are determined statically at JIT time, so they are quite fast. Method table layout is such that overridden virtual methods inherited from base classes occupy the same slots, ensuring the index for a particular method doesn't depend on runtime type.

Normal method calls (i.e., the IL call instruction, or callvirts to non-virtual methods) are very fast. The JIT Compiler is able to burn the precise address of the target method-table slot at the call-site because it knows the location at compile time.

Let's consider an example:

int ff = f.f("Hi", 10, 10);
int bf = b.f("Hi", 10, 10);

In this case, we're calling the method f as defined on Foo. Although we use the b variable in the second line to make the call, f is non-virtual and thus the call always goes through Foo's definition. The jitted native code for both (in this example, IA-32 code) will be nearly identical:

mov   ecx,esi 
mov   edx,dword ptr ds:[01B4303Ch] 
push  0Ah  
push  0Ah  

Remember, the first two arguments are passed in ECX and EDX, respectively. Our this pointer (constructed above with the Foo f = new Foo() C# code) resides in ESI, and thus we simply mov it into ECX. Then we move the pointer to the string "Hi" into EDX; the exact address clearly will change based on your program. Since we are passing two additional parameters to the method beyond the two which are stored in a register, we pass them using the machine's stack; 0Ah is hexadecimal for the integer 10, so we push two onto the stack (one each for each argument).

Lastly, we make a call to a statically known address. This address refers to the method-table slot, in this case Foo::f's, and is discovered at JIT compile time by matching the supplied method token with the internal CLR method-table data structure:

call FFFC0D28

The second call — through the b variable — differs only in that it passes b's value in the ECX register. The target address of the call is the same:

mov   ecx,edi 
mov   edx,dword ptr ds:[01B4303Ch] 
push  0Ah  
push  0Ah  
call  FFFC0D28 

After performing the call to FFFC0D28 in this example, the JIT stub will either jmp straight to the jitted code or invoke the JIT compiler (with a call) if the method's code has not yet been compiled.

Virtual Method Calls (callvirt)

A virtual method call is very much like an ordinary call, except that it must look up the target of the call at runtime based on the this object. For example, consider this code:

int fg = f.g("Hi", 10, 10);
int bg = b.g("Hi", 10, 10);

The manner in which the this pointer and its arguments are passed is identical to the call example above. ESI is moved into ECX for the dispatch on f and EDI is moved into ECX for the dispatch on b. The difference is that the call target can't be burned into the call-site. Instead, we indirectly go through the method-table to get at the address:

mov   eax,dword ptr [ecx] 
call  dword ptr [eax+38h]

We first dereference ECX, which holds the this pointer, and store the result in EAX. Then we add 38h to EAX to get at the correct slot in the method-table. Because this table's address was discovered using the this pointer, we will inspect a different method-table for f and b. Thus, the call through b will end up going through its overridden version. We then just call the address of that slot. Remember, we stated above that all classes in a hierarchy use the same offsets for methods, meaning that this same offset can be used for all derived classes.

The full IA-32 for this calling sequence (using the f variable) is:

mov   ecx,esi 
mov   edx,dword ptr ds:[01B4303Ch] 
push  0Ah  
push  0Ah  
mov   eax,dword ptr [ecx] 
call  dword ptr [eax+38h]

Again, the only difference when b is used is that EDI, instead of ESI, is moved into ECX.

Indirect Method Calls (calli)

C# doesn't supply a mechanism with which to emit a calli instruction in the IL. You can, of course, emit code using reflection, but an example would introduce more complexity than necessary. If you were to imagine that a calli sequence were being JIT compiled, the only difference introduced would be that the native call instruction would perform a call dword ptr [exx], where exx is the register in which the target address of the calli was found. That is, it calls the address to which the indirect pointer refers. All of the arguments would be passed in accordance to the method token supplied to the calli instruction.

Dynamic Method Calls (Delegates, Others)

There is a range of dynamic method calls available. Many of them are part of the dynamic programming infrastructure supplied by reflection, and thus won't be explored in depth here. They are all variants on the same basic premise, which is that some piece of runtime functionality is able to look up the method-table information at runtime to make a method dispatch. The runtime can then, of course, make calls to this code as requested, based on information supplied by the programmer.

Delegates are an interesting special case of this capability. A delegate is essentially just a strongly typed function pointer type, an instance of which has two pieces of information: the target object (to be passed as this), and the target method token. Each delegate type has a special Invoke method whose signature matches the function over which it has been formed. The CLR supplies the implementation of this method, which enables it to perform lightweight dispatch to the underlying method.

A call to a delegate looks identical to a call to a normal method. The difference is that the target is the delegate's Invoke method-table slot instead of the actual underlying function. Arguments are laid out as with any other type of call (i.e.,_fastcall). The implementation of Invoke simply patches the ECX register to contain the target object reference (supplied at delegate construction time) and uses the method token (also supplied at delegate construction time) to jump to the appropriate method-slot. There is very little overhead in this process, which makes delegate dispatch on the order of one to two times the speed than a simple virtual method call.

The various other styles of method dispatch — such as Type.InvokeMember, MethodInfo.Invoke, and so forth — all add a certain level of overhead when compared to delegates, because they must go through the process of binding to the target method. This is the process of matching dynamic type, method name, and argument information to the list of known loaded types. Delegates typically don't suffer this penalty because the target method token is embedded in the IL. You may dynamically construct and invoke delegates (e.g., with DynamicInvoke), which adds a comparable level of overhead for the construction and binding process. Another penalty associated with pure dynamic invocation, is that these mechanisms tend to pass arguments as object[]s. This requires that the dispatching code inside the CLR must transform that information into the appropriate calling convention to perform the invocation, by unraveling the array, and then perform the necessary marshaling on the return.

Wrapping Up

This was a very brief overview of something that is incredibly deep. More details, including the performance characteristics, and how you can play around with some of these implementation details through spelunking in the Visual Studio debugger, are outlined in this MSDN video.

This article is adapted from Professional .NET Framework 2.0 by Joe Duffy (Wrox, 2006, ISBN: 0-7645-7135-4), from chapter 3 "Inside the CLR." Joe is a Program Manager on the CLR Team at Microsoft, where he works on WinFX and the .NET Framework. Joe's other recent related article at Wrox.com is Common Type System (CTS): One Platform to Rule Them All.