Ordinary Calls (
You might have already guessed the primary native code difference between an ordinary call and a virtual call based on descriptions elsewhere. Simply put, a virtual call looks at the method-table of the object against which the method is dispatching to determine the method-table slot address to use for the
call, while others just use the token supplied at the call-site to determine the method-table slot address at compile time. Slot offsets for virtual calls are determined statically at JIT time, so they are quite fast. Method table layout is such that overridden virtual methods inherited from base classes occupy the same slots, ensuring the index for a particular method doesn't depend on runtime type.
Normal method calls (i.e., the IL
call instruction, or
callvirts to non-virtual methods) are very fast. The JIT Compiler is able to burn the precise address of the target method-table slot at the call-site because it knows the location at compile time.
Let's consider an example:
int ff = f.f("Hi", 10, 10); int bf = b.f("Hi", 10, 10);
In this case, we're calling the method
f as defined on
Foo. Although we use the
b variable in the second line to make the call,
f is non-virtual and thus the call always goes through
Foo's definition. The jitted native code for both (in this example, IA-32 code) will be nearly identical:
mov ecx,esi mov edx,dword ptr ds:[01B4303Ch] push 0Ah push 0Ah
Remember, the first two arguments are passed in
EDX, respectively. Our
this pointer (constructed above with the
Foo f = new Foo() C# code) resides in
ESI, and thus we simply
mov it into
ECX. Then we move the pointer to the string
EDX; the exact address clearly will change based on your program. Since we are passing two additional parameters to the method beyond the two which are stored in a register, we pass them using the machine's stack;
0Ah is hexadecimal for the integer
10, so we push two onto the stack (one each for each argument).
Lastly, we make a call to a statically known address. This address refers to the method-table slot, in this case
Foo::f's, and is discovered at JIT compile time by matching the supplied method token with the internal CLR method-table data structure:
The second call — through the
b variable — differs only in that it passes
b's value in the
ECX register. The target address of the
call is the same:
mov ecx,edi mov edx,dword ptr ds:[01B4303Ch] push 0Ah push 0Ah call FFFC0D28
After performing the call to
FFFC0D28 in this example, the JIT stub will either
jmp straight to the jitted code or invoke the JIT compiler (with a
call) if the method's code has not yet been compiled.
Virtual Method Calls (
A virtual method call is very much like an ordinary call, except that it must look up the target of the
call at runtime based on the
this object. For example, consider this code:
int fg = f.g("Hi", 10, 10); int bg = b.g("Hi", 10, 10);
The manner in which the
this pointer and its arguments are passed is identical to the
call example above.
ECX for the dispatch on
EDI is moved into
ECX for the dispatch on
b. The difference is that the call target can't be burned into the call-site. Instead, we indirectly go through the method-table to get at the address:
mov eax,dword ptr [ecx] call dword ptr [eax+38h]
We first dereference
ECX, which holds the
this pointer, and store the result in
EAX. Then we add
EAX to get at the correct slot in the method-table. Because this table's address was discovered using the
this pointer, we will inspect a different method-table for
b. Thus, the call through
b will end up going through its overridden version. We then just
call the address of that slot. Remember, we stated above that all classes in a hierarchy use the same offsets for methods, meaning that this same offset can be used for all derived classes.
The full IA-32 for this calling sequence (using the
f variable) is:
mov ecx,esi mov edx,dword ptr ds:[01B4303Ch] push 0Ah push 0Ah mov eax,dword ptr [ecx] call dword ptr [eax+38h]
Again, the only difference when
b is used is that
EDI, instead of
ESI, is moved into
Indirect Method Calls (calli)
C# doesn't supply a mechanism with which to emit a
calli instruction in the IL. You can, of course, emit code using reflection, but an example would introduce more complexity than necessary. If you were to imagine that a
calli sequence were being JIT compiled, the only difference introduced would be that the native call instruction would perform a
call dword ptr [exx], where
exx is the register in which the target address of the
calli was found. That is, it
calls the address to which the indirect pointer refers. All of the arguments would be passed in accordance to the method token supplied to the
Dynamic Method Calls (Delegates, Others)
There is a range of dynamic method calls available. Many of them are part of the dynamic programming infrastructure supplied by reflection, and thus won't be explored in depth here. They are all variants on the same basic premise, which is that some piece of runtime functionality is able to look up the method-table information at runtime to make a method dispatch. The runtime can then, of course, make calls to this code as requested, based on information supplied by the programmer.
Delegates are an interesting special case of this capability. A delegate is essentially just a strongly typed function pointer type, an instance of which has two pieces of information: the target object (to be passed as
this), and the target method token. Each delegate type has a special
Invoke method whose signature matches the function over which it has been formed. The CLR supplies the implementation of this method, which enables it to perform lightweight dispatch to the underlying method.
A call to a delegate looks identical to a call to a normal method. The difference is that the target is the delegate's
Invoke method-table slot instead of the actual underlying function. Arguments are laid out as with any other type of call (i.e.,
_fastcall). The implementation of
Invoke simply patches the
ECX register to contain the target object reference (supplied at delegate construction time) and uses the method token (also supplied at delegate construction time) to jump to the appropriate method-slot. There is very little overhead in this process, which makes delegate dispatch on the order of one to two times the speed than a simple virtual method call.
The various other styles of method dispatch — such as
MethodInfo.Invoke, and so forth — all add a certain level of overhead when compared to delegates, because they must go through the process of binding to the target method. This is the process of matching dynamic type, method name, and argument information to the list of known loaded types. Delegates typically don't suffer this penalty because the target method token is embedded in the IL. You may dynamically construct and invoke delegates (e.g., with
DynamicInvoke), which adds a comparable level of overhead for the construction and binding process. Another penalty associated with pure dynamic invocation, is that these mechanisms tend to pass arguments as
objects. This requires that the dispatching code inside the CLR must transform that information into the appropriate calling convention to perform the invocation, by unraveling the array, and then perform the necessary marshaling on the return.
This was a very brief overview of something that is incredibly deep. More details, including the performance characteristics, and how you can play around with some of these implementation details through spelunking in the Visual Studio debugger, are outlined in this MSDN video.
This article is adapted from Professional .NET Framework 2.0 by Joe Duffy (Wrox, 2006, ISBN: 0-7645-7135-4), from chapter 3 "Inside the CLR." Joe is a Program Manager on the CLR Team at Microsoft, where he works on WinFX and the .NET Framework. Joe's other recent related article at Wrox.com is Common Type System (CTS): One Platform to Rule Them All.