文章中的C++代码未经特别声明,均为VC编译。
使用VC编译器生成汇编代码:
运行"cl filename.cpp /Fa"生成"filename.cpp"的中间汇编代码。这些代码没有经过编译器优化,所以要比编译成EXE后再返汇编得到的汇编代码来得更易读;更方便的是,编译器会在asm文件中生成注释,将C++代码的行号对应到asm代码中。
在运行cl.exe前,必须先运行"C:\Program Files\Microsoft Visual Studio\VC98\Bin\VCVARS32.BAT"注册环境变量。
一个C++的对象在内存中到底是个什么样子呢?先来看看下面的代码:
#include <stdio.h>
class test
{
public:
int m1;
int m2;
int m3;
virtual int function1(){return 1;}
virtual int function2(){return 2;}
int function3(){return 3;}
int function4(){return 4;}
};
int main()
{
test *ptr1=new test();
test *ptr2=new test();
printf("Size of test:\t%d\n",sizeof(test));
printf("Addr of ptr1:\t0x%08X\n",ptr1);
printf("Addr of m1:\t0x%08X\n",&(ptr1->m1));
printf("Addr of m2:\t0x%08X\n",&(ptr1->m2));
printf("Addr of m3:\t0x%08X\n",&(ptr1->m3));
printf("Addr of vtable:\t0x%08X\n",*(unsigned int *)((void *)ptr1));
printf("\n");
printf("Addr of ptr2:\t0x%08X\n",ptr2);
printf("Addr of m1:\t0x%08X\n",&(ptr2->m1));
printf("Addr of m2:\t0x%08X\n",&(ptr2->m2));
printf("Addr of m3:\t0x%08X\n",&(ptr2->m3));
printf("Addr of vtable:\t0x%08X\n",*(unsigned int *)((void *)ptr2));
printf("\n");
printf("Addr of vtable[0]:\t0x%08X\n",**((int**)ptr1));
printf("Addr of vtable[1]:\t0x%08X\n",*(*((int**)ptr1)+1));
printf("Addr of function1:\t0x%08X\n",(test::function1));
printf("Addr of function2:\t0x%08X\n",(test::function2));
printf("Addr of function3:\t0x%08X\n",(test::function3));
printf("Addr of function4:\t0x%08X\n",(test::function4));
return 0;
}
在VC中编译运行后的结果是:
Size of test: 16
Addr of ptr1: 0x00340758
Addr of m1: 0x0034075C
Addr of m2: 0x00340760
Addr of m3: 0x00340764
Addr of vtable: 0x004060B0
Addr of ptr2: 0x00340770
Addr of m1: 0x00340774
Addr of m2: 0x00340778
Addr of m3: 0x0034077C
Addr of vtable: 0x004060B0
Addr of vtable[0]: 0x00401210
Addr of vtable[1]: 0x00401220
Addr of function1: 0x00401230
Addr of function2: 0x00401240
Addr of function3: 0x004011E0
Addr of function4: 0x004011F0
可以确定,test对象在内存中的大小是16字节,结构如下:
其中pvtable是一个指向虚函数表的指针,C++依赖vtable实现动态编联,在程序运行时,依靠vtable中的函数指针来执行相应的虚函数。但是执行的结果却与这个模型有些出入:
Addr of vtable[0]: 0x00401210
Addr of vtable[1]: 0x00401220
Addr of function1: 0x00401230
Addr of function2: 0x00401240
vtable[0]、vtable[1]和function1、function2并不对应,虽然它们的内存地址十分接近。究竟是怎么回事,还是反汇编看看:
:00401230 8B01 mov eax, dword ptr [ecx] ;将vtble地址放到eax寄存器
:00401232 FF20 jmp dword ptr [eax] ;跳转到vtable指向的function1
:00401234 CC int 03
... ...
:0040123F CC int 03
:00401240 8B01 mov eax, dword ptr [ecx] ;将vtble地址放到eax寄存器
:00401242 FF6004 jmp [eax+04] ;跳转到vtable指向的function2
注:对于thiscall函数调用,ecx寄存器中保存的是该对象的this指针。这两段代码样子差不多,都是从vtable中找到对应的虚函数地址,然后跳转到虚函数里。VC之所以不暴露真正的虚函数地址是为了实现对象的多态性,因为在程序执行前,虚函数的地址是不能确定的;也不应该是确定的。
(待续)
http://www.donews.net/tabris17/archive/2005/02/13/275979.aspx