VxWorks上STL的效率 - 王朝网络宽屏版

沉醉于STL强大的功能及其中的泛型编程思想，在VxWorks上写程序时使用了STL，一切都正常，除了在最后做性能测试时根本不相信程序怎么会那么慢。

为此，专门编写了下面的小程序做对比测试。先看测试结果：

TestMap last: 201

TestVector last: 116

Test last: 2

另外，还要注意生成的.o文件的大小：只有test： 38K

包含MAP： 98K

包含VECTOR: 68K

(按照VxWorks FAQ上所说，在编译时加入了-fno-exceptions -fno-rtti，似乎没有效果）

测试程序如下：

#include <map>

#include <vector>

#include <cstdio>

#include "tickLib.h"

using namespace std;

#define TEST_MAP

#define TEST_VECTOR

#ifdef TEST_MAP

int TestMap()

{

map<int, int> test_map;

test_map.insert(make_pair(1, 111));

test_map.insert(make_pair(2, 222));

test_map.insert(make_pair(3, 333));

map<int, int>::iterator it = test_map.find(3);

int ret = (*it).second;

//printf("%d\n", ret);

return 0;

}

#endif

#ifdef TEST_VECTOR

int TestVector()

{

vector<int> test_vec;

test_vec.push_back(111);

test_vec.push_back(222);

test_vec.push_back(333);

vector<int>::iterator it = test_vec.begin();

for( ; it != test_vec.end(); ++it) if (*it == 333);

return 0;

}

#endif

int test()

{

int arr[5];

arr[0] = 111;

arr[1] = 222;

arr[2] = 333;

for (int i = 0; i < 3; i++) if (arr[i] == 333);

//printf("%d\n", arr[2]);

return 0;

}

int main()

{

int i, num = 100000;

#ifdef TEST_MAP

int map_tick = tickGet();

for (i = 0; i < num; i++)

TestMap();

printf("TestMap last: %d\n", tickGet() - map_tick);

#endif

#ifdef TEST_VECTOR

int vec_tick = tickGet();

for (i = 0; i < num; i++)

TestVector();

printf("TestVector last: %d\n", tickGet() - vec_tick);

#endif

int base_tick = tickGet();

for (i = 0; i < num; i++)

test();

printf("Test last: %u\n", tickGet() - base_tick);

}

得到这个结果令人沮丧无比。不过，怎么可能这么慢呢？而且编译的为什么会那么大？

仔细想想，泛型编程中用到了大量的模板，模板虽不会像宏替换那样“简单”，但是，每次实例化一种容器，相当于用此实例，将容器的实现全部宏替换一下。这也是为什么使用模板定义的类只能在头文件中：它只是广义的类型（包括实现代码）。（我并没有深入了解模板的具体实现，只是推测）

这或许可以解释编译出的文件大，但为什么会那么慢呢？至少，Vector不应该那么慢吧？关于STL的效率，有句话是这么说的：它的效率与你自己实现时的效率一样。再仔细看程序，在push_back中，STL需要使用allocator分配内存；begin,end是两次函数调用。这些操作产生这样的效率应该不算不合理。

STL是个好东西，但是，并不是任何情况下它都是想要的那样。