我正在通过编码一些数据结构来学习C ++,并注意到在我的测试代码中,Vector :: add的第一次调用比第二次和第三次调用花费了近4倍的时间。
请有人可以解释一下为什么第一次调用如此昂贵?我认为这是因为模板(因此将其删除),然后由于未进行内联(因此被内联),现在我被压倒了,只是猜测是由于c ++运行时。
src / vector.h
//#include <cstddef>
#include <iostream>
#include <chrono>
using namespace std;
class Vector {
public:
explicit Vector(const int n);
explicit Vector(const int n, const float val);
float& operator[](const int i);
inline int const length();
inline void fill(const float val);
inline void add(const float val)
{
chrono::steady_clock::time_point start = chrono::steady_clock::now();
for (int i = 0; i < len; ++i) {
arr[i] += val;
}
chrono::steady_clock::time_point end = chrono::steady_clock::now();
cout << "inside add took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
<< "us.\n";
}
inline float sum();
private:
float* arr;
int len;
};
vector_test.cpp
#include "vector.h"
#include "yepCore.h"
#include "yepLibrary.h"
#include <iostream>
#include <chrono>
#include <vector>
#include <cstdlib>
using namespace std;
int main()
{
/* Initialize the Yeppp! library */
yepLibrary_Init();
const int n = 5000000;
float *a = (float*) calloc(n, sizeof(float));
yepCore_Add_V32fS32f_V32f(a, 1, a, n);
float sum = 0;
//cout << "starting tests" << '\n';
chrono::steady_clock::time_point start = chrono::steady_clock::now();
Vector vec(n);
chrono::steady_clock::time_point end = chrono::steady_clock::now();
cout << "vec constructor took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
<< "us.\n";
start = chrono::steady_clock::now();
vec.add(1.0);
end = chrono::steady_clock::now();
cout << "1st vec add took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
<< "us.\n";
start = chrono::steady_clock::now();
vec.add(1.0);
end = chrono::steady_clock::now();
cout << "2nd vec add took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
<< "us.\n";
start = chrono::steady_clock::now();
vec.add(1.0);
end = chrono::steady_clock::now();
cout << "3rd vec add took " << chrono::duration_cast<chrono::microseconds>(end - start).count()
<< "us.\n";
//std::cout << "a1 length = " << a1.length() << '\n';
//std::cout << "a2[0] = " << a2[0] << '\n';
}
输出:
vec constructor took 8us.
inside add took 11918us.
1st vec add took 11947us.
inside add took 2379us.
2nd vec add took 2405us.
inside add took 2374us.
3rd vec add took 2405us
最佳答案
打开我的心灵感应技能,我的猜测是,您在Vector(const int n)
中为arr
分配了内存,但不初始化其内容(这就是矢量创建如此快速的原因)。因此,当您首次访问数据(即第一个add()
中的数据)时,会发生由操作系统进行的实际分配。