问题描述
编辑:性能上的巨大差异是由于测试中的错误所致,如果设置正确, Eigen
的速度要快2到3倍
The huge difference in performance is due to a bug in the test, when set up properly Eigen
is 2 to 3 times faster.
我注意到使用 C ++
库比使用 Python
库。我在 scipy.sparse
的时间内实现了〜0.03
的秒,而在 Eigen中实现了
在〜25
秒内。也许我在Eigen中做错了什么?
I noticed that sparse matrix multiplication using C++
Eigen library is much slower than using Python
scipy.sparse library. I achieve in scipy.sparse
in ~0.03
seconds what I achieve in Eigen
in ~25
seconds. Maybe I doing something wrong in Eigen?
此处Python代码:
Here Python code:
from scipy import sparse
from time import time
import random as rn
N_VALUES = 200000
N_ROWS = 400000
N_COLS = 400000
rows_a = rn.sample(range(N_COLS), N_VALUES)
cols_a = rn.sample(range(N_ROWS), N_VALUES)
values_a = [rn.uniform(0,1) for _ in xrange(N_VALUES)]
rows_b = rn.sample(range(N_COLS), N_VALUES)
cols_b = rn.sample(range(N_ROWS), N_VALUES)
values_b = [rn.uniform(0,1) for _ in xrange(N_VALUES)]
big_a = sparse.coo_matrix((values_a, (cols_a, rows_a)), shape=(N_ROWS, N_COLS))
big_b = sparse.coo_matrix((values_b, (cols_b, rows_b)), shape=(N_ROWS, N_COLS))
big_a = big_a.tocsr()
big_b = big_a.tocsr()
start = time()
AB = big_a * big_b;
end = time()
print 'time taken : {}'.format(end - start)
C ++代码:
#include <iostream>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <Eigen/Dense>
#include <Eigen/Sparse>
using namespace Eigen;
std::vector<long> gen_random_sample(long min, long max, long sample_size);
double get_random_double(double min, double max);
std::vector<double> get_vector_of_rn_doubles(int length, double min, double max);
int main()
{
long N_COLS = 400000;
long N_ROWS = 400000;
long N_VALUES = 200000;
SparseMatrix<double> big_A(N_ROWS, N_COLS);
std::vector<long> cols_a = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<long> rows_a = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<double> values_a = get_vector_of_rn_doubles(N_VALUES, 0, 1);
for (int i = 0; i < N_VALUES; i++)
big_A.insert(cols_a[i], cols_a[i]) = values_a[i];
// big_A.makeCompressed(); // slows things down
SparseMatrix<double> big_B(N_ROWS, N_COLS);
std::vector<long> cols_b = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<long> rows_b = gen_random_sample(0, N_COLS, N_VALUES);
std::vector<double> values_b = get_vector_of_rn_doubles(N_VALUES, 0, 1);
for (int i = 0; i < N_VALUES; i++)
big_B.insert(cols_b[i], cols_b[i]) = values_b[i];
// big_B.makeCompressed();
SparseMatrix<double> big_AB(N_ROWS, N_COLS);
clock_t begin = clock();
big_AB = (big_A * big_B); //.pruned();
clock_t end = clock();
double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
std::cout << "Time taken : " << elapsed_secs << std::endl;
}
std::vector<long> gen_random_sample(long min, long max, long sample_size)
{
std::vector<long> my_vector(sample_size); // THE BUG, is right std::vector<long> my_vector
for (long i = min; i != max; i++)
{
my_vector.push_back(i);
}
std::random_shuffle(my_vector.begin(), my_vector.end());
std::vector<long> new_vec = std::vector<long>(my_vector.begin(), my_vector.begin() + sample_size);
return new_vec;
}
double get_random_double(double min, double max)
{
std::uniform_real_distribution<double> unif(min, max);
std::default_random_engine re;
double a_random_double = unif(re);
}
std::vector<double> get_vector_of_rn_doubles(int length, double min, double max)
{
std::vector<double> my_vector(length);
for (int i=0; i < length; i++)
{
my_vector[i] = get_random_double(min, max);
}
return my_vector;
}
我编译为: g ++ -std = c + +11 -I / usr / include / eigen3 time_eigen.cpp -o my_exec -O2 -DNDEBUG
。
我错过了一种方法
推荐答案
如果不使用 -DNDEBUG
进行编译,则稀疏乘法会快速吗? ,那么您会看到矩阵已损坏,因为您多次插入相同的元素,并且insert方法不允许这样做。
If you compile without -DNDEBUG
, then you will see that your matrices are corrupted because you are inserting the same elements multiple times and the insert method does not allow this.
用<$ c $替换它们c> coeffRef(i,j)+ = value 或使用文档中建议的三元组列表。经过这一小小的修复后,在我的计算机上使用Python花费的 0.012s
和 0.021s
的C ++代码。请注意,由于输入矩阵并不完全相同,但是至少它们的顺序相同,所以无法从这两个数中真正推断出哪个更快。
Replace them with coeffRef(i,j) += value
or use a triplet list as recommended in the documentation. After this small fix, it takes 0.012s
for the C++ code, and 0.021s
with Python on my computer. Note that you cannot truly deduce which one is faster from these two numbers as the input matrices are not exactly the same, but at least they are in the same order.
这篇关于C ++ Eigen稀疏矩阵乘法比python scipy.sparse慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!