Eigen稀疏矩阵乘法比python

Eigen稀疏矩阵乘法比python

本文介绍了C ++ Eigen稀疏矩阵乘法比python scipy.sparse慢得多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

编辑:性能上的巨大差异是由于测试中的错误所致,如果设置正确, Eigen 的速度要快2到3倍

The huge difference in performance is due to a bug in the test, when set up properly Eigen is 2 to 3 times faster.

我注意到使用 C ++ 库比使用 Python 库。我在 scipy.sparse 的时间内实现了〜0.03 的秒,而在 Eigen中实现了〜25 秒内。也许我在Eigen中做错了什么?

I noticed that sparse matrix multiplication using C++ Eigen library is much slower than using Python scipy.sparse library. I achieve in scipy.sparse in ~0.03 seconds what I achieve in Eigen in ~25 seconds. Maybe I doing something wrong in Eigen?

此处Python代码:

Here Python code:

from scipy import sparse
from time import time
import random as rn

N_VALUES = 200000
N_ROWS = 400000
N_COLS = 400000

rows_a = rn.sample(range(N_COLS), N_VALUES)
cols_a = rn.sample(range(N_ROWS), N_VALUES)
values_a = [rn.uniform(0,1) for _ in xrange(N_VALUES)]

rows_b = rn.sample(range(N_COLS), N_VALUES)
cols_b = rn.sample(range(N_ROWS), N_VALUES)
values_b = [rn.uniform(0,1) for _ in xrange(N_VALUES)]

big_a = sparse.coo_matrix((values_a, (cols_a, rows_a)), shape=(N_ROWS, N_COLS))
big_b = sparse.coo_matrix((values_b, (cols_b, rows_b)), shape=(N_ROWS, N_COLS))

big_a = big_a.tocsr()
big_b = big_a.tocsr()

start = time()

AB = big_a * big_b;

end = time()

print 'time taken : {}'.format(end - start)

C ++代码:

#include <iostream>
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <Eigen/Dense>
#include <Eigen/Sparse>

using namespace Eigen;

std::vector<long> gen_random_sample(long min, long max, long sample_size);
double get_random_double(double min, double max);
std::vector<double> get_vector_of_rn_doubles(int length, double min, double max);

int main()
{

  long N_COLS = 400000;
  long N_ROWS = 400000;
  long N_VALUES = 200000;

  SparseMatrix<double> big_A(N_ROWS, N_COLS);
  std::vector<long> cols_a = gen_random_sample(0, N_COLS, N_VALUES);
  std::vector<long> rows_a = gen_random_sample(0, N_COLS, N_VALUES);
  std::vector<double> values_a = get_vector_of_rn_doubles(N_VALUES, 0, 1);

  for (int i = 0; i < N_VALUES; i++)
    big_A.insert(cols_a[i], cols_a[i]) = values_a[i];
  // big_A.makeCompressed(); // slows things down

  SparseMatrix<double> big_B(N_ROWS, N_COLS);
  std::vector<long> cols_b = gen_random_sample(0, N_COLS, N_VALUES);
  std::vector<long> rows_b = gen_random_sample(0, N_COLS, N_VALUES);
  std::vector<double> values_b = get_vector_of_rn_doubles(N_VALUES, 0, 1);

  for (int i = 0; i < N_VALUES; i++)
    big_B.insert(cols_b[i], cols_b[i]) = values_b[i];
  // big_B.makeCompressed();

  SparseMatrix<double> big_AB(N_ROWS, N_COLS);

  clock_t begin = clock();

  big_AB = (big_A * big_B); //.pruned();

  clock_t end = clock();
  double elapsed_secs = double(end - begin) / CLOCKS_PER_SEC;
  std::cout << "Time taken : " << elapsed_secs << std::endl;

}

std::vector<long> gen_random_sample(long min, long max, long sample_size)
{
  std::vector<long> my_vector(sample_size); // THE BUG, is right std::vector<long> my_vector

  for (long i = min; i != max; i++)
    {
      my_vector.push_back(i);
    }

  std::random_shuffle(my_vector.begin(), my_vector.end());

  std::vector<long> new_vec = std::vector<long>(my_vector.begin(), my_vector.begin() + sample_size);

    return new_vec;
}

double get_random_double(double min, double max)
{
   std::uniform_real_distribution<double> unif(min, max);
   std::default_random_engine re;
   double a_random_double = unif(re);
}

std::vector<double> get_vector_of_rn_doubles(int length, double min, double max)
{
  std::vector<double> my_vector(length);
  for (int i=0; i < length; i++)
    {
      my_vector[i] = get_random_double(min, max);
    }
  return my_vector;
}

我编译为: g ++ -std = c + +11 -I / usr / include / eigen3 time_eigen.cpp -o my_exec -O2 -DNDEBUG

我错过了一种方法

推荐答案

如果不使用 -DNDEBUG 进行编译,则稀疏乘法会快速吗? ,那么您会看到矩阵已损坏,因为您多次插入相同的元素,并且insert方法不允许这样做。

If you compile without -DNDEBUG, then you will see that your matrices are corrupted because you are inserting the same elements multiple times and the insert method does not allow this.

用<$ c $替换它们c> coeffRef(i,j)+ = value 或使用文档中建议的三元组列表。经过这一小小的修复后,在我的计算机上使用Python花费的 0.012s 0.021s 的C ++代码。请注意,由于输入矩阵并不完全相同,但是至少它们的顺序相同,所以无法从这两个数中真正推断出哪个更快。

Replace them with coeffRef(i,j) += value or use a triplet list as recommended in the documentation. After this small fix, it takes 0.012s for the C++ code, and 0.021s with Python on my computer. Note that you cannot truly deduce which one is faster from these two numbers as the input matrices are not exactly the same, but at least they are in the same order.

这篇关于C ++ Eigen稀疏矩阵乘法比python scipy.sparse慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 00:02