本文介绍了具有超过65535 ^ 2个元素的2d阵列->阵列尺寸超出支持范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一台具有128 GB RAM的64位PC,并且正在使用C#和.NET 4.5.我有以下代码:

I've a 64-bit PC with 128 GB of RAM and I'm using C# and .NET 4.5.I've the following code:

double[,] m1 = new double[65535, 65535];
long l1 = m1.LongLength;

double[,] m2 = new double[65536, 65536]; // Array dimensions exceeded supported range
long l2 = m2.LongLength;

我知道<gcAllowVeryLargeObjects enabled="true" />,并将其设置为true.

I'm aware of <gcAllowVeryLargeObjects enabled="true" /> and I've set it to true.

为什么多维数组的元素数不能超过4294967295?我看到了以下答案 https://stackoverflow.com/a/2338797/7556646 .

Why can a multidimensional array not have more than 4294967295 elements?I saw the following answer https://stackoverflow.com/a/2338797/7556646.

我还检查了 gcAllowVeryLargeObjects ,我看到了下面的评论.

I checked as well the documentation for gcAllowVeryLargeObjects and I saw the following remark.

我不明白为什么会有这个限制?有解决方法吗?是否计划在即将发布的.net版本中取消此限制?

I cannot understand why there is this limit? Is there a workaround? Is it planned to remove this limit in an upcoming version of .net?

我之所以需要内存中的元素,是因为我想使用Intel MKL计算例如对称特征值分解.

I need the elements in that why in the memory because I want to compute for example a symmetric eigen-value decomposition using Intel MKL.

[DllImport("custom_mkl", CallingConvention = CallingConvention.Cdecl, ExactSpelling = true, SetLastError = false)]
internal static extern lapack_int LAPACKE_dsyevd(
    int matrix_layout, char jobz, char uplo, lapack_int n, [In, Out] double[,] a, lapack_int lda, [In, Out] double[] w);

推荐答案

免责声明:这一结果比预期的要长

CLR不支持托管堆上的大型阵列有多种原因.

There are multiple reasons why the CLR doesn't support large arrays on the managed heap.

其中有些是技术性的,有些可能是范式"的.

Some of them are technical, some of them might be "paradigmal".

博客文章讨论了为什么存在限制的一些原因.本质上,由于内存碎片,决定要限制(大写O)对象的最大大小.考虑到以下事实,权衡了实现处理较大对象的成本:没有多少用例需要使用如此大的对象,而在大多数情况下,这样做是由于程序员的设计谬误造成的.并且由于对于CLR,一切是一个对象,因此此限制也适用于数组.为了强制执行此限制,数组索引器设计了带符号整数.

This blog post goes into some of the reasons as to why there is a limitation. Essentially there was a decision to limit the maximum size of (capital O) Objects due to memory fragmentation. The cost of implementing the handling of larger objects was weighed against the fact that not many use cases exist[ed] that would require such large objects and those that did, would - in most cases - be due to a design fallacy of the programmer.And since, for the CLR, everything is an Object, this limitation also applies to arrays. To enforce this limitation array indexers were designed with signed integers.

但是一旦确定,您的程序设计要求您拥有如此大的阵列,那么您将需要解决方法.

But once you have made sure, that your program design requires you to have such large arrays you are going to need a workaround.

上面提到的博客文章还展示了您可以实现大型数组而无需进入不受管理的领域.

The above mentioned blog post also demonstrates that you can implement big arrays without going into unmanaged territory.

但是,正如Evk在注释中指出的那样,您想通过PInvoke将数组作为一个整体传递给外部函数.这意味着您将需要在非托管堆上的数组,否则在调用期间必须将其编组.使用这么大的数组对整个事情进行封送是一个坏主意.

But as Evk has pointed out in the comments you want to pass the array as a whole to an external function via PInvoke. That means you'll need the array on the unmanaged heap, or it'll have to be marshaled during the call. And marshaling the whole thing is a bad idea with arrays this large.

因此,由于托管堆是不可能的,因此您需要在非托管堆上分配空间,并将该空间用于阵列.

So since the managed heap is out of the question you'll need to allocate space on the unmanaged heap and use that space for your array.

假设您需要8 GB的空间:

Let's say you need 8 GB worth of space:

long size = (1L << 33);
IntPtr basePointer = System.Runtime.InteropServices.Marshal.AllocHGlobal((IntPtr)size);

太好了!现在,您在虚拟内存中有一个区域,可以存储多达8 GB的数据.

Great! Now you have a region in virtual memory where you can store up to 8 GB worth of data.

如何将其转换为数组?

C#中有两种方法

这将使您可以使用指针.并且可以将指针转换为数组. (在香草C中,它们通常是相同的)

This will let you work with pointers. And pointers can be cast to arrays. (In vanilla C they are often one and the same)

如果您对如何通过指针实现2D数组有一个好主意,那么这将是您的最佳选择.

If you have a good idea on how to realize 2D Arrays via pointers, then this will be the best option for you.

这里是指针

您不需要不安全的上下文,而必须将数据从托管堆封送"到非托管堆.您仍然必须了解指针算法.

You don't need the unsafe context and have to instead "marshal" your data from the managed heap to the unmanaged one. You'll still have to understand pointer arithmetic.

您要使用的两个主要功能是 PtrToStructure 和相反的 StructureToPtr .使用其中一个,您将获得非托管堆上指定位置之外的值类型(如double)的副本.另一个将在非托管堆上放置值类型的副本.

The two main functions you'll want to use are PtrToStructure and the reverse StructureToPtr. With one you'll get a copy of a value type (such as a double) out of a specified position on the unmanaged heap. With the other you'll put a copy of a value type on the unmanaged heap.

从某种意义上说,这两种方法都是不安全的".您需要了解您的指针

Both approaches are "unsafe" in a sense. You'll need to know your pointers

  • 忘记严格检查界限
  • 混合元素的大小
  • 整理路线
  • 混合您想要的2D阵列
  • 忘记使用2D阵列进行填充
  • 忘记释放内存
  • 忘记释放内存并始终使用它
  • Forgetting to check bounds rigorously
  • Mixing up the size of my elements
  • Messing up the alignment
  • Mixing up what kind of 2D Array you want
  • Forgetting about padding with 2D Arrays
  • Forgetting to free memory
  • Forgetting to have freed memory and using it anyways

您可能希望将2D阵列设计转变为1D阵列设计

You'll probably want to turn your 2D array desing into a 1D array design

无论如何,您都希望将它们全部包装到具有适当检查和解释器的类中.

In any case you would want to wrap it all into a class with the appropriate checks and destsructors.

下面是一个基于非托管堆的通用类,类似于数组.

What follows is a generic class that is "like" an array, based on the unmanaged heap.

功能包括:

  • 它具有一个可以访问64位整数的索引访问器.
  • 它将T可以变为值类型的类型限制.
  • 具有边界检查功能,并且是一次性的.
  • It has an index accessor that accepts 64 bit integers.
  • It restricts the types that T can become to value types.
  • It has bounds checking and is disposable.

如果您注意到,我不会进行任何类型检查,因此,如果Marshal.SizeOf无法返回正确的数字,我们将落入上述其中一个陷阱.

If you notice, I don't do any type checking, so if Marshal.SizeOf fails to return the correct number we are falling in one of the pits mentioned above.

您必须实现的功能包括:

Features that you'll have to implement yourself include:

  • 2D访问器和2D数组算术(取决于其他库的期望,通常类似于p = x * size + y
  • 用于PInvoke(或内部调用)的公开指针

因此,如果有的话,请仅将此作为灵感.

So use this only as a inspiration, if at all.

using static System.Runtime.InteropServices.Marshal;

public class LongArray<T> : IDisposable where T : struct {
    private IntPtr _head;
    private Int64 _capacity;
    private UInt64 _bytes;
    private Int32 _elementSize;

    public LongArray(long capacity) {
        if(_capacity < 0) throw new ArgumentException("The capacity can not be negative");
        _elementSize = SizeOf(default(T));
        _capacity = capacity;
        _bytes = (ulong)capacity * (ulong)_elementSize;

        _head = AllocHGlobal((IntPtr)_bytes);
    }

    public T this[long index] {
        get {
            IntPtr p = _getAddress(index);

            T val = (T)System.Runtime.InteropServices.Marshal.PtrToStructure(p, typeof(T));

            return val;
        }
        set {
            IntPtr p = _getAddress(index);

            StructureToPtr<T>(value, p, true);
        }
    }

    protected bool disposed = false;
    public void Dispose() {
        if(!disposed) {
            FreeHGlobal((IntPtr)_head);
            disposed = true;
        }
    }

    protected IntPtr _getAddress(long index) {
        if(disposed) throw new ObjectDisposedException("Can't access the array once it has been disposed!");
        if(index < 0) throw new IndexOutOfRangeException("Negative indices are not allowed");
        if(!(index < _capacity)) throw new IndexOutOfRangeException("Index is out of bounds of this array");
        return (IntPtr)((ulong)_head + (ulong)index * (ulong)(_elementSize));
    }
}

这篇关于具有超过65535 ^ 2个元素的2d阵列->阵列尺寸超出支持范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 00:55