问题描述
我们的软件是通过一定的解压缩字节数据在 GZipStream
,其内容从的MemoryStream
数据。这些数据解压在4KB块,并写入到另一个的MemoryStream
。
Our software is decompressing certain byte data through a GZipStream
, which reads data from a MemoryStream
. These data are decompressed in blocks of 4KB and written into another MemoryStream
.
我们已经意识到,内存中的进程分配比实际解压的数据高得多
We've realized that the memory the process allocates is much higher than the actual decompressed data.
例:
的压缩字节数组2425536字节被压缩到23050718字节。我们使用的内存探查表明,该方法 MemoryStream.set_Capacity(的Int32值)
分配67104936字节。 ,这是预留和实际写入存储器之间的2.9倍。
Example:A compressed byte array with 2,425,536 bytes gets decompressed to 23,050,718 bytes. The memory profiler we use shows that the Method MemoryStream.set_Capacity(Int32 value)
allocated 67,104,936 bytes. That's a factor of 2.9 between reserved and actually written memory.
请注意: MemoryStream.set_Capacity
是<$称为C $ C> MemoryStream.EnsureCapacity 这是自己从 MemoryStream.Write
在我们的函数调用。
Note: MemoryStream.set_Capacity
is called from MemoryStream.EnsureCapacity
which is itself called from MemoryStream.Write
in our function.
为什么在的MemoryStream
储备这么多的能力,即使它只是追加4KB块?
Why does the MemoryStream
reserve so much capacity, even though it only appends blocks of 4KB?
这里是一个解压缩数据的代码片段:
Here is the code snippet which decompresses data:
private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream())
{
byte[] buffer = new byte[4096];
int iCount = 0;
while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}
请注意:如果相关,这是系统配置:
Note: If relevant, this is the system configuration:
- 的Windows XP 32位,
- .NET 3.5
- 编译与Visual Studio 2008
推荐答案
由于的。
public override void Write(byte[] buffer, int offset, int count) {
//... Removed Error checking for example
int i = _position + count;
// Check for overflow
if (i < 0)
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
if (i > _length) {
bool mustZero = _position > _length;
if (i > _capacity) {
bool allocatedNewArray = EnsureCapacity(i);
if (allocatedNewArray)
mustZero = false;
}
if (mustZero)
Array.Clear(_buffer, _length, i - _length);
_length = i;
}
//...
}
private bool EnsureCapacity(int value) {
// Check for overflow
if (value < 0)
throw new IOException(Environment.GetResourceString("IO.IO_StreamTooLong"));
if (value > _capacity) {
int newCapacity = value;
if (newCapacity < 256)
newCapacity = 256;
if (newCapacity < _capacity * 2)
newCapacity = _capacity * 2;
Capacity = newCapacity;
return true;
}
return false;
}
public virtual int Capacity
{
//...
set {
//...
// MemoryStream has this invariant: _origin > 0 => !expandable (see ctors)
if (_expandable && value != _capacity) {
if (value > 0) {
byte[] newBuffer = new byte[value];
if (_length > 0) Buffer.InternalBlockCopy(_buffer, 0, newBuffer, 0, _length);
_buffer = newBuffer;
}
else {
_buffer = null;
}
_capacity = value;
}
}
}
所以,你打的每一次容量限制它加倍的能力的大小。它这样做的原因是, Buffer.InternalBlockCopy
操作较慢的大型阵列,所以如果它不得不经常调整每个写致电表现会显著下降。
So every time you hit the capacity limit it doubles the size of the capacity. The reason it does this is that Buffer.InternalBlockCopy
operation is slow for large arrays so if it had to frequently resize every Write call the performance would drop significantly.
有几件事情可以做,以提高性能为你的是,你可以设置初始容量是你的压缩阵列中的至少大小,然后你可以通过一个因素增加的大小小于 2.0
来减少你所使用的内存量。
A few things you could do to improve the performance for you is you could set the initial capacity to be at least the size of your compressed array and you could then increase size by a factor smaller than 2.0
to reduce the amount of memory you are using.
const double ResizeFactor = 1.25;
private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream(data.Length * ResizeFactor)) //Set the initial size to be the same as the compressed size + 25%.
{
byte[] buffer = new byte[4096];
int iCount = 0;
while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
if(resultStream.Capacity < resultStream.Length + iCount)
resultStream.Capacity = resultStream.Capacity * ResizeFactor; //Resize to 125% instead of 200%
resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}
如果你想你可以做更看中的算法如缩放基于当前的压缩率
If you wanted to you could do even more fancy algorithms like resizing based on the current compression ratio
const double MinResizeFactor = 1.05;
private byte[] Decompress(byte[] data)
{
using (MemoryStream compressedStream = new MemoryStream(data))
using (GZipStream zipStream = new GZipStream(compressedStream, CompressionMode.Decompress))
using (MemoryStream resultStream = new MemoryStream(data.Length * MinResizeFactor)) //Set the initial size to be the same as the compressed size + the minimum resize factor.
{
byte[] buffer = new byte[4096];
int iCount = 0;
while ((iCount = zipStream.Read(buffer, 0, buffer.Length)) > 0)
{
if(resultStream.Capacity < resultStream.Length + iCount)
{
double sizeRatio = ((double)resultStream.Position + iCount) / (compressedStream.Position + 1); //The +1 is to prevent divide by 0 errors, it may not be necessary in practice.
//Resize to minimum resize factor of the current capacity or the
// compressed stream length times the compression ratio + min resize
// factor, whichever is larger.
resultStream.Capacity = Math.Max(resultStream.Capacity * MinResizeFactor,
(sizeRatio + (MinResizeFactor - 1)) * compressedStream.Length);
}
resultStream.Write(buffer, 0, iCount);
}
return resultStream.ToArray();
}
}
这篇关于为什么C#内存流储备这么多的内存?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!