我有一个程序,该程序基本上从数据库中提取数据,将其缓存到文件中,然后将数据导出为多种格式(Excel,Excel 2003,CSV)。我正在使用OpenXML SDK 2.0来完成Excel工作。这些导出过程是并行运行的(使用Parallel.ForEach
),并且数据量可能非常大-例如有些CSV为800MB。在这些较大的导出过程中,我注意到XML文档的编写将挂起。例如,如果我有8个并行导出,则有时它们都将“暂停”。他们都徘徊在同一点:
//this.Writer is an OpenXmlWriter which was created from a WorksheetPart.
this.Writer.WriteElement(new Cell()
{
CellValue = new CellValue(value),
DataType = CellValues.String
});
发生这种情况时,我暂停了调试器(在本例中为VS2013),并注意到所有线程都在同一部分代码附近阻塞-有些在OpenXML SDK中更深-但它们都源自对
OpenXmlWriter.WriteElement
的调用。我使用JustDecompile挖掘了源代码,但没有找到任何答案。似乎正在使用中间流,该中间流正在写入隔离存储,并且由于某种原因,这是阻塞的。每一个的基础流都是
FileStream
。这是一个屏幕截图,显示了所有在
OpenXmlWriter.WriteElement
方法中或内部阻塞的并行任务(在这种情况下为8个):这些挂起线程之一的完整堆栈-带有注释。
WindowsBase.dll!MS.Internal.IO.Packaging.PackagingUtilities.CreateUserScopedIsolatedStorageFileStreamWithRandomName Normal
WindowsBase.dll!MS.Internal.IO.Packaging.PackagingUtilities.CreateUserScopedIsolatedStorageFileStreamWithRandomName(int retryCount, out string fileName)
WindowsBase.dll!MS.Internal.IO.Packaging.SparseMemoryStream.EnsureIsolatedStoreStream()
//---> Why are we writing to isolated storage at all?
WindowsBase.dll!MS.Internal.IO.Packaging.SparseMemoryStream.SwitchModeIfNecessary()
WindowsBase.dll!MS.Internal.IO.Zip.ZipIOFileItemStream.Write(byte[] buffer, int offset, int count)
System.dll!System.IO.Compression.DeflateStream.WriteDeflaterOutput(bool isAsync)
System.dll!System.IO.Compression.DeflateStream.Write(byte[] array, int offset, int count)
WindowsBase.dll!MS.Internal.IO.Packaging.CompressStream.Write(byte[] buffer, int offset, int count)
WindowsBase.dll!MS.Internal.IO.Zip.ProgressiveCrcCalculatingStream.Write(byte[] buffer, int offset, int count)
WindowsBase.dll!MS.Internal.IO.Zip.ZipIOModeEnforcingStream.Write(byte[] buffer, int offset, int count)
System.Xml.dll!System.Xml.XmlUtf8RawTextWriter.FlushBuffer()
System.Xml.dll!System.Xml.XmlUtf8RawTextWriter.WriteAttributeTextBlock(char* pSrc, char* pSrcEnd)
System.Xml.dll!System.Xml.XmlUtf8RawTextWriter.WriteString(string text)
System.Xml.dll!System.Xml.XmlWellFormedWriter.WriteString(string text)
DocumentFormat.OpenXml.dll!DocumentFormat.OpenXml.OpenXmlElement.WriteAttributesTo(System.Xml.XmlWriter xmlWriter)
DocumentFormat.OpenXml.dll!DocumentFormat.OpenXml.OpenXmlElement.WriteTo(System.Xml.XmlWriter xmlWriter)
DocumentFormat.OpenXml.dll!DocumentFormat.OpenXml.OpenXmlPartWriter.WriteElement(DocumentFormat.OpenXml.OpenXmlElement elementObject)
//---> At this point, threads seem to be blocking.
MyProject.Common.dll!MyProject.Common.Export.ExcelWriter.WriteLine(string[] values) Line 117
值得一提的是,虽然一次导出了8个东西(在这种情况下),但是每个单独的导出器都在连续写入许多文件。例如,给定的导出可能具有150个基础文件,该文件正导出至该文件-输入数据被分段,并且只有一部分写入每个文件。基本上,我从数据库缓存大量数据,然后读取一行并将其(一个接一个地串联)推入应包含此数据的流中。关键是,如果有8个导出器正在运行,那么也许还会写入1,000个文件,但在任何给定时间只有8个正在积极写入。
最佳答案
我知道这个问题已经很老了,但这是Microsoft有关OpenXml-IsolatedFileStorage的问题。您可以在http://support.microsoft.com/kb/951731处了解有关解决方法的信息:
IsolatedStorageFile类不是线程安全的,IsolatedStorageFile是静态的,并且在所有PackagePart对象之间共享。因此,当访问多个使用IsolatedStorageFile对象缓冲数据的PackagePart流进行写入(也包括刷新)时,就会发现IsolatedStorageFile类中的线程安全问题,从而导致死锁。
基本思想是包装PackagePart流并锁定对其的写入。
他们指出了一个带有包装流的示例。这是实现:
public class PackagePartStream : Stream
{
private readonly Stream _stream;
private static readonly Mutex Mutex = new Mutex(false);
public PackagePartStream(Stream stream)
{
_stream = stream;
}
public override long Seek(long offset, SeekOrigin origin)
{
return _stream.Seek(offset, origin);
}
public override void SetLength(long value)
{
_stream.SetLength(value);
}
public override int Read(byte[] buffer, int offset, int count)
{
return _stream.Read(buffer, offset, count);
}
public override void Write(byte[] buffer, int offset, int count)
{
Mutex.WaitOne(Timeout.Infinite, false);
_stream.Write(buffer, offset, count);
Mutex.ReleaseMutex();
}
public override bool CanRead
{
get { return _stream.CanRead; }
}
public override bool CanSeek
{
get { return _stream.CanSeek; }
}
public override bool CanWrite
{
get { return _stream.CanWrite; }
}
public override long Length
{
get { return _stream.Length; }
}
public override long Position
{
get { return _stream.Position; }
set { _stream.Position = value; }
}
public override void Flush()
{
Mutex.WaitOne(Timeout.Infinite, false);
_stream.Flush();
Mutex.ReleaseMutex();
}
public override void Close()
{
_stream.Close();
}
protected override void Dispose(bool disposing)
{
_stream.Dispose();
}
}
用法示例:
var worksheetPart = document.WorkbookPart.AddNewPart<WorksheetPart>();
var workSheetWriter = OpenXmlWriter.Create(new PackagePartStream(worksheetPart.GetStream()));
workSheetWriter.WriteStartElement(new Worksheet());
//rest of your code goes here ...
关于c# - OpenXML在编写元素时挂起,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/21482820/