本文介绍了C ++映射性能 - Linux(30秒)和Windows(30分钟)!的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要处理文件列表。不应为同一文件重复处理操作。我使用的代码是 -

  using namespace std; 

vector< File *> gInputFileList; //可以包含重复,文件有成员sFilename
map< string,File *> gProcessedFileList; // using map to avoid linear search costs

void processFile(File * pFile)
{
File * pProcessedFile = gProcessedFileList [pFile-> sFilename];
if(pProcessedFile!= NULL)
return; // Already processed

foo(pFile); // foo()是为每个文件执行的操作
gProcessedFileList [pFile-> sFilename] = pFile;
}

void main()
{
size_t n = gInputFileList.size(); //使用数组语法(迭代器语法也提供相同的性能)
for(size_t i = 0; i processFile(gInputFileList [i]);
}
}

代码可以正常工作, p>

我的问题是,当输入大小为1000时,在Windows / Visual Studio 2008 Express上需要30分钟 - 半小时。对于相同的输入,在Linux / gcc上运行只需要40秒!



可能是什么问题?当单独使用时,动作foo()只需要很短的时间来执行。我应该使用像vector :: reserve为地图吗?




$ b 1.它打开文件
2.将其读入内存
3.关闭文件
4.内存中的文件内容被解析
5.构建令牌列表;我正在使用一个向量。



每当我中断程序时(运行程序时使用1000+个文件输入集):调用堆栈显示程序在std中间: :vector add。

解决方案

在Microsoft Visual Studio中,访问标准C ++库以防止多线程问题在Debug构建。这可能导致大的性能命中。例如,我们的完整测试代码在Linux / gcc上运行50分钟,而在Windows VC ++ 2008上需要5个小时。请注意,在使用非调试Visual C ++运行时在发布模式下编译时,此性能命中不存在。


I need to process a list of files. The processing action should not be repeated for the same file. The code I am using for this is -

using namespace std;

vector<File*> gInputFileList; //Can contain duplicates, File has member sFilename
map<string, File*> gProcessedFileList; //Using map to avoid linear search costs

void processFile(File* pFile)
{
    File* pProcessedFile = gProcessedFileList[pFile->sFilename];
    if(pProcessedFile != NULL)
        return; //Already processed

    foo(pFile); //foo() is the action to do for each file
    gProcessedFileList[pFile->sFilename] = pFile;
}

void main()
{
    size_t n= gInputFileList.size(); //Using array syntax (iterator syntax also gives identical performance)
    for(size_t i=0; i<n; i++){
        processFile(gInputFileList[i]);
    }
}

The code works correctly, but...

My problem is that when the input size is 1000, it takes 30 minutes - HALF AN HOUR - on Windows/Visual Studio 2008 Express. For the same input, it takes only 40 seconds to run on Linux/gcc!

What could be the problem? The action foo() takes only a very short time to execute, when used separately. Should I be using something like vector::reserve for the map?

EDIT, EXTRA INFORMATION

What foo() does is:1. it opens the file2. reads it into memory3. closes the file4. the contents of the file in memory is parsed5. it builds a list of tokens; I'm using a vector for that.

Whenever I break the program (while running the program with the 1000+ files input set): the call-stack shows that the program is in the middle of a std::vector add.

解决方案

In the Microsoft Visual Studio, there's a global lock when accessing the Standard C++ Library to protect from multi threading issue in Debug builds. This can cause big performance hits. For instance, our full test code runs on Linux/gcc in 50 minutes, whereas it needs 5 hours on Windows VC++2008. Note that this performance hit does not exist when compiling in Release mode, using the non-debug Visual C++ runtime.

这篇关于C ++映射性能 - Linux(30秒)和Windows(30分钟)!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-15 20:20