万条记录来说很慢

万条记录来说很慢

本文介绍了为什么 Import-Csv 的 Sort-Object 对于 100 万条记录来说很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要对 csv 文件的第一列(列可能不同)进行排序.由于我的 csv 文件有超过一百万条记录,因此执行以下命令需要 10 分钟.

I need to sort first column (column may differ) of csv files.As my csv files have more than a million records, for executing below command , it is taking 10 minutes.

有没有其他方法可以优化代码以加快执行速度?

is there any other way to optimize the code to speed up the execution?

$CsvFile = "D:\Performance\10_lakh_records.csv"
$OutputFile ="D:\Performance\output.csv"

    Import-Csv $CsvFile  | Sort-Object { $_.psobject.Properties.Value[1] } | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation

推荐答案

您可以尝试使用 [array]::Sort() 静态方法,它可能比 Sort-Object 更快,虽然它需要一个额外的步骤来首先得到一个所有值的一维数组来排序..

You could try using the [array]::Sort() static method which might prove faster than Sort-Object, although it does take an extra step to first get a one-dimensional array of all values to sort upon..

试试

$CsvFile    = "D:\Performance\10_lakh_records.csv"
$OutputFile = "D:\Performance\output.csv"

# import the data
$data = Import-Csv -Path $CsvFile

# determine the column name to sort on. In this demo the first column
# of course, if you know the column name you don't need that and can simply use the name as-is
$column = $data[0].PSObject.Properties.Name[0]

# use the Sort(Array, Array) overload method to sort the data by the
# values of the column you have chosen.
# see https://docs.microsoft.com/en-us/dotnet/api/system.array.sort?view=net-5.0#System_Array_Sort_System_Array_System_Array_
[array]::Sort($data.$column, $data)

$data | Export-Csv -Encoding default -Path $OutputFile -NoTypeInformation

这篇关于为什么 Import-Csv 的 Sort-Object 对于 100 万条记录来说很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-30 13:58