我有一堆管道分隔的文件,生成时无法正确转义以用于回车,因此无法使用CR或换行符来分隔行但是我知道每个记录必须有7个字段。
使用Ruby1.9中的csv库设置'col_sep'参数很容易分割字段,但是'row_sep'参数无法设置,因为字段中有新行。
是否有方法使用固定数量的字段作为行分隔符来解析管道分隔的文件?
谢谢!
最佳答案
有一种方法:
构建一个包含七个单词的示例字符串,并在
绳子中间。有三条线值得。
text = (["now is the\ntime for all good"] * 3).join(' ').gsub(' ', '|')
puts text
# >> now|is|the
# >> time|for|all|good|now|is|the
# >> time|for|all|good|now|is|the
# >> time|for|all|good
过程如下:
lines = []
chunks = text.gsub("\n", '|').split('|')
while (chunks.any?)
lines << chunks.slice!(0, 7).join(' ')
end
puts lines
# >> now is the time for all good
# >> now is the time for all good
# >> now is the time for all good
所以,这表明我们可以重建行。
假设这些单词实际上是来自管道分隔文件的列,我们可以通过去掉
.join(' ')
,让代码做真正的事情:while (chunks.any?)
lines << chunks.slice!(0, 7)
end
ap lines
# >> [
# >> [0] [
# >> [0] "now",
# >> [1] "is",
# >> [2] "the",
# >> [3] "time",
# >> [4] "for",
# >> [5] "all",
# >> [6] "good"
# >> ],
# >> [1] [
# >> [0] "now",
# >> [1] "is",
# >> [2] "the",
# >> [3] "time",
# >> [4] "for",
# >> [5] "all",
# >> [6] "good"
# >> ],
# >> [2] [
# >> [0] "now",
# >> [1] "is",
# >> [2] "the",
# >> [3] "time",
# >> [4] "for",
# >> [5] "all",
# >> [6] "good"
# >> ]
# >> ]
关于ruby - 每行读取固定数量的管道分隔字段?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/4083690/