问题描述
我有两个文件,一个文件是我的数据,另一个文件是我想从我的数据文件中提取行号的列表。我可以用AWK在我行读取文件,然后提取匹配的行号,该行?
I have two files, one file is my data, and the other file is a list of line numbers that I want to extract from my data file. Can I use awk to read in my lines file, and then extract the lines that match the line numbers?
例如:
数据文件:
Example:Data file:
This is the first line of my data
This is the second line of my data
This is the third line of my data
This is the fourth line of my data
This is the fifth line of my data
行号文件
1
4
5
输出:
This is the first line of my data
This is the fourth line of my data
This is the fifth line of my data
我只用过的命令行的awk和sed的非常简单的东西。这远远超出了我,没有回答我一直在谷歌上搜索了一个小时。
I've only ever used command line awk and sed for really simple stuff. This is way beyond me and I have been googling for an hour without an answer.
推荐答案
有一个办法 SED
:
sed 's/$/p/' linesfile | sed -n -f - datafile
您可以使用同样的伎俩与 AWK
:
You can use the same trick with awk
:
sed 's/^/NR==/' linesfile | awk -f - datafile
编辑 - 巨大的文件替代
至于它是不是谨慎,以保持整个文件在内存中的行数量巨大。在这种情况下,溶液可进行排序的数字文件,并在一次读取一行。下面一直与GNU AWK测试:
Edit - Huge files alternative
With regards to huge number of lines it is not prudent to keep whole files in memory. The solution in that case can be to sort the numbers-file and read one line at a time. The following has been tested with GNU awk:
的 extract.awk 的
BEGIN {
getline n < linesfile
if(length(ERRNO)) {
print "Unable to open linesfile '" linesfile "': " ERRNO > "/dev/stderr"
exit
}
}
NR == n {
print
if(!(getline n < linesfile)) {
if(length(ERRNO))
print "Unable to open linesfile '" linesfile "': " ERRNO > "/dev/stderr"
exit
}
}
运行这样的:
awk -v linesfile=$linesfile -f extract.awk infile
测试:
echo "2
4
7
8
10
13" | awk -v linesfile=/dev/stdin -f extract.awk <(paste <(seq 50e3) <(seq 50e3 | tac))
输出:
2 49999
4 49997
7 49994
8 49993
10 49991
13 49988
这篇关于用awk拉从一个文件中的特定行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!