用awk拉从一个文件中的

用awk拉从一个文件中的

本文介绍了用awk拉从一个文件中的特定行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个文件,​​一个文件是我的数据,另一个文件是我想从我的数据文件中提取行号的列表。我可以用AWK在我行读取文件,然后提取匹配的行号,该行?

I have two files, one file is my data, and the other file is a list of line numbers that I want to extract from my data file. Can I use awk to read in my lines file, and then extract the lines that match the line numbers?

例如:
数据文件:

Example:Data file:

This is the first line of my data
This is the second line of my data
This is the third line of my data
This is the fourth line of my data
This is the fifth line of my data

行号文件

1
4
5

输出:

This is the first line of my data
This is the fourth line of my data
This is the fifth line of my data

我只用过的命令行的awk和sed的非常简单的东西。这远远超出了我,没有回答我一直在谷歌上搜索了一个小时。

I've only ever used command line awk and sed for really simple stuff. This is way beyond me and I have been googling for an hour without an answer.

推荐答案

有一个办法 SED

sed 's/$/p/' linesfile | sed -n -f - datafile

您可以使用同样的伎俩与 AWK

You can use the same trick with awk:

sed 's/^/NR==/' linesfile | awk -f - datafile

编辑 - 巨大的文件替代

至于它是不是谨慎,以保持整个文件在内存中的行数量巨大。在这种情况下,溶液可进行排序的数字文件,并在一次读取一行。下面一直与GNU AWK测试:

Edit - Huge files alternative

With regards to huge number of lines it is not prudent to keep whole files in memory. The solution in that case can be to sort the numbers-file and read one line at a time. The following has been tested with GNU awk:

extract.awk

BEGIN {
  getline n < linesfile
  if(length(ERRNO)) {
    print "Unable to open linesfile '" linesfile "': " ERRNO > "/dev/stderr"
    exit
  }
}

NR == n {
  print
  if(!(getline n < linesfile)) {
    if(length(ERRNO))
      print "Unable to open linesfile '" linesfile "': " ERRNO > "/dev/stderr"
    exit
  }
}

运行这样的:

awk -v linesfile=$linesfile -f extract.awk infile

测试:

echo "2
4
7
8
10
13" | awk -v linesfile=/dev/stdin -f extract.awk <(paste <(seq 50e3) <(seq 50e3 | tac))

输出:

2   49999
4   49997
7   49994
8   49993
10  49991
13  49988

这篇关于用awk拉从一个文件中的特定行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 18:20