问题描述
具体来说,我想提取第二列( $ 2
), file1.txt
, file2.txt
等,根据第一列的值,然后将所有提取的列放在一个文件中, out.txt
。
问题是第一列在每个文件中有不同的间隔:
file
:
0.50 x1
1.25 x2
1.50 x3
1.75 x4
2.00 x5
file2
:
0.25 y1
0.50 y2
1.00 y3
1.25 y4
2.00 y5
所需输出:
0.25 y1
0.50 x1 y2
1.00 y3
1.25 x2 y4
1.50 x3
1.75 x4
2.00 x5 y5
这里是importa nt格式的数字,而不是数值。有数字,点和一个或多个数字,你应该写在正则表达式中:\ d \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\如何提取精确的列的方法是先使用awk。
这样你可以设置列号:
$ var = 3
$ ls -l | awk'{print $'$ var'}'
我不认为这是一个任务bash(我不说这是不可能的),所以我写了我的解决方案在python中:
import re,sys
num = {}
files = ['file1','file2']
文件中的文件:
f = open(file,'r')
for f.readlines()中的行:
cont = re.match(r(\ d + \.\\\d +)\s (。*),line)
if cont!= None:
if float(cont.group(1))not in num:
num [float(cont.group(1) )] = []
num [float(cont.group(1))]。append(cont.group(2))
f.close()
for key in:
sys.stdout.write(str(key)+'')
print num [key]
file1:
0.5 x1
0.8 x2
0.3 x3
file2:
1.3 y1
0.5 y2
0.0 y3
$ b $ > 0.5 ['x1','y2']
0.0 ['y3']
1.3 ['y1']
0.3 ['x3']
0.8 ['x2' ]
I need help in extracting one column of numbers from many different files and display it in an output file.
Specifically, I want to extract the second column ($2
) from each file, file1.txt
, file2.txt
etc., according the the value of the first column, then place all extracted columns in a one file, out.txt
.
The problem is that the first column has different intervals in each file:
file
:
0.50 x1
1.25 x2
1.50 x3
1.75 x4
2.00 x5
file2
:
0.25 y1
0.50 y2
1.00 y3
1.25 y4
2.00 y5
Desired output:
0.25 y1
0.50 x1 y2
1.00 y3
1.25 x2 y4
1.50 x3
1.75 x4
2.00 x5 y5
Here is important format of the number, not value. There is number, dot and one or more numbers, which you should write in regex: \d\.\d+
If your file has more columns the best way how to extract exact column is use awk first.This way you may set column number:
$ var=3
$ ls -l | awk '{print $'$var'}'
I dont think that it is a task for bash (I dont say that it is not possible), so I wrote my solution in python:
import re, sys
num = {}
files = ['file1', 'file2']
for file in files:
f = open(file,'r')
for line in f.readlines():
cont = re.match(r"(\d+\.\d+)\s(.*)", line)
if cont != None:
if float(cont.group(1)) not in num:
num[float(cont.group(1))] = []
num[float(cont.group(1))].append(cont.group(2))
f.close()
for key in num:
sys.stdout.write(str(key)+' ')
print num[key]
file1:
0.5 x1
0.8 x2
0.3 x3
file2:
1.3 y1
0.5 y2
0.0 y3
output:
0.5 ['x1', 'y2']
0.0 ['y3']
1.3 ['y1']
0.3 ['x3']
0.8 ['x2']
这篇关于根据第一列值从不同文件中选择grep列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!