本文介绍了bash从表中提取第一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于威胁表模式日志文件的第二个问题.我正在处理位于workdir中的大量dlg文本文件的分析.每个文件都有一个以下格式的表(通常位于日志末尾):

A second question for threatment of log files for table-patterns. I am dealing with the analysis of big number of dlg text files located within the workdir. Each file has a table (usually located in the end of the log) in the following format:

 RMSD TABLE
    __________


_____________________________________________________________________
     |      |      |           |         |                 |
Rank | Sub- | Run  | Binding   | Cluster | Reference       | Grep
     | Rank |      | Energy    | RMSD    | RMSD            | Pattern
_____|______|______|___________|_________|_________________|___________
   1      1      7       -1.43      0.00    178.12           RANKING
   1      2     18       -0.96      1.88    177.35           RANKING
   2      1      4       -0.97      0.00    178.43           RANKING
   3      1     13       -0.60      0.00    178.03           RANKING
   4      1      5       -0.56      0.00    198.10           RANKING
   5      1     16       +0.01      0.00    189.71           RANKING
   6      1      3       +0.06      0.00    176.95           RANKING
   7      1     19       +0.10      0.00    177.27           RANKING
   8      1     17       +0.13      0.00    177.60           RANKING
   9      1      8       +0.20      0.00    177.05           RANKING
  10      1     20       +0.27      0.00    177.43           RANKING
  11      1     10       +0.34      0.00    176.33           RANKING
  12      1      6       +0.37      0.00    177.30           RANKING
  13      1      9       +0.44      0.00    175.48           RANKING
  14      1      2       +0.46      0.00    175.67           RANKING
  15      1     11       +0.84      0.00    177.52           RANKING
  15      2     12       +1.31      1.95    178.03           RANKING
  16      1     14       +1.29      0.00    201.01           RANKING
  17      1     15       +1.65      0.00    175.50           RANKING
  18      1      1       +1.96      0.00    186.83           RANKING

Run time 3.909 sec
Idle time 0.817 sec

目标是遍历所有.dlg文件,并从表中获取与其第一行相对应的单行(忽略标题),而忽略最后一列(通常用于grep识别).在上表的示例中,这是第三行.

The aim is to loop over all the .dlg files and take the single line from the table corresponding to its first line (ignorring the header) ommiting the last column (normally provided for grep recognition). In the above example from the table this is the third line.

      1      1      7       -1.43      0.00    178.12

然后,我需要将此行与日志文件的名称(应在之前指定)一起添加到final_log.txt中.根据我最近的经验,我的BASH工作流程(针对多个文件的威胁)的可能模型可能是:

Then I need to add this line to the final_log.txt together with the name of the log file (that should be specified before).Based on my very recent experience a possible model for my BASH workflow (for threatment of several files) may be:

#!/bin/bash
#name of the folder containing all *.dlg filles to be analysed
prot='7000'
#path to the folder with these *.dlg filles
FILES=$PWD/${prot}/*.dlg
#make a final log
echo 'This is a list of processed filles' > $PWD/final_results.log
# we loop over all *.dlg filles in order to extract Clustering Histogram to the final LOG file
for f in $FILES
do
  file_name2=$(basename "$f")
  file_name="${file_name2/.dlg}"
  echo "Processing of $f..."
  # here is an expression for GREP to take the line from the table and save it to >> $PWD/final_results.log
done

推荐答案

如何开始-假设gawk具有nextfile支持:

how about to start with - assuming gawk with the nextfile support:

gawk '$1~/[[:digit:]]/{ print FILENAME, substr($0,1,match($0,/[[:blank:]]+[^[:blank:]]+$/)-1);nextfile}' *.dlg

这篇关于bash从表中提取第一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-21 06:01