A second question for threatment of log files for table-patterns. I am dealing with the analysis of big number of dlg text files located within the workdir. Each file has a table (usually located in the end of the log) in the following format:
| | | | | |
Rank | Sub- | Run | Binding | Cluster | Reference | Grep
| Rank | | Energy | RMSD | RMSD | Pattern
1 1 7 -1.43 0.00 178.12 RANKING
1 2 18 -0.96 1.88 177.35 RANKING
2 1 4 -0.97 0.00 178.43 RANKING
3 1 13 -0.60 0.00 178.03 RANKING
4 1 5 -0.56 0.00 198.10 RANKING
5 1 16 +0.01 0.00 189.71 RANKING
6 1 3 +0.06 0.00 176.95 RANKING
7 1 19 +0.10 0.00 177.27 RANKING
8 1 17 +0.13 0.00 177.60 RANKING
9 1 8 +0.20 0.00 177.05 RANKING
10 1 20 +0.27 0.00 177.43 RANKING
11 1 10 +0.34 0.00 176.33 RANKING
12 1 6 +0.37 0.00 177.30 RANKING
13 1 9 +0.44 0.00 175.48 RANKING
14 1 2 +0.46 0.00 175.67 RANKING
15 1 11 +0.84 0.00 177.52 RANKING
15 2 12 +1.31 1.95 178.03 RANKING
16 1 14 +1.29 0.00 201.01 RANKING
17 1 15 +1.65 0.00 175.50 RANKING
18 1 1 +1.96 0.00 186.83 RANKING
Run time 3.909 sec
Idle time 0.817 sec
The aim is to loop over all the .dlg files and take the single line from the table corresponding to its first line (ignorring the header) ommiting the last column (normally provided for grep recognition). In the above example from the table this is the third line.
1 1 7 -1.43 0.00 178.12
Then I need to add this line to the final_log.txt together with the name of the log file (that should be specified before).Based on my very recent experience a possible model for my BASH workflow (for threatment of several files) may be:
#name of the folder containing all *.dlg filles to be analysed
#path to the folder with these *.dlg filles
#make a final log
echo 'This is a list of processed filles' > $PWD/final_results.log
# we loop over all *.dlg filles in order to extract Clustering Histogram to the final LOG file
for f in $FILES
file_name2=$(basename "$f")
echo "Processing of $f..."
# here is an expression for GREP to take the line from the table and save it to >> $PWD/final_results.log
how about to start with - assuming gawk with the nextfile
gawk '$1~/[[:digit:]]/{ print FILENAME, substr($0,1,match($0,/[[:blank:]]+[^[:blank:]]+$/)-1);nextfile}' *.dlg