问题描述
我有一个文件与8列使用|作为分隔符,并且我想计算第8列中的单词的出现频率。我试过awk这样
awk -F'{print $ 8}'|排序| uniq -c $ FILE
但我得到的是整个文件的打印,我做错了。
编辑:现在我打印了我想要的,如下所示:
1
2307 Internet资源管理器
369 Safari
2785 Chrome
316 Opera
4182 Firefox
,但我无法理解这个1来自
您可以 awk
执行此操作:
awk -F'| freq [$ 8] ++} END {for(i in freq)print freq [i],i}'file
$ b b此awk命令使用
|
作为分隔符,并使用数组看到
,键为$ 8
。当找到$ 8
键时,频率(值)增加1
。
Btw您需要在命令中添加自定义分隔符|
,并使用它:awk -F'|''{print $ 8}'file |排序| uniq -c
I have a file with 8 columns using "|" as a delimiter and I want to count the occurence frequency of the words in the 8th column. I tried awk like this
awk -F '{print $8}' | sort | uniq -c $FILE
but I get instead a print of the whole file and I can't understand what I am doing wrong.
EDIT: Now I get printed what I want as below:
1
2307 Internet Explorer
369 Safari
2785 Chrome
316 Opera
4182 Firefox
but I can't understand where this "1" come from解决方案You can just
awk
to do this:awk -F '|' '{freq[$8]++} END{for (i in freq) print freq[i], i}' file
This awk command uses
|
as delimiter and uses an arrayseen
with key as$8
. When it finds a key$8
increments the frequency (value) by1
.Btw you need to add custom delimiter|
in your command and use it like this:awk -F '|' '{print $8}' file | sort | uniq -c
这篇关于bash中文件列的频率计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!