

我想知道是否有任何命令/表达式只能获取hadoop中的文件名.我只需要提取文件名,当我执行hadoop fs -ls时,它将打印整个路径.

I would like to know is there any command/expression to get only the file name in hadoop. I need to fetch only the name of file, when I do hadoop fs -ls it prints the whole path.


I tried below but just wondering if some better way to do it.

hadoop fs -ls <HDFS_DIR>|cut -d ' ' -f17


似乎 hadoop ls不支持仅输出文件名,甚至仅输出最后一列的任何选项.

It seems hadoop ls does not support any options to output just the filenames, or even just the last column.


If you want get the last column reliably, you should first convert the whitespace to a single space, so that you can then address the last column:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8


This will get you just the last column but files with the whole path. If you want just filenames, you can use basename as @rojomoke suggests:

hadoop fs -ls | sed '1d;s/  */ /g' | cut -d\  -f8 | xargs -n 1 basename

我还过滤掉了第一行说Found ?x items

I also filtered out the first line that says Found ?x items

注意:请注意,如注释中的@ felix-frank所述,上述命令将不能正确保留多个连续空格的文件名.因此,Felix提出了一个更正确的解决方案:

Note: beware that, as @felix-frank notes in the comments, that the above command will not correctly preserve file names with multiple consecutive spaces. Hence a more correct solution proposed by Felix:

hadoop fs -ls /tmp | sed 1d | perl -wlne'print +(split " ",$_,8)[7]'


08-04 23:42