问题描述
我正在尝试使用命令hdfs dfs - du -h
列出文件和文件夹的大小.我使用的命令是hdfs dfs - du -h /path_name/folder_name
,返回的结果类似于
I am trying to use command hdfs dfs - du -h
to list the size of files and folders.The command I use is hdfs dfs - du -h /path_name/folder_name
, the result returned is like
9.2 G 27.5 G /path_name/folder_name/xxx01.parquet
0 0 /path_name/folder_name/xxx02.parquet
19.9 M 59.6 M /path_name/folder_name/xxx03.parquet
我知道hadoop命令行是从通用文件系统命令中借来的,而-du -h
是列出人类可读的文件夹/文件大小.但是,(以第一条结果行为例)这两个数字9.2 G 27.5 G
分别是什么意思?
I know the hadoop command line is borrowing a lot from general file system command, and -du -h
is to list a human readable folder/file size. However, (take the first result line as an example ) what is the meaning for these two numbers 9.2 G 27.5 G
respectively?
谢谢!
推荐答案
您的群集复制因子是3.第一个数字是文件的纯大小,第二个数字是带有复制符的文件大小.例如,实际文件大小为9.2 GB.由于复制因子为3,因此具有副本的文件大小为27.5GB
Your cluster replication factor is 3. The first number is the file pure size and the second one is the file size with repicas. for example actual file size is 9.2 GB. Because replication factor is 3 the file size with replicas is 27.5GB
size disk space consumed with all replicas full_path
这篇关于如何了解hdfs -du结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!