本文介绍了小文件和HDFS块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hadoop分布式文件系统中的块是否存储多个小文件,或块是否只存储1个文件?

文件不存储在一个单独的块中。顺便说一句,单个文件可以存储在多个块中。文件和block-id之间的映射在NameNode中保存。



根据
$ b

HDFS旨在处理大文件。如果有太多的小文件,那么NameNode可能会被加载,因为它存储了HDFS的名称空间。查看此,了解如何缓解问题。太多的小文件。

Does a block in Hadoop Distributed File System store multiple small files, or a block stores only 1 file?

解决方案

Multiple files are not stored in a single block. BTW, a single file can be stored in multiple blocks. The mapping between the file and the block-ids is persisted in the NameNode.

According to the Hadoop : The Definitive Guide

HDFS is designed to handle large files. If there are too many small files then the NameNode might get loaded since it stores the name space for HDFS. Check this article on how to alleviate the problem with too many small files.

这篇关于小文件和HDFS块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 10:10