问题描述
我搜索了所有文档,但仍然找不到以下文件命名约定中为什么有前缀以及c000是什么的原因:
I have searched through every documentation and still didn't find why there is a prefix and what is c000 in the below file naming convention:
文件:/Users/stephen/p/spark/f1/part-00000-445036f9-7a40-4333-8405-8451faa44319-c000.snappy.parquet
file:/Users/stephen/p/spark/f1/part-00000-445036f9-7a40-4333-8405-8451faa44319-c000.snappy.parquet
推荐答案
您应该使用对话很便宜,请告诉我代码".方法.一切都没有记录在案,只有一种方法就是代码.
You should use "Talk is cheap, show me the code." methodology. Everything is not documented and one way to go is just the code.
考虑part-1-2_3-4.parquet:
Consider part-1-2_3-4.parquet :
-
分割/分区号.
Split/Partition number.
随机UUID,以防止不同的(附加的)写入作业之间发生冲突.
Random UUID to prevent collision between different (appending) write jobs.
这篇关于任何人都可以在c000.snappy.parquet或c000.snappy.orc中解释c000是什么意思吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!