问题描述
我存储在Amazon S3的桶2000000文件。有一个给定的根(L1)的下方,下L1目录的列表,然后每个目录包含文件。所以我斗看起来像下面的
I am storing two million files in an amazon S3 bucket. There is a given root (l1) below, a list of directories under l1 and then each directory contains files. So my bucket will look something like the following
L1 / A1 / file1-1.jpgL1 / A1 / file1-2.jpgL1 / A1 / ...另外500个文件L1 / A2 / file2-1.jpgL1 / A2 / file2-2.jpgL1 / A2 / ...另外500个文件......
l1/a1/file1-1.jpgl1/a1/file1-2.jpgl1/a1/... another 500 filesl1/a2/file2-1.jpgl1/a2/file2-2.jpgl1/a2/... another 500 files....
L1 / A5000 / file5000-1.jpg
l1/a5000/file5000-1.jpg
我想尽可能快的第二级条目列出,所以我想获得A1,A2,A5000。我不想列出所有的钥匙,这将需要更长的时间。
I would like to list as fast as possible the second level entries, so I would like to get a1, a2, a5000. I do not want to list all the keys, this will take a lot longer.
我打开直接使用AWS的API,但是我打到目前为止的right_aws宝石红宝石 http://rdoc.info/projects/rightscale/right_aws
I am open to using directly the AWS api, however I have played so far with the right_aws gem in ruby http://rdoc.info/projects/rightscale/right_aws
有在创业板至少有两个API,我尝试使用bucket.keys()的S3模块和incrementally_list_bucket()的S3Interface模块的研究。我可以设置preFIX和分隔符列出所有L1的/ A1 / *,例如,但我无法弄清楚如何列出L1只是第一个层次。有一个:common_ prefixes在由incrementally_list_bucket(返回的散列条目),但在我的测试样品它没有填充
There are at least two APIs in that gem, I tried using bucket.keys() in the S3 module and incrementally_list_bucket() in the S3Interface module. I can set the prefix and delimiter to list all of l1/a1/*, for example, but I cannot figure out how to list just the first level in l1. There is a :common_prefixes entry in the hash returned by incrementally_list_bucket() but in my test sample it is not filled in.
这是操作的可能与S3 API?
Is this operation possible with the S3 API?
谢谢!
推荐答案
right_aws
允许这样做,因为他们根本 S3Interface $的一部分C $ C>类,但你可以创建自己的方法更容易(和更好)的使用。将这个在你的code顶部:
right_aws
allows to do this as part of their underlying S3Interface
class, but you can create your own method for an easier (and nicer) use. Put this at the top of your code:
module RightAws
class S3
class Bucket
def common_prefixes(prefix, delimiter = '/')
common_prefixes = []
@s3.interface.incrementally_list_bucket(@name, { 'prefix' => prefix, 'delimiter' => delimiter }) do |thislist|
common_prefixes += thislist[:common_prefixes]
end
common_prefixes
end
end
end
end
这增加了 common_ prefixes
方法将 RightAws :: S3 ::斗
类。现在,而不是调用 mybucket.keys
来获取你的水桶键列表,你可以使用 mybucket.common_ prefixes
来获得共同prefixes阵列。你的情况:
This adds the common_prefixes
method to the RightAws::S3::Bucket
class. Now, instead of calling mybucket.keys
to fetch the list of keys in your bucket, you can use mybucket.common_prefixes
to get an array of common prefixes. In your case:
mybucket.common_prefixes("l1/")
# => ["l1/a1", "l1/a2", ... "l1/a5000"]
我必须说,我只用少量的共同prefixes测试它;你应该检查该作品超过1000个共同prefixes。
I must say I tested it only with a small number of common prefixes; you should check that this works with more than 1000 common prefixes.
这篇关于清单目录在Amazon S3的一定水平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!