问题描述
我试图限制对S3存储桶的访问,并且仅允许基于引用者的列表中的某些域.
I'm trying to restrict access to a S3 bucket and only allowing certain domains from a list based on the referer.
存储桶策略基本上是:
{
"Version": "2012-10-17",
"Id": "http referer domain lock",
"Statement": [
{
"Sid": "Allow get requests originating from specific domains",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::example.com/*",
"Condition": {
"StringLike": {
"aws:Referer": [
"*othersite1.com/*",
"*othersite2.com/*",
"*othersite3.com/*"
]
}
}
}
]
}
此othersite1,2和3调用一个对象,该对象已存储在域example.com下的s3存储桶中.我还向该存储桶附加了一个Cloudfront发行版.我在字符串条件前后使用*通配符.引荐来源网址可以是othersite1.com/folder/another-folder/page.html.推荐人还可以使用http或https.
This othersite1,2 and 3 call an object that i have stored in my s3 bucket under the domain example.com.I also have a cloudfront distribution attached to the bucket. I'm using * wildcard before and after the string condition. The referer can be othersite1.com/folder/another-folder/page.html. The referer may also use http or https.
我不知道为什么会出现403 Forbidden错误.
I don't know why I'm getting 403 Forbidden error.
我这样做主要是因为我不希望其他站点调用该对象.
I'm doing this basically because i don't want other sites to call that object.
任何帮助将不胜感激.
推荐答案
对于正确的缓存行为而言,CloudFront会将请求中的几乎所有请求标头都剥离掉,然后再转发到原始服务器.
As is necessary for correct caching behavior, CloudFront strips almost all of the request headers off of a request before forwarding it to the origin server.
因此,如果您的存储桶试图阻止基于引荐页的请求(有时是为了防止进行热链接而进行的操作),则S3在默认情况下将无法看到 Referer
标头,因为CloudFront不会转发它.
So, if your bucket is trying to block requests based on the referring page, as is sometimes done to prevent hotlinking, S3 will not -- by default -- be able to see the Referer
header, because CloudFront doesn't forward it.
而且,这很好地说明了为什么 CloudFront不转发它.如果CloudFront转发了标头,然后盲目地缓存了结果,则存储桶策略是否具有预期的效果将取决于第一个请求是来自某个预期站点还是来自其他站点-其他请求者将获得缓存的响应,从而可能是错误响应.
And, this is a very good illustration of why CloudFront doesn't forward it. If CloudFront forwarded the header and then blindly cached the result, whether the bucket policy had the intended effect would depend on whether the first request was from one of the intended sites, or from elsewhere -- and other requesters would get the cached response, which might be the wrong response.
(tl; dr)将 Referer
标头列入白名单以转发到源(在CloudFront缓存行为设置中)解决了此问题.
(tl;dr) Whitelisting the Referer
header for forwarding to the origin (in the CloudFront Cache Behavior settings) solves this issue.
但是,有一个陷阱.
现在您要将 Referer
标头转发到S3,您已经扩展了缓存键-CloudFront缓存响应的内容列表-包括 Referer
标头.
Now that you are forwarding the Referer
header to S3, you've extended the cache key -- the list of things against which CloudFront caches responses -- to include the Referer
header.
因此,现在,对于每个对象,除非传入请求的 Referer
标头与已缓存请求中的完全匹配,否则CloudFront将不会从缓存提供响应.否则,该请求必须转到S3.而且,关于引荐来源标头的问题是引荐的 page ,而不是引荐的 site ,因此授权站点中的每个 page 都有其在CloudFront中拥有这些资产的缓存副本.
So, now, for each object, CloudFront will not serve a response from cache unless the incoming request's Referer
header matches exactly one from an already-cached request... otherwise the request has to go to S3. And, the thing about the referer header, it's the referring page, not the referring site, so each page from the authorized sites will have its own cached copy of these assets in CloudFront.
这本身不是问题.这些多余的对象副本不收取任何费用,这就是CloudFront设计为工作的方式.问题是,它减少了给定对象在给定边缘缓存中的可能性,因为每个对象都必须引用得更少.如果流量很大,这变得不那么重要-至微不足道的程度,如果流量较小,则变得更重要.较少的高速缓存命中意味着较慢的页面加载和更多的请求发送到S3.
This, itself, is not a problem. There is no charge for these extra copies of objects, and this is how CloudFront is designed to work... the problem is, it reduces the likelihood of a given object being in a given edge cache, since each object will necessarily be referenced less. This becomes less significant -- to the point of insignificance -- if you have a large amount of traffic, and more significant if your traffic is smaller. Fewer cache hits means slower page loads and more requests going to S3.
这对您是否理想没有正确的答案,因为它与您使用CloudFront和S3的方式非常相关.
There is not a correct answer to whether or not this is ideal for you, because it is very specific to exactly how you are using CloudFront and S3.
但是,这是替代方法:
通过配置CloudFront触发 Lambda @ Edge Viewer请求触发器,它将检查前门中出现的每个请求,并阻止那些不是来自您要允许的页面的请求.
You can remove the Referer
header from the whitelist of headers to forward to S3 and undo that potential for negatively impacting cache hits, by configuring CloudFront to fire a Lambda@Edge Viewer Request trigger that will inspect each request as it comes in the front door, and block those requests that don't come from referring pages that you want to allow.
在匹配了特定的缓存行为"之后,但在检查实际的缓存之前,并且大多数传入标头仍保持不变的情况下,将触发查看器请求"触发器.您可以允许请求继续进行,可以选择进行修改,或者可以生成响应并取消其余的CloudFront处理.这就是我在下面说明的内容-如果 Referer
标头的主机部分不在可接受值的数组中,我们将生成403响应;否则,请求将继续,检查缓存,并且仅在需要时才查询源.
A Viewer Request trigger fires after the specific Cache Behavior is matched, but before the actual cache is checked, and with most of the incoming headers still intact. You can allow the request to proceed, optionally with modifications, or you can generate a response and cancel the rest of the CloudFront processing. That's what I'm illustrating, below -- if the host part of the Referer
header isn't in the array of acceptable values, we generate a 403 response; otherwise, the request continues, the cache is checked, and the origin consulted only as needed.
触发此触发器会为每个请求添加少量开销,但是与降低缓存命中率相比,这种开销可能会逐渐摊销,这是更可取的.因此,以下不是一个更好的解决方案,而是一个替代解决方案.
Firing this trigger adds a small amount of overhead to every request, but that overhead may amortize out to being more desirable than a reduced cache hit rate. So, the following is not a "better" solution -- just an alternate solution.
这是用Node.js 6.10编写的Lambda函数.
This is a Lambda function written in Node.js 6.10.
'use strict';
const allow_empty_referer = true;
const allowed_referers = ['example.com', 'example.net'];
exports.handler = (event, context, callback) => {
// extract the original request, and the headers from the request
const request = event.Records[0].cf.request;
const headers = request.headers;
// find the first referer header if present, and extract its value;
// then take http[s]://<--this-part-->/only/not/the/path.
// the || [])[0]) || {'value' : ''} construct is optimizing away some if(){ if(){ if(){ } } } validation
const referer_host = (((headers.referer || [])[0]) || {'value' : ''})['value'].split('/')[2];
// compare to the list, and immediately allow the request to proceed through CloudFront
// if we find a match
for(var i = allowed_referers.length; i--;)
{
if(referer_host == allowed_referers[i])
{
return callback(null,request);
}
}
// also test for no referer header value if we allowed that, above
// usually, you do want to allow this
if(allow_empty_referer && referer_host === "")
{
return callback(null,request);
}
// we did not find a reason to allow the request, so we deny it.
const response = {
status: '403',
statusDescription: 'Forbidden',
headers: {
'vary': [{ key: 'Vary', value: '*' }], // hint, but not too obvious
'cache-control': [{ key: 'Cache-Control', value: 'max-age=60' }], // browser-caching timer
'content-type': [{ key: 'Content-Type', value: 'text/plain' }], // can't return binary (yet?)
},
body: 'Access Denied\n',
};
callback(null, response);
};
这篇关于根据引荐来源限制对AWS S3存储桶的访问的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!