问题描述
因此,我找到了删除页面上 .html 扩展名的答案,此代码可以正常工作:
So, I found an answer to removing the .html extension on my page, that works fine with this code:
server {
listen 80;
server_name _;
root /var/www/html/;
index index.html;
if (!-f "${request_filename}index.html") {
rewrite ^/(.*)/$ /$1 permanent;
}
if ($request_uri ~* "/index.html") {
rewrite (?i)^(.*)index.html$ $1 permanent;
}
if ($request_uri ~* ".html") {
rewrite (?i)^(.*)/(.*).html $1/$2 permanent;
}
location / {
try_files $uri.html $uri $uri/ /index.html;
}
}
但是如果我打开 mypage.com 它会将我重定向到 mypage.com/index
这不是通过将 index.html 声明为索引来解决的吗?任何帮助表示赞赏.
But if I open mypage.com it redirects me to mypage.com/index
Wouldn't this be fixed by declaring index.html as index? Any help is appreciated.
推荐答案
NGINX 中删除.html"的圣杯"解决方案:
更新的答案:这个问题激起了我的好奇心,于是我又一次更深入地寻找圣杯".NGINX 中 .html
重定向的解决方案.这是我找到的答案的链接,因为我不是自己想出来的:https://stackoverflow.com/a/32966347/4175718
The "Holy Grail" Solution for Removing ".html" in NGINX:
UPDATED ANSWER: This question piqued my curiosity, and I went on another, more in-depth search for a "holy grail" solution for .html
redirects in NGINX. Here is the link to the answer I found, since I didn't come up with it myself: https://stackoverflow.com/a/32966347/4175718
不过,我会举一个例子并解释它是如何工作的.代码如下:
However, I'll give an example and explain how it works. Here is the code:
location / {
if ($request_uri ~ ^/(.*).html) {
return 302 /$1;
}
try_files $uri $uri.html $uri/ =404;
}
这里发生的事情是对 if
指令的巧妙使用.NGINX 在传入请求的 $request_uri
部分运行正则表达式.正则表达式检查 URI 是否具有 .html 扩展名,然后将 URI 的无扩展名部分存储在内置变量 $1
中.
What's happening here is a pretty ingenious use of the if
directive. NGINX runs a regex on the $request_uri
portion of incoming requests. The regex checks if the URI has an .html extension and then stores the extension-less portion of the URI in the built-in variable $1
.
来自 docs,因为我花了一段时间才弄清楚在哪里$1
来自:
From the docs, since it took me a while to figure out where the $1
came from:
正则表达式可以包含可供以后在 $1..$9 变量中重用的捕获.
正则表达式检查是否存在不需要的 .html 请求并有效地清理 URI 使其不包含扩展名.然后,使用一个简单的 return
语句,将请求重定向到现在存储在 $1
中的清理过的 URI.
The regex both checks for the existence of unwanted .html requests and effectively sanitizes the URI so that it does not include the extension. Then, using a simple return
statement, the request is redirected to the sanitized URI that is now stored in $1
.
正如原作者 cnst 解释的那样,最好的部分是
The best part about this, as original author cnst explains, is that
由于每个请求 $request_uri 始终保持不变,并且不受其他重写的影响,因此实际上不会形成任何无限循环.
与对任何 .html
请求(包括对/index.html
的不可见内部重定向)进行操作的重写不同,此解决方案仅对用户可见的外部 URI 进行操作.
Unlike the rewrites, which operate on any .html
request (including the invisible internal redirect to /index.html
), this solution only operates on external URIs that are visible to the user.
您仍然需要 try_files
指令,否则 NGINX 将不知道如何处理新清理的无扩展 URI.上面显示的 try_files
指令将首先自己尝试新的 URL,然后使用.html"来尝试它.扩展名,然后尝试将其作为目录名.
You will still need the try_files
directive, as otherwise NGINX will have no idea what to do with the newly sanitized extension-less URIs. The try_files
directive shown above will first try the new URL by itself, then try it with the ".html" extension, then try it as a directory name.
NGINX 文档还解释了默认的 try_files
指令是如何工作的.默认的 try_files
指令的顺序与上面的示例不同,因此下面的解释并不完全一致:
The NGINX docs also explain how the default try_files
directive works. The default try_files
directive is ordered differently than the example above so the explanation below does not perfectly line up:
NGINX 将首先将 .html
附加到 URI 的末尾并尝试为其提供服务.如果找到合适的 .html
文件,它将返回该文件并维护无扩展名的 URI.如果找不到合适的 .html
文件,它会尝试不带任何扩展名的 URI,然后将 URI 作为目录,最后返回 404 错误.
更新:正则表达式有什么作用?
上面的回答涉及到了正则表达式的使用,但这里有一个更具体的解释给那些仍然好奇的人.使用了以下正则表达式(regex):
UPDATE: What does the regex do?
The above answer touches on the use of regular expressions, but here is a more specific explanation for those who are still curious. The following regular expression (regex) is used:
^/(.*).html
这分解为:
^
:表示行首.
/
:匹配字符/";字面上地.在 NGINX 中不需要对正斜杠进行转义.
/
: match the character "/" literally. Forward slashes do NOT need to be escaped in NGINX.
(.*)
:捕获组:无限次匹配任意字符
(.*)
: capturing group: match any character an unlimited number of times
.
:匹配字符.";字面上地.这必须用反斜杠转义.
.
: match the character "." literally. This must be escaped with a backslash.
html
:匹配字符串html";字面意思.
html
: match the string "html" literally.
捕获组 (.*)
是包含非.html"的URL 的一部分.稍后可以使用变量 $1
引用它.然后 NGINX 被配置为重新尝试请求(return 302/$1;
),并且 try_files
指令在内部重新附加.html".扩展名,以便可以找到文件.
The capturing group (.*)
is what contains the non-".html" portion of the URL. This can later be referenced with the variable $1
. NGINX is then configured to re-try the request (return 302 /$1;
) and the try_files
directive internally re-appends the ".html" extension so the file can be located.
要保留传递给 .html
页面的查询字符串和参数,可以将 return
语句更改为:
To retain query strings and arguments passed to a .html
page, the return
statement can be changed to:
return 302 /$1$is_args$args;
这应该允许诸如 /index.html?test
之类的请求重定向到 /index?test
而不仅仅是 /index
.
This should allow requests such as /index.html?test
to redirect to /index?test
instead of just /index
.
来自 NGINX 页面 If Is Evil:
From the NGINX page If Is Evil:
如果在位置上下文中,可以在内部完成的唯一 100% 安全的事情是:
返回...;
重写...最后;
另外,请注意,您可以将302"重定向替换为301".
301
重定向是永久性的,并由网络浏览器和搜索引擎缓存.如果您的目标是从已被搜索引擎索引的页面中永久删除 .html
扩展名,您将需要使用 301
重定向.但是,如果您在实时站点上进行测试,最好的做法是从 302
开始,只有在您完全确信您的配置工作正常时才移动到 301
.
Also, note that you may swap out the '302' redirect for a '301'.
A 301
redirect is permanent, and is cached by web browsers and search engines. If your goal is to permanently remove the .html
extension from pages that are already indexed by a search engine, you will want to use a 301
redirect. However, if you are testing on a live site, it is best practice to start with a 302
and only move to a 301
when you are absolutely confident your configuration is working correctly.
这篇关于NGINX 删除 .html 扩展名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!