本文介绍了将文件名转换为file://URL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在WeasyPrint的公共API中,我接受HTML输入的文件名(除其他类型外).可以与内置open()一起使用的任何文件名都可以使用,但是我需要将其转换为file://方案中的URL,然后再将其传递给urllib.urlopen().

In WeasyPrint’s public API I accept filenames (among other types) for the HTML inputs. Any filename that works with the built-in open() should work, but I need to convert it to an URL in the file:// scheme that will later be passed to urllib.urlopen().

(所有内容在内部都是URL形式.为了使用urlparse.urljoin()解析相对URL引用,我需要文档的基本URL".)

(Everything is in URL form internally. I need to have a "base URL" for documents in order to resolve relative URL references with urlparse.urljoin().)

urllib.pathname2url 是一个开始:

重点是我的,但我确实需要完整的URL.到目前为止,这似乎可行:

The emphasis is mine, but I do need a complete URL. So far this seems to work:

def path2url(path):
    """Return file:// URL from a filename."""
    path = os.path.abspath(path)
    if isinstance(path, unicode):
        path = path.encode('utf8')
    return 'file:' + urlparse.pathname2url(path)

UTF-8似乎是 RFC 3987(IRI)推荐的.但是在这种情况下(URL最终将用于urllib)也许我应该使用 sys.getfilesystemencoding()?

UTF-8 seems to be recommended by RFC 3987 (IRI). But in this case (the URL is meant for urllib, eventually) maybe I should use sys.getfilesystemencoding()?

但是,根据文献,我不仅应该在file:之前,而且应该在file://之前... ...我不应该:在Windows上,nturl2path.pathname2url()的结果已经以三个斜杠开头.

However, based on the literature I should prepend not just file: but file:// ... except when I should not: On Windows the results from nturl2path.pathname2url() already start with three slashes.

所以问题是:有没有更好的方法来做到这一点并使它跨平台?

So the question is: is there a better way to do this and make it cross-platform?

推荐答案

出于完整性考虑,在Python 3.4+中,您应该执行以下操作:

For completeness, in Python 3.4+, you should do:

import pathlib

pathlib.Path(absolute_path_string).as_uri()

这篇关于将文件名转换为file://URL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 13:45