从基本 URL 和潜在相对路径构建 URL 的 PHP 等价物是什么? Python 提供了 urlparse.urljoin 但在 PHP 中似乎没有任何标准实现。

我发现的最接近的是人们建议使用 parse_url 然后从部分重建 URL,但是这样做的实现通常会导致协议(protocol)相关链接错误(例如,//example.com/foo 变成 http://example.com/foohttps://example.com/foo ,继承了基本 URL 的协议(protocol)),并且它也不容易处理诸如父目录链接之类的事情。以下是在 urlparse.urljoin 中正常工作的示例:

>>> from urlparse import urljoin
>>> urljoin('http://example.com/some/directory/filepart', 'foo.jpg')
'http://example.com/some/directory/foo.jpg'
>>> urljoin('http://example.com/some/directory/', 'foo.jpg')
'http://example.com/some/directory/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '../foo.jpg')
'http://example.com/some/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '/foo.jpg')
'http://example.com/foo.jpg'
>>> urljoin('http://example.com/some/directory/', '//images.example.com/bar.jpg')
'http://images.example.com/bar.jpg'
>>> urljoin('https://example.com/some/directory/', '//images.example.com/bar.jpg')
'https://images.example.com/bar.jpg'
>>> urljoin('ftp://example.com/some/directory/', '//images.example.com/bar.jpg')
'ftp://images.example.com/bar.jpg'
>>> urljoin('http://example.com:8080/some/directory/', '//images.example.com/bar.jpg')
'http://images.example.com/bar.jpg'

是否有一种惯用的方法可以在 PHP 中实现相同的目标,或者一个备受推崇的简单库或实现实际上可以正确处理所有这些情况?

最佳答案

因为显然需要这个功能,而且没有一个随机脚本涵盖所有的基础,我已经启动了一个 project on Github 来尝试正确地做到这一点。
urljoin() 目前的实现如下:

function urljoin($base, $rel) {
    $pbase = parse_url($base);
    $prel = parse_url($rel);

    $merged = array_merge($pbase, $prel);
    if ($prel['path'][0] != '/') {
        // Relative path
        $dir = preg_replace('@/[^/]*$@', '', $pbase['path']);
        $merged['path'] = $dir . '/' . $prel['path'];
    }

    // Get the path components, and remove the initial empty one
    $pathParts = explode('/', $merged['path']);
    array_shift($pathParts);

    $path = [];
    $prevPart = '';
    foreach ($pathParts as $part) {
        if ($part == '..' && count($path) > 0) {
            // Cancel out the parent directory (if there's a parent to cancel)
            $parent = array_pop($path);
            // But if it was also a parent directory, leave it in
            if ($parent == '..') {
                array_push($path, $parent);
                array_push($path, $part);
            }
        } else if ($prevPart != '' || ($part != '.' && $part != '')) {
            // Don't include empty or current-directory components
            if ($part == '.') {
                $part = '';
            }
            array_push($path, $part);
        }
        $prevPart = $part;
    }
    $merged['path'] = '/' . implode('/', $path);

    $ret = '';
    if (isset($merged['scheme'])) {
        $ret .= $merged['scheme'] . ':';
    }

    if (isset($merged['scheme']) || isset($merged['host'])) {
        $ret .= '//';
    }

    if (isset($prel['host'])) {
        $hostSource = $prel;
    } else {
        $hostSource = $pbase;
    }

    // username, password, and port are associated with the hostname, not merged
    if (isset($hostSource['host'])) {
        if (isset($hostSource['user'])) {
            $ret .= $hostSource['user'];
            if (isset($hostSource['pass'])) {
                $ret .= ':' . $hostSource['pass'];
            }
            $ret .= '@';
        }
        $ret .= $hostSource['host'];
        if (isset($hostSource['port'])) {
            $ret .= ':' . $hostSource['port'];
        }
    }

    if (isset($merged['path'])) {
        $ret .= $merged['path'];
    }

    if (isset($prel['query'])) {
        $ret .= '?' . $prel['query'];
    }

    if (isset($prel['fragment'])) {
        $ret .= '#' . $prel['fragment'];
    }


    return $ret;
}

该函数将正确处理用户、密码、端口号、查询字符串、 anchor ,甚至 file:/// URL(这似乎是此类现有函数的常见缺陷)。

关于PHP 相当于 Python 的 `urljoin`,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/40053950/

10-12 12:52
查看更多