问题描述
PostgreSQL 可以处理从任何地方开始的数组下标.
考虑以下示例,该示例创建了一个包含 3 个下标从 5 到 7 的元素的数组:
SELECT ('[5:7]={1,2,3}'::int[]);
返回:
[5:7]={1,2,3}
例如,意思是你得到第一个元素
SELECT ('[5:7]={1,2,3}'::int[])[5];
我想规范化任何给定的一维数组以从数组下标1开始.
我能想到的最好的:
SELECT ('[5:7]={1,2,3}'::int[])[array_lower('[5:7]={1,2,3}'::int[], 1):array_upper('[5:7]={1,2,3}'::int[], 1)]
或者,同样的,更容易阅读:
WITH x(a) AS (选择 '[5:7]={1,2,3}'::int[])SELECT a[array_lower(a, 1):array_upper(a, 1)]发件人
你知道一种更简单/更快或更优雅的方法吗?
基准
为了测试性能,我制定了这个快速基准.
具有 100k 行的表,简单的整数数组,长度在 1 到 11 之间:
CREATE TEMP TABLE t (a int[]);INSERT INTO t -- 现在实际上有不同的下标SELECT ('[' || g%10 || ':' || 2*(g%10) || ']={1'||repeat(','||g::text, g%10) ||'}')::int[]FROM generate_series(1,100000) g;解释分析选择substring(a::text, '{.*$')::int[] -- 总运行时间:949.304 毫秒-- a[-2147483648:2147483647] -- 总运行时间:283.877 毫秒-- a[array_lower(a, 1):array_upper(a, 1)] -- 总运行时间:311.545 ms从T
所以,是的,@Daniel 的想法稍微快了一点.
@Kevin 的文本转换也有效,但没有获得很多积分.
还有其他想法吗?
有一个更简单的方法很丑,但我相信技术上是正确的:从数组中提取最大可能的切片,而不是使用计算的精确切片界限.它避免了两个函数调用.
示例:
select ('[5:7]={1,2,3}'::int[])[-2147483648:2147483647];
结果:
整数4---------{1,2,3}PostgreSQL can work with array subscripts starting anywhere.
Consider this example that creates an array with 3 elements with subscripts from 5 to 7:
SELECT ('[5:7]={1,2,3}'::int[]);
Returns:
[5:7]={1,2,3}
Meaning, for instance, that you get the first element with
SELECT ('[5:7]={1,2,3}'::int[])[5];
I want to normalize any given 1-dimensional array to start with array subscript 1.
The best I could come up with:
SELECT ('[5:7]={1,2,3}'::int[])[array_lower('[5:7]={1,2,3}'::int[], 1):array_upper('[5:7]={1,2,3}'::int[], 1)]
Or, the same, easier the read:
WITH x(a) AS (
SELECT '[5:7]={1,2,3}'::int[]
)
SELECT a[array_lower(a, 1):array_upper(a, 1)]
FROM x
Do you know a simpler / faster or at least more elegant way?
Benchmark
For the purpose of testing performance I whipped up this quick benchmark.
Table with 100k rows, simple integer array of random length between 1 an 11:
CREATE TEMP TABLE t (a int[]);
INSERT INTO t -- now with actually varying subscripts
SELECT ('[' || g%10 || ':' || 2*(g%10) || ']={1'
|| repeat(','||g::text, g%10) || '}')::int[]
FROM generate_series(1,100000) g;
EXPLAIN ANALYZE
SELECT
substring(a::text, '{.*$')::int[] -- Total runtime: 949.304 ms
-- a[-2147483648:2147483647] -- Total runtime: 283.877 ms
-- a[array_lower(a, 1):array_upper(a, 1)] -- Total runtime: 311.545 ms
FROM t
So, yes, @Daniel's idea is slightly faster.
@Kevin's text conversion works, too, but doesn't earn many points.
Any other ideas?
There is a simpler method that is ugly, but I believe technically correct: extract the largest possible slice out of the array, as opposed to the exact slice with computed bounds.It avoids the two function calls.
Example:
select ('[5:7]={1,2,3}'::int[])[-2147483648:2147483647];
results in:
int4 --------- {1,2,3}
这篇关于规范化一维数组的数组下标,使它们从 1 开始的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!