问题描述
有没有一种方法可以在不将位字符串宽度硬编码为0的情况下进行非零位字符串测试?
Is there a way to do a non-zero bit string test without hard-coding the bit string width of 0?
例如,假设我有两个表,用户和功能,每个都带有掩码,我要对此进行测试:
For example, suppose I have two tables, Users and Features, each with masks, I want to test this:
SELECT u.name FROM Users u, Features f
WHERE u.mask & f.mask;
匹配隐式非零结果。但是,SQL要求 WHERE
的显式布尔结果,而不是隐式强制转换,例如:
matching implicit non-zero results. However, SQL requires an explicit boolean result for WHERE
as opposed to an implicit cast, such as this:
SELECT u.name FROM Users u, Features f
WHERE (u.mask & f.mask) != 0::BIT(2048);
我不想硬编码 2048
(或其他任何原因)。
I don't want to hardcode 2048
(or whatever) in this query for a number of reasons.
测试 expr = 0
或 expr> 0
会导致类型错误。奇怪的是,我可以测试 expr = 0 :: BIT(1)
,但这给出了错误的答案,因为Postgres并不认为所有的全零位字符串都相等。 / p>
Testing expr = 0
or expr > 0
results in a type error. Oddly, I can test expr = 0::BIT(1)
, but that gives the wrong answer because Postgres does not consider all all-zero bit strings to be equal.
select 0::BIT(2) > 0::BIT(1);
?column?
----------
t
(1 row)
我可以通过以下操作创建一个计算出的零:
I can create a calculated zero by doing this:
SELECT u.name FROM Users u, Features f
WHERE (u.mask & f.mask) != (u.mask & ~u.mask);
虽然有效,但感觉像是骇人听闻的骇客。
Which works but feels like an awful hack.
有什么建议或见解吗?
结果
我对下面提供的几种选择进行了基准测试。
I benchmarked several options provided below. Thanks for the suggestions, Erwin!
基于非常大的数据集和100,000个查询,我发现以下构造导致每秒的相关查询。希望Postgres小组的人看到了这一点,并提供了通用0以加快处理速度!不幸的是,大多数通用方法似乎都需要进行字符串转换,这是非常昂贵的。
Based on a very large data set and 100,000 queries, I found the following constructs resulted in the associated queries per second. Hopefully someone from the Postgres team sees this and provides a generic 0 to speed things up! Unfortunately most generic approaches seem to incur a string conversion which is quite expensive.
Constructs | Queries / s
----------------------------------------+--------------
(u.mask & f.mask) <> 0::BIT(2048) | 158
(u.mask & f.mask) <> (u.mask # u.mask) | 135
(u.mask & f.mask) <> (u.mask & ~u.mask) | 125
position('1' IN (u.mask & f.mask)) > 0 | 37
(u.mask & f.mask)::TEXT !~ '^0+$' | 27
推荐答案
短位字符串
要排除返回由零组成的位串,但长度可能会改变( B'000 ...'
),则可以使用强制转换为整数
(最多 bit(32)
)或 bigint
(最多 bit(64)
):
Short bitstring
To exclude cases where the bitwise AND (&
) returns a bitstring consisting of nothing but zeros, but the length might change (B'000...'
), you can use a cast to integer
(up to bit(32)
) or bigint
(up to bit(64)
):
SELECT u.name
FROM users u
JOIN features f ON (u.mask & f.mask)::int <> 0;
当转换为整数时,它们全部导致 0
。
这也排除其中任一列为 NULL
的情况。换句话说,结果必须包含至少一个 1
。
When cast to integer, all of them result in 0
.
This also excludes cases where either of the columns is NULL
. In other words, the result has to include at least one 1
.
如果您的值可以大于64位,则可以强制转换为 text
并使用正则表达式进行检查:
If your values can be longer than 64 bit, you could cast to text
and check with a regular expression:
ON (u.mask & f.mask)::text !~ '^0+$'
模式解释:
^
..字符串的开头
0 +
..一个或多个'0'
$
..字符串结尾
^
.. beginning of string0+
.. one or more '0'$
.. end of string
或者,作为:
Ergo:
ON position('1' IN (u.mask & f.mask)) > 0
这篇关于在Postgres中将通用位字符串与零进行比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!