问题描述
我真的不明白 CHECKSUM_AGG()
是如何工作的,尽管我知道它是通过使用 XOR 以某种方式构建的.这解释了为什么当您传递相等的整数时它返回 0
.
I don't really get how CHECKSUM_AGG()
works, although I understood that it is somehow built by using XORs. Which explains why it returns 0
when you pass just equal integers.
但是为什么我在以下 SQL 中得到相同的聚合校验和,其中输入值是唯一的?
But why do I get the same aggregated checksum in the following SQL, where the input values are unique?
DECLARE @test1 TABLE (chksum INT)
INSERT INTO @test1 VALUES (2147473855), (2147473343)
SELECT CHECKSUM_AGG(chksum)
FROM @test1
DECLARE @test2 TABLE (chksum INT)
INSERT INTO @test2 VALUES (2147474831), (2147472271)
SELECT CHECKSUM_AGG(chksum)
FROM @test2
一个解释将不胜感激.谢谢!
An explanation would be much appreciated. Thanks!
推荐答案
SQL Server CHECKSUM 和 CHECKSUM_AGG 实现存在已知问题:CHECKSUM 弱点解释
There are known issues with SQL Server CHECKSUM and CHECKSUM_AGG implementations: CHECKSUM weakness explained
改用哈希字节:使用HASHBYTES 比较列
来自 微软:如果表达式列表中的值之一发生变化,则列表的校验和通常也会发生变化.但是,校验和不变的可能性很小.出于这个原因,我们不建议使用 CHECKSUM 来检测值是否已更改,除非您的应用程序可以容忍偶尔丢失更改.考虑改用 HashBytes.指定 MD5 哈希算法时,HashBytes 对两个不同输入返回相同结果的概率远低于 CHECKSUM.
From Microsoft:If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change. For this reason, we do not recommend using CHECKSUM to detect whether values have changed, unless your application can tolerate occasionally missing a change. Consider using HashBytes instead. When an MD5 hash algorithm is specified, the probability of HashBytes returning the same result for two different inputs is much lower than that of CHECKSUM.
您不能直接跨行使用 HASHBYTES - 有一种解决方法此处.
You may not use HASHBYTES across rows directly - there is one workaround here.
以下是使用 HASBYTES 解决方法对较小数字的比较:
Here is comparison for smaller numbers, using HASBYTES workaround:
DECLARE @test1 TABLE (chksum INT)
DECLARE @test2 TABLE (chksum INT)
INSERT INTO @test1 VALUES (50), (3), (26)
INSERT INTO @test2 VALUES (45), (0), (6)
SELECT [Values] = '50, 3, 26',
[Checksum] = CHECKSUM_AGG(chksum),
-- HashBytes is limited to 8000 bytes only
[Hashbytes] = HashBytes('md5',convert(varbinary(max),(SELECT * FROM @test1 FOR XML AUTO)))
FROM @test1
UNION ALL
SELECT [Values] = '45, 0, 6',
[Checksum] = CHECKSUM_AGG(chksum),
-- HashBytes is limited to 8000 bytes only
[Hashbytes] = HashBytes('md5',convert(varbinary(max),(SELECT * FROM @test2 FOR XML AUTO)))
FROM @test2
这篇关于为什么 CHECKSUM_AGG() 为完全不同的输入值返回相同的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!