我正在查看SQL Server表中的约13,000行,并尝试解析出存储为json的一列中的某些值。
json列的值如下所示:
..."http://www.companyurl.com","FoundedYear":"2007","Status":"Private","CompanySize":"51-200","TagLine":"We build software we believe in","Origi...
我想提取“ CompanySize”的值,但并非所有行都包含此属性。其他复杂因素:
我不确定“ CompanySize”参数中有多少个可能的值。
“ CompanySize”并不总是紧随其后的是“ TagLine”参数。
我可以肯定地知道一个规则:CompanySize值始终是一个长度未知的字符串,该字符串在varchar字符串
"CompanySize":"
之后,并在下一个","
字符串之前终止。理想情况下,我们应该已完全升级到SQL Server 2016,因此我可以利用SQL Server's JSON support的优势,但事实并非如此。
最佳答案
您可以使用CHARINDEX
进行此操作,因为您可以将其传递至开始位置,这将使您获得结束的"
。您可能不应该查找","
,因为如果CompanySize
是最终属性,则该片段的末尾不会包含,"
。将其作为内联表值函数(iTVF)会非常有效(尤其是因为13k行几乎没有任何内容),您只需要在CROSS APPLY
或OUTER APPLY
中使用它:
USE [tempdb];
GO
CREATE FUNCTION dbo.GetCompanySize(@JSON NVARCHAR(MAX))
RETURNS TABLE
AS RETURN
WITH SearchStart AS
(
SELECT '"CompanySize":"' AS [Fragment]
), Search AS
(
SELECT CHARINDEX(ss.Fragment, @JSON) AS [Start],
LEN(ss.Fragment) AS [FragmentLength]
FROM SearchStart ss
)
SELECT CASE Search.Start
WHEN 0 THEN NULL
ELSE SUBSTRING(@JSON,
(Search.Start + Search.FragmentLength),
CHARINDEX('"',
@JSON,
Search.Start + Search.FragmentLength
) - (Search.Start + Search.FragmentLength)
)
END AS [CompanySize]
FROM Search;
GO
设置测试:
CREATE TABLE #tmp (JSON NVARCHAR(MAX));
INSERT INTO #tmp (JSON) VALUES
('"http://www.companyurl.com","FoundedYear":"2007","Status":"Private","CompanySize":"51-200","TagLine":"We build software we believe in","Origi..');
INSERT INTO #tmp (JSON) VALUES
('"http://www.companyurl.com","FoundedYear":"2009","Status":"Public","TagLine":"We build software we believe in","Origi..');
INSERT INTO #tmp (JSON) VALUES (NULL);
运行测试:
SELECT comp.CompanySize
FROM #tmp tmp
CROSS APPLY tempdb.dbo.GetCompanySize(tmp.JSON) comp
返回值:
CompanySize
-----------
51-200
NULL
NULL