我有一个名为file的下表列,它包含目录。
样本数据为:

C:\filedata\6860_f11.xlxb_3.30 test - 0.3 ML

C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML


请注意,我只想为#1获取6860_f11.xlxb,为#2获取1191_f12.xlxb

对于#1,该目录仅包含1个文件夹filedata,对于#2,该目录包含2个文件夹cloud\files

下面是我的代码:

select
    (SUBSTRING((file), 0, CHARINDEX ('.xlxb', (file)) + 4)) as xlsb_file
from
    [Projects].[dbo].[ProjFiles]


有什么办法可以在文件夹之后直到.xlxb之后的下划线获取字符串?

最佳答案

无需CLR。不需要正则表达式。使用NGrams8K解决此问题的最简单,最佳方法。我现在住的是凌晨2点,所以我会尽快介绍。

注意此查询:

DECLARE @string VARCHAR(150) = 'C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML';

SELECT RetPos = f.p, RetVal = e.s
FROM   (SELECT MAX(position)+1 FROM samd.NGrams8k(@string,1) WHERE token = '\') AS f(p)
CROSS APPLY  (VALUES(SUBSTRING(@string,f.p,CHARINDEX('.',@string,f.p)-f.p+5)))  AS e(s);


结果:

RetPos RetVal
------ ---------------
16     1191_f12.xlxb


现在针对一张桌子:

CREATE TABLE #yourtable ([file] VARCHAR(150));
INSERT INTO #yourtable
VALUES ('C:\filedata\6860_f11.xlxb_3.30 test - 0.3 ML'),
       ('C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML');

SELECT *
FROM   #yourtable AS t
CROSS APPLY
(
  SELECT newstring = e.s
  FROM   (SELECT MAX(position) FROM samd.NGrams8k(t.[file],1) WHERE token = '\') AS f(p)
  CROSS
  APPLY  (VALUES(SUBSTRING(t.[file],f.p+1,CHARINDEX('.',t.[file],f.p+1)-f.p+4)))  AS e(s)
) AS itvf_str_extract;


真的很容易。性能也将超越任何基于CLR / Regex的解决方案-就是这样。

附带说明一下:John Cappelletti的解决方案非常出色(与往常一样)。在幕后,它与我的NGrams解决方案非常相似,但不完全相同。比较这两个查询:

DECLARE @string VARCHAR(150) = 'C:\cloud\files\1191_f12.xlxb_12.16 test - 0.3 ML';
DECLARE @Delimiter1 varchar(100) = '\', @Delimiter2 varchar(100) = '.';

-- Alan B
SELECT
  RetSeq = 1,
  RetPos = f.p,
  RetVal = e.s
FROM   (SELECT MAX(position)+1 FROM samd.NGrams8k(@string,1) WHERE token = '\') AS f(p)
CROSS
APPLY  (VALUES(SUBSTRING(@string,f.p,CHARINDEX('.',@string,f.p)-f.p+5)))  AS e(s);

-- John C
with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
       cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
       cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
       cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
      ,RetPos = N
      ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
 From  (
        Select *,RetVal = Substring(@String, N, L)
         From  cte4
       ) A
 Where charindex(@Delimiter2,RetVal)>1;


现在执行计划:

sql-server - 在“\”和“。”之间选择字符串在SQL Server中-LMLPHP

10-07 20:39