1. ORACLE 19C LISTAGG函数功能增强


1.1 LISTAGG函数简介

LISTAGG函数是11g R2引入用于合并字符串(列转行),可替换wmsys.wm_concat功能,而且效率更好wm_concat函数是自定义聚集函数,而且是undocument的,从12C开始已经去掉,从11g R2开始,有字符串合并的需求,最好使用LISTAGG替代,以获得更好的性能。

 

但是,LISTAGG函数在11g R2中有个缺点,就是不能直接DISTINCT,在ORACLE 19C中,ORACLE给它增加了DISTINCT功能,这样可以剔除重复的字符串合并19C的完整LISTAGG语法如下:



 从语法图上可以看出,LISTAGG函数可以是普通的组函数,也可以用于分析函数,并且12C开始增加了OVERFLOW语法,用于字符串过长的处理。

Oracle 19c New Features:LISTAGG函数功能增加DISTINCT-LMLPHP 

  

1.2 使用LISTAGG函数实现字符串合并

 

如下例所示:

 

需求:对emp表,按照部门分组,按逗号合并部门员工名。

使用LISTAGG实现如下:

 

 

select deptno, listagg(ename,',') within group(order by deptno)as enames

  from emp

  group by deptno

order by deptno;

 

DEPTNO ENAMES

------------------------------------------------------------------------------

    10 CLARK,KING,MILLER

    20ADAMS,FORD,JONES,SCOTT,SMITH

    30ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

下面给DEPTNO=10的部门插入2个重复名字的员工,如下所示:

 

INSERT INTO empVALUES(8000,'KING','ANALYST',7782,to_date('1983-1-1','yyyy-mm-dd'),2000,NULL,10);

INSERT INTO empVALUES (9000,'KING','MANADER',7782,to_date('1984-5-1','yyyy-mm-dd'),2500,NULL,10);

COMMIT;

 

再次查询,发现有重复的数据:

SQL> selectdeptno, listagg(ename,',') within group(order by deptno) as enames

  2   from emp

  3   group by deptno

  4 order by deptno;

 

DEPTNO ENAMES

--------------------------------------------------------------------------------------

    10 CLARK,KING,KING,KING,MILLER

    20 ADAMS,FORD,JONES,SCOTT,SMITH

    30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

 
 
一般遇到这种有重复数据的情况,需要剔除重复数据,在19C之前,需要先剔除重复数据:

 

--红色部分使用分析函数剔重,当然这里直接distinct也可以,结果与最前面的一致

select deptno, listagg(ename, ',')within group(order by deptno) as enames

 from (select deptno,

               ename,

               row_number() over(partition by deptno, enameorder by empno) rn

          from emp)

 where rn = 1

 group by deptno

 order by deptno;

 

或者直接DISTINCT :

select deptno, listagg(ename, ',') within group(order by deptno)as enames

  from (select distinctdeptno,

               ename

          from emp)

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

-------------------------------------------------------------------------------

10 CLARK,KING,MILLER

20 ADAMS,FORD,JONES,SCOTT,SMITH

30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

 

 

  到了19CLISTAGG提供了直接DISTINCT的功能,可以很简单地实现剔除重复数据,然后合并,如下所示:

 

selectdeptno, listagg(distinct ename,',') within group(order by deptno) as enames

  from emp

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

-------------------------------------------------------------------------------

10 CLARK,KING,MILLER

20 ADAMS,FORD,JONES,SCOTT,SMITH

30 ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

      

  默认情况下包含所有数据(ALL不用写默认),也就是LISTAGG(ALLcolumn),如下所示:

select deptno, listagg(all ename,',') within group(order by deptno) asenames

  from emp

 group by deptno

 order by deptno;

 

1.3 WM_CONCAT函数与LISTAGG函数比较

如果使用wm_concat,则是:

select deptno, wm_concat(distinct ename)as enames

  from emp

 group by deptno

 order by deptno;

 

DEPTNO ENAMES

--------------------------------------------------------------------------------------

    10 CLARK,KING,MILLER

    20 ADAMS,FORD,JONES,SCOTT,SMITH

    30ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD

 

wm_concat这个undocument函数可以直接distinct,但是它与LISTAGG比缺点很明显,一是性能差,而是undocument函数,比较如下:

 

1)wm_concat可以distinct去除重复的(用于分析函数,distinct不能带orderby),19C之前的listagg不可以,可以先剔除重复,然后做listagg,19C之后的listagg可以distinct。

2)wm_concat不能保证排序,listagg可以保证排序.

3.WMSYS.WM_CONCAT是undocument的函数,最好不要使用,要使用也用自定义分析函数

 

1.4  使用ON OVERFLOW处理字符串长度溢出问题

LISTAGG函数在12.2开始,合并长度最多32767字节,依赖于MAX_STRING_SIZE参数,如下所示:

1)  如果MAX_STRING_SIZE=EXTEND,则对于VARCHAR2RAW类型,最多返回32767字节

2)  如果MAX_STRING_SIZE=STANDARD,则对于VARCHAR2最多4000字节,对于RAW类型最多2000字节

 

那么在合并的字符串超过限制,溢出时,默认报错,但是12.2引入了ON OVERFLOW可以截断处理,如下:

 

ON OVERFLOW TRUNCATE默认对溢出数据后面用…(count),如下所示:

 

SELECT deptno,LISTAGG(ename, ','  ON OVERFLOW TRUNCATE) WITHIN GROUP(ORDER BY deptno) AS enames

FROM  (select deptno,ename from emp,(select levelfrom dual connect by level

GROUP BY deptno

ORDER BY deptno;

 

DEPTNO ENAMES

----------------------------------------------------------------------------------------------------------

    10CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLAR

       省略

      ARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,

      CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,CLARK,...(4334)

 

    20ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAM

      S,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,AD

      AMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,

       省略

      ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,ADAMS,...(4334)

 

    30ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLE

      N,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,AL

      LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,

      省略

      LEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,

      ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,ALLEN,...(5333)

 

其他用法如下:

--1.改变省略的格式,换成~~~

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOWTRUNCATE '~~~') WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (selectdeptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

--2.通过WINTHOUTCOUNT省略掉计数

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOWTRUNCATE WITHOUT COUNT) WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (selectdeptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

--3.默认格式ONOVERFLOW ERROW,溢出则报错ORA-01489: result of string concatenation istoo long

 

SELECT deptno, LISTAGG(ename, ',' ON OVERFLOW ERROR)WITHIN GROUP (ORDER BY deptno) AS enames

FROM  (selectdeptno,ename from emp,(select level from dual connect by level<1000))

GROUP BY deptno

ORDER BY deptno;

 

1.5 使用XMLAGG返回CLOB处理字符串合并超长需求

针对超过字符串长度溢出的问题,如果要完整显示,可以使用如下方式:

1) 自定义wm_concat返回值类型为CLOB

2) 使用XMLAGG函数获取CLOB值

下面使用XMLAGG函数处理超长合并字符串问题,如下所示:

 

SELECT deptno,  RTRIM(xmlagg(xmlelement(c, ename || ',')

ORDER BY deptno).extract('//text()').getclobval(),

             ',') ASenames

FROM (select deptno,ename from emp,(select level from dual connect bylevel<1000))

GROUP BY deptno

ORDER BY deptno;

 

总结:ORACLE19C针对LISTAGG增加了很多改进,在以后有字符串合并需求(列转行)的时候,要优先使用LISTAGG,而不是WM_CONCAT,对于自定义聚集函数,性能较差,而且功能也没有LISTAGG强大。

12-24 01:33