本文介绍了PostgreSQL-正则表达式拆分带有潜在引号的CSV行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在postgres中拆分代表csv行的列。此文本行中的字段由竖线分隔,有时用引号括起来,有时则不。

I would like to split a column that represent a csv line in postgres. Fields in this text line are delimited by pipe, sometime they are enclosed by quote and sometime not. In addition we can have escaped chars.

field1|"field2"|field3|"22 \" lcd \| screen "

是否有正则表达式来拆分此列(即使用regexp_split_to_array(....)? )

Is there a regex to split this column (i.e. using regexp_split_to_array(....)? )

推荐答案

与正则表达式无关,但有效

Not about regexp but it works

create or replace function split_csv(
  line text,
  delim_char char(1) = ',',
  quote_char char(1) = '"')
returns setof text[] immutable language plpythonu as $$
  import csv
  return csv.reader(line.splitlines(), quotechar=quote_char, delimiter=delim_char, skipinitialspace=True, escapechar='\\')
$$;

select *, x[4] from split_csv('field1|"field2"|field3|"22 \" lcd \| screen "'||E'\n'||'a|b', delim_char := '|') as x;




╔══════════════════════════════════════════════╤════════════════════╗
║                      x                       │         x          ║
╠══════════════════════════════════════════════╪════════════════════╣
║ {field1,field2,field3,"22 \" lcd | screen "} │ 22 " lcd | screen  ║
║ {a,b}                                        │ ░░░░               ║
╚══════════════════════════════════════════════╧════════════════════╝

这篇关于PostgreSQL-正则表达式拆分带有潜在引号的CSV行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 13:06