问题描述
我正在按计数将一个巨大的表汇总到一个新表中,在这里我想将所有空字符串都更改为NULL
,并且还要对某些列进行类型转换.我通读了一些帖子,但找不到查询,这使我可以在单个查询中跨所有列进行查询,而无需使用多条语句.
I am rolling up a huge table by counts into a new table, where I want to change all the empty strings to NULL
, and typecast some columns as well. I read through some of the posts and I could not find a query, which would let me do it across all the columns in a single query, without using multiple statements.
让我知道是否可以遍历所有列,并用空字符串替换为null来替换单元格.
Let me know if it is possible for me to iterate across all columns and replace cells with empty strings with null.
推荐答案
据我所知,没有内置函数可以替换表中所有列的空字符串.您可以将 plpgsql函数 编写为照顾这个.
To my knowledge there is no built-in function to replace empty strings across all columns of a table. You can write a plpgsql function to take care of that.
以下函数用NULL
替换给定表的所有基本字符类型列中的空字符串 .然后,如果其余字符串是有效的数字文字,则可以强制转换为integer
.
The following function replaces empty strings in all basic character-type columns of a given table with NULL
. You can then cast to integer
if the remaining strings are valid number literals.
CREATE OR REPLACE FUNCTION f_empty2null(_tbl regclass, OUT updated_rows int) AS
$func$
DECLARE
-- basic char types, possibly extend with citext, domains or custom types:
_typ CONSTANT regtype[] := '{text, bpchar, varchar, \"char\"}';
_sql text;
BEGIN
SELECT INTO _sql -- build command
format('UPDATE %s SET %s WHERE %s'
, _tbl
, string_agg(format($$%1$s = NULLIF(%1$s, '')$$, col), ', ')
, string_agg(col || $$ = ''$$, ' OR '))
FROM (
SELECT quote_ident(attname) AS col
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible, legal table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND NOT attnotnull -- exclude columns defined NOT NULL!
AND atttypid = ANY(_typ) -- only character types
ORDER BY attnum
) sub;
-- Test
-- RAISE NOTICE '%', _sql;
-- Execute
IF _sql IS NULL THEN
updated_rows := 0; -- nothing to update
ELSE
EXECUTE _sql;
GET DIAGNOSTICS updated_rows = ROW_COUNT; -- Report number of affected rows
END IF;
END
$func$ LANGUAGE plpgsql;
致电:
SELECT f_empty2null('mytable');
SELECT f_empty2null('myschema.mytable');
还要获取列名updated_rows
:
SELECT * FROM f_empty2null('mytable');
-
表名必须有效且可见,并且主叫用户必须具有所有必要的特权.如果不满足这些条件中的任何一个,则该函数将不执行任何操作-即,也不会破坏任何操作.我强制转换为对象标识符类型
regclass
确定.
可以按原样('mytable'
)提供表名,然后默认提供决定.或符合模式资格以选择特定模式('myschema.mytable'
).
Table name has to be valid and visible and the calling user must have all necessary privileges. If any of these conditions are not met, the function will do nothing - i.e. nothing can be destroyed, either. I cast to the object identifier type
regclass
to make sure of it.
The table name can be supplied as is ('mytable'
), then thesearch_path
decides. Or schema-qualified to pick a certain schema ('myschema.mytable'
).
查询系统目录以获取所有内容(字符-类型)列.提供的函数使用以下基本字符类型:text
,bpchar
,varchar
,"char"
.仅处理相关的列.
Query the system catalog to get all (character-type) columns of the table. The provided function uses these basic character types: text
, bpchar
, varchar
, "char"
. Only relevant columns are processed.
使用 quote_ident()
或format()
清除列名和防范SQLi .
更新的版本使用基本的SQL聚合函数string_agg()
来构建命令字符串而不会循环,这更简单,更快捷.而且更优雅. :)
The updated version uses the basic SQL aggregate function string_agg()
to build the command string without looping, which is simpler and faster. And more elegant. :)
必须使用具有EXECUTE
的动态SQL.
Has to use dynamic SQL with EXECUTE
.
更新后的版本不包含定义为NOT NULL
的列,并且仅在单个语句中更新 每行 ,这对于具有多个字符的表来说要快得多,键入列.
The updated version excludes columns defined NOT NULL
and only updates each row once in a single statement, which is much faster for tables with multiple character-type columns.
应与任何现代PostgreSQL版本一起使用.经过Postgres 9.1、9.3和9.5的测试.
Should work with any modern version of PostgreSQL. Tested with Postgres 9.1, 9.3 and 9.5.
这篇关于用空值替换空字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!