问题描述
我正在使用几百列的CSV,其中许多只是枚举,即:
I'm working with a CSV of a few hundred columns, many of them are just enumerations, ie:
[
['code_1', 'code_2', 'code_3', ..., 'code_50'],
[1, 2, 3, ..., 50],
[2, 3, 4, ..., 51],
...
[400000, 400001, 400002, ..., 400049]
]
我要将这些数据导入PostgreSQL,并希望将这些列连接到一个数组中,例如:
I'm importing this data into PostgreSQL and would like to concatenate these columns into an array such as:
[
['codes'],
['{1, 2, 3, ..., 50}']
]
等。
我知道可以完成此操作的环回方式,例如
I'm aware of 'round-about' ways I can accomplish this such as
df['codes'] = pd.DataFrame(["{" + df['code_1'] + ", " + df['code_2'] + "}"]).T
,但鉴于此CSV的大小,这是很多冗余代码,可用于编写和维护。
but that's a lot of redundant code to write and maintain given the size of this CSV.
我基本上要使用的是一个列列表,我已经提取了列举的列,例如:
What I basically have to work with is a column list, I've already extracted the enumerated columns such as:
codes = [
'code_1',
'code_2',
'code_3',
...
]
在我开始编写自己的自定义 implode_columns(arr)
函数,熊猫中有什么东西可以解决此问题,或者有特殊的便捷方式容纳PostgreSQL数组吗?
Before I begin writing my own custom "implode_columns(arr)
" function, is there anything in pandas that already solves this problem or has special ways of accommodating PostgreSQL arrays in convenient ways?
推荐答案
假定您已经连接到PostgreSQL,并且已经在PostgreSQL中拥有该表。或访问此链接
Assumed that you already connect to PostgreSQL and already have the table in PostgreSQL. Or visit this link https://wiki.postgresql.org/wiki/Psycopg2_Tutorial
import psycopg2
try:
conn = psycopg2.connect("host='localhost' dbname='template1' user='dbuser' password='dbpass'")
except:
print "I am unable to connect to the database"
首先,打开.csv文件。
First, open the .csv file.
>>> import csv
>>> with open('names.csv') as csvfile:
... reader = csv.DictReader(csvfile)
... for row in reader:
... print(row['first_name'], row['last_name'])
...
来自
或完整的示例代码:
You can change (100, "abc'def") with (variable1, variable2) See this link http://initd.org/psycopg/docs/usage.htmlOr in full sample code:
>>> import csv
>>> import psycopg2
>>> with open('names.csv') as csvfile:
... reader = csv.DictReader(csvfile)
... for row in reader:
... cur.execute("INSERT INTO test (num, data) VALUES (%s, %s)", (variable1, variable2))
...
希望这会有所帮助...
Hope this will help...
这篇关于将 pandas 列转换为PostgreSQL列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!