问题描述
我试图将Pandas DataFrame中的数据插入使用SQLite后端的现有Django模型 Agency
中.但是,按照的答案进行操作将Pandas Dataframe转换为Django模型和将Pandas DataFrame保存到Django Model 导致整个SQLite表被替换并破坏了Django代码.具体来说,是由Django自动生成的 id
主键列替换为 index
,这会导致在渲染模板时出现错误().
I am trying to insert data in a Pandas DataFrame into an existing Django model, Agency
, that uses a SQLite backend. However, following the answers on How to write a Pandas Dataframe to Django model and Saving a Pandas DataFrame to a Django Model leads to the whole SQLite table being replaced and breaking the Django code. Specifically, it is the Django auto-generated id
primary key column that is replaced by index
that causes the errors when rendering templates (no such column: agency.id
).
这是在SQLite表 agency
上使用Pandas to_sql的代码和结果.
Here is the code and the result of using Pandas to_sql on the SQLite table, agency
.
在 models.py
中:
class Agency(models.Model):
name = models.CharField(max_length=128)
在 myapp/management/commands/populate.py
中:
class Command(BaseCommand):
def handle(self, *args, **options):
# Open ModelConnection
from django.conf import settings
database_name = settings.DATABASES['default']['NAME']
database_url = 'sqlite:///{}'.format(database_name)
engine = create_engine(database_url, echo=False)
# Insert data data
agencies = pd.DataFrame({"name": ["Agency 1", "Agency 2", "Agency 3"]})
agencies.to_sql("agency", con=engine, if_exists="replace")
调用" python manage.py populate
"成功将三个代理添加到表中:
Calling 'python manage.py populate
' successfully adds the three agencies into the table:
index name
0 Agency 1
1 Agency 2
2 Agency 3
但是,这样做已更改了表的DDL:
However, doing so has changed the DDL of the table from:
CREATE TABLE "agency" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT, "name" varchar(128) NOT NULL)
收件人:
CREATE TABLE agency (
"index" BIGINT,
name TEXT
);
CREATE INDEX ix_agency_index ON agency ("index")
如何将DataFrame添加到Django管理的模型中并保持Django ORM完整?
How can I add the DataFrame to the model managed by Django and keep the Django ORM intact?
推荐答案
要回答我自己的问题,当我如今经常使用Pandas将数据导入Django时,我犯的错误是试图使用Pandas内置的Sql Alchemy正在修改基础数据库表定义的数据库ORM.在上面的上下文中,您可以简单地使用Django ORM连接并插入数据:
To answer my own question, as I import data using Pandas into Django quite often nowadays, the mistake I was making was trying to use Pandas built-in Sql Alchemy DB ORM which was modifying the underlying database table definition. In the context above, you can simply use the Django ORM to connect and insert the data:
from myapp.models import Agency
class Command(BaseCommand):
def handle(self, *args, **options):
# Process data with Pandas
agencies = pd.DataFrame({"name": ["Agency 1", "Agency 2", "Agency 3"]})
# iterate over DataFrame and create your objects
for agency in agencies.itertuples():
agency = Agency.objects.create(name=agency.name)
但是,您可能经常想使用外部脚本而不是上面的管理命令或Django的shell导入数据.在这种情况下,您必须首先通过调用 setup
方法来连接到Django ORM:
However, you may often want to import data using an external script rather than using a management command, as above, or using Django's shell. In this case you must first connect to the Django ORM by calling the setup
method:
import os, sys
import django
import pandas as pd
sys.path.append('../..') # add path to project root dir
os.environ["DJANGO_SETTINGS_MODULE"] = "myproject.settings"
# for more sophisticated setups, if you need to change connection settings (e.g. when using django-environ):
#os.environ["DATABASE_URL"] = "postgres://myuser:mypassword@localhost:54324/mydb"
# Connect to Django ORM
django.setup()
# process data
from myapp.models import Agency
Agency.objects.create(name='MyAgency')
-
在这里,我已将设置模块
myproject.settings
导出到DJANGO_SETTINGS_MODULE
,以便可以使用django.setup()
项目设置.Here I have exported my settings module
myproject.settings
to theDJANGO_SETTINGS_MODULE
so thatdjango.setup()
can pick up the project settings.取决于运行脚本的位置,您可能需要转到系统路径,以便Django可以找到设置模块.在这种情况下,我在项目根目录下的两个目录中运行脚本.
Depending on where you run the script from, you may need to path to the system path so Django can find the settings module. In this case, I run my script two directories below my project root.
您可以在调用
setup
之前修改任何设置.如果您的脚本连接到数据库的方式与settings
中配置的方式不同.例如,当针对Django/postgres Docker容器在本地运行脚本时.You can modify any settings before calling
setup
. If your script needs to connect to the DB differently than whats configured insettings
. For example, when running a script locally against Django/postgres Docker containers.请注意,上面的示例使用的是 django-environ 指定数据库设置.
Note, the above example was using the django-environ to specify DB settings.
这篇关于如何将Pandas Dataframe写入现有的Django模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!