Forgive me if this is an obvious question but I'm new to pony and databases in general and didn't find the right part of the documentation that answers this question.
I'm trying to create a database with companies and the locations where those companies have offices. This is a many-to-many relationship since each company is in multiple locations and each location can be host to multiple companies. I'm defining my entities as such:
from pony import orm
class Company(db.Entity):
'''A company entry in database'''
name = orm.PrimaryKey(str)
locations = orm.Set('Location')
class Location(db.Entity):
'''A location for a company'''
name = orm.PrimaryKey(str)
companies = orm.Set('Company')
Ideally, I'd like to be able to write a function that adds a company to the database while also adding the list of locations where that company exists while also being sure to add new location instances if they don't already exist. I can quickly think of two ways to do so.
First would be to try to enter the location even if it exists and handle the exception:
def add_company(name, locations):
loc_entities = []
for l in locations:
loc = Location[l]
except orm.core.ObjectNotFound:
loc = Location(name=l)
comp = Company(name=name, locations=loc_entities)
Second would be to query the database and ask whether the locations exist yet:
def add_company2(name, locations):
old_loc_entities = orm.select(l for l in Location if l.name in locations)[:]
old_locations = [l.name for l in old_loc_entities]
new_locations = set(locations) - (set(locations) & set(old_locations))
loc_entities = [Location(name=l) for l in new_locations] + old_loc_entities
comp = Company(name=name, locations=loc_entities)
在这两个中,我猜想更 Pythonic 的方法是简单地处理异常,但这会遇到 N+1 问题吗?我注意到通过使用名称作为主键,我每次使用索引访问实体时都会进行查询.当我让 pony 选择顺序 ID 时,我似乎不需要查询.我还没有用任何大型数据集对此进行测试,所以我还没有进行基准测试.
Of these two, I'd guess that the more pythonic way to do it would be to simply handle the exception but does this run into the N+1 problem? I'm noticing that by using the name as a primary key, I'm making a query every time I access the entity using an index. When I just let pony pick sequential ids, I don't seem to need to query. I haven't tested this with any large datasets yet so I haven't benchmarked yet.
Pony 内部以与字符串主键相同的方式缓存顺序主键,所以我认为应该没有区别.每个 db_session
都有单独的缓存(称为身份映射").读取对象后,在同一 db_session
Internally Pony caches sequential primary keys in the same way as a string primary keys, so I think there should be no difference. Each db_session
have separate cache (which is called "identity map"). After an object is read, any access by primary key (or any other unique key) within the same db_session
should return the same object directly from the identity map without issuing a new query. After the db_session
is over, another access by the same key will issue a new query, because the object could be modified in the database by a concurrent transaction.
关于你的方法,我认为它们都是有效的.如果一家公司只有几个位置(比如大约十个),我会使用第一种方法,因为对我来说它感觉更像 Python.确实是导致N+1查询,但是通过主键检索对象的查询非常快速且易于服务器执行.使用 get
Regarding your approaches, I think both of them are valid. If a company have just a few location (say, around ten), I'd use the first approach, because it feels more pythonic to me. It is indeed causes N+1 query, but a query which retrieves an object by a primary key is very fast and easy to the server to execute. The code can be expressed a little more compact by using a get
def add_company(name, locations):
loc_entities = [Location.get(name=l) or Location(name=l)
for l in locations]
comp = Company(name=name, locations=loc_entities)
The second approach of retrieving all existing locations with a single query looks like a premature optimization to me, but if you create hundreds a companies per second, and each company has hundreds of locations, it may be used.
这篇关于PonyORM:在不知道哪些项目已经存在的情况下向 pony 数据库添加新项目的最有效方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!