问题描述
在Google App Engine数据存储中对实体执行大量更新的正确方法是什么?它可以在不需要检索实体的情况下完成吗?
例如,在SQL中,GAE等同于类似的东西:
UPDATE dbo.authors
SET city =替换(city,'Salt','Olympic')
城市名称'Salt%' ;
没有直译。数据存储真的没有更新的概念;你所能做的就是用同一个地址(关键字)上的一个新实体覆盖旧实体。要更改实体,您必须从数据存储中获取它,在本地修改它,然后将其保存回来。
LIKE操作符也没有等价物。尽管通配符后缀匹配可能会带来一些窍门,但如果您想匹配'%Salt%',则必须将每个实体读入内存并在本地进行字符串比较。
所以它不会像SQL那样干净和高效。这是对大多数分布式对象存储的折衷,数据存储也不例外。
也就是说,可用于促进此类批量更新。按照下面的例子为你的进程
函数使用类似的东西:
def process(entity):
entity.city.startswith('Salt'):
entity.city = entity.city.replace('Salt','Olympic')
yield op .db.Put(实体)
除映射器外还有其他替代方法。最重要的优化技巧是批量更新;不要单独保存每个更新的实体。如果您使用mapper和yield puts,则会自动处理。
What is the proper way to perform mass updates on entities in a Google App Engine Datastore? Can it be done without having to retrieve the entities?
For example, what would be the GAE equivilant to something like this in SQL:
UPDATE dbo.authors
SET city = replace(city, 'Salt', 'Olympic')
WHERE city LIKE 'Salt%';
There isn't a direct translation. The datastore really has no concept of updates; all you can do is overwrite old entities with a new entity at the same address (key). To change an entity, you must fetch it from the datastore, modify it locally, and then save it back.
There's also no equivalent to the LIKE operator. While wildcard suffix matching is possible with some tricks, if you wanted to match '%Salt%' you'd have to read every single entity into memory and do the string comparison locally.
So it's not going to be quite as clean or efficient as SQL. This is a tradeoff with most distributed object stores, and the datastore is no exception.
That said, the mapper library is available to facilitate such batch updates. Follow the example and use something like this for your process
function:
def process(entity):
if entity.city.startswith('Salt'):
entity.city = entity.city.replace('Salt', 'Olympic')
yield op.db.Put(entity)
There are other alternatives besides the mapper. The most important optimization tip is to batch your updates; don't save back each updated entity individually. If you use the mapper and yield puts, this is handled automatically.
这篇关于Google App Engine数据存储区中的大量更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!