问题描述
在MarkLogic(版本9)中,我有大约100万个文档,其结构如下:
I have a big collection of about 1 million documents, in MarkLogic (version 9), with below structure:
<File>
<Id></Id>
<ModifiedAt></ModifiedAt>
<Author></Author>
<Title></Title>
</File>
我需要遍历整个集合,并用T替换所有文档中ModifiedAt的空间
And I need to iterate through entire collection and to replace space from ModifiedAt with T for all documents
文件示例:
<File>
<Id>12121</Id>
<ModifiedAt>2011-06-08 14:29:29.000</ModifiedAt>
<Author>Test</Author>
<Title>Test</Title>
</File>
ModifiedAt字段应变为:2011-06-08T14:29:29.000
Field ModifiedAt should become: 2011-06-08T14:29:29.000
代码如下:
for $doc in fn:collection('File')
return xdmp:node-replace($doc/File/ModifiedAt,<ModifiedAt>{fn:replace($doc/File/ModifiedAt,' ','T')}</ModifiedAt>)
问题在于此代码返回超时.
The issue is that this code returns timeout.
我认为有一种更优雅的方式可以对整个收藏进行这种修改,也许有人会暗示.
I assume there is a more elegant way to make this modification on entire collection and maybe someone has a hint.
谢谢!
推荐答案
有各种外部工具,例如 Corb2 和 MLCP 可以用于此目的,但是您可以还可以从MarkLogic内部执行即席工作或更少的工作.您本质上需要做的就是分批处理. Taskbot对此非常有用:
There are various external tools out there, like Corb2, and MLCP that can be used for this, but you can also do adhoc or less adhoc work from inside MarkLogic. All you essentially need to do is do your processing in batches. Taskbot is very useful for that:
https://github.com/mblakele/taskbot
HTH!
这篇关于XQuery MarkLogic中的循环超时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!