温馨提示:本文翻译自stackoverflow.com,查看原文请点击:其他 - marklogic remove duplicate node/element
marklogic

其他 - marklogic删除重复的节点/元素

发布于 2020-05-01 07:22:57

我有几千个具有重复元素节点的文档。如何查找和删除titleXML文件中的重复元素?

我使用fn:distict-values()导致性能问题。

例如:01.xml

<doc>
     <pdf>1</pdf>
     <title>Head First JavaScript</title>
     <title>Head First JavaScript</title>
</doc>

02.xml

<doc>
    <pdf>0</pdf>
    <title>Python: Programming Basics for Absolute Beginners </title>
    <title>Python: Programming Basics for Absolute Beginners </title>
</doc>

结果:01.xml

<doc>
     <pdf>1</pdf>
     <title>Head First JavaScript</title>

</doc>

02.xml

<doc>
    <pdf>0</pdf>
    <title>Python: Programming Basics for Absolute Beginners </title>

</doc>

查看更多

提问者
thichxai
被浏览
91
Sudeep Rawat 2020-02-13 16:43

嗨,请测试附件代码

    let $doc :=
<doc>
    <title>Head First JavaScript</title>
     <title>Head First JavaScript</title>
     <title>hellao</title>
     <title>hello</title>
     <title>hello</title>
     <title>Python: Programming Basics for Absolute Beginners </title>
     <title>ahello</title>
     <title>Python: Programming Basics for Absolute Beginners </title>
</doc>

for $data in $doc//title[not(. = preceding-sibling::node())]
return $data