Warm tip: This article is reproduced from serverfault.com, please click

XSLT copy elements from one file to another matching two conditions

发布于 2020-11-27 23:26:53

pretty basic question here. I am trying to write an XSLT to copy over elements (<w>) into a file (from another file) if the <w>s match on two parameters:

  • @lemma must match in both elements and
  • text() must match for the first <m> element of the target and the full <w> in the source (even if it is across several <m>s.

If either parameter doesn't match, then the <w> should remain unmodified.

Here is a sample file to be modified.

    <?xml version="1.0" encoding="UTF-8"?>
  <text>
    <w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m></w>
    <w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="FishCake">FishCake</m></w>
    <w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="s">s</m></w>
    <w lemma="FishCake" corresp="1"><m baseForm="FishC">FishC</m><m baseForm="ake">ake</m></w>
    <w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
    <w lemma="dog" corresp="1"><m baseForm="cat">cat</m></w>
    <w lemma="dog" corresp="1"><m baseForm="dog">dog</m></w> 
    <w lemma="dog" corresp="1"><m baseForm="dog">dog</m><m baseForm="cat">cat</m></w>
   </text>

Here is a sample file with elements to be copied over (source.xml)

<?xml version="1.0" encoding="UTF-8"?>
    <text>
                <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
                <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
    </text>

I would expect the code to produce:

    <?xml version="1.0" encoding="UTF-8"?>
    <text>
                   <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
                   <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
                   <w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="s">s</m></w>
                   <w lemma="FishCake" corresp="1"><m baseForm="FishC">FishC</m><m baseForm="ake">ake</m></w>
                   <w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
                   <w lemma="dog" corresp="1"><m baseForm="cat">cat</m></w>
                   <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
                   <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
       </text>

I've tried the following XSLT (a lousy attempt at modifying some code I had already), but only manage to get it to match on the @lemma

    <?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xpath-default-namespace="http://www.tei-c.org/ns/1.0"
    exclude-result-prefixes="#all"
    version="3.0">
    <xsl:param name="lookup-doc" select="document('source.xml')"/>
    <xsl:key name="ref" match="*[@lemma|m[1]/text()]" use="@lemma|m[1]/text()"/>  
 <xsl:mode on-no-match="shallow-copy"/>
    <xsl:mode name="ref-copy" on-no-match="shallow-copy"/>
    <xsl:template match="*[key('ref', @lemma|m[1]/text(), $lookup-doc)]">
        <xsl:apply-templates select="key('ref', @lemma|m[1]/text(), $lookup-doc)" mode="ref-copy">
          
        </xsl:apply-templates>
    </xsl:template> 
</xsl:stylesheet>

what I get is:

<?xml version="1.0" encoding="UTF-8"?><text>
    <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
    <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
    <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
    <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
    <w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
    <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
    <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w> 
    <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
   </text>

Any tips? Cheers!

Questioner
Benjamín Molineaux
Viewed
0
Martin Honnen 2020-11-28 19:03:37

I haven't been able to full grasp the rules and an attempt to write them as a key and use that

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:param name="doc2">
     <text>
                <w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
                <w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
    </text>      
  </xsl:param>
  
  <xsl:key name="ref" match="w" composite="yes" use="@lemma, ."/>
  
  <xsl:template match="w[key('ref', (@lemma, m[1]), $doc2)]">
      <xsl:copy-of select="key('ref', (@lemma, m[1]), $doc2)"/>
  </xsl:template>
  
</xsl:stylesheet>

does not quite give the result you have described (the second document is inline for compactness and completeness of the example but of course it could use <xsl:param name="doc2" select="doc($lookup-doc)"/> instead).

Perhaps you can clarify which document is the "source", which the "target" and explain the rules in a bit more detail and also why the examples given match or don't match.