pretty basic question here.
I am trying to write an XSLT to copy over elements (<w>
) into a file (from another file) if the <w>
s match on two parameters:
@lemma
must match in both elements andtext()
must match for the
first <m>
element of the target and the full <w>
in the source
(even if it is across several <m>
s.If either parameter doesn't match, then the <w>
should remain unmodified.
Here is a sample file to be modified.
<?xml version="1.0" encoding="UTF-8"?>
<text>
<w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m></w>
<w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="FishCake">FishCake</m></w>
<w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="s">s</m></w>
<w lemma="FishCake" corresp="1"><m baseForm="FishC">FishC</m><m baseForm="ake">ake</m></w>
<w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="1"><m baseForm="cat">cat</m></w>
<w lemma="dog" corresp="1"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="1"><m baseForm="dog">dog</m><m baseForm="cat">cat</m></w>
</text>
Here is a sample file with elements to be copied over (source.xml)
<?xml version="1.0" encoding="UTF-8"?>
<text>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
</text>
I would expect the code to produce:
<?xml version="1.0" encoding="UTF-8"?>
<text>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="FishCake" corresp="1"><m baseForm="FishCake">FishCake</m><m baseForm="s">s</m></w>
<w lemma="FishCake" corresp="1"><m baseForm="FishC">FishC</m><m baseForm="ake">ake</m></w>
<w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="1"><m baseForm="cat">cat</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
</text>
I've tried the following XSLT (a lousy attempt at modifying some code I had already), but only manage to get it to match on the @lemma
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xpath-default-namespace="http://www.tei-c.org/ns/1.0"
exclude-result-prefixes="#all"
version="3.0">
<xsl:param name="lookup-doc" select="document('source.xml')"/>
<xsl:key name="ref" match="*[@lemma|m[1]/text()]" use="@lemma|m[1]/text()"/>
<xsl:mode on-no-match="shallow-copy"/>
<xsl:mode name="ref-copy" on-no-match="shallow-copy"/>
<xsl:template match="*[key('ref', @lemma|m[1]/text(), $lookup-doc)]">
<xsl:apply-templates select="key('ref', @lemma|m[1]/text(), $lookup-doc)" mode="ref-copy">
</xsl:apply-templates>
</xsl:template>
</xsl:stylesheet>
what I get is:
<?xml version="1.0" encoding="UTF-8"?><text>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="cat" corresp="1"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
</text>
Any tips? Cheers!
I haven't been able to full grasp the rules and an attempt to write them as a key and use that
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all"
version="3.0">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:param name="doc2">
<text>
<w lemma="FishCake" corresp="2"><m baseForm="Fish">Fish</m><m baseForm="Cake">Cake</m></w>
<w lemma="dog" corresp="2"><m baseForm="dog">dog</m></w>
</text>
</xsl:param>
<xsl:key name="ref" match="w" composite="yes" use="@lemma, ."/>
<xsl:template match="w[key('ref', (@lemma, m[1]), $doc2)]">
<xsl:copy-of select="key('ref', (@lemma, m[1]), $doc2)"/>
</xsl:template>
</xsl:stylesheet>
does not quite give the result you have described (the second document is inline for compactness and completeness of the example but of course it could use <xsl:param name="doc2" select="doc($lookup-doc)"/>
instead).
Perhaps you can clarify which document is the "source", which the "target" and explain the rules in a bit more detail and also why the examples given match or don't match.
Hi Martin. Thanks so much – this is exactly what I was struggling with – basically the parentheses for the matching and a general lack of XSLT background. Indeed, I can see how I confused you there – I got my target and source muddled up in my response. The fourth example should have been a match, as your code predicts.