Warm tip: This article is reproduced from serverfault.com, please click

Grouping section of xml

发布于 2020-11-29 14:59:47

I'm having some problems grouping a part of an input tree into a container element and leaving other parts intact. I am trying to use the for-each-group as an exercise.

Logic:

  1. Process elements with template matches and try to detect when an element only contains w elements. If other content, continue "normal" processing, but otherwise continue with next step in this sequence.
  2. Build a container element with the current node content and try to pull the following adjacent siblings, that does not contain a w element into the container. A step with a w element should be outside the container. Either as a separate element (if there are w and other elements), or as a new container (if only w children).

Input example (the body element in the example can be seen as a fragment of a larger tree):

<?xml version="1.0" encoding="UTF-8"?>
<body>
    <step>
        <p>step 1</p>
    </step>
    <step>
        <p>step 2</p>
    </step>
    <step>
        <w>Warning A</w>
        <p>step 3</p>
    </step>
    <step>
        <p>step 4</p>
    </step>
    <step>
        <p>step 5</p>
    </step>
    <step>
        <w>Spec Warning X</w>
        <w>Spec Warning Y</w>
    </step>
    <step>
        <p>step 6</p>
    </step>
    <step>
        <p>step 7</p>
    </step>
    <step>
        <p>step 8</p>
    </step>
    <step>
        <p>step 9</p>
    </step>
    <step>
        <p>step 10</p>
    </step>
    <step>
        <p>step 11</p>
    </step>
    <step>
        <w>Warning B</w>
        <p>step 12</p>
    </step>
    <step>
        <p>step 13</p>
    </step>
    <step>
        <p>step 14</p>
    </step>    
</body>

Desired output:

<?xml version="1.0" encoding="UTF-8"?>
<body>
    <step>
        <p>step 1</p>
    </step>
    <step>
        <p>step 2</p>
    </step>
    <step>
        <w>Warning A</w>
        <p>step 3</p>
    </step>
    <step>
        <p>step 4</p>
    </step>
    <step>
        <p>step 5</p>
    </step>
    <container>
        <w>Spec Warning X</w>
        <w>Spec Warning Y</w>
         <step>
            <p>step 6</p>
        </step>
        <step>
            <p>step 7</p>
        </step>
        <step>
            <p>step 8</p>
        </step>
        <step>
            <p>step 9</p>
        </step>
        <step>
            <p>step 10</p>
        </step>
        <step>
            <p>step 11</p>
        </step>
    </container>
    <step>
        <w>Warning B</w>
        <p>step 12</p>
    </step>
    <step>
        <p>step 13</p>
    </step>
    <step>
        <p>step 14</p>
    </step>    
</body>

Initial test:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />

    <xsl:template match="/">
        <xsl:element name="body">
          <xsl:apply-templates select="*"/>  
        </xsl:element>        
    </xsl:template>

    <xsl:template match="step[w and not(p)]">
        <xsl:element name="container">
           <xsl:apply-templates/>
            <xsl:for-each-group select="following-sibling::*" group-adjacent="self::step[not(w)]">
                <xsl:copy-of select="current-group()"/>
            </xsl:for-each-group>
        </xsl:element>
    </xsl:template>    
    
    <xsl:template match="step[p]">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="w">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="step[p and not(w)][preceding-sibling::step[w][1][not(p)]]"/>
</xsl:transform>

Result (http://xsltransform.net/eixk6Sw/2):

<body>
    <step>
        <p>step 1</p>
    </step>
    <step>
        <p>step 2</p>
    </step>
    <step>
        <w>Warning A</w>
        <p>step 3</p>
    </step>
    <step>
        <p>step 4</p>
    </step>
    <step>
        <p>step 5</p>
    </step>
    <container>
        <w>Spec Warning X</w>
        <w>Spec Warning Y</w>
      <step>
        <p>step 6</p>
      </step>
      <step>
        <p>step 7</p>
      </step>
      <step>
        <p>step 8</p>
      </step>
      <step>
        <p>step 9</p>
      </step>
      <step>
        <p>step 10</p>
      </step>Error on line 14 
  XTTE1100: An empty sequence is not allowed as the @group-adjacent attribute of xsl:for-each-group
  in built-in template rule
  at xsl:apply-templates (#7)
     processing /body

My current problem was that I couldn't see how to use a grouping technique, and limit the processing to the first group (that would be the one following my context node), instead of processing all groups.

Second attempt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />
    
    <xsl:template match="/">
        <body>
            <xsl:apply-templates select="*"/>
        </body>
    </xsl:template>
    
    <xsl:template match="step[w and not(p)]">   <!-- Find a step with w elements only. -->
        <xsl:element name="container">
            <xsl:apply-templates/>  <!-- Get content from current node. -->
            
            <!-- This where it gets dicey and I'm guessing a lot -->
            <!-- Get all following adjacent elements in groups, where the interesting group is 
                 the first one containing step elements with no w elements.
                 So group element that doesn's include a w element.-->
            <xsl:for-each-group select="following-sibling::*" group-adjacent="boolean(self::step[not(w)])">
                <!-- Check if the group actually is according to the criteria. The group can include other nodes as well? -->
                <!-- And also check if the first preceding step with a w element also lacks any p elements. 
                     If so, this has to be the first group. -->
                <xsl:if test="current-grouping-key() and preceding-sibling::step[w][1][not(p)]">
                    <xsl:sequence select="current-group()"/>
                </xsl:if>
            </xsl:for-each-group>
        </xsl:element>
    </xsl:template>    
    
    <xsl:template match="step[w and p] | step[p][not(preceding-sibling::step[w][1][not(p)])]">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="w ">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="step[p and not(w)][preceding-sibling::step[w][1][not(p)]]"/>
</xsl:transform>

I know that I can get this to work by finding my step with only w elements, and at that point apply a template to process the next step sibling in a special mode, and have that template pulling the next sibling with no w elements and so forth. This works as intended but I would like to learn other techniques for this:

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output method="xml" omit-xml-declaration="yes" encoding="UTF-8" indent="yes" />

    <xsl:template match="/">
        <xsl:element name="body">
          <xsl:apply-templates select="*"/>  
        </xsl:element>        
    </xsl:template>

    <xsl:template match="step[w and not(p)]">
        <xsl:element name="container">
           <xsl:apply-templates/>
            <xsl:apply-templates select="following-sibling::*[1][self::step[p and not(w)]]" mode="keep"/>
        </xsl:element>
    </xsl:template>    
    
    <xsl:template match="step[p]" mode="keep">
        <xsl:copy-of select="."/>
        <xsl:apply-templates select="following-sibling::*[1][self::step[p and not(w)]]" mode="keep"/>
    </xsl:template>
    
    <xsl:template match="step[p]">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="w">
        <xsl:copy-of select="."/>
    </xsl:template>
    
    <xsl:template match="step[p and not(w)][preceding-sibling::step[w][1][not(p)]]"/>
</xsl:transform>

My second attempt seems to get me the desired result, but this comes from trial and error, and some free interpretations from the result...

Feel free to comment on my approach and questions.

Questioner
Zug_Bug
Viewed
0
Martin Honnen 2020-11-30 05:04:26

When using for-each-group, I tend to use it in a template of the parent (e.g. the body) and use the items (e.g. the steps) as the population. I am not sure I have fully understood the requirements from the sole sample but assuming we can reformulate the second requirement as an attempt to find the first item having a w a nested grouping might work:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">
    
  <xsl:strip-space elements="*"/>
  <xsl:output indent="yes"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="body">
      <xsl:copy>
          <xsl:for-each-group select="step" group-starting-with="step[w and not(p)]">
              <xsl:choose>
                  <xsl:when test="w and not(p)">
                      <xsl:variable name="wrapper" select="."/>
                      <xsl:for-each-group select="tail(current-group())" group-ending-with="step[w]">
                          <xsl:choose>
                              <xsl:when test="position() = 1">
                                <container>
                                    <xsl:apply-templates select="$wrapper, current-group()[position() lt last()]"/>
                                </container>
                                <xsl:apply-templates select="current-group()[last()]"/>
                              </xsl:when>
                              <xsl:otherwise>
                                  <xsl:apply-templates select="current-group()"/>
                              </xsl:otherwise>
                          </xsl:choose>
                      </xsl:for-each-group>
                  </xsl:when>
                  <xsl:otherwise>
                      <xsl:apply-templates select="current-group()"/>
                  </xsl:otherwise>
              </xsl:choose>
          </xsl:for-each-group>
      </xsl:copy>
  </xsl:template>
  
</xsl:stylesheet>

The outer xsl:for-each-group select="step" group-starting-with="step[w and not(p)]" is supposed to identify your container elements, as always with group-starting-with you can get a group not formed by the pattern so inside, to only wrap if we have one of the wanted step groups we have to recheck the condition test="w and not(p)".

Then inside, to identify the "end" of the items to be wrapped a second grouping is used: xsl:for-each-group select="tail(current-group())" group-ending-with="step[w]", it basically allows us to select the adjacent steps not having the w. We only want to wrap the first such sequence or group, therefore the xsl:when test="position() = 1" is used.

All xsl:otherwise branches just push whatever has been collected through to the identity transformation.