Warm tip: This article is reproduced from serverfault.com, please click

Coldfusion

发布于 2020-12-17 02:14:05

I am trying to get specific data between two strings which are a opening and closing tag. Normally I would just parse it using XmlParse but the problem is it has a lot of other junk in the dataset.

Here is an example of the large string:

test of data need to parse:<?xml version="1.0" encoding="UTF-8"?><alert xmlns="urn:oasis:names:tc::cap:1.2"><identifier>_2020-12-16T17:32:5620201116173256</identifier><sender>683</sender><sent>2020-12-16T17:32:56-05:00</sent><status>Test</status><msgType>Alert</msgType><source>test of data need to parse</source><scope>Public</scope><addresses/><code>Test1.0</code><note>WENS IPAWS</note><info><language>en-US</language></info>


<capsig:Signature xmlns:capsig="http://www.w3.org/2000/09/xmldsig">

<capsig:Info>
<capsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n"/>
<capsig:SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-morersa-sha256"/>
<capsig:Referrer URI="">
<capsig:Trans>
<capsig:Trans Algorithm="http://www.w3.org/2000/09/xmldsigenveloped-signature"/>
</capsig:Trans>
<capsig:DMethod Algorithm="http://www.w3.org/2001/04/xmlencsha256"/>
<capsig:DigestValue>wjL4tqltJY7m/4=</capsig:DigestValue>
</capsig:Referrer>
</capsig:Info>


test of data need to parse:<?xml version="1.0" encoding="UTF-8"?><alert xmlns="urn:oasis:names:tc::cap:1.2"><identifier>_2020-12-16T17:32:5620201116173256</identifier><sender>683</sender><sent>2020-12-16T17:32:56-05:00</sent><status>Test</status><msgType>Alert</msgType><source>test of data need to parse</source><scope>Public</scope><addresses/><code>Test1.0</code><note>WENS IPAWS</note><info><language>en-US</language></info>

So what I need to do is just extract the following:

 <capsig:Info>
 <capsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n"/>
 <capsig:SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-morersa-sha256"/>
 <capsig:Referrer URL="">
 <capsig:Trans>
 <capsig:Trans Algorithm="http://www.w3.org/2000/09/xmldsigenveloped-signature"/>
 </capsig:Trans>
 <capsig:DMethod Algorithm="http://www.w3.org/2001/04/xmlencsha256"/>
 <capsig:DigestValue>wjL4tqltJY7m/4=</capsig:DigestValue>
 </capsig:Referrer>
 </capsig:Info>

I have searched everywhere and I have found where things can be done with characters and counts but none of them really worked. Tried doing it with SQL but because the constant change in the string it causes issues. So my plan was get everything after "capsig:Info" and before "</capsig:Info>" then insert it into a table.

Is there a way to do this with Coldfusion?

Any suggestions would be appreciated.

Thanks!

Questioner
Scott
Viewed
0
user12031119 2020-12-17 22:29:05

Yes, you can use a regular expression match to extract the substring containing the text between the <capsig:Info> ... </capsig:Info> tags by using the ColdFusion function reMatch() which will return an array of all substrings that match the specified pattern. This can be done using the line of code below.

<!--- Use reMatch to extract all pattern matches into an array --->
<cfset parsedXml = reMatch("<capsig:Info>(.*?)</capsig:Info>", xmlToParse)>

<!--- parsedXml is an array of strings.  The result will be found in the first array element as such --->
<cfdump var="#parsedXml[1]#" label="parsedXml">

You can see this using the demo here.

https://trycf.com/gist/00be732d93ef49b2427768e18e371527/lucee5?theme=monokai