I am trying to get specific data between two strings which are a opening and closing tag. Normally I would just parse it using XmlParse but the problem is it has a lot of other junk in the dataset.
Here is an example of the large string:
test of data need to parse:<?xml version="1.0" encoding="UTF-8"?><alert xmlns="urn:oasis:names:tc::cap:1.2"><identifier>_2020-12-16T17:32:5620201116173256</identifier><sender>683</sender><sent>2020-12-16T17:32:56-05:00</sent><status>Test</status><msgType>Alert</msgType><source>test of data need to parse</source><scope>Public</scope><addresses/><code>Test1.0</code><note>WENS IPAWS</note><info><language>en-US</language></info>
<capsig:Signature xmlns:capsig="http://www.w3.org/2000/09/xmldsig">
<capsig:Info>
<capsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n"/>
<capsig:SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-morersa-sha256"/>
<capsig:Referrer URI="">
<capsig:Trans>
<capsig:Trans Algorithm="http://www.w3.org/2000/09/xmldsigenveloped-signature"/>
</capsig:Trans>
<capsig:DMethod Algorithm="http://www.w3.org/2001/04/xmlencsha256"/>
<capsig:DigestValue>wjL4tqltJY7m/4=</capsig:DigestValue>
</capsig:Referrer>
</capsig:Info>
test of data need to parse:<?xml version="1.0" encoding="UTF-8"?><alert xmlns="urn:oasis:names:tc::cap:1.2"><identifier>_2020-12-16T17:32:5620201116173256</identifier><sender>683</sender><sent>2020-12-16T17:32:56-05:00</sent><status>Test</status><msgType>Alert</msgType><source>test of data need to parse</source><scope>Public</scope><addresses/><code>Test1.0</code><note>WENS IPAWS</note><info><language>en-US</language></info>
So what I need to do is just extract the following:
<capsig:Info>
<capsig:CanonicalizationMethod Algorithm="http://www.w3.org/2001/10/xml-exc-c14n"/>
<capsig:SignatureMethod Algorithm="http://www.w3.org/2001/04/xmldsig-morersa-sha256"/>
<capsig:Referrer URL="">
<capsig:Trans>
<capsig:Trans Algorithm="http://www.w3.org/2000/09/xmldsigenveloped-signature"/>
</capsig:Trans>
<capsig:DMethod Algorithm="http://www.w3.org/2001/04/xmlencsha256"/>
<capsig:DigestValue>wjL4tqltJY7m/4=</capsig:DigestValue>
</capsig:Referrer>
</capsig:Info>
I have searched everywhere and I have found where things can be done with characters and counts but none of them really worked. Tried doing it with SQL but because the constant change in the string it causes issues. So my plan was get everything after "capsig:Info" and before "</capsig:Info>" then insert it into a table.
Is there a way to do this with Coldfusion?
Any suggestions would be appreciated.
Thanks!
Yes, you can use a regular expression match to extract the substring containing the text between the <capsig:Info> ... </capsig:Info>
tags by using the ColdFusion function reMatch()
which will return an array of all substrings that match the specified pattern. This can be done using the line of code below.
<!--- Use reMatch to extract all pattern matches into an array --->
<cfset parsedXml = reMatch("<capsig:Info>(.*?)</capsig:Info>", xmlToParse)>
<!--- parsedXml is an array of strings. The result will be found in the first array element as such --->
<cfdump var="#parsedXml[1]#" label="parsedXml">
You can see this using the demo here.
https://trycf.com/gist/00be732d93ef49b2427768e18e371527/lucee5?theme=monokai
Thanks so much. Seems to be working. But when I try to output the #parsedXml# and I get the following: Complex object types cannot be converted to simple values. The expression has requested a variable or an intermediate expression result as a simple value. However, the result cannot be converted to a simple value. Simple values are strings, numbers, boolean values, and date/time values. Queries, arrays, and COM objects are examples of complex values. <p> The most likely cause of the error is that you tried to use a complex value as a simple one. Any idea what would cause that? Thanks so much!
Scatch that... I should have looked at your example you provided. I was trying to cfset it and then output it.
used your suggestion and it worked perfectly. You're the KING! Thank you very much... truly appreciated!
@Scott Yes, I'll edit my answer to reflect that parsedXml is a complex object. That is, it's an array of strings with the result appearing in the first array element. I'll do this to prevent any future confusion in case the link to the gist ever gets broken. In the meantime, please upvote and mark as complete. Thanks.