Skip to Main Content
Spotfire Ideas Portal
Status Implemented
Categories data sources
Created by Guest
Created on Nov 22, 2018

Configuration option (or patch) to control/remove double parsing of XML in Any-to-Any transforms

Any-Any transforms appears double parse the XML data, which causes a problem for data that contains special characters. This does not occur with XML -> Tabular transformations.

Steps to reproduce:

1. Create this XML file:

--
<?xml version="1.0" encoding="utf-8" ?>
<sampleDocs>
<doc>
<id>ID_001</id>
<notes1>This is Notes1</notes1>
<notes2>This is Notes2</notes2>
</doc>
<doc>
<id>ID_002</id>
<notes1>cghchjcrffgxvcjhtgx#$#%%$^&amp;#^%</notes1>
<notes2>cghchjcrffgxvcjhtgx#$#%%$^&amp;#^%</notes2>
</doc>
</sampleDocs>
--
2. Point a File-XML data source at it
3. Create an XML to Tabular Mapping transformation
4. Observe it works as expected
5. Create an Any-Any transformation
6. Observe that, when executed, it fails with an error similar to "String index out of range: 30"

 

I did some investigation on this, and have determined that this appears to be caused by TDV double parsing the XML input (i.e, the first pass turns &amp;# back into &#, which fails the second pass because it's not a valid character reference).

The following workarounds resolve the issue:

1. Wrap the input in a CDATA block: <notes1><![CDATA[cghchjcrffgxvcjhtgx#$#%%$^&amp;#^%]]></notes1> 

2. Double escape the ampersands: <notes2>cghchjcrffgxvcjhtgx#$#%%$^&amp;amp;#^%</notes2>

 

Requested Solution:

Remove the double parsing behavior entirely (make Any-to-Any transforms behave identically to XML to Tabular Mapping by default), or make it a configurable setting of the TDV server. 

  • Attach files
  • Guest
    Jan 29, 2019

    Hi Jason:

    Great - thanks for confirming.

    Regards,

    -Will

  • Guest
    Jan 29, 2019

    Hi Will, 

    This has been fixed in 7.0.8 HF2 which is available by contacting the support team.

    Regards, 
    Jason