Skip to Main Content
Spotfire Ideas Portal
Status To be Reviewed
Created by Guest
Created on Oct 2, 2019

Allow Python Node to ouput Text spreadsheet columns

Sometimes you just want your data to be formatted as Text and not as Double.  But the Python node for Statistica ALWAYS returns data frames as Doubles with Text Labels.  This can cause issues if the field contains both text and numbers, where the text labels may accidently replace certain numbers with the string label.

  • Attach files
  • Guest
    Reply
    |
    Jan 2, 2020

    This should really be considered a bug.  Due to the nature of the way the dataframe is written back to a Statistica spreadsheet, any number-like strings will be converted to numbers (ie. the string "01875" becomes "1875"), since the functions in PyWrapper.py do not properly handle text fields.

     

    It is generally accepted that the "object" data type in pandas is synonymous with text variables.  A simple fix for PyWrapper.py to output a text variable properly could be done by inserting the following python code after line 334 in the pandas_data_frame_as_spreadsheet function:

     

    if df[str(df.columns[i])].dtype == 'object':
        maxlength = df[str(df.columns[i])].str.len().max()
        ssObj.VariableSetTextType(i+1,maxlength)

     

    This will convert the output spreadsheet variable to a text column if the data frame column is an object data type.  Implementing a similar solution to this in a future version would be very helpful.