The box plot cannot be currently used for an in-database data connection. The reason as given on the Spotfire documentation is as follows: "The box plot requires both aggregated and non-aggregated data and has not yet been adjusted to handle in-db data."
For our non-technical users we have set up Spotfire as a data exploration tool with an in-db SAP HANA data connection. Our users deal with large number of columns as Patient Age, Weight, Lab Values etc. and a box plot would give out-of-the-box a nice summary of data in these columns.
We are much in need of this feature as currently getting these details is only possible by a cross table which requires not only a lot of manual steps but is also very clumsy to arrive at such a particular analysis.
For Oracle, Microsoft SQL, and SAP HANA (these are the ones I checked) the window function percentile_cont and median exist.
For Spark this exist as well with very little difference in syntax
Since there is a nice, structured way to calculate values for Boxplot via SQL, a boxplot for in-database connections should be possible.
This SQL approach enables box plot for big data, pushing the calculation to the external database and no need for import of data in-memory.
In most of modern databases the SQL function PERCENTILE_CONT is supported and with this the calculation of the median and quartiles should be possible without loading all the data into in-memory. Then only the rows for the outliers needs to be loaded.
I generalized the title of this Idea so that it covers all external data sources so that users also with other databases will be interested in voting for the Idea.