Reading data correctly from a data source is the basis of the correct and meaningful visualization! This is true for all visualization tools and Tableau is no exception.
Issue
A peculiar Issue was faced with Tableau data reading once when it suddenly started dropping the data rows that it was reading from a csv file. The environment setup was as follows:
There were ^ delimited csv files getting generated daily from Oracle database by using PL/SQL code.
The files were fairly large (700 MB).
There were more than 100 columns in the csv file with almost 20 measure columns.
These csv files were then used to be appended to the Tableau Data extract (TDE) daily.
The Issue was caught during the Post production phase when the sum of Nominal Flow and Interest Flow columns in Tableau data extract was not matching with that in the database.
Careful study of the data showed a curious observation that only the rows containing the decimal numbers were missing and only the Integers were being showed up in the data extract. This Issue persisted even if we change the data type of the measure column in Tableau to Number(Decimal). The two measure column data has been highlighted below:
Further research proved that this was actually an Issue with the Microsoft’s Jet database engine which is used by Tableau to connect to all Microsoft provided data sources like xlsx, access, csv, txt etc.
http://kb.tableau.com/articles/Issue/resolving-incorrect-data-issues-related-to-jet
A brief limitations of the Microsoft’s Jet database engine are listed in below article:
http://kb.tableau.com/articles/Issue/limitations-to-data-and-file-sizes-with-jet-based-data-sources
Also, a reply from Tableau support confirmed the same:
“When we connect to a CSV file as data source by default connection(open the file directly), the data type of a column is determined by the first 1024 rows. We can see that the column “Nominal Flow” has a decimal number 51.63 at row 331 and the column “Interest Flow” has no decimal numbers within the first 024 rows. So Tableau Desktop determines the data types for “Nominal Flow” and “Interests Flow” are “Number(Decimal) and “Number(Whole)” respectively. However, “Interests Flow” actually has both decimal numbers and whole numbers(integers). Those integers are properly displayed while those decimal numbers are displayed as NULL. “
Solution:
The solution was to force Tableau to read the data as a decimal number. One way to do it was to insert a dummy record at the beginning of the dataset while exporting the data but this would not be a clean solution specially when we try to join/blend two data sources.
There was another solution, to export the measure columns as floats while exporting from the database and it required a tweak in the PL/SQL code. It was done by a code like this:
case when mod(“Nominal Flow”,1)=0 then trim(to_char(“Nominal Flow”))||’.00′ else trim(to_char(“Nominal Flow”)) end AS “Nominal Flow”,
CASE WHEN MOD(“Interest Flow”,1)=0 THEN TRIM(TO_CHAR(“Interest Flow”))||’.00′ ELSE TRIM(TO_CHAR(“Interest Flow”)) END AS “Interest Flow”
Hence the final output looked like:
When Tableau was connected with the above data output and converted into TDE, It worked like a charm.
Tableau was able to Identify the column as float data type and no rows were found to be getting dropped.
Get started with a Tableau Desktop Trail today
Some more exciting products that we are partner with:
Download a Lavastorm Dataverse Trial or visit Wherescape