Tableau despite being an excellent tool to quickly visualize the data can also be used for creation and verification of Linear regression models used for predictive analytics. The ability of Tableau to integrate with external statistical languages like Python or R allows it to use the Regression models built in those languages to directly be used in Tableau.
Integration of R and Tableau
- Download and install software:
To integrate R with Tableau, we would need R Studio:
R download link: https://cran.r-project.org/bin/windows/base/
R Studio download link: https://www.rstudio.com/products/rstudio/download/
We would also need Tableau desktop: https://www.tableau.com/products/desktop
- Open R Studio and Type below commands on R command line:
install.packages(“Rserve”);
library(Rserve);
Rserve()
- Open Tableau desktop and goto Help Menu -> Settings and performance -> Manage External Service Connection
- Select Localhost and port 6311
- Test Connection and Ok.
Development of Linear Regression Model:
After integrating RServe and Tableau, we are all set to embed the R code for linear regression model creation into R calculated fields.
The sample data used here is an open source data available for download from Duke university’s website: http://www2.stat.duke.edu/~mc301/data/movies.html
The data contains a sample of 651 movies, their reviews, critics score etc. (The data dictionary is also present at above link).
Let us try to develop a regression model to predict the audience score from various other dependent variables like IMDB Rating or Critics Score.
We will first analyze the relationships among these variables via a scatter plot among them in Tableau:
Above figure shows two plots:
IMDB votes Vs Audience Score
Critics Score Vs Audience Score
Clearly, Critics Score seems to have a greater linear relationship with Audience Score. That simply means that Critics Score is a better predictor of Audience Score rather than IMDB Votes.
Let us write a calculated field called “Predicted Audience Score”
Tableau’s SCRIPT_REAL function can be used to embed R or Python code in Tableau’s calculation.
Here we have used Critics Score to Predict Audience Score.
Let us plot the Predicted Audience Score Vs Audience Score.
As can be seen clearly the plot of the Predicted Audience Score Vs Audience Score comes out as a perfect straight line.
Conclusion:
Given above is a very basic example of achieving a simple linear regression model using Tableau and R.
The advanced and much more sophisticated Linear regression model has been developed in R and can be located at below GitHub URL:
https://github.com/shashibhushan86/Linear_Regression/blob/master/reg_model_project.Rmd
Get started with a Tableau Desktop Trial today
Some more exciting products that we are partner with:
Download Lavastorm Dataverse Trial or visit Wherescape