Have you heard this one before? “Just connect to your data in Tableau and start visualizing. Then you’ll publish and share with your whole organization.” It’s a great line, because it’s true. You CAN get started with analysis on top of just about any data in Tableau. But “can” is not “should” — what is possible may not be the BEST way, particularly if you want to scale up. When dealing with massive amounts of data, a better solution is to have two data sources: (1) A pre-aggregated data set for overviews, which I’ll call the Overview data source (2) The row-level data set, which I’ll call the Granular data source. Tableau’s abilities to filter between two data sources (actions & cross-datasource filters in Tableau 10) make this an excellent strategy, and one that I have seen massively improve performance over and over.
We have a tendency to answer questions of multi-tenancy in Tableau swiftly with switching between Sites, which are the virtual tenements for your Tableau tenants. But Sites are the simple part of the equation; when there is a need for multitenancy in Tableau, there is most likely existing multi-tenancy in the data systems Tableau must connect to. I’m going to dive into the diverse ways that customers corral their data and outline all the tenets of deploying effectively from a single template to all your tenants.
Editor’s Note: The official (and improved) whitepaper version of this is available from Tableau here
The techniques outlined in this post are applicable to Live Connections and Multi-Table Extracts (available in Tableau 2018.3+). If you need to use Extracts and are on a version of Tableau prior to 2018.3, please see Keeping Your Extracts From Blowing Up .
Tableau has supported Stored Procedures in Microsoft SQL Server (and Sybase and Teradata) since version 8.1, and you can connect the SP parameters to Tableau parameters.
However, there are two features that don’t exist as of 9.2:
- Parameters cannot be set to match a function, such as USERNAME()
- Parameters cannot have multiple values (no array concept)
These are both feature requests that you can go vote up on the Community forum, so go there now and then come back and continue reading!
Until these features are implemented, the only way to set these values dynamically is using Tableau Server’s ability to set parameters programmatically.
As the recent post on Vertica brings to light, sometimes really highly performing systems need a little configuration to perform optimally with Tableau. There’s a particular set of systems that require some extra thought and care to use with Tableau, because if you set off without any planning and expect to combine Tableau’s ease of use with the speed of these systems and end up staring at the “query executing” screen for 10 minutes, you may start to doubt everyone’s claims.
The systems I’m talking about are the Massive Parallel Processing (MPP) databases. There’s already a great explanation of them here so I’m not going to go too deeply into how they work, other than what is relevant for Tableau. Which systems that Tableau supports are MPP (don’t get too angry if I get this a bit wrong) (in no particular order):
- Aster (although there is some Hadoop going on in the backend)