When you are trying to maximize performance in Tableau, particularly on a live connection, sometimes the smallest changes can make a big difference. All of your choices in Tableau Desktop eventually end up as a real live SQL query, which the database will have to interpret. The simpler the query, the easier the interpretation, and in most cases the quicker the results.
Tableau’s Dashboard Actions are amazing, and in the newer versions there is a quick little “Use as filter” button on each sheet in a Dashboard. This creates an Action in the Dashboard->Actions menu which is set to “All Fields” down at the bottom. This is incredibly convenient from a creation standpoint; however, it means that the selected values for every single dimension in the Source Sheet will be passed along as filters in the WHERE clause of the eventual SQL query. This includes categorical information which you are displaying: if you are showing Product Category, Product Sub-Category, and Product ID; all three will be sent in the eventual query.
Particularly when you are getting down to granular details, you really only need the most granular piece of information to be passed into the WHERE clause. For optimal performance, you really only want to pass in values for fields that are indexed in the database. In the previous example, presuming that a Product ID can only belong to one Category and Sub-Category, setting the Action to “Selected Fields” and choosing “Product ID” would simplify the query sent; hopefully Product ID is indexed and thus you get an incredibly quick lookup.
I’ve put together a new page in the top menu on Tableau Performance which links out to a lot of brilliant posts from all over (and a few of mine) on all of the topics involved in making things fast in Tableau. I’ll continue to update it over time with more and more resources.
Editor’s Note: The official (and improved) whitepaper version of this is available from Tableau here
The techniques outlined in this post are applicable to Live Connections and Multi-Table Extracts (available in Tableau 2018.3+). If you need to use Extracts and are on a version of Tableau prior to 2018.3, please see Keeping Your Extracts From Blowing Up .
Learning to design so that you limit loading unnecessary rows of granular data is the most important technique you can learn to make Tableau perform well. It reduces the strain on the database in finding and returning data, and it limits the amount that Tableau needs to return. Ask yourself: “Am I getting tired of scrolling through this list trying to find what I need?” If the answer is yes, consider what other views of the data might help you filter down just to the rows you need to see.
Setting Actions to Exclude when nothing is selected
The ideal workflow in Tableau, both from a performance and a visual analysis perspective, is useful visualizations that filter via Dashboard Actions that act on click / select. You can limit what is shown by setting your Actions to Exclude when nothing is selected. After the first time you do a selection and clear it, the affected sheets won’t show anything until something has been selected.
As the recent post on Vertica brings to light, sometimes really highly performing systems need a little configuration to perform optimally with Tableau. There’s a particular set of systems that require some extra thought and care to use with Tableau, because if you set off without any planning and expect to combine Tableau’s ease of use with the speed of these systems and end up staring at the “query executing” screen for 10 minutes, you may start to doubt everyone’s claims.
The systems I’m talking about are the Massive Parallel Processing (MPP) databases. There’s already a great explanation of them here so I’m not going to go too deeply into how they work, other than what is relevant for Tableau. Which systems that Tableau supports are MPP (don’t get too angry if I get this a bit wrong) (in no particular order):
- Aster (although there is some Hadoop going on in the backend)
Really exciting announcement of TabMon for monitoring Tableau Server performance!
Check it out at http://www.tableau.com/about/blog/2015/10/introducing-tabmon-44911
There are some good reasons to look at the actual queries Tableau sends to a live relational database:
- Performance is not good and your DBA needs to know what is happening so they can optimize. This goes for live connections and slow extract generation.
- Your viz still isn’t performing well enough even with extracts, but you don’t understand the TQL language you see in the Performance Recorder (it’s okay, no one does!). Seeing the same logic in SQL can help you understand what exactly is going on
- You just want to marvel at the amazing ability of the VizQL engine to translate your actions into SQL. You should check out what LOD calculations, Sets, or calculated fields look like sometime just to marvel at what is going on
Read more for how to accomplish #1