You’ve come to this page because you want to know how to make Tableau perform as efficiently as possible. Performance always starts in Desktop, where you connect to data and build the worksheets and dashboards that you will then publish to Tableau Server. Tableau Server is running the same VizQL engine as Desktop, so if it’s slow in Desktop, it will be slow on Server.
The most complete guide to everything Tableau Performance is Alan Eldridge’s magnificent Best Practices for Designing Efficient Tableau Workbooks: Version 10 Edition . It’s unlikely anything said anywhere else is not in this guide, but it is a hefty PDF tome. If you want to become an expert, it is essential reading.
Database Connections / Join Culling
The most efficient way to set up your data for Tableau is a standard star schema with all INNER JOINs, because it allows you to use Assume Referential Integrity
This is particularly relevant for MPP databases, which also need their own proper configuration to run efficiently.
If your database has all of its Primary and Foreign Key relationships defined, Tableau can also cull down unnecessary tables out of its queries (I believe INNER JOINs are still preferable).
Custom SQL is the slowest way to go, because Tableau includes the entire Custom SQL query in every subsequent query that it writes. If you can find a way to replace Custom SQL, even if it is putting it in as a View in the database, it is worth doing this.
If you want to see what queries are happening so you can optimize them at the database level, here is a guide
Big Data Systems
If you are using a big data system, make sure to think about how to pre-aggregate overview data, rather than making that system repeatedly process roll-ups live. Both extracts and live connections can be used together for optimal performance.
Slalom has a nice guide to how to optimize Redshift with Tableau.
You might need to do put in place a TDC file to get the best performance configurations for your given system in place.
Russell Christopher put together an amazing tool for creating the TDC file correctly.
Designing for Performance
You should really read Alan’s guide, but here are my quick tips:
- Watch and understand the Dashboard Best Practices video. Design by the “Guided Analytics” philosophy: Use visualizations at the high level to filter down to the lower levels of detail.
- Hide your lowest detail levels, particularly any long lists, using the Exclude All Values option in actions. This is easier on your eyes, and far easier on your database
- If you are embedding Tableau, take out all the fancy stuff and branding and put it in your portal. The more stuff you have, the more Tableau has to draw
- If you find yourself scrolling and scrolling, rethink that visualization. There must be some other way of looking at things to help identify what is important to drill into
- Minimize your cross-tabs of text. Keep them for the end, once you have filtered.
- Use Actions for drilldown, but make sure to optimize them
Tableau Server’s performance is determined by two aspects: (1) Did you build things to perform well in Desktop? (2) Is there adequate hardware for Tableau Server to run efficiently?. For (1), see above. For (2), see the sections below.
Regarding configuring the processes on Tableau Server, the defaults are the defaults for a reason. In most cases, two of any process per worker node is all you need.
Hardware / Virtualization
General thoughts on Tableau in a virtualized environment
Disk I/O / IOPS
The one thing you’ll notice when reading about AWS or Azure, is that disk read and write speed, or IOPS, is an important limiting factor in Tableau Server performance. There is no minimum disk I/O recommendation from Tableau, other than a minimum of 10,000RPM magnetic drives. However, in many cases we have seen slow IOPS be the hidden culprit when otherwise fast running workbooks end up dragging when published to Server.
If you are using extracts / TDEs, you need to be especially concerned with disk I/O. Extract creation requires writing to disk quite intensely, and extracts are memory-mapped and loaded into RAM as needed when used as a data source, so read speed matters as well. If you are going for the fastest possible configuration, SSDs definitely win. Otherwise, fast, local storage is the best solution. Things like SANs should be avoided unless you absolutely know they are fast and dedicated to the Tableau Server.
Yes, Tableau Server can run really fast in AWS. It can also run really slowly; you need the right hardware setup. Russell Christopher’s explorations are a guide to finding that optimal setup (and yes you should read them in this order):
- Which EC2 Instance Type Should I run Tableau Server On? Part One
- Which AWS EC2 Instance Type Should I run Tableau Server On? Part Two
- Studying Tableau Performance Characteristics on AWS EC2
- Tableau Server Performance on AWS EC2
- Comparing Tableau Server performance on EC2: The c3 vs. c4 bake-off
- AWS EC2 General Purpose (gp2) disks and Tableau Server: mostly awesome
Choosing the right Azure box may be even more challenging than in AWS. Once again, Russell has explored the options:
- Yes, Tableau can go fast on Windows Azure
- Can Tableau Server go faster on Windows Azure now?
- Tableau Server: I wanna go fast on Windows Azure
Look and Feel / Minimizing Load Times
In Tableau 9.3 and beyond, sheets in a dashboard will load as they come in, rather than waiting for the whole dashboard to render. This will improve the user experience and let them start working as long as something has loaded.
Server Maintenance Automation
The following are “Unsupported” ways of reducing time with maintenance and changes to Tableau Server.
Load and Performance Testing
TabJolt is a tool built by Tableau (but not supported by Tableau Tech Support) for load testing a Tableau Server. Use it with your own workbooks to get an accurate picture of concurrent users that will saturation your cluster. It is available on GitHub
Russell has a great set of blog posts on how to use TabJolt, in the following approximate order:
- The Mondo Tableau Server TabJolt Post Series – Part 1
- The Mondo Tableau Server TabJolt Series – Part 2
- The Mondo Tableau Server TabJolt Series – Part 3
- Load Testing Tableau Server 9 with TabJolt on EC2
- Customizing Tableau Tabjolt Load Tests
- Tableau Server TabJolt Testing – The Light Load
- Tableau Server TabJolt Testing – The Heavy Load
TabMon is a tool built by Tableau (but not supported by Tableau Tech Support) to measure a Tableau Server cluster, capturing both JMX console information and Windows Performance Monitor metrics. It is available on GitHub
Russell Christopher’s explorations into how to use TabMon