When you use Tableau in an embedded solution, the “Tableau method” of Desktop users publishing in a free-form method to the Tableau Server on a single site often doesn’t cover the requirements for controls and QA that a software development effort requires. While the classic “dev-test-qa-prod” is not as necessary with Tableau, any SaaS customer that is embedding Tableau will need a process for deploying Tableau to their customer’s sites from templates.
In this post, I’ll go over the deployment methodology my team recommends to customers. We’ll be looking at the two separate phases, which are often split into different teams: (1) Development (2) Deployment.
Many thanks to Tyler Dugal who helped develop much of this content for our presentation at TC16.
Tableau Content Development
Developing content in Tableau always starts in Tableau Desktop. There are more resources on how to build content in Tableau out on the Internet than I could possibly cover here, so I’m not going to even try. When deploying “production” level content, however, you should definitely optimize for performance in your designs.
What I’m concerned with here is the process of development. In most cases, you will be connecting Tableau Desktop to a “dev” database server to do development. To make the process as smooth as possible, the structure/schema of the “dev”
databases should match exactly to the final “prod” databases. When the table schemas match exactly, the data source in Tableau can be easily moved between the different database servers.
The basic paradigm is that you develop content in Desktop and publish to a Tableau Server Site set up for dev content. This process will continue until the content is in good enough shape to publish. You can determine your own method for how to signal that content is ready to publish — two ideas will be discussed in the Source Control section below.
Source / Revision Control
Tableau Server Revision History
Tableau Server has built in revision history, but you must enable it for a given site. Once enabled, it stores each published version of a workbook or data source, with full access to download or restore back to a given point. Revision history can also be accessed via the Tableau REST API, so this functionality can be integrated in with other tools you have.
External Source Control (Git, SVN, SharePoint, etc.)
The Tableau content files (TDS and TWB) are XML files and thus can be easily stored in any sort of revision control system. When you create an extract (TDE) or save a packaged workbook (TDSX or TWBX file), the files are binary and thus source control is less useful. Doing diff processes on workbook and data source files is not particularly useful, since there is no real easy way to merge changes with Tableau files, but having a check-in process to manage who is working on a workbook or data source can definitely have advantages.
Tableau Content Deployment / Promotion
Let’s assume a simplified process, where the QA team is looking at the Dev Tableau Server Site and giving feedback to the content developers. Everything looks good, so we’re going to promote to Prod.
Pulling the Content for Promotion
If using Tableau Server Revision History, the REST API can pull down any of the content that is “ready to publish.” In most cases, this is signaled by moving content into specified Projects, with the REST API script set to only pick up recently modified content into those Projects. You are welcome to develop your process however works best with your processes.
For external source control, you will be checking the TWB and TDS files in and out as they are developed. It’s also good practice to be publishing them to the Dev Tableau Server Site at the same time (this could probably be tied in with your source control system via the REST API if you need to get to that level of sophistication).
Whether using Tableau Server Revision History or an external Source Control method, the next steps can all be accomplished programmatically with the REST API and the Document API.
Modifying the Files for the New Environment
If you’ve read the The Tenets of Tableau Templates on Multitenants (and you really should, if simply for the awe-inspiring alliteration), you’ll see that Tableau holds some of the information (server address, port, etc) about a data source in the repository, where the REST API can modify it, and other information, such as the schema/database name and the table names directly in the Tableau XML files. To modify those attributes, you’ll need to use the Document API (or your own mechanism for modifying the XML directly, such as tableau_tools). This means that to fully switch an environment, you may need to use both the REST API and Document API.
Depending on how complex the environment is, particularly if the final result is a multi-tenanted Tableau Server with many sites, each connected to their own database, you will also want to keep a “Master Database” that all of your scripts can reference when making moving from dev to test to each prod tenant. Information you’ll want to keep in the Master Database:
- Database Info: database server addresses, database/schema names for each environment, database credentials
- Tableau Server Info: Tableau Server addresses, admin credentials, and Tableau Site IDs for each tenant
If you have a simpler environment, or are deploying on-premise, the “Master Database” could just be a text config file.
The deployment process involves passing the Template files from Revision Control through a Document API/XML Modification script, which should look to the “Master Database” for any elements that need to be changed and will produce new files with those details substituted in.
The last step is to use the REST API to publish the content to the correct sites. While these are shown as separate steps, in most cases the actual flow would be for the Document API to create the files for a given Tableau Server Site, and then the REST API publish to that site, then to move to the next customer in the Master DB.
On-Premise Deployment
For those Tableau customers who embed into a product they distribute to their customers, the same basic process follows, with the obvious additional step that the deployment must happen on the customer’s site. The end result of the development phase would be a deployment package, containing both the Tableau Template files and the scripts for deploying within the customer environment.
You don’t have to use the REST API in the deployment phase here — tabcmd can also do the same publishing actions. It’s also possible that you’d want to pre-populate the templates from the development side, rather than do the XML modification at the customer site. It’s totally up to you.
Dev – Test – Prod
If you have a fully traditional SDLC process that must be adhered to, the process again looks very similar — basically you have one promotion to Test, then one to Prod, without worrying about the multi-tenancy at the end. You still may need the Document API if your database/schema names change though. If the schemas are the same and only the database server addresses vary, you can get away with the REST API Update Datasource/Workbook Connection command.