The best analogy I’ve come up with that making a Tableau great data source is like making a great bronze sculpture. You make the original model that you keep for yourself in your workshop and then you cast new statues each time the outside world needs one. This process allows you make changes to the original model or even make a new model without replacing the existing statues (until necessary).
When used correctly, Data Sources, Extracts, and Tableau Data Server facilitate quick access to data with minimal effort. However, using the correct process and order can save tremendous amounts of time on initial setup and any subsequent changes. A fully thought out process will look something like this:
Making the Model: Creating a Data Source
Every Tableau workbook begins with Connect to Data. When sculpting our data source, we want to create a .TWB file that only contains datasources, without any viz. The TWB can contain multiple datasources if necessary and contains all the connection information necessary to publish a live connection or create an extract (TDE).
- Open Tableau Desktop
- Connect to your file or database.
- Join your tables
- Leave the connection live – don’t extract yet
- Manage your metadata:
- Rename fields
- Create Calculations
- Create Hierarchies
- Hide fields – hide anything empty or not used by anyone, for improved extract generation performance
- Set default properties (number format, comments)
- Assign Data Types and Geographic Roles
- Build parameters
- Depending on your version, you have two options
- 9.3 and after: Publish the workbook to a Project dedicated for Data Source Workbooks.
- 9.2 and before, Save the TWB, following your Revision Control Process to check the changes in.
Organizing your workshop: Revision Control Process
The essential aspect of Revision Control is tracking the changes you make so you can go back in case of a mistake, or just see who made a particular change.
9.3 and after
In Tableau Server, 9.3, Revision Control can be turned on for Workbooks. Keep separate workbooks for your data sources, then publish them to a Project dedicated to Data Source Revisions. A workbook needs at least one viz or filter before it can be published; make something simple and easy
9.2 and before
Prior to 9.3, Tableau Server does not track changes to overwritten TWB or TDS files, so you have choice in your revision process if using an earlier version. All you need is a method for distinguishing between versions and a way to “check in” and “check out”.
If your method is a simple shared drive, keep the first part of the file name the same and add _timestamp (and possible _initials) to the filenames to keep them distinct. It is also suggested that you publish the updated data source to a Tableau Server Project, visible only to Data Stewards, using the same naming conventions.
If using SharePoint, checking the file back into SharePoint should archive the previous version and keep the most recent check-in as the current file.
Industry standard revision control platforms like CVS, SVN, etc. can all be used for the same purpose.
Optional: Casting a Test Statue
Many organizations require a QA process before making any change to the verified data sources that end user workbooks depend on. In this case, the data steward will first publish the new data source (creating an extract if that will be the final result) to a QA Project on Tableau Server. Then the QA team will open a copy of existing workbooks that connect to the original data source in Desktop, and use the “Replace data source” functionality to confirm that the updated data source will not cause any issues when it replaces the original data source.
Once QA has confirmed, they report back their approval to the Data Steward. QA does not move the data source into the Verified Data Source Project.
Casting the final statue: Publishing to Verified Data Source Project
- 9.3 and later:Connect to the Published Data Source Workbook.
- 9.2 and before: Open the latest TWB from the Revision Control location, the one that QA approved
- If the final data source will be a Tableau Data Extract, follow these steps:
- Add a “Create Empty Extract” parameter and calculation per this article
- Set the “Create Empty Extract” parameter to “Yes”, then take an extract from the local copy, adding the “Create Empty Extract” filter per the article so that the extract generates with no data very quickly
- Set the “Create Empty Extract” parameter to “No” before publishing
- Publish the data source (live or extract) to the appropriate Project on the Tableau Server, keeping the same name as was originally published (take out any revision tracking endings). Overwrite if you are updating an existing data source so existing workbook references continue to work. This Project should have permissions that allow Data Stewards to Publish, and Business Users to Connect. In 9.3 and after: Make sure that “Update workbook to user the published data source” is NOT SELECTED. You don’t to connect the Data Source workbook to the version you publish.
- Extract Only: To get the data in the extract to be generated, go into the Data Sources page on Tableau Server, select the data source and then choose the “Permissions” menu option. Choose “Scheduled Tasks” and then make the extract refresh “Right Now”.
- Business Users should connect to the published datasource via the Tableau Server connection in Tableau Desktop to begin their own explorations of the data.