The Tableau Extract API 2.0 is an amazingly powerful tool for building out Extracts that, for whatever reason, cannot be built or maintained using the standard Tableau Server extract refresh process. The output of the Extract API 2.0 is a Hyper file (just as the older Extract API pushed out TDE files). You can publish a Hyper file directly to a Tableau Server, but there are several drawbacks:
- Tableau Server will build out an automatic TDS file, taking a rough guess at any type of metadata categorization (Measure vs. Dimensions, Hierarchies, Geographic info, etc.)
- The only use for this data source will be creating Ad Hoc reports using Web Edit (or hoping someone in Desktop now knows that it exists). You can’t integrate it easily in an existing Workbook
What is missing is a TDS file to pair up with the Hyper file, describing the exact metadata that you want to go along with the Extracted data. In this article, I’ll describe two workflows that result in a fully controlled TDSX file with a newly generated Hyper file.
As of 2019.2, the Tableau Server REST API only allows logging in for a REST API session using a combination of username and password. This means there is no effective way to directly start a REST API session using a SSO mechanism (SAML, JWT, etc.) Even if you were able to, you might still want to restrict the user to only do certain actions (for example, enabling Querying methods but not Updates or Deletes).
The best practice for working around this is to wrap the Tableau REST API in another REST API service of your own design. Then within that wrapper, use a Server or Site Administrator level account to log in to the Tableau Server REST API. In this article, we’ll discuss how to achieve this using tableau_tools, with both a simple and a more complex but efficient design pattern.
The contents of this post have been merged into a revised version of How to set up your Database for Row Level Security in Tableau, where they rightfully belonged in the first place.
One of the least mentioned, but incredibly useful APIs in Tableau is the Extract API, which allows you to programmatically create an Extract file (Hyper files starting in 10.5, previously TDE files). The main use case is for data sources that require programmatic access (as opposed to using the one of the native connectors in Tableau). Some situations where this would be useful:
- Data coming from a Web Service/ RESTful API with an object response
- ODBC / JDBC drivers that Tableau cannot use
- Additional programmatic modeling / statistical analysis against a whole data set
This post is focused mostly on first use case, where you are trying to make data available from some type of Web Service / RESTful API. In particular, if you need to provide only a subset from a very flexible set of possible fields for “ad hoc” analysis, this technique is the most functional solution to the problem.
When should I build a Flexible Extract Generator?
- Know the structure of your web service responses
- The amount of total fields is reasonably sized
- The web service responses will not change frequently
- Workbooks are fully built out and will not allow web editing
- Data Source structure can be reused across multiple reports (and possibly customers)
then the better solution for Web Service/REST API based data sources is “Live” Web Services Connections in Tableau.
If instead you want to provide a selection screen to generate an Extract that will power a Web Edit session, then it makes sense to build a Flexible Extract Generator process. This is particularly useful when the set of fields could change drastically from extract to extract, or if other processing (such as machine learning) needs to be applied based on differing parameters prior to its use by the end user (that said, if the actual output columns are consistent, the “Live” Web Services solution could still work).
Many organizations have begun standardizing on a “Web Services” layer for access to reporting data, often with a restriction on directly connecting to the underlying data stores that power the Web Service responses. In the majority of cases, the result is a set of RESTful endpoints returning JSON object data, but for the purposes of this article, any variation that involves HTTP requests and responses in a “web-friendly” response format (JSON / XML) will be referred to as “Web Services”.
There are many reasons for adopting this architecture, and I’m here neither to recommend or pass judgement. There is one major implication to this architectural decision though — BI systems that expect a relational model and SQL-compliant querying capabilities do not have a native, natural way to handle these data responses. Tableau falls in this category (I don’t care about any others, but it’s not an issue exclusive to Tableau).
Tableau provides a Web Data Connector technology which helps individual analysts retrieve data from Web Service Data Sources, but current design does not account for data sets to vary depending on the user looking at the workbook, something essential for scalable and secure Tableau Server reports.
However, Tableau’s ability to connect live to a wide range of relational data sources allows us to construct an alternate architecture for accessing Web Services responses “live”:
Tableau’s behavior for saving content when using Web Edit follows these rules:
- If you are the Content Owner, you can Save or Save As
- If you are not the Content Owner, you can Save As
Save As is only allowed to Projects where you (or the groups you belong to) have a Save permission set to “Allow”.
Since a newly Saved Workbook will take the Default Permissions of the Project it saves into, if other people also have permissions for that same Project, they will also be able to access that content. This leads to several different strategies for controlling the privacy of content created through Save As.
- A Project Per Team / Group
- A Project Per User
- A REST API script that “fixes” Permissions
- Publishing a New Copy rather than Save As