Building a Flexible Extract Generator using the Extract API

One of the least mentioned, but incredibly useful APIs in Tableau is the Extract API, which allows you to programmatically create an Extract file (Hyper files starting in 10.5, previously TDE files). The main use case is for data sources that require programmatic access (as opposed to using the one of the native connectors in Tableau). Some situations where this would be useful:

  • Data coming from a Web Service/ RESTful API with an object response
  • ODBC / JDBC drivers that Tableau cannot use
  • Additional programmatic modeling / statistical analysis against a whole data set

This post is focused mostly on first use case, where you are trying to make data available from some type of Web Service / RESTful API. In particular, if you need to provide only a subset from a very flexible set of possible fields for “ad hoc” analysis, this technique is the most functional solution to the problem.

When should I build a Flexible Extract Generator?

If you:

  • Know the structure of your web service responses
  • The amount of total fields is reasonably sized
  • The web service responses will not change frequently
  • Workbooks are fully built out and will not allow web editing
  • Data Source structure can be reused across multiple reports (and possibly customers)

then the better solution for Web Service/REST API based data sources is “Live” Web Services Connections in Tableau.

If instead you want to provide a selection screen to generate an Extract that will power a Web Edit session, then it makes sense to build a Flexible Extract Generator process. This is particularly useful when the set of fields could change drastically from extract to extract, or if other processing (such as machine learning) needs to be applied based on differing parameters prior to its use by the end user (that said, if the actual output columns are consistent, the “Live” Web Services solution could still work).



“Live” Web Services Connections in Tableau

Many organizations have begun standardizing on a “Web Services” layer for access to reporting data, often with a restriction on directly connecting to the underlying data stores that power the Web Service responses. In the majority of cases, the result is a set of RESTful endpoints returning JSON object data, but for the purposes of this article, any variation that involves HTTP requests and responses in a “web-friendly” response format (JSON / XML) will be referred to as “Web Services”.

There are many reasons for adopting this architecture, and I’m here neither to recommend or pass judgement. There is one major implication to this architectural decision though — BI systems that expect a relational model and SQL-compliant querying capabilities do not have a native, natural way to handle these data responses. Tableau falls in this category (I don’t care about any others, but it’s not an issue exclusive to Tableau).

Tableau provides a Web Data Connector technology which helps individual analysts retrieve data from Web Service Data Sources, but current design does not account for data sets to vary depending on the user looking at the workbook, something essential for scalable and secure Tableau Server reports.

However, Tableau’s ability to connect live to a wide range of relational data sources allows us to construct an alternate architecture for accessing Web Services responses “live”:

Full Embedded Web Services Architecture


Keeping Web Edit Content Private

Tableau’s behavior for saving content when using Web Edit follows these rules:

  1. If you are the Content Owner, you can Save or Save As
  2. If you are not the Content Owner, you can Save As

Save As is only allowed to Projects where you (or the groups you belong to) have a Save permission set to “Allow”.

Since a newly Saved Workbook will take the Default Permissions of the Project it saves into, if other people also have permissions for that same Project, they will also be able to access that content. This leads to several different strategies for controlling the privacy of content created through Save As.

Possible solutions:

  • A Project Per Team / Group
  • A Project Per User
  • A REST API script that “fixes” Permissions
  • Publishing a New Copy rather than Save As


Revisiting The Tenets of Multi-Tenancy

Since there’s been so much time and better examples and code, I went back and did a major revision of the The Tenets of Tableau Templates on Multi-tenants which I highly advise everyone reading. It’s the most thorough explanation out there of how to correctly handle SaaS / Multi-Tenancy or Dev->Test->Prod promotion. And no, you do not need Interworks PowerTools to do this process, although they do have some nice features.

Row Level Security using Microsoft Analysis Services Cubes in an External- Facing Environment

Later versions of Microsoft Analysis Services (MSAS) allow you to configure user and role based data security within the cube itself. However, this functionality only works when that particular user is logged in directly to the cube. In Tableau, this can be accomplished via Kerberos.

What about when you are using MSAS cubes in an external facing solution, with users who are not in the local domain? Cube connections in Tableau don’t have the equivalent of a Data Source Filter the way relational database connections do, and there is no way to pass the USERNAME() function into a Calculated Member the way you can in a relational calculated field.

In this case, the manual “User Filter” functionality can achieve a reasonable solution.


Securely Passing Parameters into a Tableau Viz at Load Time

The standard answer for enforcing user-based data entitlements in Tableau is to use Row Level Security, where the user is authenticated in Tableau Server and then tied into an “entitlements view” in the database so that the user only ever sees data they have access rights to.

However, we are very often asked about passing parameters in to the viz to filter down information directly at load time, often driven by an application that Tableau vizes are embedded in. This post is about a few methods of implementing this behavior, and the security implications of each of them.

Basics of Security

Everything must be HTTPS

I’ll start by saying, to do any of this securely, you need EVERY resource you are working with to be using the HTTPS protocol (latest TLS version). If anything is not HTTPS, you could be passing important information in the clear.

Using URL Parameters to set a Filter directly is NOT SECURE

You can use the URL Parameter syntax to directly set the values for a Filter on any field, but this is completely insecure. Why? Because the following two methods will clear any filter and reveal all of the rows of data. Unless you have the JS API turned off, there is no way to prevent this.

Sheet.applyFilterAsync(fieldName, "", tableau.FilterUpdateType.ALL);

Tableau Parameters are the (potentially) secure way to make an adjustable Data Source Filter

The only way to prevent a user from resetting a filter value is by making it a Data Source Filter.  Thankfully, you can use a Calculated Field for the Data Source Filter. If you use Tableau Parameters in the Calculated Field, the Parameter value(s) can be set to change what is filtered, and you will have a Data Source Filter that cannot be altered by the JS API (or the end user).

However, there are quite a few considerations to make this a truly secure method for setting filter values:


Tableau and Write-Back – Together At Last

Editor’s Note: Huge thanks to special contributor Gordon Rose for this blog post.

Tableau helps people see and understand their data – and guarantees that it in the process, it will never make any changes to that data. Tableau is a strictly read-only technology. However, many customers want the ability to modify the data that lies behind a Tableau visualization (Viz), and then, either see those changes immediately reflected in the Viz and/or make other applications aware of those changes. With a small amount of supporting technology, Tableau’s read-only behavior can easily be integrated into so-called “write-back” use cases.

In this blog article, we’ll explore a way to do exactly that – one in which the write-back components are external to the Viz. An alternative approach is one in which those components are more tightly integrated into the Viz itself – that’s for a later blog article to explore. Ideally you will find that you can use one of these two approaches as a launching point for the development of your own write-back use case.