How to set up your Database for Row Level Security in Tableau

How do I show a user only the data I want them to see in Tableau? The answer is not Permissions, which only affect what Tableau Server content a user can see. It is Row Level Security, which requires setup both in your Tableau workbooks and in your database.

To be secure and perform well, you must use a Live Connection to your database. Extracts will be too large if you join a security table that is one-to-many to your data table (we say they “blow up” to dissuade people). While you can simulate row-level security using Data Blending, it is not fully secure and it also can run into performance issues, so it is best to avoid that method.

Since you need a live connection and a particular setup in your database, it’s worth exploring the requirements and the best practices on the database side.

The Basics

Fully programmatic Row Level Security has been possible in Tableau for a long time, using what I will refer to as the “JOIN Method”. It requires you have a security table (or set of tables, or a view) which includes the Tableau Username as one column, and a column with a “security_key” that is also present in a column in the data table. The security table must have one row per “security_key”, so that when JOINed it will not cause any row duplication, which would will cause incorrect aggregations.

I’m generalizing here to the bare minimum. I’ll talk about all of the possibilities of what the “security table” can and should be further down. For our initial discussion, you can think of it as a single table.

There is a rare case where the data is mapped 1 to 1 with individual users. In this case, you don’t need the extra security table and can simply filter the username field in the data table. If you have this situation, lucky you! Most organizations have more complex mappings, where a user belongs to an organization or a multiple groups and so on. When you have a security table involved, there are two methods to achieve the necessary filtering.

The JOIN Method

The basics of the JOIN Method (as laid out in the Tableau KB article, second section) are:

  1. Connect to Data Table “d”
  2. INNER JOIN Security Table “s” ON “d”.security_key = “s”.security_key
  3. Create a Calculated Field in Tableau using [“s”.tableau_username] = USERNAME()
  4. Put the Calculated Field on as a Data Source Filter, set to True
  5. Publish the Data Source to Tableau Server
  6. All user workbooks must connect to the Published Data Source. Now the data source will be filtered to the logged in user, and they cannot get around the data security model, owing to the data source filter.

You can represent this in SQL like the following (although VizQL will produce whatever complex queries it needs, this is the basic form):

FROM data d
INNER JOIN security s ON d.key = s.key
WHERE s.username = USERNAME()

Why does this not cause row duplication?

For those used to Tableau Data Extracts, the obvious issue with this method is that if your security table has multiple entries, you will end up with many duplicate rows. And in truth, if your live database is not optimizing the queries correctly, it might go through that process to get your end result. But in most cases, the database processes in an order that actually makes this more efficient.

Your initial JOINs in Tableau are just telling VizQL about the relationships;the database never actually calculates this combined view:

row level 4

With your row level security calculation in place, there will always be a WHERE clause that filters the Security table. The RDBMS should process this first, which reduces the number down just to the entries for that username.


row level 5

Now that there is one (or a very few) number of keys remaining to JOIN on, the database filters down the Data table based on those remaining keys from the security table.

row level 6

Even if a given user has more than one match in the data table, as long as the relationship remains one-to-one AFTER filtering down on the security side, you still won’t have duplication.

How does it perform?

I tested on my local PostgreSQL instance and this method can produce very fast results. For it to be fast, you must INDEX all of the filtered fields properly and set up the correct Primary Key / Foreign Key relationships where you can. The query optimizer does the following (1) Filters the security table down by the username column, using the INDEX on the username column (2) Uses the remaining key (just 1 row) to quickly JOIN, removing all rows from the data table that don’t belong. Since the key field is indexed in both tables, this also is a very fast process.

Will it perform this well in other RDBMSs? I’m hoping to follow up with a few other systems to make sure there isn’t something being missed, but if the optimizer is decent, it seems like it should work just fine. The database optimizer shouldn’t be trying to JOIN all the data in both tables first and then filtering down; if it is, then you may need to investigate how to force different optimizer behavior (hinting and so forth).

I also tested whether either of the following queries changed the optimizer logic. The answer was no: at least in PostgreSQL, the query optimizer recognized that these were all equivalent, and performed the exact same operations.

FROM data d
INNER JOIN security s ON d.key = s.key AND s.username = USERNAME();

FROM data d
SELECT s.key
FROM security s
WHERE s.username = USERNAME()
) s ON d.key =s.key;

The WHERE Method

The other way of filtering down rows in SQL is through a WHERE clause, rather than a JOIN. You can put a SELECT statement in the WHERE clause (a sub-select) which returns only values necessary to filter the main table. Logically, this is exactly the same as INNER JOINing, however it’s very likely that the database optimizer will do a different set of steps to compute the result set.

There is currently no way to do this method in Tableau’s Data Joins screen. Prior to Tableau 9.3, you can only do it using Custom SQL and JavaScript API parameters, which requires embedding into another page (see the post about Parameters with Stored Procedures; it’s exactly the same idea). With Tableau 9.3 you can use Initial SQL with Parameters.

The basic idea, rendered in SQL, is as follows:

FROM data d
WHERE d.key = (
SELECT s.key
FROM security s
WHERE s.username = USERNAME()

How does it perform?

Using the same tables in PostgreSQL, the WHERE Method does appear to be slightly more efficient. I do mean slightly — when using EXPLAIN, the total cost for the WHERE Method scored about 10 points lower than the JOIN Method (I don’t fully understand PostgreSQL costing, but lower is better). I ran the queries over and over and the lowest speed results were with the WHERE Method, but the average time was about the same for either. My data set was around 3.5 million records, with security table size around 4000. As mentioned earlier, I plan to test on some other RDBMS systems to see if there is more benefit / differences in optimization. Russell’s look at Row Level Security in SQL Server 2016 leads me to believe the WHERE Method might provide more benefit on SQL Server. The optimizer is where RDBMS systems really vary, so YMMV.

Recommendation: Use the JOIN Method unless the WHERE Method is “easy”

The standard recommended JOIN Method for Row Level Security in Tableau works just fine, given you have a “security table” to JOIN to, as described in the next section. While the WHERE method may give slight performance gains (and it is probably more natural to SQL experts), in a properly optimized database there’s no reason not to use the JOIN method.

Exceptions to this recommendation, which take advantage of features in particular RDBMS systems to do the WHERE method internally:

  • In Oracle, if you have VPD set up for the database users, you can use Initial SQL in Tableau 9.3 to take advantage of the existing security filtering
  • In SQL Server 2016, you can set up Row Level Security in the database, based on database user. Tableau’s Impersonate User functionality will set the user correctly for you, and SQL Server 2016 will do the filtering. See Russell’s exploration here.
  • In previous versions of SQL Server, you can create Views with the row level security WHERE clauses built in that reference the current user’s role, then give users only access to those specific Views. In this case, Impersonate User will work. It is a similar concept to how SQL Server 2016 works, but the DBA must make every View they want to expose instead of having the security logic applied automatically to every table or view. You can reference the username of the user who is being impersonated using the SYSTEM_USER constant value, per Stack Overflow
    • Each View would require a WHERE clause includes a reference like [Entitlements Table].[Username] = SYSTEM_USER

Note that these require the usernames to exist on the database as users with permissions / roles in the database. This is more likely in an internal organization use case than in an externally facing Tableau deployment. If the usernames are not database users, but simply exist in the data, then you can implement a solution starting with Tableau 9.3 using Initial SQL with Parameters.

Building the “Entitlements View”

At the beginning, we assumed there was a single “security table” to be JOINed or used in the WHERE clause. In reality, you will probably need to construct a de-normalized “entitlements view” out of many different tables that store different pieces of the user entitlements.

Starting with the end in mind, whatever we do needs to result in a one-to-one relationship between a column in the data table and a column in the “security table”. To represent different hierarchical relationships to the data, one effective technique is to have a row in the Entitlements View representing each of the most granular level of security filtering. What does this mean? Let’s take an example with a two-tiered hierarchy:

  1. Big Boss
  2. Category Manager
  3. Sub-Category Manager

The most granular level we will filter to is Sub-Category. A Category manager can see all of the sub-categories in their Category, and the big boss can see all sub-categories. In a strict hierarchy like this, you really only have to filter on sub-category, but there are situations where you want to filter on more than one attribute.

To represent this in your Entitlements View, the Big Boss will have a row for every sub-category. Each Category manager will have a row for any of the sub-categories that belong to their category, while a Sub-Category manager will have a row for any sub-category they own.

Let’s consider an additional level of hierarchy depth:

  1. Big Boss
  2. Category Manager
  3. Sub-Category Manager
  4. Segment

In this case, segment doesn’t hierarchically belong to Sub-Category; the Segment can exist across different sub-categories and categories. However, some sub-categories have a manager per segment, while others have managers who handle all of the segments.

Now our most granular level is Sub-Category+Segment. This is known as a Composite Key in database terminology, and there are two ways you can handle it:

  • Create an actual Composite Key field in your Entitlements View and in your Data Table. Then JOIN the two on that single Composite Key field
  • JOIN the Entitlements View to the Data Table on each field that makes up the Composite Key.

Which of these is better? It really depends on how many field make up your composite key and how much control you have on your data tables. In all cases, you want to have any of the columns you JOIN against in the Data Table INDEXed; you might also (as noted belong) want to Materialize the Entitlements View into a real table and index its columns, or define a Foreign Key relationship to the data table. Basically, anything that will help make the JOINs and filtering faster should be done. With a Hyper extract (2018.3 and later), there is no optimization necessary, but whether a single composite key performs better than multiple joined fields is still up for debate.

Putting together an Entitlements View in a database

Standard database design practices mean you rarely have a single table that fits the criteria for the entitlements table. Most data isn’t necessarily mapped one-to-one to a single user; it’s very likely the data security is organized either by role, organization name, or both. A normalized set of tables might look like:

  1. Users table
  2. Roles table
  3. Users-Roles mapping table
  4. Organization / Customer table. We’ll assume it has a one-to-one relationship with Users

To get to the “security table” mentioned previously, you’d actually do

FROM roles r
INNER JOIN users_roles_map m ON r.role_id = m.role_id
INNER JOIN users u ON m.user_id = u.user_id
INNER JOIN organization o ON u.org_id = o.org_id

You may even need to do UNIONs to get all of the rows necessary to represent that most granular level of data, and particular calculations to generate a unique Composite Key field.

Certainly, you can do this in Tableau’s JOINs dialog. But it’s a lot to ask of an end user to set up every time, when the logic should always stay the same. I would always recommend creating a View, possibly even a Materialized or Indexed View depending on your RDBMS, so that you only have to JOIN that single ‘security_table” view to the data table in Tableau.

Hierarchical Filters

At some point in life, we all have to face the fact that we answer to someone. Most organizations have a hierarchy, and often there is the desire that those higher in the hierarchy can access the data of all of those who report to them. Recommendations start to vary on how to deal with hierarchies, because your particular needs may be different, and RDBMS systems have often solved the challenges of hierarchies with their own proprietary methods.

Utilizing Tableau Groups and a “Flattened” Security Table

The standard Tableau Row Level Security calculated field looks like

[Username Field] = USERNAME()

But this calculation can include IF/THEN logic, and there is also a function for testing if a user is a member of a Tableau Group called ISMEMBEROF(‘group name’). Using this, you can construct hierarchical filters, like so:

IF ISMEMBEROF(‘System Admins’) THEN 1
IF [Manager Username Field] = USERNAME() THEN 1 ELSE 0 END
IF [Username Field] = USERNAME() THEN 1 ELSE 0 END

You’ll notice that the outputs need to be numeric; this is due to the nature of IF/THEN calculations in Tableau.

The idea here is that a given user in the security table will have additional columns that include the username key of the user above them in the hierarchy. For each level of hierarchy, an additional column is necessary. Obviously there is some limit to the levels of hierarchy you would be able to reasonably define this way — you may have to decide on a mapping process that simplifies down the actual hierarchy into a set number for Tableau to handle. You also need a mechanism for synchronizing Tableau Group membership from the database (using the REST API, but with some type of mapping from the database).

Unlimited levels of hierarchy

The standard pattern from representing hierarchy in a relational database actually is through a long table with only two columns: User ID and User’s Manager’s User ID. You can JOIN this table to itself to create the flattened view of the hierarchy.

Common Table Expressions (CTEs)

Rather than hand write out each self JOIN (one for each level of hierarchy), many systems now allow for Common Table Expressions (CTEs) which can recursively JOIN until the last level has been joined. Tableau cannot deal with CTEs directly; you can’t even put them in as Custom SQL due to how VizQL uses the Custom SQL query. If you need to use a CTE to get to your flattened security table, you need to make it as a View. Following the earlier advice, you should just combine this CTE view with any other details you need to make a single View to join with your data table in Tableau.

Security Functions, Stored Procedures, etc.

As mentioned, different RDBMS systems have other ways of handling hierarchy calculations. SQL Server, for example, has a hierarchyID type and related functions for handling relationships defined this way. You may have defined some functions or stored procedures that do all of the security processing logic. To use them in Tableau, you’ll need to either use the Parameters with Stored Procedures method, or Initial SQL in Tableau 9.3.




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s