How do I show a user only the data I want them to see in Tableau? The answer is not Permissions, which only affect what Tableau Server content a user can see. It is Row Level Security, which requires setup both in your Tableau workbooks and in your database.
To be secure and perform well, you must use a Live Connection to your database. Extracts will be too large if you join a security table that is one-to-many to your data table (we say they “blow up” to dissuade people). While you can simulate row-level security using Data Blending, it is not fully secure and it also can run into performance issues, so it is best to avoid that method.
Since you need a live connection and a particular setup in your database, it’s worth exploring the requirements and the best practices on the database side.
Fully programmatic Row Level Security has been possible in Tableau for a long time, using what I will refer to as the “JOIN Method”. It requires you have a security table (or set of tables, or a view) which includes the Tableau Username as one column, and a column with a “security_key” that is also present in a column in the data table. The security table must have one row per “security_key”, so that when JOINed it will not cause any row duplication, which would will cause incorrect aggregations.
I’m generalizing here to the bare minimum. I’ll talk about all of the possibilities of what the “security table” can and should be further down. For our initial discussion, you can think of it as a single table.
There is a rare case where the data is mapped 1 to 1 with individual users. In this case, you don’t need the extra security table and can simply filter the username field in the data table. If you have this situation, lucky you! Most organizations have more complex mappings, where a user belongs to an organization or a multiple groups and so on. When you have a security table involved, there are two methods to achieve the necessary filtering.
The JOIN Method
The basics of the JOIN Method (as laid out in the Tableau KB article, second section) are:
- Connect to Data Table “d”
- INNER JOIN Security Table “s” ON “d”.security_key = “s”.security_key
- Create a Calculated Field in Tableau using [“s”.tableau_username] = USERNAME()
- Put the Calculated Field on as a Data Source Filter, set to True
- Publish the Data Source to Tableau Server
- All user workbooks must connect to the Published Data Source. Now the data source will be filtered to the logged in user, and they cannot get around the data security model, owing to the data source filter.
You can represent this in SQL like the following (although VizQL will produce whatever complex queries it needs, this is the basic form):
FROM data d
INNER JOIN security s ON d.key = s.key
WHERE s.username = USERNAME()
Why does this not cause row duplication?
For those used to Tableau Data Extracts, the obvious issue with this method is that if your security table has multiple entries, you will end up with many duplicate rows. And in truth, if your live database is not optimizing the queries correctly, it might go through that process to get your end result. But in most cases, the database processes in an order that actually makes this more efficient.
Your initial JOINs in Tableau are just telling VizQL about the relationships;the database never actually calculates this combined view:
With your row level security calculation in place, there will always be a WHERE clause that filters the Security table. The RDBMS should process this first, which reduces the number down just to the entries for that username.
Now that there is one (or a very few) number of keys remaining to JOIN on, the database filters down the Data table based on those remaining keys from the security table.
Even if a given user has more than one match in the data table, as long as the relationship remains one-to-one AFTER filtering down on the security side, you still won’t have duplication.
How does it perform?
I tested on my local PostgreSQL instance and this method can produce very fast results. For it to be fast, you must INDEX all of the filtered fields properly and set up the correct Primary Key / Foreign Key relationships where you can. The query optimizer does the following (1) Filters the security table down by the username column, using the INDEX on the username column (2) Uses the remaining key (just 1 row) to quickly JOIN, removing all rows from the data table that don’t belong. Since the key field is indexed in both tables, this also is a very fast process.
Will it perform this well in other RDBMSs? I’m hoping to follow up with a few other systems to make sure there isn’t something being missed, but if the optimizer is decent, it seems like it should work just fine. The database optimizer shouldn’t be trying to JOIN all the data in both tables first and then filtering down; if it is, then you may need to investigate how to force different optimizer behavior (hinting and so forth).
I also tested whether either of the following queries changed the optimizer logic. The answer was no: at least in PostgreSQL, the query optimizer recognized that these were all equivalent, and performed the exact same operations.
FROM data d
INNER JOIN security s ON d.key = s.key AND s.username = USERNAME();
FROM data d
INNERT JOIN (
FROM security s
WHERE s.username = USERNAME()
) s ON d.key =s.key;
The WHERE Method
The other way of filtering down rows in SQL is through a WHERE clause, rather than a JOIN. You can put a SELECT statement in the WHERE clause (a sub-select) which returns only values necessary to filter the main table. Logically, this is exactly the same as INNER JOINing, however it’s very likely that the database optimizer will do a different set of steps to compute the result set.
The basic idea, rendered in SQL, is as follows:
FROM data d
WHERE d.key = (
FROM security s
WHERE s.username = USERNAME()
How does it perform?
Using the same tables in PostgreSQL, the WHERE Method does appear to be slightly more efficient. I do mean slightly — when using EXPLAIN, the total cost for the WHERE Method scored about 10 points lower than the JOIN Method (I don’t fully understand PostgreSQL costing, but lower is better). I ran the queries over and over and the lowest speed results were with the WHERE Method, but the average time was about the same for either. My data set was around 3.5 million records, with security table size around 4000. As mentioned earlier, I plan to test on some other RDBMS systems to see if there is more benefit / differences in optimization. Russell’s look at Row Level Security in SQL Server 2016 leads me to believe the WHERE Method might provide more benefit on SQL Server. The optimizer is where RDBMS systems really vary, so YMMV.
Recommendation: Use the JOIN Method unless the WHERE Method is “easy”
The standard recommended JOIN Method for Row Level Security in Tableau works just fine, given you have a “security table” to JOIN to, as described in the next section. While the WHERE method may give slight performance gains (and it is probably more natural to SQL experts), in a properly optimized database there’s no reason not to use the JOIN method.
Exceptions to this recommendation, which take advantage of features in particular RDBMS systems to do the WHERE method internally:
- In Oracle, if you have VPD set up for the database users, you can use Initial SQL in Tableau 9.3 to take advantage of the existing security filtering
- In SQL Server 2016, you can set up Row Level Security in the database, based on database user. Tableau’s Impersonate User functionality will set the user correctly for you, and SQL Server 2016 will do the filtering. See Russell’s exploration here.
- In previous versions of SQL Server, you can create Views with the row level security WHERE clauses built in that reference the current user’s role, then give users only access to those specific Views. In this case, Impersonate User will work. It is a similar concept to how SQL Server 2016 works, but the DBA must make every View they want to expose instead of having the security logic applied automatically to every table or view. You can reference the username of the user who is being impersonated using the SYSTEM_USER constant value, per Stack Overflow
- Each View would require a WHERE clause includes a reference like [Entitlements Table].[Username] = SYSTEM_USER
Note that these require the usernames to exist on the database as users with permissions / roles in the database. This is more likely in an internal organization use case than in an externally facing Tableau deployment. If the usernames are not database users, but simply exist in the data, then you can implement a solution starting with Tableau 9.3 using Initial SQL with Parameters.
Building the “Security Table”
At the beginning, we assumed there was a single “security table” to be JOINed or used in the WHERE clause. Now let’s think about what that “security table” should consist of and how we can get there from whatever we are starting with.
Starting with the end in mind, whatever we do needs to result in a one-to-one relationship between a column in the data table and a column in the “security table”.
Standard database design practices mean you rarely have a single table that fits this criteria. Most data isn’t necessarily mapped one-to-one to a single user; it’s very likely the data security is organized either by role, organization name, or both. A normalized set of tables might look like:
- Users table
- Roles table
- Users-Roles mapping table
- Organization / Customer table. We’ll assume it has a one-to-one relationship with Users
To get to the “security table” mentioned previously, you’d actually do
FROM roles r
INNER JOIN users_roles_map m ON r.role_id = m.role_id
INNER JOIN users u ON m.user_id = u.user_id
INNER JOIN organization o ON u.org_id = o.org_id
Certainly, you can do this in Tableau’s JOINs dialog. But it’s a lot to ask of an end user to set up every time, when the logic should always stay the same. I would always recommend creating a View, possibly even a Materialized or Indexed View depending on your RDBMS, so that you only have to JOIN that single ‘security_table” view to the data table in Tableau.
At some point in life, we all have to face the fact that we answer to someone. Most organizations have a hierarchy, and often there is the desire that those higher in the hierarchy can access the data of all of those who report to them. Recommendations start to vary on how to deal with hierarchies, because your particular needs may be different, and RDBMS systems have often solved the challenges of hierarchies with their own proprietary methods.
Utilizing Tableau Groups and a “Flattened” Security Table
The standard Tableau Row Level Security calculated field looks like
[Username Field] = USERNAME()
But this calculation can include IF/THEN logic, and there is also a function for testing if a user is a member of a Tableau Group called ISMEMBEROF(‘group name’). Using this, you can construct hierarchical filters, like so:
IF ISMEMBEROF(‘System Admins’) THEN 1
ELSEIF ISMEMBEROF(‘Managers’) THEN
IF [Manager Username Field] = USERNAME() THEN 1 ELSE 0 END
IF [Username Field] = USERNAME() THEN 1 ELSE 0 END
You’ll notice that the outputs need to be numeric; this is due to the nature of IF/THEN calculations in Tableau.
The idea here is that a given user in the security table will have additional columns that include the username key of the user above them in the hierarchy. For each level of hierarchy, an additional column is necessary. Obviously there is some limit to the levels of hierarchy you would be able to reasonably define this way — you may have to decide on a mapping process that simplifies down the actual hierarchy into a set number for Tableau to handle. You also need a mechanism for synchronizing Tableau Group membership from the database (using the REST API, but with some type of mapping from the database).
Unlimited levels of hierarchy
The standard pattern from representing hierarchy in a relational database actually is through a long table with only two columns: User ID and User’s Manager’s User ID. You can JOIN this table to itself to create the flattened view of the hierarchy.
Common Table Expressions (CTEs)
Rather than hand write out each self JOIN (one for each level of hierarchy), many systems now allow for Common Table Expressions (CTEs) which can recursively JOIN until the last level has been joined. Tableau cannot deal with CTEs directly; you can’t even put them in as Custom SQL due to how VizQL uses the Custom SQL query. If you need to use a CTE to get to your flattened security table, you need to make it as a View. Following the earlier advice, you should just combine this CTE view with any other details you need to make a single View to join with your data table in Tableau.
Security Functions, Stored Procedures, etc.
As mentioned, different RDBMS systems have other ways of handling hierarchy calculations. SQL Server, for example, has a hierarchyID type and related functions for handling relationships defined this way. You may have defined some functions or stored procedures that do all of the security processing logic. To use them in Tableau, you’ll need to either use the Parameters with Stored Procedures method, or Initial SQL in Tableau 9.3.