Attribute-Based Row Level Security in Tableau, a Working Technique

Tableau has the most flexibility for doing Row Level Security filtering using what is termed the “Standard Method”: a JOIN between a Data Table and an Entitlements table. The basic assumption, as seen in the RLS Whitepaper, is that this Entitlements table exists in the same database at the Data Table. This is a very narrow assumption, that is only true for a small subset of real deployments.

In many environments and applications, there is no “Entitlements” table within the same logical database as the Data table. In fact, there may be no “Entitlements table” that can be reached at all — entitlements can be retrieved somehow, but they may be in any number of forms.

The most common request I see is to pass a set of Attributes (lists or arrays) of values directly into a Tableau report at run-time. The rest of the post will describe a flexible and generic method for making this possible in a secure way. It builds upon the baseline of the PostgreSQL/Python environment in the using a Web Service as a Data Source post, but is even simpler.

Intro to Solution

I already explored decoding a JWT using a function in the post Passing a JWT as a Tableau Parameter , but there we still worked under the assumption that you would be getting back a single value and using it to filter data within the same PostgreSQL environment. However:

If a Function returns back as a TABLE, Tableau can use it as a TABLE through Custom SQL, and thus it can be used in a Cross Database JOIN (Multiple Connections in a Single Data Source)

As long as our function returns back a valid table, we can encapsulate (1) any decryption needed to securely pass the values as a Tableau Parameter (see here for why this is necessary) (2) any transformation necessary to return back a “table of entitlements”.

Put simply:

  1. Send in a JWT through a Tableau Parameter to a PostgreSQL function
  2. Get back a Table
  3. Use that Table against ANY source via Cross-Database JOIN
  4. Perform RLS using the “Standard Method” of JOINS (either Deepest Entitlements or Sparse, both work)

What is PostgreSQL doing, exactly?

Tableau needs to think it has a Live Connection to a real database for a Live Cross-Database JOIN. You could implement the same thing in MySQL or MariaDB or any other free database that Tableau has a native connection to — but example uses the existing extension of the standard Docker PostgreSQL 10 image to quickly spawn a PostgreSQL with Python 3 as a functional language. If you follow the instructions here, you can get a PostgreSQL server with the necessary packages up and going very quickly.

We know Python 3 is (1) fast (2) Unicode compliant (3) has good JWT libraries (4) is reasonably easy to integrate within a PostgreSQL function.

But if you are thinking “PostgreSQL” seems like overkill here, you aren’t “wrong”: we’re just using it to host a data transformation function, and consistently return it back as a table. It’s a workaround, of which you are certainly welcome to describe in other terms if that makes you feel good. It’s fast, free, and scalable, so in the realm of workarounds is about us low cost/effort as possible.

Note: You need to have AT LEAST ONE Table defined in the PostgreSQL database for the Custom SQL option to appear in Tableau Desktop when setting up the data source.

This table does not need to have any data, and can be one single column, but the user you are connecting to in Tableau needs to have QUERY rights on it.

Flexibly Handling a JWT

The example function is based on a very simple idealized JSON object, where the keys at the lowest level contain 1-dimensional arrays of entitlements:

{ 
'iat' : '202006031032',
'segment' : ['Home', 'Business'],
'region' : ['East', 'West'],
'item_id': [12312031203, 198002992, 19302, 19019191]
}

You will know best what you are passing through, so just remember to adjust your logic to pull the set of things you want out of your entitlements object.

Next is going to be the PL/Python Function we define. It is designed to be flexible, so that one function can handle a wide variety of inputs:

CREATE OR REPLACE FUNCTION
jwt_to_table(jwt_token text, attribute_keys text)
RETURNS SETOF record
AS $$

import datetime
import itertools
import jwt

jwt_secret = 'between-you-and-me-keep-this-a-secret'
decoded_value = jwt.decode(jwt=jwt_token, key=jwt_secret, algorithms='HS256')

# Split attribute_keys
split_delimiter = ","
attributes = attribute_keys.split(split_delimiter)

# Or maybe throw an error?
if len(attributes) == 0:
    return None
# Determine what we should spit back

attribute_lists = []
for attribute_key in attributes:
    # Remove any whitespace
    a_key = attribute_key.strip()
    if a_key in decoded_value:
        attribute_lists.append(decoded_value[a_key])
    # Throw error if there is a key that is passed in that doesn't exist in the JWT
    # Check for a valid response, if not return no rows
    # You could instead through an exception here (https://www.postgresql.org/docs/10/plpython-util.html)
    else:
        return []

# Replace any empty attribute list with a  [None, ] to be multiplied in the cross product
# You might instead drop that attribute entirely from the cross product
for att_list in attribute_lists:
    if len(att_list) == 0:
        att_list = [None, ]

cartesian_prod = itertools.product(*attribute_lists)
return list(cartesian_prod)

$$ LANGUAGE plpython3u;

How to use the Function

The jwt_to_table function has two arguments: (1) a string of your JWT (2) a comma-delimited string of JSON key names.

The second attribute lets you say what keys out of the JWT you want to pull back as attributes, and in what order.

If you just want a single column of one of the attributes you would just pass:

'item_id'

Or you could get the full product of all of them:

'segment, region, item_id'

How do we deal with the different potential returns? You’ll notice that the return of the function is:

RETURNS SETOF record

What this means is that the function itself isn’t telling you what exactly it returns. In PostgreSQL, you can use a function with SETOF as a return by declaring how you want it interpreted when you do the query , using an AS clause, then defining the columns like you would in a TABLE definition:

SELECT * FROM jwt_to_table(, 'segment,region') AS (segment_att text, region_att text)

or

SELECT * FROM jwt_to_table(, 'segment, item_id') AS (segment_att text, item_id_att int)

You do have to match the data type you expect back (text or int or possibly some other type), which means you need some knowledge of what your entitlements data types are. These are going to be filter keys, so most likely they are just strings or integers.

Tying Together in Tableau

The JWT itself will be passed through as a Tableau Parameter. In the examples and screenshots, you’ll see that parameter called “JWT Input”, but you would really name it with a single character like l or z , to obfuscate and minimize.

The first connection you will make is to your data – whether that is a text file or a connection to another database. Then you’ll press Add next to Connections, and connect to the PostgreSQL database you set up with the jwt_to_table function. As mentioned before, you need at least one actual table available, and then you can drag the New Custom SQL element from the Left into the JOIN pane.

Now you can put the function call, with whatever values you want to pull out:

RLS Function Custom SQL

Deepest Granularity vs. Sparse Entitlements

There’s a great discussion of this in the Whitepaper, but you have two basic choices with JOIN-based RLS:

  • Supply an entitlement row with every combination of attributes that a user could be entitled to (Deepest Granularity)
  • Allow NULL values to represent an “all”, so that you can define entitlements at higher level of the hierarchy which bring in anything below them as a wildcard (Sparse Entitlements)

The example Python function code above actually creates a Cartesian product of the Lists/Arrays of attributes, which would be considered Deepest Granularity, but with a special case if there is an empty List/Array, which adds None (in Python, which translates to NULL in SQL:

# Replace any empty attribute list with a [None, ] to be multiplied in the cross product 
# You might instead drop that attribute entirely from the cross product 
for att_list in attribute_lists: if len(att_list) == 0: 
    att_list = [None, ] 
cartesian_prod = itertools.product(*attribute_lists) 
return list(cartesian_prod)

Examples and how this plays out

Let’s say our attributes object looks like:

{ 
'iat' : '202006031032', 
'segment' : ['Home', 'Business'], 
'region' : [], 
'item_id': [12312031203, 198002992, 19302, 19019191] 
}

If we use the function like:

SELECT * FROM jwt_to_table('{JWT_Value}', 'segment') AS (segment_att text)

We will get back a single column (no surprises here) with 2 rows.

If instead we ask for

SELECT * FROM jwt_to_table('{JWT_Value}', 'segment, region') AS (segment_att text, region_att text)

The resulting table will still only have 2 rows, but the table will have a ‘Region’ column as well. All the values in that column will be NULL.

If we select instead:

SELECT * FROM jwt_to_table('{JWT_Value}', 'segment, item_id') AS (segment_att text, item_id_att int)

We’ll get back a table with 8 rows, the full Cartesian product with a row with Home for every item_id, and Business for every item_id.

Choosing Your JOIN type

If you have Deepest Granularity entitlements, you can just do an INNER JOIN and that’s that! But you are saying “I cannot possibly have any NULLs”, because an INNER JOIN will result in NOTHING if there is a NULL.

If you are using Sparse Entitlements, you need to instead make a JOIN calculation on both Tables in Tableau. Each calculation simply has the value of ‘1’, but this creates a theoretical Cartesian JOIN in SQL as well. Don’t worry, we’ll limit it down without getting the full cross-product.

Each field you want to limit on should have a calculation that resembles:

IF ISMEMBEROF('Admins') OR ISMEMBEROF('Other Escaped Group') 
    THEN 1
ELSEIF 
[Region] = [region_att] OR ISNULL([region_att])
    THEN 1
ELSE 0
END

The core part is the OR combination, which will pass a row if there is a NULL, or if there is a match.

You’ll then put any of these security calculations on the data source as Data Source Filters, with their value set to 1.

Passing Full Rows (Tuples) of Entitlements

If your entitlement service already passes the equivalent of a table of entitlements (particularly if they resemble Sparse Entitlements), then you can set the function up to just convert those tuples directly back into list object that returns as a table:

{ 
'iat' : '202006031032', 
'entitlements' : [ 
     ['Home', none], 
     ['Business', 'West']
]
}

or

{ 
'iat' : '202006031032', 
'entitlements' : [ 
{'segment': 'Home', 'region': none}, 
{'segment' : 'Business', 'region' : 'West'}
]
}

are both easily processed into a List response (the first one you could just take the ‘entitlements’ key directly and return it, while the second needs a little reprocessing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s