Paginating via tabcmd for full exported PDFs

Note from 2017-12-27: You are probably better off looking to this newer post, as later versions of Tableau Server allow this from the UI if you set your workbooks up correctly. The information contained about PDFtk is still useful in tabcmd situations.

Tableau was designed to be interacted with on a computer, and the way it handles PDF export by default reflects the belief that long lists should be replaced with visual cues and filters. Sometimes you want a paper trail — not only that, sometimes it is a legal requirement.

There are plenty of documents out there discussing how to use tabcmd to export PDFs , but you’ll find that the PDFs are basically a snapshot of whatever a worksheet or dashboard looks like in its default state (even using –fullpdf flag on the export command won’t create pages for content that is not visible without scrolling). This post describes a solution to “page” through all of the content in a worksheet and one way to put it all together for export.

Paginating in Tableau

It’s fairly easy to make a paginated worksheet in Tableau. By this, I mean a worksheet that will only show N number of rows per “page”, where the page is a parameter you’ve created. To do this you’ll need two parameters, each set to Integer: Page_Number and Page_Size . Set Page_Number to 1 and Page_Size to 30 as a default. What is nice about this setup is that it is designed to be flexible.

Next, create a Ranking calculation. In the example, I call it [Rank] but I use the INDEX() function; you can just as easily use RANK(). All you want is a distinct number for each row you see.

Lastly, it only takes one calculation to make the pagination filter (mine is called Rank Filter

[Rank]>[Page_Length]*([Page_Number]-1)

AND

[Rank]<= ([Page_Length]*[Page_Number])

Now just drag the Rank Filter calculated field onto the Filter shelf and set it to True.

paginated worksheet

I’ve got the Parameter controls visible but when we publish we’ll hide them and change them programmatically via URL parameters. Here you can test and prove that your pagination is working correctly.

Hide the parameter controls and publish your workbook up to Tableau Server.

URL Parameters for Programmatic Filtering

Sometimes language fails us, and it definitely does in the case of Filters, Parameters and URL Parameters. In Tableau, there are two separate features called Filters and Parameters. Filters are linked to a field in a datasource, while Parameters are stored at a workbook level and their values are input independently from the data. There is also a feature of Tableau Server which is the ability to use URL Parameters to set the values of both Tableau Filters and Parameters. The format for setting both Tableau features looks exactly the same in the URL — if I have a Field named [Product Name] and a Parameter named [Product Line], the URL to filter that view will look like:

http://{tableauserver}/#/views/{Workbook}/{ViewName}?Product Name=Tableau Desktop&Product Line=Software

When we say “URL Parameters”, we mean this ability, regardless of whether you are setting a Filter or a Parameter in the workbook.

Since we can set the value of Parameters using URL Parameters, we can easily walk through all of the “pages” in our view programmatically — we just count up with …/ViewName?Page_Number=

Creating PDFs Via TABCMD

A “hidden” feature of Tableau Server is that all you have to do to get a PDF is to add .pdf at the end of the view name to have a PDF of the worksheet or dashboard returned, taking into account any URL Parameters. So …/ViewName.pdf?Page_Number=3 will give back the 3rd “page” in a PDF form. You can do this in your browser, but for real exports, you’ll want to use tabcmd.

There are two tabcmd commands that get an export: get and export. This method uses get , looking something like this:

tabcmd get “views/PaginationExample/Pagination.pdf?Page_Number=2&Page_Length=25” -f “page_2.pdf”

PUTTING the PDFs Together

At this point, you’ve probably figured out where this is going — we’re going to loop through and get each page. Two challenges remain: (1) Knowing when you’ve reached the last page to stop requesting more (2) Putting each page together

There’s no way via tabcmd to ask for a particular value from a workbook, so even if we had a calculation to determine the total number of pages, it won’t do us any good. A different method is necessary. When you go to a Page_Number that is beyond the end of the data, tabcmd will return back a completely blank PDF page. This blank page has a very small file size — in my testing it was consistently 1.4KB, but the page size may vary. In comparison, a sheet with any actual information is upwards of 100KB in size. To find when to stop, all that is necessary is looking at the file size of each returned PDF, and if it is small enough, assume that was the first blank page.

Now all that is left is a method for putting all these PDFs together. Here is where I say “this is not supported or recommended by Tableau” — it’s just something I discovered that might help you make your process work. PDFtk Server (https://www.pdflabs.com/tools/pdftk-server/) is a GNU (free as in freedom) command line tool that can concatenate PDF files. The format for the command is

pdftk page_1.pdf page_2.pdf page_3.pdf cat output multi_page.pdf

You can put as many pages in the argument list in the front, and it puts them all together in that order. There are a lot of other functionality in PDFtk but this is all you need at minimum.

Putting All Together

If you know any type of scripting language, it shouldn’t be too hard to make a script that loops through and does all these steps. Not everyone has access to a scripting language, but everyone with Windows can use a batch script. If you save the following as “tableau_batch_pdf.bat” and put it in your Tableau Server bin directory so it can access tabcmd, you can make this work like

tableau_batch_pdf.bat %viewLocation% %final_filename% %page_size%

ex.

tableau_batch_pdf.bat PaginationExample/Pagination paginated.pdf 25


@echo off
:: initial strings from arguments
set view_url=views/%1%
set page_length=Page_Length=%3%
set filename=%2%
set username=
set password=
tabcmd login -s http://localhost -u %username% -p %password%
:: Make loop counter
set /a i=0
:: Start building the eventual pdftk command string
set pdftk_cmd=pdftk
:loop
set /a i=%i%+1
set page_num=Page_Number=%i%
set temp_file=page_%i%.pdf

set full_cmd=tabcmd get "%view_url%.pdf?%page_num%&%page_length%" -f "%temp_file%"
%full_cmd%
for /f %%A in ("%temp_file%") do set size=%%~zA
:: Check if the size of the file is small enough to be blank, if so break out of the loop and finish
if %size% leq 4000 goto :finish
set pdftk_cmd=%pdftk_cmd% %temp_file%
goto :loop
:finish
echo All done at %i%
set pdftk_cmd=%pdftk_cmd% cat output %filename%
%pdftk_cmd%

:: Cleanup
set /a j=1
:cleanup
set temp_file=page_%j%.pdf
del %temp_file%
set /a j=%j%+1
if %i% gtr %j% goto :cleanup_end
goto :cleanup
:cleanup_end
echo Cleanup finished

Advertisements

3 comments

  1. Hello,
    Great post on export to PDF.
    How can we export dashboard programmatically to multiple *.twbx files (Tableau Packaged Workbook with extracts) based on source dashboard filters.
    Let’s say original dashboard has 20 values in filter, so I need to create 20 (similar as original dashboards) *.twbx files based on filter value and name the target *.twbx files with name of the filter value.
    Thank you

    Like

    1. There’s no real way to do this programmatically, because applying a filter programmatically doesn’t actually filter the TDE itself. You can take a new extract from the original extract in Desktop, applying a filter on each one, which would create the individualized extracts. It might be possible to use the techniques from the post on packaged workbooks and the REST API to do something to create unique extracts based on a parameter, but I haven’t looked into this to test it out.

      Like

  2. Great write-up and nice solution!

    Also, one of the best blog posts I’ve seen in a long time. Easy and clear instructions, explanations, and actually useful code anyone can use. Excellent job!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s