Vertica Performance in Tableau & ODBC Customization

I’ve been working with several customers lately who are using Tableau with Vertica. The Vertica team was kind enough to provide their guide to configuring for Tableau performance with Vertica, which is essential reading whether you are getting started or trying to troubleshoot. We also modified how Tableau connects to Vertica and which queries it send via a TDC file, which you can read more about…

The initial reason for modifying the connection was that on a large enough data set, some queries were resulting in the error

[Vertica][VerticaDSII] (10) An error occurred during query preparation: Multiple commands cannot be active on the same connection. Consider increasing ResultBufferSize or fetching all results before initiating another command.

The guidance from Vertica was to add an ODBC Connect String parameter with the value “ResultBufferSize=0“. This apparently forces the result buffer to be unlimited, preventing the error. This is simple enough to accomplish when building a connection string manually or working with a DSN, but Vertica is one of Tableau’s native connectors. So how do you tell the native connector to do something else with its connection?

Native Connections in Tableau can be customized using TDC files

“Native connectors” still connect through the vendor’s ODBC drivers, and can be customized just the same as an “Other Databases” / ODBC connection. In the TDC files themselves, “ODBC” connections are referred to as “Generic ODBC”, which is a much more accurate way to think about the difference.

The full guide to TDC customizations, with all of the options, is available here although it is pretty dense reading. One thing that isn’t provided is an example of customizing a “native connector”. The basic structure of a TDC file is this


<?xml version='1.0' encoding='utf-8' ?>
<connection-customization class='genericodbc' enabled='true' version='7.7'>
<vendor name='' />
<driver name='' />
<customizations>

</customizations>
</connection-customization>

When using “Generic ODBC”, the class is “genericodbc” and then the vendor and driver name must be specified so that Tableau can know when the TDC file should be applied. It’s much simpler for a native connector — you just use the native connector name in all three places. The big list of native connector names is at the end of this article. Luckily for us, Vertica is simply referred to as “vertica”. So our Vertica TDC framework will look like:


<?xml version='1.0' encoding='utf-8' ?>
<connection-customization class='vertica' enabled='true' version='7.7'>
<vendor name='vertica' />
<driver name='vertica' />
<customizations>

</customizations>
</connection-customization>

This is a good start, but we need some actual customization tags to cause anything to happen. Per the documentation, to add additional elements to the ODBC connection string, we use a tag named ‘odbc-connect-string-extras‘. This would look like


<customization name='odbc-connect-string-extras' value='ResultBufferSize=0;' />

One important thing we discovered was that all ODBC connection extras need to be in this single tag. Because we wanted to turn on load balancing in the Vertica cluster, there was a second parameter recommended: ConnectionLoadBalance=1. To get both of these parameters in place, the correct way method is


<customization name='odbc-connect-string-extras' value='ResultBufferSize=0;ConnectionLoadBalance=1;' />

There are a whole set of other customizations you can put in to place to see how they affect performance. Make sure you understand the way the customization option is worded — if it starts with ‘SUPRESS’ then giving a ‘yes’ value will¬†turn off the feature; other times you want to set the value to ‘no’ to turn off the feature. Some of the other ones we tried were


<customization name='CAP_SUPPRESS_DISCOVERY_QUERIES' value='yes' />
<customization name='CAP_ODBC_METADATA_SUPPRESS_PREPARED_QUERY' value='yes' />
<customization name='CAP_ODBC_METADATA_SUPPRESS_SELECT_STAR' value='yes' />
<customization name='CAP_ODBC_METADATA_SUPPRESS_EXECUTED_QUERY' value='yes' />
<customization name='CAP_ODBC_METADATA_SUPRESS_SQLSTATISTICS_API' value='yes' />
<customization name= 'CAP_CREATE_TEMP_TABLES' value='no' />
<customization name= 'CAP_SELECT_INTO' value='no' />
<customization name= 'CAP_SELECT_TOP_INTO' value='no' />

The first set were mostly about reducing the number of queries for metadata detection, while the second set tell Tableau not to use TEMP tables.

The best way to see the results of these customizations is to change the TDC file and restart Tableau Desktop Once you are satisfied with the changes, then move the TDC file to your Tableau Server and restart it.

Where to put the TDC files

Per the documentation ”

  • For Tableau Desktop on Windows: Documents\My Tableau Repository\Datasources
  • For Tableau Server: Program Files\Tableau\Tableau Server\<version>\bin

Note: The file must be saved using a .tdc extension, but the name does not matter.”

If you are running a Tableau Server cluster, the .tdc file must be placed on every worker node in the bin folder so that the vizqlserver process can find it. I’ve also highlighted the biggest issue of all — you should edit these using a real text editor like Notepad++ or SublimeText rather than Notepad, because Notepad likes to save things with a hidden .TXT ending, and the TDC file will only be recognized if the ending is really .tdc, not .tdc.txt.

Advertisements

One comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s