Tableau has had UNION for Excel and text files for several versions now, but Tableau 10.1 delivers a huge expansion of the feature with Wildcard Pattern-Based UNION. Now, you can set up a match pattern to bring in all the files you have into the UNION automatically.
To see how it works, check out the ever excellent Tableau Online Help.
Not only is this an immensely helpful feature for saving time the first time you are doing analysis, it also can save a ton of development time and effort when you are dealing with data sources that aren’t already in a database.
Effective Alternative to Web Data Connector (WDC)
When you have data that is not already in a database, that you’d like to continually update, you have a few options for getting it in to Tableau:
- Build an import process into a database, then connect Tableau to the database.
- Build a Web Data Connector that can pull the data into a Tableau Extract
- Use the Tableau Extract SDK to convert the source into a Tableau Extract
- Export to CSV and connect Tableau to the CSV
The first three options present an major issue, even if you know how to do them technically. They are all fixed and hard-coded, requiring you to know the structure of the data ahead of time. For what I’ll call “arbitrary” data sources, where the structure isn’t known ahead of time, the vast majority of your time and effort for any of the first three options will be on classifying the data types and figuring out how to get that to load in correctly to the database or the Tableau Extract.
Part of the magic of Tableau Desktop and Server is that they can read a CSV or Excel file and determine what type of data is in each column, and handle new columns, and do some cleanup on the front end through the drag-and-drop interface. If you can get a data source into a tabular format and store it in CSV, Tableau can read it and create an Extract in an efficient manner.
With Wildcard Pattern Based Union, you can set up a process where any number of similar CSVs can be UNIONed automatically.
Here’s what you need:
- A network-available folder to put the files. This can be a local location on your Tableau Server machine, but more likely would be on network storage somewhere. You want both Tableau Desktop on your local machine, and Tableau Server to be able to access this folder.
- This means don’t use your local drive letter, like H:\ or Z:\, but instead use a network file path like \\fileserver01\tableau_files\ . This will let you publish to Tableau Server in a way that it can connect correctly.
- Connect via the Text File connection. This is where you need to be sure to access using the network file location I just mentioned.
- You have to choose at least one file. Choose the very first one.
- Now drag the New Union option over the table
4. Now you’ll see the new UNION menu. Choose “Wildcard (automatic)”. Then we’ll set up our pattern. Remember, you do need to use the * wildcard symbol where you want to make your matches. There are also options for expanding the search to sub-folders.
5. After this, I go into Sheet 1 and make an Extract. I like forcing text files to make an Extract, so that Tableau doesn’t recreate the shadow extract for the live text connection at random times. There’s nothing really wrong with leaving it a live connection either.
6. Now publish the data source or the workbook to your Tableau Server.
7. Add some more CSV files that follow your naming pattern to the network location.
8. Go to Tableau Server, find your published data source or workbook and refresh the extract.
9. Open up a workbook that connects to the data source. You may need to hit the Refresh option in the toolbar. See the new data from your text file appear.
How does this help with Web APIs?
If you can get information out of any system in a tabular format (rows and columns) and write it to a text file, you can use this technique to create ever-updating extracts. Many systems already have a CSV export option — if you can schedule this download, you are done.Even if you have to do that manually, it’s easily doable. If a file is available at a web address, you can set up an automatic process to download it (here’s a nice set of examples).
Other systems may still require a little bit work to bring the data down into a relational format. If you have an API that returns object-oriented JSON, you’ll have to transform it into rows and columns regardless of which method you use. Writing that output directly to CSV files which are picked up by Tableau Server automatically is far simpler than mastering the web data connector or the Tableau SDK, and you can code your transformation script in any language you prefer.
Tableau 10.1 also adds a text JSON connector. While the JSON connector does not have wildcard UNION yet, there’s a reason Tableau allows customers to submit and vote on Ideas on the Community Forum…