Why Vizioneer?

My Photo
Atlanta, Georgia, United States
The "Vizioneer" comes from mashing two words that have shaped my world for most of my adult life - Engineer and [data] Visualizations (or Vizes to those who know what's up). Graduating from first from Oglethorpe University in Atlanta, followed by Georgia Tech with my Bachelors and Masters in Civil Engineering, all of which taught me to think through anything and everything - problem solving, "engineering" solutions, teaching to the "ah ha" moments - is what I love to do. In 2010 that investigative, engineering mindset intersected a job change and a plunge into the world of Data Analysis. In the search for the next great thing I stumbled on to a data visualization and dashboarding product called Tableau software and things just took off. So now I guess you could call me that engineer with the sweet data visualizations - or just "The Vizioneer" :)

In 2013, I joined the incredible team at Slalom, focusing on Tableau and it's been an amazing experience. Recently in 2014, I was honored and humbled to receive Tableau's highest recognition of being named a Tableau Zen Master. Follow along to see what happens next :)

Tuesday, June 16, 2015

Clickable Drillable Cascading Dynamic Parameters in Tableau

Clickable, Drillable Cascading Dynamic Parameters in Tableau




Dynamic parameters are a well-documented demand in the Tableau community. The reasons for this are numerous but most of all simply goes back to better enabling advanced analysis with a better, more intuitive experience for end users. My experience is no different.  It’s not a problem that I run into often, but when I run into it, it is a seemingly endless brick wall between the analysis that I can do and the analysis that I want to do.

Now the solution that follows solves a fairly specific problem (which we’ll talk about) and is not intended to be an end-all be-all solve for dynamic parameters (we still need Tableau on that one).  But what I am going to show you is extremely powerful in the right circumstance - and the funny thing is that it’s been right in front of our faces all this time. 

The solution comes from an actual client problem that was a bit of a brainteaser for a few hours. Before I go any further, a big thanks to both Chris Toomey and Allan Walker of Slalom for allowing me to bounce some ideas off of them, even though I went in a different direction with this solution. 

Here’s the situation that were solving for in the analysis that we want to create:

Working with a large organization with a five or six step hierarchy, some steps have many hundreds of members, and the hierarchy cascades very well (ie walking down a step drastically reduces the pertinent data).  The analysis I want to show is a year-over-year percent change metric (or maybe a few metrics – sales, profit, cost, etc…), where you can see both the parent and its relevant children over time AND where the user can either select a child to drill into (making it a new parent and displaying its children, one level down in the hierarchy) or clicking and giving the ability to walk one step back up the hierarchy. Oh, and the client JUST upgraded to Tableau Server 8.2 (gasp) - so a level of detail calculation is out of scope on this one.

Did I lose you?  They say a picture is worth 1000 words, so in terms of superstore data here’s an idea what I’m talking about:



As you can see on the left we’re showing year-over-year change in sales by two different levels of detail: the thick gray line represents all of superstore, while the three thinner trend lines represent each of the different departments that make up superstore (Technology, Furniture, and Office Supplies). Here’s why this is visually so important - the parent is giving context to the children, so if you can understand that a parent is down 10% year-over-year, you’ll actually have admiration for a child that was only off 5% over the same period of time. Similarly a child that is up 20% year-over-year doesn’t look quite so good when you realize the parent was up 40% year-over-year.  Seeing both levels of detail is extraordinarily powerful visual analytics especially in very hierarchical organizations.

Now creating these views in Tableau at each level of the hierarchy is not especially difficult - what is extraordinarily difficult is creating the interactive flow and filtering between these different levels of detail in an intuitive and excellent user experience.  So the challenge became how can we dive in and out of a piece of the hierarchy without robbing users of a great experience? I’m glad you asked…

Let’s get a couple questions out of the way. 

Is publishing to Tableau Server or Tableau Public required for this solution?
Yes, actually it is.  If your organization doesn’t have Tableau Server and you can’t publish your data to Tableau Public, this isn’t going to work for you.  This hack takes advantage of a trick with URL parameters, which means it will only work after the workbook has been published.  We’ll talk about this more in a bit.

Did you say something about dynamic parameters?
I did and here’s why - we need to use parameters because we need to know what level of the hierarchy we need to be on (are we drilling in or going back out?).  We also need to use parameters because we need to hold each of the values that we’ve drilled into the hierarchy (we’re basically creating a breadcrumb).  The easiest way to understand what I’m talking about is by drilling into the viz below.  The simple viz shows how you can leverage logic based off of parameters to select the correct sheet/level of the hierarchy as you drill in.



Note: I’m going to show two different solutions to solve the same problem.  This first solution (above) is the lesser of the two as it has the most ‘gotchas’ and by far the worst user experience - parameters don’t cascade (or ’show relevant values’ as quick filters can) so by the time you make your second selection only 1/3 of your choices will actually produce another visualization. Imagine if each step had dozens of options, they would be beyond frustrating to get three levels into your hierarchy. Also from a user experience, if I want to drill into something that interests me I want to do it by clicking on or hovering over that piece of data.  Having to select where to drill via a drop-down takes me out of the analytic ‘flow’ that Tableau wants me to be in.
But it totally works - if you have the patience for it. But I’m here to save your patience by giving you another solution via…

DYNAMIC PARAMETERS!!!

This is actually shockingly simple, so don’t blink as you might miss it.  It actually comes from knowing a little bit about how URL parameters work.  You may have used these before to create some customize links, or if you have ever written a blog or embedded a dashboard inside of a webpage you tend to learn a thing or two on this.  And Tableau’s documentation is very clear on how these work:

Here we learn first about how we can set parameters in our workbook by putting that value into the URL, like so (parameters and parameter values in purple):

http://servername/views/workbook/sheet?param1=value1,value2&param2=value

and just a little bit farther down from there we come to find out that we can also set field values from our data or calculated fields in a similar way (calc fields and field values in orange):

http://<servername>/views/CoffeeSales/SalesPerformance?Product%20Type=Espresso&Market=East

These are both very interesting pieces of information but neither is extraordinarily groundbreaking. The question that I’ve never seen anyone ask is what happens if I take a parameter value and set equal to a field value - from the examples above something like this:

http://<servername>/views/CoffeeSales/SalesPerformance?param1=Espresso&param2=East

By Joe that’s crazy! If only there was an easy way to do that :) Well of course there is - it’s our dear friend the URL action. You see making a URL action with a link that looks like the following would actually create the URL you saw above:

http://<servername>/views/CoffeeSales/SalesPerformance?param1=<Product Type>&param2=<Market>

By the way - if you’ve always wondered what the checkbox on URL actions that says ‘URL Encode Data Values’ was all about, now is your time to find out, because you definitely want to check the box when you go with this solution.  It’s what handles funny characters like spaces when you have to put them into a URL.  Here’s some more details if you’re curious:

So what we’re saying is you can dynamically set the value of the parameter by clicking on point in the data and using a URL action to set the parameter equal to the point in the data.  They are dynamic because you set their values when you click on them.  They are cascading because of the filter logic you’re using on each sheet that shows the relevant data for the current level of the hierarchy.  And they’re drillable both in and out (we’ll show you how in a bit).  Thus, ladies and gentlemen – I give you:

CLICKABLE, DRILLABLE CASCADING DYNAMIC PARAMETERS IN TABLEAU!!!


I know - where has this been all my life?

I’m now going to walk you through how I built a second, more elegant solution which leverages this URL parameter hack using super store sales data. One big thing to note before we get started – In the first solution we created two parameters that were both LISTS of strings based on the values of Department and Category from the data.  In the solution that follows (and in your solution) I recommend blank string parameters that can be set to ANY value to accomplish the functionality we’re going to discuss here. The advantage is that, of course as your data changes there is no need to update some list of parameters in your workbook.  Also note that you don’t need to create a parameter for the lowest level of the hierarchy, which in my example is Item (Remember SuperStore is Department>>Category>>Item).

The How-To

To begin we need to create three sheets, one at each of the hierarchy used that we care about. Remember the goal of the view the hierarchy is to show two levels at a time - a single parent and all of its associated children. Also note that in each example were looking at a year-over-year change (by quarters if you’re following along with me). I’ll show you how to make the first one, and the other two you can make following the exact same methodology.

First we’ll create our year-over-year metric - it’s a Table Calc using the lookup function. We’re going to compare the current value to the one from four quarters prior and take the percent difference:


Once that calc field is created it’s time to make the first view – this will be Y/Y change in Sales (by quarter and Table Calc using the default of Table Across (Quarter[Order Date])) at the first two levels of detail:  Total SuperStore (one thick grey line) and by Department (three colorful thin lines).  Leveraging a dual axis (don’t forget to sync them) at those two different levels of detail and it should look something like this:


Note the two different levels of detail in each of the dual axis:



Got it? Perfect.  You’re done creating the first level.  Add in any additional formatting/tool tip cleanup and then duplicate the sheet. I’ve called this first sheet ‘1LevelBlank’ so that I know it’s the first level in my drill down.

In the duplicated sheet (‘2LevelBlank’) drill down one level of detail on both of the dual axis – meaning the axis that showed Total Superstore now shows the level of Department, and the axis that originally Department now goes to Category.  Now given that we haven’t talked about filtering yet, you’re looking at about 20 lines (3 Departments and 17 Categories).   It’s a mess.  Don’t worry, we’ll fix it.  Go ahead and duplicate it one more time (‘3LevelBlank’), this time taking the two different levels of detail down to Category and Item (the lowest level for our example).  If you thought the 3 Departments and 17 Categories were a mess, then this is an absolute disaster.  Again, don’t worry, we’re about to fix it. Now that our three sheets are created, let’s go ahead and create our parameters and filters.

Parameters and Filtering

As noted earlier, the easiest thing to do is to create two blank string parameters and set their default values to be blank.  In this example, I’ve named the two parameters: 1DeptBlankParam and 2CategoryBlankParam.  Here’s an example:



Creating the filtering is really simple as well. We’re simply going to match the parameters we created above up with the fields from the data.  Here’s an example:




Now, the first sheet, ‘1LevelBlank’, doesn’t require any filtering, so let’s head to the sheet on the second level, ‘2LevelBlank’.  On the filter shelf add the filter we just created where [1DeptBlankParam]=[Department].  Because the parameter is blank, set the filter to exclude False (ends up doing the same as keeping only True).



Go to the last sheet, ‘3LevelBlank’, and add the filter for [2CategoryBlankParam]=[Category].  Filter it the same as before, excluding False.  When adding the filter on sheets 2 and 3, they should disappear – if they don’t it’s because your blank parameters aren’t really blank.  To test if you’ve done everything right, show the ‘1DeptBlankParam’ parameter and type in ‘Technology’ – you should see some lines show up on the 2LevelBlank Sheet.  If that worked, type in ‘Copiers and Fax’ in to the ‘2CategoryBlankParam’ parameter – and you should see some lines show up on the 3LevelBlank Sheet.  Once you’ve both successfully, clear both parameters and move on to the hack that brings it all together.

The Hack

Now, you’re all probably very familiar with the fact that you can create actions on dashboards.  Should this be news to you, please go and check out Peter Gilk’s great post on Dashboard Actions http://paintbynumbersblog.blogspot.com/2014/10/a-rough-guide-to-tableau-dashboard.html - well done and rather comprehensive.  The interesting thing that most people don’t know is that basically all of these same actions dashboard actions can be done in the worksheet view.  Since we’re all about trying new things, we’ll take this road less traveled and add some worksheet actions.

We’ve already discussed that URL Actions are going to play a key role, so on the first sheet, 1LevelBlank, go up to the top and select Worksheet>>Actions



You’ll be greeted by a very similar action menu that should look very familiar.  Click Add Action>>URL.  This is where we get fancy.  Since we’re going to create a ‘run on menu’ action, the name of the action is what will show up at the bottom of our tooltip.  In the name field, type: ‘Drill in to ’ and then click on the arrow at the end.  Here, select ‘Department’ which will show the value of the Department you’re hovered over.   

Select only the current sheet in the check boxes, 1LevelBlank.

Head to the URL tab and paste in the following (we’ll dissect it next):

http://public.tableau.com/views/DynamicParameterURL/2LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Department>&2CategoryBlankParam=

There’s a lot going on here so let’s talk about each piece so that you understand it.  Understanding all of this is very important because this is how you make it both clickable and dynamic so that everything works.

Let’s start at the beginning:
http://public.tableau.com/views/DynamicParameterURL/2LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Department>&2CategoryBlankParam=

This is nothing more than the URL address of sheet ‘2LevelBlank’ on the server.  Remember, you’ll be on the highest level of the hierarchy on sheet ‘1LevelBlank’, so this is going to move you to a different sheet one level down in the hierarchy.  The easiest way to get this URL right is to publish the workbook (either to Tableau Server or to Tableau Public, like I’m doing here).  The format is clear:

http://<server>/views/<workbook>/<sheet>

So you should be able to plug in your values and go.

Next up are some very important URL parameters:

http://public.tableau.com/views/DynamicParameterURL/2LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Department>&2CategoryBlankParam=

They’re all specified here: http://onlinehelp.tableau.com/current/server/en-us/help.htm#embed_list.htm

The ? means we’re going to add some details (parameters) to the URL. The : that follows it means that the detail is for how to render on the server (without the :, shows that we’re setting a value in the workbook either as a filter or parameter – which we’ll do at the end of this).

:showVizHome=no

This gets rid of the header at the top of the viz (that is now rather large in v9).  Here’s an explination by the famous Ben Jones of Tableau Public. https://public.tableau.com/s/blog/2014/03/how-fix-your-iframe

&

It means there something else I want to add to the URL.  Separate each ‘detail’ parameter in the URL with this ‘&’.

 :embed=y

‘Required for URL parameter. Hides the top navigation area, making the view blend into your web page better.’  When they say required, they really mean some of the other URL parameters won’t even work if this isn’t included. Obvious, but worth mentioning.

:tabs=no

Hides tabs, even if you tick the box in the publishing dialog to show the tabs (which you should do in this case).

:linktarget=_self

Specifies he target window name for external hyperlinks.  There’s two options here – ‘_self’ which we use here means open in the same window/tab; the other option ‘_blank’ would open the URL in a new window/tab. Using this will re-spin the viz in the same window/tab, which is the user experience we are looking for.

Now for setting the parameters in the filters:

http://public.tableau.com/views/DynamicParameterURL/2LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Department>&2CategoryBlankParam=

As we addressed back in the beginning, the trick that we are doing here is setting the parameter ‘1DeptBlankParam’ to be equal to the department that we click on in the viz.  Here’s what that looks like when we add in purple for the name of the parameter and orange for the field value from the data:

1DeptBlankParam=<Department>

In the last piece we are simply reminding Tableau that we want the second parameter to remain blank. You should now understand all of the pieces of the URL that we’ve created.

Lastly, to the box for ‘URL Encode Data Values’ - this should properly encode any spaces, commas, and other special characters that are in the data.  It is very important that you check this box.

Here’s what everything should look like:



Now, here’s what the drill in URL menu action for the next level down on sheet ‘2LevelBlank’ looks like – the only changes are in bold at the end:

http://public.tableau.com/views/DynamicParameterURL/3LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Parameters.1DeptBlankParam>&2CategoryBlankParam=<Category>



What we are doing here is going to a different sheet one step down in the hierarchy (‘3the Department parameter is holding the value that we gave it in the first drill down (we’re reminding it of the value we set it to in the first drill down), while we are setting the value for the category parameter based off the category that we’re clicking on in the viz (just as we did before with Department, one level up).  If you have additional levels in your hierarchy, you would simply continue creating URLs where the previous parameters would hold their values, the current level would set the value of the parameter, and lower levels would remain blank.

Great – now how do I go back up the hierarchy?

If you’re at the bottom level of the hierarch on sheet ‘3LevelBlank’ (which is where you’d be after the step above), then you’d want to do two things: 1) Go to the sheet one step above in the hierarchy (in this case ‘2LevelBlank’ and 2) Clear the parameter ‘2CategoryBlankParam’ while holding the value of the first parameter ‘1DeptBlankParam’.  All you have to do is create a URL that does exactly that:

http://public.tableau.com/views/DynamicParameterURL/2LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<Parameters.1DeptBlankParam>&2CategoryBlankParam=

Here’s what mine looks like:



If once you’re there and you want to return all the way to the top, do the same process.  You’ll end up with the following:

http://public.tableau.com/views/DynamicParameterURL/1LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=&2CategoryBlankParam=



We’re almost home.

Putting it all together

Now we could simple publish the tabs, and the interactions would pretty much work, but the user would be heading to a different URL with every click.  This isn’t a winning formula.  Instead, we’re going to create a dashboard and put the whole flow inside of it.  This way it’s easily accessible from Tableau Server.

It’s easier than you think – Create a new dashboard and name it whatever you want (mine is called DynamicBlankParamDrill).

Add a web page object and let it take up the entire dashboard. When prompted for the URL, put in the URL for the highest level, with blank parameters:

http://public.tableau.com/views/DynamicParameterURL/1LevelBlank?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=&2CategoryBlankParam=

Now, it’s time to publish and enjoy:




Things to note

First off, you’ve just created a dashboard with a dynamic parameter!  Wahoo!

Now we should talk.  If you’ve made this far and you understand everything we did then you’ve probably already realized some things:

1)      Using this method as outlined requires you to create a different sheet for every level of the hierarchy – This is less than optimal if there are many levels/layers in your organization’s hierarchy.
2)     You can only dynamically drill in and out of one sheet at a time.  This is because you can only send a URL to one web page object on a Dashboard. 
3)     If you’ve played around with the dashboard, you may have figured out that the dynamic setting of a parameter in the web page object doesn’t extend to other parts of the dashboard.  What I mean is that if you have the parameters on the dashboard and the web page embedded – the parameters are only being updated in the web page object, not outside of it.  Think of movies like Inception or The Matrix – The experience was inside, while the person existed outside.  This is no different – Even though I can show the parameters existing on the dashboard, the experience is inside the web object – and they don’t really tie together at all.




Now those seem like big limitations – and they are.  But you can actually overcome them all.  Here’s how:

Advanced Jedi Tricks

At the client where I put this together, I needed to not only be able to go up and down the hierarchy in a sheet, I needed to create a complex dashboard with multiple sheets.  I also had six levels to go up and down – so there was no way I was going to create six highly complicated dashboards, with all the actions and sheet placement, and then reference each one in the URL.  I would have gone insane.  Now that you understand the basic framework from above – and I walk through that basic example in detail so that this advanced use makes sense – I’ll talk about how I make this work on a complex Dashboard.

Parent and Child Dimensions, Filtering, and URL Fields

What you need to be able to do on the fly is to know – based on the parameters I have values for – what level of the hierarchy am I on?  If you know what level you should be on then you can set four very important things that drive everything:

1) The Parent Level
2) The Children Level
3) The Filter
4) The URL Fields

If you go this route and use logic based on the parameters, then you can create a single sheet that can set itself to the correct level of the hierarchy.

Basically we test the parameters to see if they’re blank.  To do this we use the LEN() function (returns a number as the length of a string based of the number of characters) and it looks like this:



Len([1DeptBlankParam])=0 and
Len([2CategoryBlankParam])=0
Now if this is true, we’re at the highest level.  Because there’s three levels to our hierarchy, we’d do this twice more:

Len([1DeptBlankParam])>0 and
Len([2CategoryBlankParam])=0

This puts us one level down.  And….

Len([1DeptBlankParam])>0 and
Len([2CategoryBlankParam])>0

Puts us at the bottom level.  Create each as its own calculated field – it will save you some time later.  If your hierarchy has more levels then build as many of these as you need.

Now we’re going to create Children Dimension – We’re basically going to say ‘when we’re at x-level, that means y-field is the correct spot in the hierarchy’. It’s actually very simple:



if [ShowOnlyBlankLevel1] then [Department]
elseif [ShowOnlyBlankLevel2] then [Category]
elseif [ShowOnlyBlankLevel3] then [Item]
end

The Parent Dim is just like it, except one level up on the hierarchy:



The last thing we need to do is build the filtering.  This is slightly tricky, but you’ll get it quickly:



Now, here’s the awesome thing – You can build as many sheets as you want!  Just make sure you use the Parent and Children Dimensions for your slicing and filter everything on the filter we just created.

When you bring everything in to a dashboard everything will be based on the parameters as they are set.  You’ll create dashboard actions (so that you can apply them to multiple sheets, rather than just one at a time) for the URL links. Note: Instead of referencing different sheets as you did before, you will reference this same dashboard view in the URL – only the Parameters in the URL will change.

This brings up to the last bit of trick-er-a-tion:  You’re URL has to be dynamically created, which means for every click you’re going to set each value, just as we did before. But it was easier before because we manually made the URL based on the sheet we were on, but we don’t have that luxury now that everything comes from the same sheet.  So – We have to create fields that do this for us and then call those fields on any sheet that needs the ability to drill.

I’m sure that was confusing. It will make sense in a minute.

The goal here is to tell the parameters what value they should be set to on a click.  So on a drill in click going from Level 1 to Level 2, the Dept param should be changed to the Dept that was clicked.  When going from Level 2 to Level 3, the Dept param should hold it’s value.  Here’s how to write that:



At the same time, on a drill in click going from Level 1 to Level 2, the Category param should remain blank.  But when going from Level 2 to Level 3, the Category param should should be changed to the Category that was clicked.  And here’s how to write that:



Now, you’re going to need to call those values in the URL, so take both fields and add then to the Tooltip on the ‘All’ marks card:



I’ve also added the way to drill back out (download to see it).

Almost done! Dashboard time – Create a new dashboard (Mine is called ‘FinalDynamicDrillDash’).  Set the size of the dashboard to Automatic and bring in the sheet you just created.  Click on Dashboard actions and select URL Actions.

We’re going to do just as we did before with some moderate tweaks.  The URL is going to be self-referencing and the parameter values are coming from the fields you just created.  It looks like this (note the bold parts:

http://public.tableau.com/views/DynamicParameterURL/FinalDynamicDrillDash?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=<ATTR(URLDeptParam)>&2CategoryBlankParam=<ATTR(URLCategoryParam)>



You’re done with that dashboard for the moment.  Just as we did before, create one last dashboard with just a single web object.  When asked for the starting URL, use the one from above, but remove the parameter fields (we want them to be blank when it loads).  It should look like this:

http://public.tableau.com/views/DynamicParameterURL/FinalDynamicDrillDash?:showVizHome=no&:embed=y&:tabs=no&:linktarget=_self&1DeptBlankParam=&2CategoryBlankParam=

Now publish the whole thing!

If you see any errors just reload the page.

You’re done! Yay!

To prove out the dashboard concept I added a Tree Map sheet as well and allow the user to select either sheet to drill in.

Here’s the final product:



I can’t believe you made it all the way through this post.  Thanks for hanging out.  Hopefully you’ve learned a couple things.  The Analysis that you can do with this is incredibly powerful!  So go out and change the world with your dynamic parameters in Tableau!


Nelson

Sunday, March 22, 2015

Friday, March 20, 2015

Wheels within Wheels: The History of Marvel Comics (Guest Post)

Hello everyone, I’m Chris Toomey.  I work with Nelson at Slalom and he has kindly let me borrow his blog for the day.  If you haven’t guessed already, this post is my entry into the Wiki Data Contest


For those of you who don’t know (or maybe forgot) – every year Tableau hosts a series of themed competitions to select competitors to the Iron Viz Championship.  The first is all about finding data on Wikipedia and make a viz out of it.  I’ve been reading comics since I was 10 and thought it would be cool to chart the History of Marvel Comics.  

Sources and Inspiration

In developing this viz, I came to appreciate how just how much Tableau’s technical depth supports the creative process.  There were many sources of inspiration, both technical and creative, and their work deserves mention. 

Jonathan Hickman – Writer for the Avengers and mastermind behind Marvel’s upcoming “Secret Wars” event.  His visual style is fantastic, and his storytelling is even better.  A primary theme in his work is the notion of circularity – think concentric circles and time travel.  If you decide to open up his Avengers #1, this is one of the first things you will see:



Jer Thorp – Co-founder of the Office for Creative Research, former NYU professor, and from 2010-2012, the Data Artist in Residence at the New York Times.  In 2012, he published a visual history of the Avengers, a selection of which can be seen below.  If you want to see master work in data visualization, read his stuff.



Bora Beran – Program Manager at Tableau, where he focuses on statistics and calculation features, query generation and technical partnerships.  Bora’s post on radial treemaps was what gave me the confidence that I could take what I’d seen in the work of Hickman and Thorp and translate it to Tableau.

There is also much more Marvel-related background, but you’ll need to explore the viz to find it! 

Getting the Data

While the specifics of the competition make it appear that you can simply CTRL-C + CTRL-V your way to a fabulous Wikipedia data set, the reality is much more cumbersome.  After working the design basics out in my handy notebook (paper still has its uses!), I knew exactly what kind of data I wanted.  This meant that I had to manually cut, paste, clean, and format the data.  It may not have been the most elegant solution, but it gave me the exact data structure that I needed to start. 

Once I had all my data, it still needed some polish so I turned to Alteryx.  In Alteryx I was able to shape, sort, aggregate, join, and union my way to a Tableau-usable dataset, without the pain of doing it all in Excel.  In a few instances I needed subsets of the data to serve as inputs or lookups.  Instead of writing that data out once, and bringing it back in as a full input, I kept it all inside the tool (which is why you see two sets of text inputs).  


Here’s how that trick works: insert a Browse tool into the workflow where you want to pull out data.  Once the workflow has run, open the Browse window and click the
icon in the top right and select “All Data with Headers.”  Then right-click anywhere in the workflow and select “Paste”.  The data will now be a Text Input and keeps everything self-contained. 

Design Notes and Community Musings

I knew that circles were going to be the dominant part of my viz, so I needed to brush up on my trigonometry.  To do so I spent some time dissecting Steven Carter’s wonderful Adoption Gap visualization, and playing around on the Desmos Graphing Calculator.  Desmos is a wonderful tool for writing and comparing any type of function.  Simply start typing and it will do the equation formatting for you, hit Enter and you’ve got a graph.  Very handy for benchmarking equations, particularly SIN and COS.

Once I had my basic math down, I visited Bora Beran’s radial treemap post.  The third of his three examples shows a radial bar chart - which I knew was the basic model I wanted for my viz. 
 I first replicated his design using my data – his methods require using a data densification trick that relies on binning and unions.  It’s handy and got me most of the way there.  I just needed each bar (representing a character/team pair) to be offset by the number of months from time zero that they joined their respective team.  I’d like to say that I came up with a magic trig method to do this – but it simply came down to changing or adding one variable at a time and seeing what happened.  Desmos was particular helpful here as I could make multiple equations and compare then quickly. 

Here’s the solution: Inside of the SIN/COS functions, I added the following: 

RUNNING_AVG(SUM([DiffFromStart])) * WINDOW_max(1.85*PI()))  

DiffFromStart is a field I calculated in Alteryx that does a DateDiff from each characters JoinDate and the first date in the dataset.  I’m calculating it at the team+character level, SUM(DiffFromStart) is the same as the unaggregated value.  I wrapped it in a RUNNING_AVG to account for densification – I didn’t want that value to change just because Tableau created new points. Finally, the WINDOW_MAX represents the amount of the circle I’m actually playing with.  The result is that I can push the polygon around the circle and keep it aligned properly. 

It is important to note that with the introduction of the “Spin the Wheel” filter, which operates before all the Table Calculations, this introduces some odd variations in arc length and position.  The values don’t change, but they move around in unexpected ways – which makes all the interactions all the more fun, you simply don’t know what’s coming!

Final Polish

The last thing I needed to do was finalize the formatting. A black background gives the colors the most pop, and I used Tableau 9’s handy hex color values to create consistent palettes.  The final piece was fonts – I spent so much time on the viz that I didn’t want to be undone by unreadable text.  Here’s the links to the best Tableau font resources I could find:

There are a number of other tricks to get the tooltips to function the way I wanted, but I’ll spare you the details and let you all pull things apart.

That’s it! Thanks to Nelson for loaning me his space, and thanks to you all for taking the time to read and interact with my work – I hope you enjoy using it as much as I enjoyed building it.
I’m active on Twitter, but the link to my LinkedIn page is on my Tableau Public profile – so feel free to reach out with comments, questions, or if you just want to talk comics.  I’m also at all the Seattle Tableau User Group events, and will be at TC15 in October.  

Monday, March 2, 2015

Demystifying Nested Table Calcs in Tableau

The farther you go with this thing we love called Tableau, the cooler and more advanced analysis you want the ability to do. Over the last few months I’ve had an opportunity to work on a number of really interesting projects. The one that’s been taking my free time has been for awesome cause: creating dashboards to help track and stop the Ebola spread in West Africa. It’s a part of the work that the Tableau Foundation is focused on with an organization called Dimagi, an award-winning global health software company, and the Columbia University Earth Institute. Fellow Zen Master Peter Gilks, of Slalom New York, has been the lead on this project and I would guess has spent well over 100 hours of his free time working to make it successful. If you know Peter, please thank him for the time he’s invested.  It’s certainly been a labor of love for all of us.  The other interesting work has been focused on my client in the travel industry, slicing and dicing data in ways that I honestly have never been able to do before - and what has been a big factor in the success of these projects? My head-long dive into Nested Table Calcs!

Even as a Zen Master, I’m always learning - and I have A LOT to learn. Different projects, different clients all require new things, and the depth of the analysis that is possible inside of Tableau becomes even more apparent the farther down the rabbit hole we go.  I would argue that beginning to understand today’s topic of Nested Table Calcs is the best way to take your Tableau game to the next level, and provide your users with even more in depth analysis.

We have to start with some fundamentals, but we’re also going to make a couple assumptions.  We’re going to assume you know at least a little bit about table calculations, though we’re certainly going to discuss how those are constructed as well. We also need to assume you want to perform some sort of analysis that at least a little bit complex and is going to require at least two or more steps of transformation/calculation.

So let’s start with the basics - what exactly is a Nested Table Calc? I’m so glad you asked! While there are some things in the world of technology that have names or acronyms that defy logic, this is not one of them. These things act just as you would think of Russian nested dolls: something small and simple is used to create something slightly larger and slightly more complex, which is then used to create something a little bit more advanced, and so on and so forth. 



The calculations build one on top of the other, allowing you to produce super complex analysis like:

Rank the top five states in each region by year-over-year change in average monthly profit margin

See? Pretty cool stuff, huh?  The orange parts are the table calculations. Stacking (or nesting) table calcs in each part of that analysis in just the right way allows us to do something super complex, and should begin to give you a little bit of insight into the power of what were actually talking about here.

You should also know there is a tremendous amount of awesome work from the likes of Joe Mako, Jonathan Drummey, Noah Salvaterra,  Keith Helfrich (not a Zen, but sure could be) and many others on this subject, and I would certainly encourage you to dig deeper by going through their work on this.  Without them, I’d still probably be lost in the dark when it comes to all this stuff.

So why don’t you say we take that little example we discussed before (which I came up with on-the-fly) and walk you right through it?  Sounds like fun! Here we go….

Just as we seemingly always do, we are start off with popping open Superstore Sales data (the one with English in the title). We’ll start by dragging region to the filter shelf and excluding International (I mean, we’re looking for states here). Alright - let’s begin by remembering the goal of the analysis:

Rank the top five states in each region by year-over-year change in average monthly profit margin

Got it? Because we’re doing nested table calcs we’re actually going to begin at the end.  Let’s get set up to show monthly profit margin.  Start with Order Date and throw both year and month on the column shelf. Also, go ahead and throw region on the row’s shelf (we will add states later on). At this point you should be looking something like this:


Now let’s quickly throw together the formula for profit margin: the sum of profit / the sum of sales:



We will simply drag it out onto the worksheet and we will quickly have a monthly profit margin:



OK, if we look back at what we are putting together you’ll see that we actually want to take an average of the monthly profit margin - but averages in Tableau can be kinda confusing.  If you simply use the average function (AVG) Tableau’s going to calculate the average at whatever the level of granularity in your data actually is at the row level. The problem for us is that our data is much more granular than a monthly look, so instead were going to have to use a window average table calculation (Window_Avg).  Using the window average allows us to set the granularity based on what is in the view.  Level of detail calculations in v9.0 (probably the next blog post you’ll see from me) go one step further and allow you to set the level of detail without concern for what is actually in your view (it’s going to be awesome). Now, before we actually create the table calculation let’s look and see what were actually hoping to get. By clicking on Worksheet>> Show Summary you will get this lovely little card that allows you to see interesting details about the data in your view or what you have selected.  Once you see it on your sheet, click and select the 12 months of 2010 for the Central region. Your summary card should update and now show that you have selected 12 marks with an average profit margin of 13.86%:



Note - this is a monthly average. What I mean by that is that we are adding up the percentages from the 12 months and dividing by 12 - these months are not weighted and this number is not a yearly sum of profit divided by a yearly sum of sales - that would be a yearly average, and, to clarify, that’s not what were interested in. Just thought you should know.

So now we know we’re looking for 13.86% in the Central region for the year 2010. Let’s write our Table Calc.  We’re going to leverage the Window_avg function and we’ll call it “Profit Margin - Win Avg”:



I like to try to keep the base name of the function first and then add in whatever aspect of the nest that calculated field is. Thus, we called it “Profit Margin - Win Avg” which should help us group the functions together and keep things straight going forward.  We’ll drag that pill out onto the view of the sheet, on top of the numbers that are already there, and we should now see two sets of numbers stacked on top of each other, like this:



The problem we should notice is that this average is the same for all months and all years - and it’s not the 13.86% we were expecting. What we wanted to see was an average that was the same for every month but restarted every year. In order to get there we click on the pill that is now located in the measure values and has the triangle on it – the triangle icon is letting you know that this is a table calculation. We click on it and go down the menu and hit “Edit Table Calculation” and it brings up the dialog box that causes trembling and fear in the hearts of even the most advanced Tableau user.

Have no fear amigos, we’ll walk through this together. 

Where it currently says “Table (Across)” (which is the typical default – and rarely what you want to use) click on it and go all the way down to the last option where it says Advanced.  Should now see this:



If you select Month of Order Date and click the arrow to send it to the right what you’re telling Tableau is “the thing I want you to average is the Month of Order Date - do this for each year and for each region (IE when region and year change I want you to restart your average)”.  Make sense? Even though there are easier ways to do it, I always go to the advanced tab on table calculations because I need to say out loud to myself how I’m planning for Tableau to do this calculation.  I would recommend you do the same until you become pretty good at it.

Once you apply those changes your sheet should now look like this, and you will notice we now see correctly that 13.86% is the monthly average in the Central region for the year 2010.  Perfect!



Now let’s head back to the original statement of analysis we’re working on:

Rank the top five states in each region by year-over-year change in average monthly profit margin

Well, average monthly profit margin is done so let’s continue to work backwards - and go after the year-over-year change.  This is another table calculation, and another simple one.  But at the same time it’s also the first one were going to nest the average that we took. So here’s the tricky part and how we do this: right-click on that first table calculation we created called “Profit Margin - Win Avg” and select Create Calculated Field. The new calculated field that we want to create is going to take the current year average profit margin and subtract it from the prior year average profit margin. Therefore the table calculation in Tableau we want to use is the lookup function that looks like this:



Again, we’re basically telling Tableau to take the difference between the current Year and the previous year and return the difference (for multiplying by 100 because these are percentages and we want to return the difference in the percentages in points - we do this because it’s easy to see when we’re right). Once you’ve created that calculated field, right-click on it and open up the Default Properties>> Number Format and force it to be a two decimal number with the string “ pts” on the end, like this:



Now, drag that out and placed on top of the other numbers in the table just like we did before.  You end up seeing a third row of numbers which sort of looks right but mostly have zeros. Again, we’re going to have to tell Tableau how we want this table calculation to be calculated. Go back to measure values and click on the third pill, “Profit Margin – Y/Y Diff…” and hit Edit Table Calculation again. You should notice a couple things: the default of Table (Across) is back again for the field we just brought in, but you should also notice we now have the ability to toggle between two different table calculations at the top. So long as the first table calculation we did is still in the view, you should see that if you switch to it it’s already preconfigured using what we gave it in the steps above.  You have the ability to switch between the two calculated fields because you are now leveraging a nested table calculation - this is because we leveraged the calculated field of a table calculation in another calculated field with another table calculation.  The power here is that we have the ability to set the configuration of each table calculation nesting separately, and you will see that this is very important for what we want to do now.



This time we bring over both Year of Order Date and Month of Order Date, making sure to put Year on top of Month, because once we set that part of the dialog box will tell the first dialog box we want to do this at the level of year and we do not want to restart the table calculation. This has the effect of forcing Tableau to ignore month completely as we do this table calculation because it is lower than the level of granularity that we are doing the table calculation.  Make sense?

Once we go back to the table, the effect is pretty obvious.  Everything from 2010 is now gone (which is logical because you’d be subtracting 2010 from 2009 but 2009 doesn’t exist - so it shows up as null). And we now see the same number show up for all months but different in each of the years 2011, 2012, in 2013 - just as we expected.  AND you’ve now written your first nested table calc. – you’re the bee’s knee’s!

Alright, let’s take another look at our statement of analysis we’re working on:

Rank the top five states in each region by year-over-year change in average monthly profit margin

Before we get to ranking the states, let’s limit the information that we’re looking at.  You should notice that the data for each year is different, but it’s the same for every month.  Therefore let’s limit our sheet to looking at only the first month of every year.  The easiest way to do this is to create a test - a true or false Boolean statement – to figure out if we’re looking at the first month of the year. Of course we can simply write something like Month(Date)=1, but that’s not a Table Calc., so what fun would that be?  Instead we’re going to use a function called “First” and it goes like this:



We create a calculated field called “First Month?” and is going to run a table calculation to figure out if what we’re looking at is the “first” based on the way we define it.  Notice that once you’ve created this calculated field it’s going to show up in Measures, not Dimensions even though it is a Boolean and returns true or false.  The reason for this is that table calculations are always aggregations, and because they’re aggregations they have to show up in Tableau as Measures. We are going to drag this new field onto the end of the column shelf, and Tableau is now going to label each column either true or false based on whether or not it’s the first in the column.  Therefore, very logically, the first column appears as true and all the others appear as false. Our goal however is to make the first month of each year show up as true and the other month as false. 

To get there, click on the triangle of the field you just drag it onto the column shelf, and go down and hit Edit Table Calculation. This time were going to put year and month to the right side (in that order).  After we hit OK, we’re going to configure the first dialog box to calculate the table calculation at the level of Month of Order Date, and restarting every Year of Order Date.  It should look like this:



Once that is done you should see a few more “trues” showing up every January. Now, simply right-click on one of the “falses” and select Hide.  Hide is important feature of Tableau when it comes to table calculations.  If we simply excluded, or filtere out all the other months, it would change our average to simply be whatever the profit margin was in January.  By using hide, the data is not excluded, it just doesn’t show up, and our averages remain correct. Your sheet should now look something like this:



We’re getting really close now. Looking back once again at our original statement of analysis:

Rank the top five states in each region by year-over-year change in average monthly profit margin

It's time to bring states. In the mapping items hierarchy click and drag States and drop it onto rows in between Region and Measure Names. You now have a bunch of rows, one for each of the three measures for each state that’s in the data.  The very last piece of this analysis is the ability to rank each of those states based on that year-over-year change so that we can identify the top five, in order, for each year, for each region. This is where we write our last part of our nested Table Calc.

Just as we did before, we’re going to right-click on the previous calculated field we created and select “Create Calculated Field”.  Here we leverage one more table calculation that we will use in order to rank our year-over-year differences:



Using rank unique guarantees that we’re going to get a different ranking for every row, even in the event that the numbers were exactly the same. Before we bring this calculated field into the view, right-click on it, and about two thirds of the way down, select “Convert to Discrete” - this is going to force Tableau to write the actual number rank rather than trying to create a visualization – which it would do if we left it as a continuous measure. Once you’ve done that, now click and drag that pill and place it in between Region and State. Again, it's time to open up the trusty Edit Table Calculation dialog boxes to get this last table calculation set up for us. At this point you should be a pro at getting these things set up, but think logically about what we’re looking for here: we want to rank the states (so address that), we wanted to restart for every year and for every region (so partition those), and we want to ignore the months (because they are all the same) - so place Month underneath State in the addressing box.  Once that’s all taking care of tell Tableau that were going to do this At the Level of State.  Everything should look like this:



Once this is done you’re going to see some funkiness for second.  The reason for this is that the ranking of the states that are not in the region that is specified are going to show up as null, and Tableau puts these at the top.  Simply right-click on the word “Null”, and click hide. Also, hide the year 2010 – we’re not going to care about it anymore going forward.  Now, press Control and left click and drag the rank function from the rows shelf over to the filter shelf. By clicking Control before you do this, it has the effect of copying and pasting the function with all the editions that we have made. When the dialog box pops up start by clicking “None” and then ticking the boxes for 1, 2, 3, 4 and 5 (because we want to see the top five States for each Region - that’s what we’re selecting here). Also, you no longer need the first two fields that are inside the measured values – “Profit Margin” and “Profit Margin – Win Avg”. Click and remove those two fields and the field “Profit Margin - Y/Y Diff” will shift up and now reside on the label shelf by itself. At this point are really close to our final result and you should be seeing something like this:



Because we have the State on the Rows Shelf, it's creating a new row for each of the states that ranks first for each of the years.  Therefore, in most cases, were seeing three rows for every ranking (one for each of the states). To fix this we’re simply going to take State and move it over to the Label Shelf. Doing that gets is 98% of the way there:



At this point all we’re going to do is add a little bit of visualization - this wouldn’t be Tableau without it!  Change the mark type to squares and Control, click and drag the “Profit Margin – Y/Y Diff…” pill from the label shelf to the color shelf as well. I've use the colorblind diverging color palette and set the start/center/end at 20/50/120 pts, to get this final view of a heat map highlighting the best and worst of our top five states for each year.  The final product looks like this:





That’s it! You did it! You've created your (possibly first) nested table calculation in Tableau! They are all really easy just like this :) Well, maybe not, but now you can begin to see the powerful analysis that nested table calculations give you the ability to understand. It's worth noting one last thing - the reason we left each step of the calculation in the view until the very end was that Tableau uses the configuration for each of the nested pieces that we had already set up. We could have gone straight to the last calculated field and brought that in, but we would have had to set each of the other nested pieces up from scratch, which can be tricky if you can’t see them. Removing them at the end is how I like to keep things straight.

Congratulations! You’ve made it all the way through this very long post. I hope you learn something about nesting table calculations. Thanks as always for stopping by!

Nelson