Why Vizioneer?

My Photo
Atlanta, Georgia, United States
The "Vizioneer" comes from mashing two words that have shaped my world for most of my adult life - Engineer and [data] Visualizations (or Vizs to those who know what's up). Graduating from Georgia Tech with my Bachelors and Masters in Civil Engineering has taught me to think through anything and everything - problem solving, "engineering" solutions, teaching to the "ah ha" moments - is what I love to do. In 2010 that investigative, engineering mindset intersected a job change and a plunge into the world of Data Analysis. In the search for the next great thing I stumbled on to a data visualization and dashboarding product called Tableau software and things just took off. So now I guess you could call me that engineer with the sweet data visualizations - or just "The Vizioneer" :)

Wednesday, June 18, 2014

Dynamic Parameters - a sorta hack

...And we're back.

It's been a little more than six weeks since I've saddled up and hopped on the blog.  The #Tableau30for30 took a lot out of me and I've needed some time to recover :).

So dynamic parameters.  It's the number one most requested feature idea in the Tableau Community.  Everyone seems to have come across a time where this would come in handy.  It's a great idea and given the demand it should be implemented sooner rather than later.  

Unfortunately, it's not coming in the very soon to be released v8.2.  You'll hear a lot about some very slick new features - Story Points and the Visual Data Window to name a few.  But dynamic parameters ain't on the docket.  So it's time we take matters in to our own hands.

Let's start at the beginning.  What's the point of dynamic parameters?  Here's the text from the idea page (currently with 870 votes - only one with over 320 votes):


Parameters are really useful when you need to do something too complex to be handled by quick filters or action filters. However parameters are currently hobbled by the fact they have to be static lists.

It would be really, really useful and solve a lot of Tableau gotchas, if you could define the options available for a parameter dynamically, from the result of a datasource, and preferably with the option to apply filters also.

That's a bit about what dynamic parameters could do.  Let's talk about what parameters are in general.  Here's what we're NOT talking about - The type of parameter that is a numerical value that can be changed via a slider or typing something in that will affect calculations.  Those parameters work great - no complaints there.  Rather, what the "Idea" is typically talking about is a parameter based on a STATIC list (think one time snapshot) of values of a dimension in the data.  Here's how it typically goes:
Create a new parameter>>Change the data type to "String">>Click the button for a "List" of Allowable values>>Select "Add from Field" and select the dimension to get your list of values (I'm using Superstore data and going with "Category")

































This will create a list of values based on the dimension selected.
















This static list now lives apart from your data set.  If the data set changes (the company adds more product categories) this list will NOT update.  This is the gripe.  The thinking is that if the data changes, so should the parameter.

So here we go.  I'm going to propose an idea.  Follow me for a minute.

When we create a list of values in a parameter, it's as if we're informally creating a new table of values, based on a snapshot of the data, that now lives outside the the data.

What if we just more formally create a table of values that remains based on the underlying data, yet lives outside of the full datasource (as a parameter would).  What if, instead of calling this new table of a single dimension of data a "parameter", we called it what it really is - a small data source.

Here's what I'm proposing - Based on the same example from above (Creating a String Parameter from a List of Values based on the Category field), we head back to the data window and head for the Custom SQL.  What? You don't like SQL?  This is going to be the easiest three lines of code you've ever written.  It goes like this:


























In general terms it's:
Select [Your Dimension]
From [Your Datasource]
Group by [Your Dimension]

And using the SQL from above, what do we get?





























A single value list with all of the Product Categories - this looks strikingly similar to the parameter we created, except for one thing.  This is dynamic.  This list will change if the underlying data would change.

So we've created this dynamic list - what can we do with it?  Basically anything we want - we can do most of the same things we could do with parameters - use it in formulas, use it in quick filters (single select or multi-select - the second option is not currently possible with a parameter), etc....  The sky is the limit.  Here's an example:



































Notice the two data sources.  Superstore is our primary data source and the single value list dynamic table based on Categories is the secondary source.  Creating a multi-select quick filter works like a champ (make sure the two data sources have the correct links and relationships - which Tableau will make sure of if the headers have the same names).





Interesting stuff, I say.

Is it a perfect solution.  Nope - it's a hack.  Are there a dozen reasons why it's not a best practice? I'm sure there are, and I'm sure the comments section below will let that play out.  But with all that said - could it help solve a problem you're having? It just might.  

It's a crazy simple solution and my hope is that it sparks a conversation.  I by no means think this is anywhere near the last word on this topic, but with v8.2 walking out the door, it's time we begin to amass ideas for v9.0.

I look forward to your feedback.  Many thanks- 

Nelson

Wednesday, April 30, 2014

Day 30: Cross Stitch Viz of Thank You

30.  

We made it.  We're here.  I this is probably the closest I'll ever come to feeling like I'm on top of Mount Everest.  I've thought a lot about what to say at this point.  I'll level with you - this was a lot harder than I thought it would be.  The content was interesting and I learned a ton, but man was this time consuming.  On average I'm guessing I spent 2.5 -3 hrs a night writing each one of these, cradle to grave.  It made for some looooong days, many of which lasted into tomorrow.  But I think I'll soon look back and be happy I took this on AND completed it.  I certainly wasn't alone in this quest and I'd like to take a few moments and say thank you to a number of people: 

  • First off to my wife, Ms. The Vizioneer, for whom I owe about a month's worth of washing dishes, cleaning up, and paying attention to - tomorrow's May baby :) Wahoo!
  • To Dan Montgomery who contributed in a number of ways along this journey and stepped up with a guest post at the end.  I believe Dan favorite every single one of my #Tableau30for30 Tweets, so if you follow Dan you'll be happy May's coming as well.
  • To John Mathis, Steven Carter, and Peter Gilks (the boys of Slalom), who stepped up when the towel was this close to being thrown, and provided some excellent posts that really enhanced the journey toward the end.
  • To the great folks at Tableau - Ben Jones and Tara Walker - I can't believe that we were just randomly on the same page with April as Tricks and Tips month, but I'm grateful for the blog feature and Twitter hype! It was really encouraging to know that people all over the place were interested.
  • And to the many of you who reached out in one way or another with many kind words of appreciation, I am honored to know that this made an impact for you.
If you've become an avid reader of the The Vizioneer, you probably shouldn't count on much new stuff coming out in May... but what am I talking about?  We still have one more trick to cover for the 30th, so buckle up!

Before we get to what I ultimately did for Day 30, you should know that I've been pondering how to wrap this up for over a week.  I thought about writing on the "ultimate trick", the trick to end all tricks, but it was a moving target depending on who you ask - Andy Kriebel's a pretty big fan of what he calls The Greatest Tableau Tip EVER: Exporting CSV made simple! which was worth considering.  But then Andy Cotgrave and Matt Francis were big fans of putting bar charts in the the tooltip which I have to admit was a pretty awesome idea:

























I'd hate to pick "the greatest trick ever" only to find out I was wrong :)

So I decided to go a different route.  Inspired by Ms. Jewel Loree, creator of many beautiful Viz of the Day wins and a Tableau Public goddess,  I give you Day 30 of #Tableau30for30 - Cross Stitch Viz of Thank You.


This is one of those we certainly have to show the end result before we dive it.  So this is where we're headed:


Pretty cool huh?  How did we do it?  Well Jewel gives a pretty awesome step by step on her blog, but I'll give you the dime tour.  We actually start this shindig in Excel (because data doesn't just grow on trees or fall from the sky! It must be created!).  He're what I did - I started by making each cell in Excel 10x10 pixel squares and numbered across the top and down the left side.  Looks something like this:




























Then inserted got some word art, found a font I liked, and wrote out a message in the word art. I then took the word art and made the fill transparent, added a solid outline and it looked something like this:






















Next came the fun part.  I manually went through and filled in each of the squares that had over 50% of the area as to be filled by the letters.  This took a bit of time (as does anything worth doing - according to my Dad).  But it soon looks like this:























You then do something that I previously didn't know was possible - Find and replace based on formating.  This allows you to write in text where you have cells of a particular color.  Once you go through and do everything, you have something that looks like this:























Now, some additional magic.  If you haven't already downloaded the Tableau Excel Data Reshaper Add-in, then where have you been?  Mother's been worried sick about you!  You better go inside and eat your dinner!

We're going to start in cell B2 and select enough of the worksheet to get every bit of the design.  We'll then fire up the Data Reshaper Add-in and pivot this to make it useable in Tableau.  After it completes and we rename the columns we get this simple table:























It's pretty easy from here.  Bring the sheet into Tableau, Y-Axis to rows and X-Axis to columns (both should be Dimensions rather than Measures), and Color on to the color shelf.  Also worth noting, I had to reverse the direction of the Y-Axis (putting 1 on top, same as it was in Excel, rather than on the bottom as Tableau is used to putting it there). Selecting shape mark type and going with the "X" looking one, and.....
Voila! 


We have a cross stitch you'd be proud to put up on the wall.

And with that, the sound of Boyz II Men tells me that we've come to the End of the Road


It's been fun.  Thanks for hanging out and following along.  If you've enjoyed this, learned something new, or have a favorite trick I missed, hit me up on Twitter @Nelsondavis as I'd love to hear about it.  I love to meet new people and TCC14 is but a short 4.5 months away - if you see me, please stop and say hey!

As always, many thanks - 
Nelson


Tuesday, April 29, 2014

Day 29: Designing for Performance - Tiled vs Floating

As the 30th rolls oh so very close, I'm doing some reflection how we got here.  The reality is that I've learned a ton.  And when I think about where I learned the most over the last year, I always come back to my TCC13 experience in Washington DC.  I learned so many mind blowing things in the span of four days that have greatly impacted and improved the work I've done since then.  One of the sessions that had a dramatic impact on me was "Designing Dashboards for Performance", where the Tableau gurus hit on a number of great points on how to make a slow dashboard sing with speed.  My good buddy (from Day 27) Dan Montgomery wrote a great blog post on many key elements for creating a highly performant dashboard.  Today we'll prove out one aspect of that post, as we look at the classic question of dashboard design in Tablea - Tiled vs Floating (a battle royal) for Day 29 of #Tableau30for30 - Designing for Performance.



So let me quickly reference Dan's blog post and list off the main categories of where performance gains can be found:

  1. Using extracts over live connections - As obvious as it is, this was something I didn't realize until I'd been using Tableau for over a year.  I didn't come from a data background, so I didn't think much about the difference until I was shown the light.  Now, I almost always extract - and you should too.
  2. Add data source filters - this limits the amount of data Tableau has to sift through in order to create your visualization.  If you're going to focus on one area of the business, rendering the rest of your data set superfluous, then just exclude Tableau from bringing it in in the first place.  It will decrease the size of your data and reduce the number of dimensions on the filters shelf, which is also a good thing.
  3. Use aggregated results - I know, I know - you should let Tableau use it's fast data engine to roll up the most granular level of data so that you have the most flexibility in exploring your data.  You know what I say to that? Poppycock!  You know what's faster than the Tableau's data engine? Not using the data engine - that's what.  Here's something to keep in mind - You'll see better performance taking 10 different extracts of different rolled up aggregations of the same 10M+ row data set, than you would taking 1 extract of the granular 10M+ set of data.  It's true basically every time.
  4. Floating your worksheets - This is a great debate. I've bought into this hook, line and sinker, but I know many (Tableau Zen Master Mark Jackson for one), who don't see the need to use garlic cloves to ward off tiled dashboards, as I do.  Today we're going to camp out here and look a simple dashboard built in two different ways and see how they perform.

So the dashboards themselves are not super important, and frankly I wish I was looking at a larger data set, but we're looking at the 14,000 rows of Superstore data.  Here's where the difference comes in: The first dashboard is made by dragging sheets into a tiled dashboard; the second is made bringing each of the four sheets in as floating sheets on the dashboard. 




Basically everything else is the same.  I clicked on the carrot of each sheet and select "Use as a Filter".  I then went and found a magical little problem solving tool Tableau inserted in Version 8.0 called the Performance Recorder (Help>>Settings and Performance>>Start Performance Recording):

















Next I selected and unselected one element in each sheet which then pushed that filter to the other sheets.  I went in the same order (Top left, bottom left, bottom right, top right) and select a different element each time so that I knew Tableau wouldn't cache the visualization.  I did this twice, once for each of the types of dashboards.  After completing the four selections, I went back up and through the menu, and stopped the Performance Recorder.  After doing that, Tableau immediately opens a workbook and shows you every little thing it thought about to render your visualization (it's really helpful for problem dashboards).  

In summary here are the results:
Tiled Dashboard: 1.249 sec
Floating Dashboard: 1.038 sec (17% decrease)
Not earth shattering, but certainly meaningful.  Remember, this is a small data set, and we're only looking at four sheets.  

Below are the Performance recording workbooks:





And here's the workbook that I created that has both the tiled and floating example (same as above):



So with that we come to this conclusion - floating is actually better (as we've often heard), but I wouldn't call it a silver bullet.  The best practice is to implement as many of the steps that Dan discusses as possible, and you should be creating some highly performant dashboards.  

Hope everyone enjoyed! We'll see you back here one more time tomorrow before I ride off into the sunset.

Nelson 

Monday, April 28, 2014

Day 28: Playing Hide and Set with Table Calcs

I love this community.  It's full of smart people trying to learn more and figuring out different ways to do things.  If you've been following the #Tableau30for30 all along (I'd like to thank all 4 of you :)), you know that we've invited three guest bloggers to this point.  Today, given the fact that we are coming to the end of our time together, I was considering adding one last one, this interesting tip from the man, the myth, the legend Peter Gilks of Paint by Numbers (and Slalom New York) fame.  We may have to get the official word, but I think he has more Viz of the Day wins than I have fingers.  He's pretty awesome.  So I got approval to share this blog post tip of his on "The End of Time.... series based calculations" but as I was reading it, I started to think I'd solve the same problem differently - AND - since he closed with the following statement: "If anyone has an alternative approach to this I would love to hear what it is, as with so many things in Tableau there are probably multiple ways to achieve the same goal" I thought it would be fun to share an additional way to do it.  Neither way is wrong but I like variety, and it's why God makes colors.  So today I give you Day 28: the Peter Gilks inspired "Playing Hide and Sets with Table Calcs"




I'm going to let Peter provide the setup here (his words in italics, and we're both using Superstore data):
The situation is this: You've got a time series based table calculation going on that you are interested in showing, something like a running total, YTD total or difference from last month, however you don't want to show all the data - instead you only want to show the latest month.


Your starting position might look something like this below, with year, month, sum of sales, monthly difference in sales and running total.





































Now lets say you only want to show the latest month, in this case December 2013. Well the first thing that springs to mind might be to filter, but that messes with the calculations PLUS its not going to automatically update to the latest month when the data updates:


























Now the next idea you have might be to HIDE the 'non latest month' data, and that will solve the issue of the filters messing up the table calculations, but its still going to leave you with a problem when you get a new months worth of data you want to automatically show. So this is what you can do....Click to see Peter's solution/keep reading to see mine...

I'm going to right click on Order Date and find something you may not have noticed before - Create Custom Date:























And we'll get this interesting dialog box.  Here we have the ability to tell Tableau to create a calculated field that is the same as Order Date, but at whatever slice of the Dimension we want.  Never used this before?  It really comes in handy in a pinch (which is what we find ourselves in) and saves some coding of calculated fields.  I'm going to use the drop down to select Month/Year, like so:

















Tableau now creates a brand new field for us called "Order Date (Month / Year)".

Here comes the trick (in a series of moves).  I'm going to right click on this new field, and select "Create Set".  What I want to do is create a set that only contains the most recent month.  Another way of saying that is the Top 1 Month by Maximum Value.  So we'll head to the last tab and fill it out like so:


































Once we click OK, we now have a set called "Last Month".  If we click and drag the set "Last Month" up to the first position on the rows shelf, we get below table and instantly realize that the sort of the In/Out in the table actually makes a difference:


































Goes to this when you put In on the bottom (giving us what we want):

































Awesome! Now we simply right click on the word "Out" and select "Hide" and right click on the In/Out pill and uncheck "Show Header" and voila:



























The cool part is that with both of these solutions (Peter's and mine), as the data gets updated, so will the "Last Month".  The takeaway is that sets can be really helpful when dealing with Table Calcs.  



Hope you enjoyed!  TWO DAYS TO GO!  Many thanks!

Nelson