Better BI

12.11.12

Comment on Dashboard design philosophies (vis-a-vis Tableau).

Motivation

This post is motivated by a discussion Steve Wexler started on LinkedIn regarding Stephen Few's recent dashboard design contest. See the discussion here.

Stephen Few's dashboard design contest.

The announcement is here.
The winner is announced here.

Steve Wexler

Steve Wexler is a prominent member of the Tableau community who's been a strong advocate for Tableau and high quality dashboard design, and an innovator who's been instrumental in the popularization and adoption of techniques that enhance and improve Tableau dashboards' communicative and interactive abilities.
Steve's blog is here.
His article on the winner of Few's dashboard competition is here. In it, he wrote:
I’ve been troubled by both the premise and results of Stephen Few’s dashboard design competition as I think it celebrates the number of disparate elements that can be crammed onto a single dashboard rather than champion the display and positioning of the elements that really need to be present (see http://www.perceptualedge.com/blog/?p=1374).

I see it somewhat differently, that the either-or choice implicit in Steve's framing is a false dichotomy, that it's possible, even desirable, to build dashboards that can be information-dense and highly effective at communicating the most salient information. I'll go further: my personal belief is that any dashboard that doesn't do both is only excused on the grounds that the media it's presented in compromises one or both principles.

Dashboard design schools.

There are a couple of schools of thought in dashboard design, that split along the dimensions of size, scale and density of dashboard information presentation.
Please note that as far as I know there aren't any official schools—this is only how I see things, and I don't speak for Edward Tufte or Stephen Few.

The traditional school.

Stephen Few is, I believe, a member of the analytical information design school that holds that effective designs can, even should, be sized sufficiently large enough to be effective at presenting all the information relevant to understanding the particulars and relevant context of the situation under consideration. Edward Tufte is the most famous of this school.

Members of this school advocate the design of rich, comprehensive, highly detailed dashboards that convey the full span of information someone needs to have simultaneously visible in order to fully comprehend the situation.

This design sensibility assumes that the dashboard media are capable of such presentions. In his workshops Tufte has talked about the roughly half-million data point available in simple technologies like 11"x17" paper that can be printed on any standard business printer or presented on monitors with the equivalent size and pixel density capability.

The new school.

On the other side is the small-format school who emphasize the need to design dashboards that can be rendered effectively in small formats, e.g. laptops, iPads, mobile devices, and the like with Web browsing, blogs, and native apps, etc. This school emphasizes the need for a few clear visualizations coupled with rich and effective interactivity to enable the user to easily and effectively browse through an information space exceeding the device's display dimensions.

The divide between the schools seems largely a consequence of the timing of entry into the marketplace of the small-format school. People whose first or highly predominant exposure to analytical information design came from building dashboards with one, two, maybe three or four charts for delivery in blogs or on laptops or iPads, quite naturally assume that their design paradigm is the right and natural one. Those who came to dashboard design in an earlier time sometimes look askance at designs that needlessly waste or misuse space that could be profitably used to communicate information.

Effective dashboard design challenges have substantive commonalities and differences in both camps. Each has its place, and its strenuous advocates. In many cases the believers in one form aren't aware of, or believe in, the strengths and advantages of the other. For example, large-format designs provide the opportunity to keep related information in sight, but they're largely unworkable on small-format electronic media where the consumer needs to look at the information through a small viewport, however easily scrollable. Small-format designs cannot take advantage of the full range of the human visual cognitive field - our binocular field of vision can take in more information than can be presented in small formats.

I believe that the large-format camp is more suitable for the Teacher-in-class scenario in Few's contest. Being able to print out a double-sided 11"x17" page with information front and back would provide the information the teacher needs in a form that s/he could keep readily at hand, and put into a logbook at day-end that would provide the historical record over time. Tufte talks about this scenario—when I took his workshop in D.C. he called it the daily briefing sheet for the Admiral; there was a large military contingent in the audience and I suspect he tailored the example to fit.

About Tableau's Dashboard Design Philosophy

–or– to which school does Tableau most naturally align?

Tableau's design strength lies in the small-format world, creating dashboards with a limited number of charts and tables, with interactive features dominating the expansive use of space for information presentation. I'm not privy to Tableau's design philosophy in this realm, but think it's a natural outcome of its inception and evolution in (relatively) small media, and with the drive to incorporate social media and mobile devices the pressure will be to push improvements out into this space.

I keep hoping that Tableau will incorporate those features that directly address the needs of information-dense information designs.

Chief among these is a sophisticated dashboard layout manager capable of control and precision in the placement and behavior of the components, both individually and collaboratively. Tableau has come a long way in its dashboard design abilities, but it still has far to go. There are plenty of excellent examples of layout managers in the world to use as examples.

Another big ask is for better bullet graphs and true sparklines. Tableau's bullet graphs are overweight and bulky—they need a slimming and toning program. On the positive side any such program will also likely improve Tableau's mark sizing and control abilities, which will upgrade the visual effectiveness of other presentation, particularly bar charts. Simulating sparklines with Tableau line graphs is a coarse approximation of clarity and crispness of Tufte's sparkline original design and specifications. Again, improving Tableau in this area should provide ancillary benefits in other visualization types.

First thoughts post-TCC 2012

The annual Tableau Customer Conference—TCC 2012—was a lot busier and more jam-packed than I thought it was going to be. Last year in Las Vegas there were 1,400 attendees, this year: 2,000 people came. I was on the full go from early on until quite late, barely managed to squeeze in my birthday dinner, and that a day late.

Tableau people are great ambassadors.

As always, everyone from Tableau was engaged, helpful, and committed to helping the attendees have a successful experience. Having worked for a BI software vendor, and with experience of many others, the enthusiasm and passion, the sheer verve, with which the Tableau people approach their work, and their appreciation for their customers, is really good to see.

The game-changing announcement.

This year the big news was transformational:

In-browser, on-Server editing of published Workbooks.

Dashboards and Worksheets can now be edited without downloading them.

This is going to shake up the entire Tableau world once it starts bubbling out. There was a demonstration of it during the main keynote, and it looks great. It opens up the door to all sorts of new possibilities. It also has the potential to dramatically alter the licensing and governance landscapes. I didn't see any information on how they will be affected beyond some recognition that the permissions system will need to accommodate the new Use Cases.

Other stuff.

New chart types

Several new chart types were demoed:

treemaps are really going to be handy and useful;
treemaps doubling as bar charts open up exciting new opportunities;
bubble charts are exciting, particularly if they work with pages and have trails a la Hans Rosling (http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html);
word clouds are interesting and certainly crowd-pleasing, particularly for the social mediaistas

Forecasting

Tableau now offers forecasting - with an existing time series data set Tableau will extrapolate into the future. The presentation was very cool - a ragged, highly variable series projected forward, and the projected data was a good visual match. They told us that there's a lot of modeling going on behind the scenes. I'm going to be very interested in seeing how this plays out going forward, how today's projected data can be captured and used as history as time passes in real-world scenarios when Tableau's forecasts can be compared to historical fact-will there be a feedback opportunity to improve forecasting over time?

Desktop

Tableau Desktop has a new Marks card that appears to clarify the visualization characteristics for improved interaction and configuration, making it easier to get what you want.

The People

I didn't get to meet everyone I wanted to, or spend enough time with the people I did manage to get together with. But I did get to spend some really good and fruitful time with some of the people I have tremendous respect for. The Tableau community is one of its great strengths, and I'm frequently humbled by the range and depth the people who make it up.

That's about it for now. More will bubble up once I get ruminating on it.

5.11.12

Unhide that Worksheet!

Cross-posted from Tableau Friction. Please head over there for a look at how easy it is to unhide the Worksheets in your Workbooks.

4.11.12

Mining the Web: Ruby + Tableau = Data Gems from Raw Content

Mountains of information.

The web is chock full of information. Completely, utterly stuffed full of interesting facts about pretty much everything and anything. Most of the information is not in the traditional record-set form of data.

Tableau and the Web

Tableau is a superb tool for exploring record-set data, and for communicating the interesting insights that can be refined from it.

Happily, Tableau is able to recognize some of the Web's information as data and make it available for examination, inspection and exploration.

Sadly, Tableau's abilities in this area are pretty limited, e.g. copying and pasting the contents of HTML tables—and not any old HTML tables, only those that are "well-formed" in a narrow sense particular to Tableau–see here

Mining the Web

Fortunately, pretty much anything on the Web that's regularly formed can be mined and refined into data that Tableau can understand. Tunneling through the Web and extracting the information gems and processing them into data is much easier than is commonly thought, if the right tools for the job are used.

Enter Ruby

Ruby is a terrific tool, as proficient in its domain as Tableau is in its. Ruby was designed to remove the barriers between programmers and their ideas and goals in very much the same way Tableau was designed to remove the barriers to data visualization.

Accessing Web content with Ruby is simple and easy, and transforming the content into data is similarly straightforward, dependent of course upon the complexity and regularity of the content.

"Show Me," you say?

You want proof. Fair enough.

A while back I was listening to some music and browsing through Rolling Stone's 500 GREATEST SONGS OF ALL TIME for inspiration and reminiscing. Each song has its own page, with lots of interesting information and buttons to the previous and next songs in the list.

Here's song #500: link, and below. Your browser does not support IFRAMEs

As I was browsing through the songs I started keeping a list of them, copying and pasting their names, rank, and artist into a spreadsheet so that I could use Tableau to organize them and answer questions like: which artists had the most songs in the list; how many songs did the Stones have in it; and so on and so forth, and such like.

I wanted to be able to use Tableau to help me answer these questions, along the lines of:

It didn't take long before I was annoyed with all the copying and pasting. It was a huge pain—boring, mechanical, slow; error prone, and unrewarding. Worse, it detracted from my listening enjoyment.

I needed a better way.

What if, I thought to myself, the information in the individual pages is coded consistently across the pages? If so, it should be possible to dig into them and extract the songs' information and capture it as data in a form that Tableau can recognize.

As it turns out, the individual songs' pages are coded consistently and it was pretty easy to write a Ruby program to do exactly what I needed. Here it is:

RSSongs.rb


require 'rubygems'
require 'nokogiri'   # Nokogiri in an add-on, installed with: gem install nokogiri
require 'open-uri'

$root      = 'http://www.rollingstone.com'
csvHeader  = 'Record #,Reference,Rank,Artist,Song,Link'
$recordNum = 0

HEADERS_HASH = {"User-Agent" => "Ruby/1.9.3"}

topOfList = 'http://www.rollingstone.com/music/lists'\
            '/the-500-greatest-songs-of-all-time-20110407'\
            '/smokey-robinson-and-the-miracles-shop-around-19691231'

def pull_song songPage
   doc          = Nokogiri::HTML(open(songPage, HEADERS_HASH))
   reference    = doc.xpath("//div[@class=\"listItemDescriptonDiv\"]/h3").text
   artist, song = split_title reference
   place        = doc.xpath("//span[@class=\"ListItemNumber\"]").text
   csvRecord    = "\"#{$recordNum += 1}\","\
                  "\"#{reference}\","\
                  "#{place},"\
                  "\"#{artist}\","\
                  "\"#{song}\","\
                  "#{songPage}"
   $f.puts csvRecord unless $f.nil?
   get_next doc
end

def split_title reference
    parts  = reference.split(/, [\']?/,2)
    artist = parts[0]
    if parts[1] =~ /.*'$/
       song = parts[1].chop
    else
       song = parts[1]
    end
    return artist, song
end

def get_next doc
   nextNode  = doc.xpath("//a[@class=\"listPaginationControls next\"]")
   nextHref  = nextNode.xpath("@href")
   nextItem  = $root + nextHref.text
   if nextItem == $root || $recordNum > 501
       nextItem = nil
   end
   return nextItem
end

puts "Starting with: #{topOfList}"

$f = File.open('RSSongs.csv','w')
$f.puts csvHeader unless $f.nil?

nextItem = pull_song topOfList
while nextItem
     nextItem = pull_song nextItem
end

$f.close unless $f.nil?

The Ruby - Tableau Partnership

OK, it's not really a partnership. Neither of the actors knows of the other, so it's more like a happy (from my perspective) convergence of functionality, but that makes a lousy section heading.

RSSongs.csv

The common element in this data mining system is the CSV file "RSSongs.csv". The Ruby script creates it. Tableau accesses it and makes its contents available for analysis. There's nothing magical about the name "RSSongs.csv" either, any other would do but I had a bigger plan and sensible naming is always a good policy.

Go ahead – give it a go.

I hereby release the Ruby code into the wild. It's free, as in speech and beer. Take it. Run it and get your own data set of the Rolling Stone 500 GREATEST SONGS OF ALL TIME.

The usual caveats.

RSSongs.rb works but it's definitely not bulletproof and hardened. Running from my home it purrs along perfectly. Running it from my hotel the data before TCC 2012 it grabs a bunch of the songs and then fails, with an error I think has to do with proxies or the specification of a User Agent in the URL open (see here), but I don't have the time to dig into it.

I hope it works for you, but make no guarantees. If you do use it and make improvements I hope that you'll post them back here as comments so I can learn from them, and hopefully other people can benefit from them too.

The Rest Of The Story

Mining these songs was a really interesting exercise, one that freed me from the shackles of a too-rigid conception of what Tableau-analyzable data is. Once I got this little project working I found myself wanting more and richer information to explore and investigate.

There will be more articles in this vein, starting with an examination of this Ruby code and a description of the process of writing it.

31.8.12

Discussing Deciphering Tables #1

This post is a comment to the previous post: Tao of Tableau - Deciphering Tables #1 - Simple Features that got too long. (blogger only accepts HTML <= 4,096 characters). It started off continuing a thread that Joe Mako started with his insightful comment on that post and then it grew its own legs.

One of the things that I loved about Tableau right from the start is that it made analytical data access and visualization really easy and straightforward. And it got so many things so very right: drag-and-drop and double-click data organizing; using the right type of chart; highly effective color schemes; good data-ink ratio; and so on.

But Tableau, as good as it is, as much as a leap forward as it was, isn't perfect. There are some things that could, that should, be better. For example I've always thought that "Columns" and "Rows" aren't the best terms for the shelves, with "Across" better than "Columns", but there's no equally good simple alternative replacement for "Rows" ("Down" is close) as long as the shelf is itself horizontal, and re-orienting the shelf imposes its own UI space and usability problems. But I digress. A bit.

It almost feels like trivial and senseless grumpiness to point out what are in one sense minor blemishes on such a lovely complexion. But then again, they stand out because their surroundings are so clean and clear and good.

It's sort of like warts on a toad. Nobody cares about a particular wart on a toad because it's already full of them. Any particular wart is pretty much indistinguishable from its neighbors. (I know. They're not really warts. It's poetic license.)

But Tableau was hatched pretty much wart-free. So those it does have look and feel, well, warty when one comes across them.

The examples: "Naked viz" – the worksheet framing lines vanishing when displayed in a dashboard; and "2H Cell Table" – when the Column Dividers are set to "None" in a single Dimensional organization; are like this. In isolation not too bad, maybe even an OK choice for the particular case, but as the exceptions to Tableau's otherwise clear complexion they stand out like as oddities, anomalies. Warts.

About "Naked viz".

From a larger design point of view, the presence of the "Drop field here" flags and framing lines (hints) in an empty Worksheet are enormously helpful in alerting the User to actions they can take to achieve something. I very much like it.

The design breaks down in the larger context of the same worksheet's presentation in a dashboard. Worksheets and dashboards are different environments, and need different presentations. In one sense this discussion is about what the differences should be, and why they should be that way.

Tableau's Design Paradigm

Tableau Software has been very clear in their desire to provide a single environment for creating and consuming Tableau analytics. There are very strong and good reasons for this philosophy.

However, in not displaying the worksheet hints in a dashboard presentation Tableau is violating their basic premise of not separating development/authoring and consuming into distinct operational modes. Unless of course one considers that worksheets and dashboards are different environments, each with its own separate mode, in which case we're discussing what the different presentations of the same viz in the different environments should be.

Fair enough. (But it feels like a thin argument, and once there's a crack in the ice...)

Dropping the "Drop field here" hints in dashboard worksheet presentations makes good sense because there's no dashboard mechanism to add content to worksheets. If the functionality isn't there it makes little sense to hint for it.

I don't like the dropping of the framing lines because, to me at least, it says "there's nothing here", which is a very different message than "this is a place for a viz to be but there's no data being presented". (I wanted to say vizzed but that's pushing it).

There is a great deal of value in providing a consistent presentation for the same thing, to the limit of it making sense, across a product. One of Tableau's great attractions when it came out was that this pretty much held true. It continues to be pretty much the case for Tableau Desktop—in this discussion we're contemplating what differences make sense for a couple of specific cases. On the other hand, Tableau Server has far too many different presentation idioms for similar things; this makes it much less approachable, much more difficult to become proficient with than it should be.

I believe that there is a legitimate need for separating authoring and consuming activities, e.g. the authoring features of a dashboard are intrusive and annoying for someone who's only consuming it. And that opens an entirely different can or worms of another color.

About Smart Defaults.

I'm in full agreement with Joe about Tableau's double-edged nature. When it works, and the great majority of the time it works beautifully, it's a marvel and a delight. It's pruning off the warts that gets a bit vexing.

It really would be useful to be able to configure the smart defaults. It would complicate things, but if it were done smartly the burden wouldn't be too great. I can do a lot of this stuff today with hacking the XML but that is risky and grows wearisome.

Why care about these things?

As much as I love Tableau, and I really do, making my living as a BI/Tableau consultant, I'm aware that it's biggest danger is in becoming one of the products it disrupted. Its singular grace—approachability, usability, low friction, etc., is fragile and easily neglected. If Tableau doesn't take care and continue to expunge its warts, or if you prefer the sand in the gears, it will inevitably lose ground to competitors who are what made Tableau successful: being a faster, easier, less expensive, more effective tools for accessing, organizing, and visualizing data.

Note: this post has expanded into territory better suited to the Tableau Friction post, so I'm moving it over there.

24.8.12

Tao of Tableau - Deciphering Tables #1 - Simple Features

Tableau is extremely powerful, flexible, and easy to use for creating tables. Drag a measure into the pane, a dimension or two into the Rows and/or Columns, maybe double-click another measure, two, or three, and presto! voila! you have a nice table containing the measures' values sorted and organized according to the dimensions' members' default sorting.

Easy, nice, neat, and tidy.

Except that it's not quite so simple. Tableau has a particular philosophy about how tables should be decorated, and this isn't always exactly in tune with what many people expect. This post is the first in a series that examines Tableau's table layout schemes and provides some (hopefully) useful clarity to the whole endeavor, with guidelines for achieving the table presentations you need.

First up, the embedded Tableau Public-published workbook below shows a simple series of tables with the default Tableau presentation along with some notes on interesting aspects. As is pretty clear, for a simple set of very closely related tables there's quite a variety in their presentation.

18.8.12

On Cognitive Fluency and BI

This article from The Atlantic considers the causes of popular misquotations. It contains the following observation:

"Our brains really like fluency, or the experience of cognitive ease (as opposed to cognitive strain) in taking in and retrieving information. The more fluent the experience of reading a quote—or the easier it is to grasp, the smoother it sounds, the more readily it comes to mind—the less likely we are to question the actual quotation."

Cognitive fluency is the hallmark of good BI. Information presented well is easily observed and understood; the barriers to comprehension are low. The same information presented poorly, or even less well, is not as readily absorbed.

Cognitive fluency in BI tools has two primary facets:

In the tool itself — the tool must be as simple and obvious to operate as possible in order minimize the cognitive load on someone seeking to analyze data;
In the analytics created with the tool — they must conform to the well-established standards for analytical information design. The defaults must embody AID best practices and the ability to create suboptimal or ineffective analytics must be, if present at all, subordinate and require more effort.