Better BI: November 2012

12.11.12

Comment on Dashboard design philosophies (vis-a-vis Tableau).

Motivation

This post is motivated by a discussion Steve Wexler started on LinkedIn regarding Stephen Few's recent dashboard design contest. See the discussion here.

Stephen Few's dashboard design contest.

The announcement is here.
The winner is announced here.

Steve Wexler

Steve Wexler is a prominent member of the Tableau community who's been a strong advocate for Tableau and high quality dashboard design, and an innovator who's been instrumental in the popularization and adoption of techniques that enhance and improve Tableau dashboards' communicative and interactive abilities.
Steve's blog is here.
His article on the winner of Few's dashboard competition is here. In it, he wrote:
I’ve been troubled by both the premise and results of Stephen Few’s dashboard design competition as I think it celebrates the number of disparate elements that can be crammed onto a single dashboard rather than champion the display and positioning of the elements that really need to be present (see http://www.perceptualedge.com/blog/?p=1374).

I see it somewhat differently, that the either-or choice implicit in Steve's framing is a false dichotomy, that it's possible, even desirable, to build dashboards that can be information-dense and highly effective at communicating the most salient information. I'll go further: my personal belief is that any dashboard that doesn't do both is only excused on the grounds that the media it's presented in compromises one or both principles.

Dashboard design schools.

There are a couple of schools of thought in dashboard design, that split along the dimensions of size, scale and density of dashboard information presentation.
Please note that as far as I know there aren't any official schools—this is only how I see things, and I don't speak for Edward Tufte or Stephen Few.

The traditional school.

Stephen Few is, I believe, a member of the analytical information design school that holds that effective designs can, even should, be sized sufficiently large enough to be effective at presenting all the information relevant to understanding the particulars and relevant context of the situation under consideration. Edward Tufte is the most famous of this school.

Members of this school advocate the design of rich, comprehensive, highly detailed dashboards that convey the full span of information someone needs to have simultaneously visible in order to fully comprehend the situation.

This design sensibility assumes that the dashboard media are capable of such presentions. In his workshops Tufte has talked about the roughly half-million data point available in simple technologies like 11"x17" paper that can be printed on any standard business printer or presented on monitors with the equivalent size and pixel density capability.

The new school.

On the other side is the small-format school who emphasize the need to design dashboards that can be rendered effectively in small formats, e.g. laptops, iPads, mobile devices, and the like with Web browsing, blogs, and native apps, etc. This school emphasizes the need for a few clear visualizations coupled with rich and effective interactivity to enable the user to easily and effectively browse through an information space exceeding the device's display dimensions.

The divide between the schools seems largely a consequence of the timing of entry into the marketplace of the small-format school. People whose first or highly predominant exposure to analytical information design came from building dashboards with one, two, maybe three or four charts for delivery in blogs or on laptops or iPads, quite naturally assume that their design paradigm is the right and natural one. Those who came to dashboard design in an earlier time sometimes look askance at designs that needlessly waste or misuse space that could be profitably used to communicate information.

Effective dashboard design challenges have substantive commonalities and differences in both camps. Each has its place, and its strenuous advocates. In many cases the believers in one form aren't aware of, or believe in, the strengths and advantages of the other. For example, large-format designs provide the opportunity to keep related information in sight, but they're largely unworkable on small-format electronic media where the consumer needs to look at the information through a small viewport, however easily scrollable. Small-format designs cannot take advantage of the full range of the human visual cognitive field - our binocular field of vision can take in more information than can be presented in small formats.

I believe that the large-format camp is more suitable for the Teacher-in-class scenario in Few's contest. Being able to print out a double-sided 11"x17" page with information front and back would provide the information the teacher needs in a form that s/he could keep readily at hand, and put into a logbook at day-end that would provide the historical record over time. Tufte talks about this scenario—when I took his workshop in D.C. he called it the daily briefing sheet for the Admiral; there was a large military contingent in the audience and I suspect he tailored the example to fit.

About Tableau's Dashboard Design Philosophy

–or– to which school does Tableau most naturally align?

Tableau's design strength lies in the small-format world, creating dashboards with a limited number of charts and tables, with interactive features dominating the expansive use of space for information presentation. I'm not privy to Tableau's design philosophy in this realm, but think it's a natural outcome of its inception and evolution in (relatively) small media, and with the drive to incorporate social media and mobile devices the pressure will be to push improvements out into this space.

I keep hoping that Tableau will incorporate those features that directly address the needs of information-dense information designs.

Chief among these is a sophisticated dashboard layout manager capable of control and precision in the placement and behavior of the components, both individually and collaboratively. Tableau has come a long way in its dashboard design abilities, but it still has far to go. There are plenty of excellent examples of layout managers in the world to use as examples.

Another big ask is for better bullet graphs and true sparklines. Tableau's bullet graphs are overweight and bulky—they need a slimming and toning program. On the positive side any such program will also likely improve Tableau's mark sizing and control abilities, which will upgrade the visual effectiveness of other presentation, particularly bar charts. Simulating sparklines with Tableau line graphs is a coarse approximation of clarity and crispness of Tufte's sparkline original design and specifications. Again, improving Tableau in this area should provide ancillary benefits in other visualization types.

First thoughts post-TCC 2012

The annual Tableau Customer Conference—TCC 2012—was a lot busier and more jam-packed than I thought it was going to be. Last year in Las Vegas there were 1,400 attendees, this year: 2,000 people came. I was on the full go from early on until quite late, barely managed to squeeze in my birthday dinner, and that a day late.

Tableau people are great ambassadors.

As always, everyone from Tableau was engaged, helpful, and committed to helping the attendees have a successful experience. Having worked for a BI software vendor, and with experience of many others, the enthusiasm and passion, the sheer verve, with which the Tableau people approach their work, and their appreciation for their customers, is really good to see.

The game-changing announcement.

This year the big news was transformational:

In-browser, on-Server editing of published Workbooks.

Dashboards and Worksheets can now be edited without downloading them.

This is going to shake up the entire Tableau world once it starts bubbling out. There was a demonstration of it during the main keynote, and it looks great. It opens up the door to all sorts of new possibilities. It also has the potential to dramatically alter the licensing and governance landscapes. I didn't see any information on how they will be affected beyond some recognition that the permissions system will need to accommodate the new Use Cases.

Other stuff.

New chart types

Several new chart types were demoed:

treemaps are really going to be handy and useful;
treemaps doubling as bar charts open up exciting new opportunities;
bubble charts are exciting, particularly if they work with pages and have trails a la Hans Rosling (http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html);
word clouds are interesting and certainly crowd-pleasing, particularly for the social mediaistas

Forecasting

Tableau now offers forecasting - with an existing time series data set Tableau will extrapolate into the future. The presentation was very cool - a ragged, highly variable series projected forward, and the projected data was a good visual match. They told us that there's a lot of modeling going on behind the scenes. I'm going to be very interested in seeing how this plays out going forward, how today's projected data can be captured and used as history as time passes in real-world scenarios when Tableau's forecasts can be compared to historical fact-will there be a feedback opportunity to improve forecasting over time?

Desktop

Tableau Desktop has a new Marks card that appears to clarify the visualization characteristics for improved interaction and configuration, making it easier to get what you want.

The People

I didn't get to meet everyone I wanted to, or spend enough time with the people I did manage to get together with. But I did get to spend some really good and fruitful time with some of the people I have tremendous respect for. The Tableau community is one of its great strengths, and I'm frequently humbled by the range and depth the people who make it up.

That's about it for now. More will bubble up once I get ruminating on it.

5.11.12

Unhide that Worksheet!

Cross-posted from Tableau Friction. Please head over there for a look at how easy it is to unhide the Worksheets in your Workbooks.

4.11.12

Mining the Web: Ruby + Tableau = Data Gems from Raw Content

Mountains of information.

The web is chock full of information. Completely, utterly stuffed full of interesting facts about pretty much everything and anything. Most of the information is not in the traditional record-set form of data.

Tableau and the Web

Tableau is a superb tool for exploring record-set data, and for communicating the interesting insights that can be refined from it.

Happily, Tableau is able to recognize some of the Web's information as data and make it available for examination, inspection and exploration.

Sadly, Tableau's abilities in this area are pretty limited, e.g. copying and pasting the contents of HTML tables—and not any old HTML tables, only those that are "well-formed" in a narrow sense particular to Tableau–see here

Mining the Web

Fortunately, pretty much anything on the Web that's regularly formed can be mined and refined into data that Tableau can understand. Tunneling through the Web and extracting the information gems and processing them into data is much easier than is commonly thought, if the right tools for the job are used.

Enter Ruby

Ruby is a terrific tool, as proficient in its domain as Tableau is in its. Ruby was designed to remove the barriers between programmers and their ideas and goals in very much the same way Tableau was designed to remove the barriers to data visualization.

Accessing Web content with Ruby is simple and easy, and transforming the content into data is similarly straightforward, dependent of course upon the complexity and regularity of the content.

"Show Me," you say?

You want proof. Fair enough.

A while back I was listening to some music and browsing through Rolling Stone's 500 GREATEST SONGS OF ALL TIME for inspiration and reminiscing. Each song has its own page, with lots of interesting information and buttons to the previous and next songs in the list.

Here's song #500: link, and below. Your browser does not support IFRAMEs

As I was browsing through the songs I started keeping a list of them, copying and pasting their names, rank, and artist into a spreadsheet so that I could use Tableau to organize them and answer questions like: which artists had the most songs in the list; how many songs did the Stones have in it; and so on and so forth, and such like.

I wanted to be able to use Tableau to help me answer these questions, along the lines of:

It didn't take long before I was annoyed with all the copying and pasting. It was a huge pain—boring, mechanical, slow; error prone, and unrewarding. Worse, it detracted from my listening enjoyment.

I needed a better way.

What if, I thought to myself, the information in the individual pages is coded consistently across the pages? If so, it should be possible to dig into them and extract the songs' information and capture it as data in a form that Tableau can recognize.

As it turns out, the individual songs' pages are coded consistently and it was pretty easy to write a Ruby program to do exactly what I needed. Here it is:

RSSongs.rb


require 'rubygems'
require 'nokogiri'   # Nokogiri in an add-on, installed with: gem install nokogiri
require 'open-uri'

$root      = 'http://www.rollingstone.com'
csvHeader  = 'Record #,Reference,Rank,Artist,Song,Link'
$recordNum = 0

HEADERS_HASH = {"User-Agent" => "Ruby/1.9.3"}

topOfList = 'http://www.rollingstone.com/music/lists'\
            '/the-500-greatest-songs-of-all-time-20110407'\
            '/smokey-robinson-and-the-miracles-shop-around-19691231'

def pull_song songPage
   doc          = Nokogiri::HTML(open(songPage, HEADERS_HASH))
   reference    = doc.xpath("//div[@class=\"listItemDescriptonDiv\"]/h3").text
   artist, song = split_title reference
   place        = doc.xpath("//span[@class=\"ListItemNumber\"]").text
   csvRecord    = "\"#{$recordNum += 1}\","\
                  "\"#{reference}\","\
                  "#{place},"\
                  "\"#{artist}\","\
                  "\"#{song}\","\
                  "#{songPage}"
   $f.puts csvRecord unless $f.nil?
   get_next doc
end

def split_title reference
    parts  = reference.split(/, [\']?/,2)
    artist = parts[0]
    if parts[1] =~ /.*'$/
       song = parts[1].chop
    else
       song = parts[1]
    end
    return artist, song
end

def get_next doc
   nextNode  = doc.xpath("//a[@class=\"listPaginationControls next\"]")
   nextHref  = nextNode.xpath("@href")
   nextItem  = $root + nextHref.text
   if nextItem == $root || $recordNum > 501
       nextItem = nil
   end
   return nextItem
end

puts "Starting with: #{topOfList}"

$f = File.open('RSSongs.csv','w')
$f.puts csvHeader unless $f.nil?

nextItem = pull_song topOfList
while nextItem
     nextItem = pull_song nextItem
end

$f.close unless $f.nil?

The Ruby - Tableau Partnership

OK, it's not really a partnership. Neither of the actors knows of the other, so it's more like a happy (from my perspective) convergence of functionality, but that makes a lousy section heading.

RSSongs.csv

The common element in this data mining system is the CSV file "RSSongs.csv". The Ruby script creates it. Tableau accesses it and makes its contents available for analysis. There's nothing magical about the name "RSSongs.csv" either, any other would do but I had a bigger plan and sensible naming is always a good policy.

Go ahead – give it a go.

I hereby release the Ruby code into the wild. It's free, as in speech and beer. Take it. Run it and get your own data set of the Rolling Stone 500 GREATEST SONGS OF ALL TIME.

The usual caveats.

RSSongs.rb works but it's definitely not bulletproof and hardened. Running from my home it purrs along perfectly. Running it from my hotel the data before TCC 2012 it grabs a bunch of the songs and then fails, with an error I think has to do with proxies or the specification of a User Agent in the URL open (see here), but I don't have the time to dig into it.

I hope it works for you, but make no guarantees. If you do use it and make improvements I hope that you'll post them back here as comments so I can learn from them, and hopefully other people can benefit from them too.

The Rest Of The Story

Mining these songs was a really interesting exercise, one that freed me from the shackles of a too-rigid conception of what Tableau-analyzable data is. Once I got this little project working I found myself wanting more and richer information to explore and investigate.

There will be more articles in this vein, starting with an examination of this Ruby code and a description of the process of writing it.