Chapter 5. Putting the Dash into Dashboards
In this chapter, we will cover the following topics:
- Choosing your visualization
- Using parameters in dashboards
- Using custom geocoding in Tableau
- Profiting from Big Data to rev your visualization
- Filtering your data for focus
- Creating choices in dashboards using conditional logic
Introduction
Dashboards are more than visual tools to display data; they are tools that can help to move business forward. Dashboards are used as decision-making tools to obtain results quickly.
You can help your business users make quick decisions by producing dashboards that are in line with the current research and thinking of dashboard structure. You can also get results from dashboards by performing the following actions:
- Improving the availability of data
- Facilitating the user's understanding of the data quickly
- Sharing information with team members and beyond
- Opening accessibility to users via adaptability
- Allowing flexibility for users to add notes to their dashboard
Choosing your visualization
Visualization is about democratizing the data and making it accessible to the people who need to know.
Today, there are many hot trends in both consumer and enterprise technology that increase accessibility by being highly visual. Think of the popularity of iPads, Surfaces, eReaders, and large screens. Everyone wants their data in the best resolution possible, with crisp graphics and colors.
Executives are engaging with the charm of visualization and putting it firmly onto enterprise business intelligence roadmaps. According to a survey by Howard Dresner, the extensive use of color, size, shape, and motion were more appealing than other buzzwords such as Big Data and the cloud. A study by Dresner Advisory Services found that advanced visualization and dashboards ranked high in terms of importance.
Why is visualization so useful? It's more than pretty pictures. Data visualization helps us to understand meaning in data via a communication medium that we are "geared" towards every day—our vision. Through discrimination and effectiveness of data visualizations, we reach insights and decisions. Technology allows us to create magic with our data, which engages us towards better decision making.
Given its power, how do we choose which is the right visualization? Throughout this book, we will talk about different visualization choices as we proceed. Because we are looking at dashboarding, we will look at dashboarding features such as KPIs.
In this recipe, we will look at different considerations when choosing your visualization. We will look at some of the default settings of Tableau and how they are affected by color blindness. We will also look at sparklines, which aim to provide as much context in as small a space as possible. This will be very useful in creating dashboards.
Getting ready
For the exercises in this chapter, take a copy of the Chapter 4
workbook and name it Chapter 5
. This workbook already has the data for the exercises in this chapter, so we do not need to make any changes to the data. We will delete all of the sheets except the KPI by Q
sheet. The following is an example of how the workbook will look:
How to do it...
- To start, let's rename the worksheet to
KPI Summary
by right-clicking on the tab at the bottom of the worksheet and selecting Rename Sheet. - Drag SalesTerritoryCountry to the Rows shelf.
- Drag Difference between Actual and Target to the canvas area.
- The boxes will appear colored, but you won't see any text. Drag Difference between Actual and Target to the Label button.
- Drag SalesTerritoryCountry to the Label button.
- Click on the Label button and then click on the button with ellipses next to Text.
- In the Edit Label textbox, use the Insert drop-down box to choose the field names that should be displayed. This will read as follows:
<SalesTerritoryCountry> <AGG(Difference between Actual and Target)>
- Click on OK.
- Let's duplicate the worksheet by right-clicking on the tab at the bottom and selecting Rename Sheet.
- Rename the duplicated worksheet to
KPI by Q
. - Drag Year(FullDateAlternateKey) to the Columns worksheet.
- We will start by creating a very simple KPI at first using the
KPI by Q
worksheet. To do this, we will create a new calculated field by going to the Analysis menu item and selecting the Create Calculated Field… option. - In the Calculated Field textbox, enter the following formula which you can see here:
IF [Difference between Actual and Target] > 0 THEN "Above" ELSEIF [Difference between Actual and Target] <= 0 THEN "Below" ELSE "Not Known" END
- The following screenshot shows you an example:
- Once you've created your calculation and returned to the main Tableau canvas, you'll see that there are five buttons in the Marks shelf. We can add a sixth button, Shape. To do that, go to the drop-down list below Marks and select Shape. You can see an example in the following screenshot:
- When you select Shape, you can see that we now have a sixth button called Shape. Your screen should now show the Shape button, as you can see in the following screenshot:
- You can then click on the Shape button to edit the shape's style and appearance. Drag the Difference between Actual and Target calculation onto the Shape button.
- When you click on the Shape button, you get a dialog box that offers you a number of options. Let's choose the Edit Shape option, which brings up the following dialog box:
- Here, we have selected the default KPI palette. A more traditional KPI selection selects the green tick for the Above option, which identifies the metrics that exceed the KPI success criteria. It follows that the Below option is indicated by a red cross. Finally, the Not Known criteria identifies the areas where there is no value present; in other words, it is a
NULL
value. - Remove the color marks from the Marks shelf. At this point, you will probably have AGG(Difference between Actual and Target) as a color mark, but this will prevent the KPI shape from working properly.
- Remove the Label marks from the Marks shelf.
- Drag Action(SalesTerritoryCountry) from the Sets pane in the Data sidebar to the Filters shelf. On the downward arrow at the right-hand side of the Action(SalesTerritoryCountry) pill, select Show In/Out of Set.
- On the dialog box, select In and then select OK.
- Once you have selected this option, click on OK. Once you have done this, the visualization will appear as in the following screenshot:
- Although we have created a KPI image, we would like to change it in order to cater to people who have color blindness. How can we show a KPI so that color blindness is taken into account?
- Let's take a look and see what Tableau does when we try to create a sparkline. To do this, duplicate the existing visualization so that we have a "point in time" of our work to date.
- Make sure that the SalesTerritoryCountry attribute is in the Rows shelf. Then, drag the Difference Between Actual and Target calculation to the Rows shelf.
- Select a line graph visualization from the Show Me panel. Make sure that the SalesTerritoryCountry attribute is in the Rows shelf.
- Drag the Difference Between Actual and Target calculation to the Color button.
- The Tableau visualization appears as follows:
- This still leaves us with the problem of showing data that isn't there—in other words, the
NULL
data. While we are building up to a sparkline, it can look misleading because the data looks as if it starts from 0 and curves upwards; in fact, there isn't any data there for the highlighted marks, which you can see in the following screenshot. We can opt to Hide the Not Known data. - First things first, however, let's fix the colors so that they are more appealing. To do this, click on the AGG (Difference between Actual and Target) metric and select the Edit Colors… option, as shown in the following screenshot:
- Since we have hidden the Not Known data, we only have the Above and Below data to worry about. For now, let's choose blue for the Above value and gray for the Below value. You can see a sample selection in the following screenshot:
- Now let's see how the completed visualization looks once we have resized it, as shown in the following screenshot:
To summarize, in this recipe, we have started to look at some dashboard visualizations. We will progress to look at others throughout the rest of the chapter.
How it works...
In this recipe, we have looked at some of the default settings in Tableau for creating KPIs along with some options for configuring them. It is clear that we can drastically change the appearance of the visualization by simply making a few changes to the default settings.
When creating data visualizations, it is vital to remember the audience. This may seem a simple, obvious statement. However, how many charts have you seen that mix red and green together? A lot, probably! In this recipe, we looked at alternatives to using red and green to convey a message. Here, we have made a color choice based on the business question. We went from a straight red-and-green visualization to one which used blue to represent the data which was the focus of the business question. There are a few issues with our initial KPI though. First, a NULL
value is represented by an icon when there is no data. It would be better to simply not show anything at all; this would reduce unnecessary "ink" on the page and would require less effort to assimilate.
Another issue is that the image shows red and green on the same visualization, which isn't good for color-blind viewers. Although additional information is provided by the shape of the indicator, it's probably best to simply avoid it if possible. At that point, we could have created only a simple, red KPI to denote the data points where the target was not met and then moved on to a different topic.
Instead, we then changed from using a shape to represent the data to a sparkline. Sparklines were devised by Edward Tufte in order to express a "small, intense, simple, word-sized graphic" to represent data. Sparklines are supposed to be able to be embedded into sentences. They are used to express data in a very compressed, at-a-glance way, which means that they are perfect for dashboards.
There's more...
Red and green KPIs are popular, but they are not always the best solution. Sparklines can give us more context by providing more information over time, but this can be misleading in certain cases. In the next recipe, we will look at ways in which we can improve the sparkline to manage this scenario.
Using parameters in dashboards
Parameters are dynamic "placeholders" that can help to control the dashboard appearance by storing information to help drive the flexibility of the dashboard.
We can use parameters to make our visualizations more accurate by using them to control which data is to be displayed and which is to be hidden in response to user input. In this recipe, we will look at using parameters in order to drive the display and enhancing the accuracy of the presentation by controlling the inclusion of NULL
values in the dashboard.
If NULL
values are present in the data, then this might cause a misleading representation of the data. On the other hand, NULL
values might be useful because they could tell us about the quality of the data itself. Sometimes, the data visualization project turns into a data quality project. Often, when data is visualized, it is the first time that business users have seen their data. This means that they can sometimes get a nasty surprise! Data can be missing, incomplete, or perhaps plain wrong. Therefore, it is important that dashboard consumers understand about data quality issues rather than being distracted by the shininess of technology.
Ultimately, the effectiveness of the data visualization rests on the accuracy of the data. In other words, even if the dashboard is perfectly formed, it will be no good if it shows inaccurate data.
In this recipe, we will use a parameter to give the user the opportunity to reveal or hide some data, dependent on the user's choice. This recipe is quite involved, because it covers parameters, dual axes, calculated fields, and some aspects of data visualization. So let's get started!
Getting ready
For the exercises in this recipe, let's continue with our Chapter 5
workbook so that we can see the progression from the initial KPIs to the end of the chapter. So, let's take the KPI by Q
worksheet and make a copy of it; we will rename it KPI Shapes
.
How to do it...
- Once you've made a copy of the
KPI by Q
worksheet and renamed itKPI Shapes
, let's create our first parameter. This is very easy to do; simply go to the Measures box at the sidebar of the Tableau workbook. Right-click anywhere inside this box and you will get a pop-up menu. You can see an example of this pop-up menu next. When it appears, simply click on Create Parameter..., which is highlighted in the following screenshot. - Once you have clicked on the Create Parameter option, you will get a dialog box named Create Parameter. This allows you to configure the parameter. We will set up a parameter that will allow the users to choose whether or not they want to display Null values.
- First, let's give our parameter a name so that its purpose is explained precisely. We will call it
Show Nulls
. This parameter is very simple. It is set to1
ifNULL
values are to be shown and set to0
if theNULL
values are to be hidden. Since we are using integers as a setting, we should keep the Data Type setting as Integer. The Current Value option gives the parameter a value as a starting point. Once you have completed these fields, your Create Parameter dialog box should appear exactly as shown in the following screenshot: - We need to set up a calculation to control the parameter. To do this, create a new calculated field by right-clicking in the Parameter box again and choosing the Create Calculated Field option. We will use the calculation to make a rule that will drive the parameter. Our rule will specify that if the parameter setting is to show the
NULL
values, the parameter value is set to1
. This will display the second copy of the difference between the Actual and Target measure. If it is set to0
, only then will the line graph show on its own, which will show theNULL
values as well as the actual data. - You can see our calculated field in the next screenshot. The calculated field is called Difference (Show or Hide). The calculated rule incorporates a rule that says that if the parameter Show Nulls is equal to
1
, then show the Difference Between Actual and Target metric. If Show Nulls is not equal to1
, then the rule fails, and it does not show anything at all. The following is an example of the calculated field: - Once you have created the calculated field, click on OK, and this will take you back to the Tableau workbook. Then, drag the Difference between Actual and Target measure and Difference (Show or Hide) to the Rows shelf, making sure that SalesTerritoryCountry is also on the rows. Then, choose the Dual Line Axis option from the Show Me panel. Your Tableau worksheet will look similar to the following screenshot:
How can we clearly differentiate where the actual data points are? The problem with the dual axis, as it stands, is that it will start at 0 if there is no data. This is because the axis is aligned to the year and the country, and if there is no data, it will simply map the data as having a value of 0. This can be misleading, however. For example, the data for Germany shows that there is a line showing data for 2005 and 2006, which ends up at a value of £1 million for the year 2008. However, this is a bit misleading; in fact, there was no data for the years 2005 or 2006; there was only data for 2007 and 2008. It would be better if this was clearer to the user.
Let's make the story of the data clearer to the business user by setting the colors and the line chart. This KPI panel is illustrating the data to illustrate an answer to a business question: which countries failed to meet their targets and when? This means that we are interested in emphasizing the losses made. We can do this by coloring these data points in red, which is a color normally used to denote a warning or a loss. Since we are not so interested in data where the countries made their targets, we will use the color gray so that this data takes a "background". Let's do this first for the Difference between Actual and Target data by dragging this measure onto the Color button. This will give us the following Edit Colors dialog box. Although we choose the option for Red-Blue Diverging, if we select the Stepped Color option and set it to 3 steps, then, we can get two different shades of red and one gray color. You can see an example of this setting in the following screenshot:
- Once you have configured the color for the Difference between Actual and Target metric, let's move forward to set the color for the Difference (Show or Hide) metric. Drag the Difference (Show or Hide) metric onto the Color button and you will get the Edit Colors dialog box. This time, we will select the same Red-Blue Diverging option, but we will choose 2 steps in the Stepped Color box, as shown in the following screenshot:
- Now let's make sure that the Difference between Actual and Target metric is set to a line. We can see this because the mark for the Difference between Actual and Target metric has a small line next to it, denoting that it is set to a line chart, as shown in the following screenshot:
- Now, click on the AGG(Difference(Show or Hide)) metric under the buttons, and this will reveal the buttons for editing this metric. In the drop-down list, choose Circle for the AGG(Difference(Show or Hide)) metric. This will distinguish it from the Difference between Actual and Target metric, which is denoted by a line graph, as shown in the following screenshot:
- We can set a border around the circles so that they are defined. At the same time, we can make the color transparent so that we get a layered effect. To do this, click on the Color button and you will get a pop-up menu. Set the transparency to 50 percent and choose a light purple for the border. You can see the options in the following screenshot:
- The last thing we need to do is to show the parameter control, which will give the user the option of explicitly showing the data points that are not
NULL
or leaving the chart as is. To show the parameter control, right-click on the Parameters section and select the Show Parameter Control option from the pop-up list. - The data visualization has now been completed, so let's test it out to see how it looks. In the following screenshot, you can see the parameter control at the top-right corner:
- If you choose the Show Nulls option, then you will get the data points appearing. This allows you to see which points are actual data and which points are
NULL
. - If we choose the Hide Nulls option, then we can see that the United States has engendered a loss which was unacceptable, but the other countries have not. However, this shows the
NULL
values, which assumes that all of the countries have commenced at the same starting point, as shown in the following screenshot: - To summarize, using parameters to drive the data visualization, we can make our dashboards interactive and more sensitive to data quality.
How it works...
To sum up, in this recipe, we have looked at data quality, calculations, parameters, and data visualizations. These are all interesting topics in their own right, and the objective of this recipe was to show that we can put them together in interesting ways in order to produce a dashboard. Tableau allows us to be very creative with our data in order to satisfy user requirements.
How did we use parameters in Tableau? To set up this visualization, we set up a dual line axis which has two measures on it: one is Difference between Actual and Target and the other is a calculated field that has a rule in it, which shows or hides a copy of the Difference between Actual and Target measure. Yes, in other words, we show this measure twice on the dual axis, or show the measure only once, depending on the choice of the user. The difference is in the way in which we represent each copy of the measure. One copy of the measure is a line graph, which is always shown, and the second copy is a dot plot, which only shows the data that is present. The parameter shows, or hides, the second version of the measure in order to show which data points actually exist.
Using custom geocoding in Tableau
Organizations often have their own definitions of geographic data. Although country names stay relatively static, their classification can change as the organization emerges from one level of maturity to another. Sometimes, for example, an organization can start with a very simple division: North America and EMEA. However, as the organization grows, it might split off into North America, Europe, Asia Pacific, and Rest of the World, for example. This can mean that the geography has a business context and meaning as well as describes a physical location.
Since some geography is fairly standard, Tableau offers a default interpretation of certain geographic data to help you automatically create maps from your data. The default interpretation includes countries, states, and area codes, for example. However, Tableau's default interpretation can be tailored to align with the business interpretation of geographic data.
In this recipe, we will add in some customized geographic data by importing a custom file and then using the customization to create a data visualization. The data is taken from the Human Development Index research, which is part of the United Nations Development Programme, which in turn is an organization that has the goal of "advocating for change and connecting countries to knowledge, experience, and resources to help people build a better life." The Human Development Index (HDI) is a new way of measuring development by using metrics such as life expectancy, educational success, and income and combining them into one measure. You can find more information about the HDI metric at http://hdr.undp.org/en/statistics/hdi/.
In this recipe, we will look at importing custom geocoding. One interesting feature of this exercise is how we go about using color to indicate rank.
Getting ready
Let's continue to use the Chapter 5
workbook. We have an amended DimSalesTerritory
to reimport, which contains the HDI rank of each country in the AdventureWorks
database. To do this, replace the existing DimSalesTerritory.csv
file with the DimSalesTerritory.csv
file of the Chapter 5
workbook. If you open the new file, you will see that it contains an additional column: HDIRank
. To refresh the data, simply go to the Data menu option and select Refresh All Extracts. You should see a new column called HDIRank
in the DImSalesTerritory
dimension.
For the purposes of creating data for customizing geocoding in Tableau, you need data that follows a number of rules:
- The filename must be called the same as the key of the data
- The file must be in CSV format
For the purposes of this example, there is a small file that you can download at http://bit.ly/TableauBookCh5HDIRank. The data file contains three columns: a nominal latitude and longitude of the countries contained in the AdventureWorks
database along with their HDI ranks according to the HDI 2013 report.
How to do it…
- Once you have made a copy of the data file, let's open our
Chapter 5
workbook in Tableau and proceed to import the custom data file. This is very simple. Simply go to the Map file menu item, then go to the Geocoding file menu item, and then select the Import Custom Geocoding option. You can see an example of this in the following screenshot: - Once you have selected this option, you will get a small dialog box that asks you for the location of the file. The following is an example of the Import Custom Geocoding textbox:
- When you have navigated to the file, select the Import button and your import is complete. Once you have imported the customized geography, you should be able to see it as part of the geographical role options in the Tableau drop-down list. You can see this in the next screenshot.
- If you go to the DimSalesTerritory measures pane in the sidebar, you will see the new HDIRank column. Drag it to the Dimensions pane to make it a dimension.
- Now, if you right-click on it and look under Geographic Role, you will see we have a new option called HDIRank, as shown in the following screenshot:
- The next step is to assign the HDIRank geographical role to the HDIRank dimension attribute. To do this, right-click on HDIRank, navigate to Geographical Role, and then select HDIRank under Geographical Role. Once you have done this, we will be able to use our HDIRank in our data visualization. You can see that the HDIRank symbol changes to show that this has a custom geographical data role. The following is an example of the symbol:
- For example, we can use HDIRank to drive the color that denotes each country. To do this, drag it to the Color button. Since we are using a rank to distinguish the countries, we can use a sequential palette to show that the data is on a continuum rather than in separate categories.
- In this example, we will select the orange color.
- We can change the colors quite easily using the Edit Color dialog box. You will need to change each color so that the lower values have brighter and darker colors than the lower ranks. The following screenshot is an example:
- Once you have edited the colors, let's use Sales Amount to denote the size. Then, choose the Heatmap option from the Show Me panel and our visualization now looks like the following screenshot:
To summarize, we have used our custom geography to help identify the rank of each country. Since the new role appears in the drop-down list as part of the Tableau interface, it is very easy for report developers to use it as part of their dashboards.
How it works...
Once again, we see that color plays a vital role in conveying the message of the data. In line with research on how to visualize data, Tableau will assume that the lower values should be assigned a less intense color, and higher values should be assigned a darker, brighter, or more intense color. If our data was rational or interval in nature, this would be correct. However, we are looking at ranking data, so the situation is reversed. In other words, the lower the number, the higher the rank.
See also
- If you'd like to see the full HDI 2013 report, you can find it at http://hdr.undp.org/en/media/HDR2013_EN_Summary.pdf
Profiting from Big Data to rev your visualization
We live in a world where everything is Big Data. Many organizations are burdened with too much data, and it is a common problem. The problem is made worse by the fact that many people aren't sure what to do with the data due to its size and complexity. In today's enterprises, data is often in disparate locations as well as growing in size. This situation is reflected in this recipe since it requires a lot of "moving parts" to be put together, such as the downloaded data, the Hortonworks Sandbox, and Tableau.
For the purposes of simplicity and clarity, we will simply use a small amount of data rather than a Big Data source. This will help you to manipulate the data more easily since it is in an accessible format. Often, the key factor in the importance of data is how often it is used and how many business processes depend on the data, rather than its size. So, don't ignore the little data!
If you don't have access to Big Data technologies, don't feel excluded from the party. There is no need for you to skip over this chapter. We will base our example on the Hortonworks Sandbox, which is freely available over the Internet for you to use. Further more, it is already preconfigured for you, so it is the easiest way possible to ramp up towards Big Data for free.
Getting ready
To use Windows Azure DataMarket, you will need a Microsoft account, such as a live account, Hotmail, an MSN account, or others of the kind. This is free to set up if you don't already have one. To do this, visit https://login.live.com/ and look for the Sign Up Now link to follow the wizard through the process.
To learn more about Big Data solutions, a great place to start is the Hadoop Sandbox, generously provided by Hortonworks for free and preconfigured for you to get started straightaway. To get started, you need to download the Hortonworks Sandbox from the Hortonworks website at VMware Player, which is very easy to use and will work with the Hortonworks Sandbox. You can find it at https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/6_0 .
Once you have configured the Sandbox, you need to download some sample code to put onto the Sandbox. To enrich the data, we will use a country code set of data, which will give us a lot of information about individual countries. For this example, we will reuse the country file that we downloaded from Azure in using a Windows Live ID. To do this, please visit https://datamarket.azure.com/.
The data can be found at https://datamarket.azure.com/dataset/oh22is/countrycodes#schemaAbout and, halfway down the page, look for the link to Explore this Dataset. On the right-hand side, you will see the option Download Options. Select the option to download as CSV.
You will see that there are a lot of columns, and we won't need them for this example. In order to make the example simple, let's keep only the following columns:
Area
Capital
Continent
Countrycallingcode
Countryid
Countryname
Currency
Fips
IOC
ISO3
Isonumeric
Nato2
Nato3
Population
Now we are ready to use the Big Data technology in order to enrich our data in Tableau.
How to do it…
- Let's upload our CSV file into the Hortonworks Sandbox. This is straightforward. Once you have the Hortonworks Sandbox open in your browser, create a directory called
GoldenRecord
. To proceed, go to the File Explorer option and select Upload File to theGoldenRecord
directory. - Once you have uploaded the file, let's create the table. To do this, we will run a query in the Query Editor interface. You can see an example of this interface in the following screenshot:
- Copy the following script and paste it into the query editor as shown in the preceding screenshot:
CREATE TABLE IF NOT EXISTS CountryInformation ( countryID INT COMMENT 'Country ID', CountryName STRING COMMENT 'Country Name', ISO3 STRING COMMENT 'ISO3 Column', ISONumeric INT COMMENT 'ISO Numeric', FIPS STRING COMMENT 'FIPS', Continent STRING COMMENT 'Continent', Currency STRING COMMENT 'Currency', CountryCallingCode INT COMMENT 'Country Calling Code', IOC STRING COMMENT 'IOC', NATO2 STRING COMMENT 'NATO2 Classification', NATO3 STRING COMMENT 'NATO3 Classification', Capital STRING COMMENT 'Capital City', Area STRING COMMENT 'Country Area in SQ M', Population STRING COMMENT 'Population' ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/user/hue/GoldenRecord'
- Once you have done this, click on the Execute button and the script will create a table called
CountryInformation
. When you click on the Tables option, you can see theCountryInformation
table. You can see this in the following screenshot: - If you click on the CountryInformation link, you will see the columns that you created. The following is an example of the columns as seen in the Hue browser:
- You can also see a sample of the data by clicking on the Sample link. The following is an example of the data:
We are now finished working with the Hortonworks Sandbox. Next, we need to use Tableau to connect to the Hortonworks store. There are two ways to do this. If you have Tableau Professional edition, then you have enabled connectivity to Hortonworks. You can see this if you go to Connect to Data on the Tableau workbook. If you have Tableau Desktop edition, you will need to use Excel to connect to the Hortonworks Sandbox and download the data from there. Once the data is downloaded to Excel, you can store it and connect to it easily. If this was a real-life scenario, this wouldn't be satisfactory because the data might go out of date very quickly.
- However, for the purposes of our simple example, the data could be loaded from Excel into Tableau Desktop edition. However, we'll connect to the data source, and let's call it
GoldCountryCodes
. - Once the data is in the Tableau workbook, we can join it to the other tables using the country name. You can see an example of the joining in the following screenshot:
- Once the data is joined together, we can visualize it in Tableau. We will have a mashup of the CSV files and Big Data technology—all in the same Tableau workbook.
- For example, you could use the SalesAmount measure and put it next to the Population data from the external file. As a starting point, you could make a table and then see where the data takes you! The following screenshot shows your starting point:
Why not try some of the visualizations in Tableau based on this data to see how it looks?
How it works...
Note that this recipe only uses small data files as an example, and it is not intended to be a real-world Big Data exercise where we are transferring petabytes of data. Excel is used as an accessible example of a data source for training purposes.
The beauty of mixing Big Data sources with Little Data is that the user is insulated from the size of the data. Instead, they can visualize their data from different stores and different formats.
Fortunately, Tableau offers us a royal road to understanding the data by helping us to visualize it quickly and easily. It also allows us the ability to explore the data so that it starts to make sense regardless of whether it is Big Data or the important Little Data that makes up the data currency of the enterprise. We can enrich our existing data stores by using Big Data technologies, and this is the theme that we explored in this recipe.
In this recipe, we also made changes to the data by enriching it with a Big Data source. Big Data solutions are becoming more prevalent, but there is still a need for simplicity in accessing data regardless of its size. In this example, Tableau used a simple ODBC connector to access the data held in the Hortonworks Sandbox. A common experience among data analysts is not being able to get access to the data that they want. Therefore, the simplicity of accessing the data is vital, and ODBC is a common way of accessing data that is familiar to IT professionals.
Once we have access to the data via ODBC, there is no stopping us! Tableau then sees the data as another data source, in the same way as it sees data from Excel or OData, for example. In other words, this mechanism is a great "leveler" of data access since the data is accessible regardless of its size.
Tableau obviously cannot suck in petabytes of data (yet!), and this is one scenario where Big Data will need to stay outside of Tableau as an external data source. On the other hand, as we saw earlier, Business Intelligence requirements often involve summarizing data for averages, counts, and so on. It can be useful to crunch the data down into manageable summaries, and Tableau could access the summary data rather than the full Big Data itself. These issues are architectural questions, but a summary is a good place to start before moving forward to bigger questions.
There's more...
If you are interested in learning more about Big Data, you will find that the Hortonworks Sandbox already has a number of preconfigured tutorials. This is a great resource to get you started looking at Hadoop.
Filtering your data for focus
Dashboards are very compact ways of communicating data because they are constrained by space. One way in which we can make more of the "real estate" on the dashboard is to use filters.
Tableau has three ways to filter dashboards. Global filters apply to every part of the workbook that uses the same data source. This might be a problem, however. What happens if you want the filter to apply in some cases, but not others?
Local filters are specific to only one region of the dashboard. However, this may make them too restrictive. Tableau 8.1 now has a new filter feature, which allows you to stipulate a selection of worksheets for the filter rather than being specific to a data source. We can apply the filter to all worksheets that use the data source, and to do this, you can choose the All Using This Data Source option. It is also possible to let the filter apply to only the current worksheet, and you would select the Only This Worksheet option for this.
In this recipe, we will look at the new filter advancements of Tableau 8.1. We will work towards changing a chart into a filled map to show the sales amount, filtering by year. We will add in some new dashboard elements and get them to "talk" to one another by the use of filters. We will apply our filter to selected worksheets rather than the previous editions of Tableau, where it was more "all or nothing" in terms of filtering the data visualization.
For the purposes of this recipe, we will want to select only some of the worksheets, so we'll select the Selected Worksheets option.
Getting ready
In this recipe, we will continue to use the workbook we created for previous recipes of this chapter. There is no need to add more data sources.
How to do it…
- Create a new dashboard called KPI Analysis.
- Take the KPI Summary and KPI Sparkline worksheets and put them in the dashboard by dragging-and-dropping them into place.
- Next, add a title at the top of the dashboard, asking the question
How well did the countries perform?
- Place a Blank object underneath the two worksheet objects so that the countries are aligned and read left to right. Your dashboard should now look like the following screenshot:
- When we go to the original KPI Sparkline sheet, we will add the SalesTerritoryCountry filter so that we can filter by country. We do this by dragging the SalesTerritoryCountry attribute to the Filters shelf. The following screenshot is an example:
- This option is fulfilled by selecting the Selected Worksheets option as follows:
- For the purposes of this recipe, we will select the KPI Sparkline worksheets. We do this by simply checking the boxes next to these names. The following screenshot shows an example of the dialog box:
- Once you have clicked on the checkboxes next to the worksheets, click on OK. Once the filters are created, you will need to click on them to display the filters. To do this, click on the filter and choose the Show Filter option.
- Now, go to our Golden Record worksheet.
- To change the Golden Record visualization to a map, simply select the Filled Map option from the Show Me pane.
- Remove Population if it is present, and use SalesAmount to illustrate the sales amount value. In the following example, red has been used because it represents the sales amount for the other components of the dashboard:
- Let's filter the filled map to only show the latest year's data by simply taking FullDateAlternateKey and dragging it to the Filters shelf. When you do this, a Filter Field dialog box will be initiated. We will select the Years filter and will restrict the data so that only the data from the latest year is shown, which in this case is the year 2008. You can see an example of this in the following screenshot:
- The next step, as shown in the next screenshot, is to select the year 2008 from the Filter dialog box; this will filter the map. However, we are not applying this filter to the rest of the dashboard. This means that the map will stay static even though we have filters shared across the other components of the dashboard, KPI Summary and KPI Sparkline.
- Once you have selected the year 2008, click on OK, and we are returned to the map.
- Finally, let's put the Golden Record worksheet with the filled map into the KPI Analysis dashboard. It is static data, and we will place it down the bottom so that it acts as an anchor for the rest of the more detailed data.
The Tableau dashboard will now look as follows:
How it works...
Filtering data is a key part of the Visual Information-Seeking Mantra article by Professor Ben Shneiderman, and users expect to be able to filter and interact with their data. Although the principles will not hold if people only want very detailed row-level data, the mantra is good to keep in your head when you are thinking about designing your dashboard. Dashboards are about actionable overviews rather than the detail about one row. Hence, the summary is an essential part of providing the overview.
Filtering is a good way to promote engagement with your dashboards. In marketing, stickiness refers to anything that encourages readers to stay on your website. In dashboard creation and reporting, stickiness can refer to features that increase the likelihood that users will stay on your dashboard and use it.
We can use filters in order to make our dashboards more flexible in response to user input, which may help to keep the dashboard engaging and interesting for data consumers. A key aspect to dashboarding is that we need to make the most of the space while engaging the user in the key facts of the data. Filters can help us to do that easily in Tableau.
Once again, color is key to conveying the message of the data. In this example, red is used in both the KPI Summary and KPI Sparkline worksheets, and the color is split into three steps in order to simplify the classification of the sales amount. People don't always distinguish fine-grained nuances of color, and using the Stepped Color feature of the Edit Color panel makes the data simpler to understand.
There's more...
For the purposes of this recipe, we have only selected some of the worksheets, and the Selected Worksheets option serves our purpose.
See also
Creating choices in dashboards using conditional logic
Logical calculations can make your analyses richer. They can also make things easier for a dashboard consumer. For example, logical calculations can help you to funnel the analysis to specific-dimension members, combine members to follow a business rule, or even remove values that are irrelevant to your investigation.
Normally, when we use filters, we select the attributes within a dimension. In this recipe, we will implement logical calculations so that users can choose different dimensions to describe the data. We will place a small control on the dashboard so that users can simply click to choose the dimension that they would like to see, which describe the sales amount data. Users can simply click on which dimension they would like to see, for example, by color, country, or product line.
In this recipe, we will need to make a calculated field using a logical calculation and a number of parameters, amend the colors, and so on. Our sequence is to set up some parameters, a calculated field, and then some filters. So, let's get started!
Getting ready
For the exercises in this recipe, we will continue to use the Chapter
5
workbook. There is no need to add any more data. Let's make a copy of the KPI Sparkline worksheet and call it KPI Dimensions
. We will also make a copy of the dashboard and call it KPI Dimension Analysis
.
How to do it...
- Firstly, let's set up a parameter so that the user can choose a metric. To set up a parameter, right-click on the Measure box and select the Create Parameter option. We will call our parameter
Choose Characteristic
and will set up a list for each metric. Make sure that you set up List as an allowable value and type each metric name in the list of values. - The following is an example of the resulting parameter:
- Next, we need to add a calculation that will help us choose different dimensions. We will use a
CASE
statement, which is simply like lots ofIF…THEN…ELSE
statements strung together. The calculation will execute the first statement that it finds to be true. - We will set up a
CASE
statement that chooses between dimensions, dependent on the user selection. We will offer the following choice of different measures so the dashboard consumer can select the measure they would like to see on the dashboard:- Sales amount
- Sales amount quote
- To do this, right-click on the Dimension part of the Tableau workbook and select the Create Calculated Field option. Our logical calculation is written as follows:
CASE [Choose Characteristic] WHEN "All" THEN "All" WHEN "Color" THEN [Color] WHEN "Country" THEN [SalesTerritoryCountry] WHEN "Product Line" THEN [ProductLine] END
- The
CASE
calculation allows us to simply show all of the data not described by any dimension, or show by color, country, or product line. - The logical calculation can be seen in the calculation editor, as shown in the following screenshot. Simply copy and paste the mentioned calculation and put it into the textbox. We will call the calculation
Calc_ChooseCharacteristic
. - Once you have clicked on OK, you will see it in the Dimensions pane in the sidebar in the Tableau workbook.
- Now, we need to set up some pills so that our workbook visualization is filtered according to the user-selected dimension. Now, take the Calc_ChooseCharacteristic filter and drag it to the Marks shelf so that the detail is retained.
- Next, we need to drag Calc_ChooseCharacteristic and put it to the Color button. This is a key part because it tells Tableau that it needs to change the display dependent on the selected dimension. The calculated fields implement the logic to denote which dimension should be displayed.
- This selected Dimension value is held in the Choose Characteristic parameter. The color is dependent on the result of the
CASE
statement evaluation in the Calc_ChooseCharacteristic calculation, and this is how Tableau differentiates in the display. We will need to make sure that users can select their preferred dimension, and to do this, they will need to see the parameter control. And you will need to see it in order to test it out! Simply go to the parameter called Choose Characteristic, right-click on it, and select the Show Parameter Control option. - Since we are using Color as a potential dimension, we can make the palette-defined colors match the actual colors of the merchandise. To do this, click on the Color button and select the Edit Colors option. We can then set the color attribute Red to be red, the color attribute Blue to be set to the color blue, and so on. The following screenshot is an example:
- To finalize the visualization, you need to make sure that Year(FullDateAlternateKey) is on the Columns shelf and that Sum(SalesAmount) is on the Rows shelf.
- Let's set Marks to be Circle rather than Automatic.
- We can also add SUM(SalesAmount) so that the diameter of each circle becomes a representative of the SUM(SalesAmount) value.
- Let's go to our dashboard called KPI Dimension Analysis, which we copied earlier. Let's remove the KPI Sparkline workbook and insert the new KPI Dimensions workbook instead. We will need to show our parameter control; to do this, click on the KPI Dimensions area of the dimension, select Parameters, and select the Choose Characteristic option. We should also add in the Calc_ChooseCharacteristic filter, which is under Quick Filters in the same menu. You can see an example in the following screenshot:
- You can also remove the title and any headings so that you are making the most of the space. Overall in the dashboard, you will need to change the business question that has been posed. Here, we have changed it to
How well did sales perform, by different characteristics?
. - You can see our final dashboard and the culmination of this chapter's work in the following screenshot:
- Our dashboard is now complete, and you can have fun changing the parameters and switching dimensions. This will create interesting patterns in the data for your users.
How it works...
In this recipe, we implemented logical calculations that allow users to change the dimension that appears on the dashboard. We are starting to make our dashboards look more interesting and more interactive.
Showing different dimensions helps the dashboard designer to make the most of the space, while maximizing the choices available to the dashboard user. Setting up all of the moving parts is not as quick as other visualizations, but it is worth the effort to help the users.
For this recipe, we will use the KPI Dimension worksheet and see the result in our dashboard.
There's more…
In this chapter, we have looked at many different ways to help users to engage with dashboards, such as choosing visualizations and making use of items to help with analysis such as parameters, Big Data technologies, and conditional logic. These tools will allow us to create dashboards that are powerful and rich, and who knows, maybe even fun!