Sunday, October 14, 2012

Excel Pivot Tables - 1 - Summariztion

I'm planning to do an intro tutorial and talk on Excel Power Pivots and was thinking many people are still unfamiliar with regular, plain old pivot tables. They may have used them on a project, but just kind of fiddled with them until they looked right rather than understanding the concepts.

Data analysis is basically summarization.  It's impossible for someone to look at a list of a million individual sales and get any kind of feel for the way a business is going. But they can look at total sales dollars and get some of the feeling. We made the raw data more valuable just by summarizing it up into a grand total.

Now if we took that data and summarized it by and intermediate level like US State, we improve its usefulness. We still have the grand total that tells us how well the company is doing overall. But now we can also tell that Arizona has more total sales than Florida.

Let say we summarized the raw data yet again, by month. We still have the grand total. We still have the total by state. But now we can also tell that Florida sales are low because they've been dropping for the last 5 month. And while we still have the totals by state, we can now also tell that sales climb as summer approaches and drop as we head into fall.

In the cases above, the sales dollars are a measure, or what we are summarizing. The months and the states are the dimensions, or what we are summarizing by.

The human mind can picture one or two dimensions easily as a line or an XY chart. We can even visualize with some difficulty three dimensions as a 3D chart. But really data analysis can use as many dimensions as are useful to apply to the data. Don't get caught up on trying to visualize it. Remember dimensions are just what we're summarizing the data by. So our sales data could be summarized by state, month, product family, product, salesperson, etc. even though we can't picture a shape to represent it.

So on to the fun stuff. How do we do this with pivot tables?

(1) Let's start off by going to AdvertureWorks and pulling off a list of products.


(2) Now, to create our pivot table, go to the Insert tab and click on PivotTable.


(3) You'll get the dialog box for creating a pivot table

* Note the dashed line around your data. As long as you are clicked on some cell inside your table, and a single cell, Excel will correctly select the area. But it's worth a quick look to verify.

I normally leave the default checkbox to insert the pivot table as a new worksheet.


(4) Click ok, and you'll get the pivot table control screen.

There are three major areas here.

Area A is the actual pivot table. This is where the pivot results are displayed.

Area B is the list of fields which you can use both for summarizing and summarizing by.

Area C is where you drag the fields to to build the pivot table.


(5) For our first sample, we're going to do a summary of sales amount summarized by porduct. So take product name (Name) and drag it down to Row Labels, and take LineTotal, the sales dollars of each order line, and drag it down to Values.

* Note that LineTotal becomes Sum of LineTotal.



(6) Let's stop here and take a breath. You've created your first pivot table!

So the pivot table control control screen give us the ability to move the field we want to summarize to the Values section and the fields we want to summarize by to the Row Labels or Column Labels depending on whether you want the labels down the side or across the top. Normally you will want to start with the labels down the side.

(7) Let's change to summarize by color

First, get rid of the summarization by product by dragging product name (Name) from Row Values and "dropping" it off the side of the screen. Then drag Color down to the empty Row Values. The pivot table reconfigures itself to display the new layout.


(8) If we wanted to have the labels for the products go across the top, we can just drag the Color field from the Row Labels.


Let's stop here for now. Even though the tutorial is short, what you've learned is real-life useful. You now have the tools to easily summarize a larger set of figures into a small summary to better understand the data.

 --Al-

No comments:

Post a Comment