Working with PivotTables.
A PivotTable is a ‘mechanism’ within Excel that
allows large quantities of like records, for
Excel or external data stores, to be summarised
into a concise, easily updateable report. The
simple ‘drag and drop’ interface allows for ease
of design and simple changing of the ‘view’ of
the underlying data. Despite its current
restriction of 65536 rows, Excel can summarise
hundreds of thousands of records from external
data sources via a PivotTable.
Most of all I would say they are one of the
least used, but most effective tools in Excel,
particularly as the quantity and diversity of
data grows ever greater.
Before invoking the construction of a
PivotTable, spend a small amount of time
thinking where you will get your data from and
whether that data is suitable. Spend particular
time looking at the ‘field’ names you have.
This is because Excel will prefix your field
names with the summary option you choose, for
example ‘SumOfSales’ or ‘CountOfProducts’. You
will not be able to change these back in the
PivotTable to Sales or Products, as Excel has to
have unique names in a PivotTable. Try also to
think where the data will go and how that will
look, will you use page fields, etc.
Here too are some limits that Excel sets for
pivot tables (from Excel help)
|
Feature |
Maximum limit |
|
PivotTables on a sheet |
Limited by available memory |
|
Unique items per field |
32,500 |
|
Rows in a PivotTable report |
Limited by available memory |
|
Page fields in a PivotTable report |
256 (may be limited by available memory) |
|
Data fields in a PivotTable report |
256 |
|
Calculated item formulas |
Limited by available memory |
Notice one other advantage of PivotTables is the
ability to have more than one on a worksheet, so
it is possible to have multiple views of the
same data at the same time, additionally it
should be borne in mind that you can, use an
existing PivotTable as the data source for a new
one. In fact, if you try to bring the same
external data into a workbook with a PivotTable
already in Excel it will ask if you would like
to save space in the file and use the existing
table as the data source.
Let’s get
started!
For the purposes of this demonstration I will
simply be using data from within the same Excel
workbook although the data source choice is
virtually limitless. In the case of Excel data
it is always worth naming your data range via
edit>name>define… mostly I use the name Database
as Excel seems to like this but any valid name
would be suitable.
Raw
Data
Below is an example of the raw data used in our
demonstration. As you can see this is ideal
data of order headers which clearly have
individual fields (columns) and repeating type
records (rows).

The building of a PivotTable is started from the
Data>PivotTable and PivotChart report… menu
option as below (Differs slightly in earlier
versions than that used (2003)).

This will present the PivotTable wizard which
will guide you through the options to build the
PivotTable.

As we are using an Excel list we have selected
that option it is possible to use Excel data
from another workbook and as you can see we
could select external data or multiple
consolidation ranges. If Excel senses an
existing PivotTable within the workbook the
dimmed option in the picture above (another
PivotTable report or PivotChart report) would be
selectable. At this stage we can also select to
produce a PivotTable or a PivotChart which will
include the PivotTable data. We have taken the
selection shown press ‘Next’ to move onto step 2
of the wizard below.

Because we had pre-named our data range
‘Database’ Excel senses this and offers it as
the possible data range for the PivotTable. In
our instance this is correct. You will notice
that this step also allows you to ‘Browse…’ for
another Excel workbook outside of the one in
which you are building the PivotTable. As Excel
has guessed correctly we will simply press
‘Next’ to move on to the last step of the wizard
as below.

This wizard asks for the position of the
PivotTable in this case we will take the ‘New
worksheet’ option as selected but we could
select to place the PivotTable on the ‘Existing
worksheet’ which would then allow us to enter a
cell reference in the ‘refedit’ box which is
dimmed in the dialog above.
Whilst we will move on and simply finish this
dialog it is also possible at this stage to set
the layout of the table and the table options
from this third wizard step.
We are now presented with the empty layout grid
on the worksheet clearly showing the different
parts that make up a PivotTable. These are row
fields, column fields, page fields and the data
area. The image below clearly shows this
together with the PivotTable Field List and the
PivotTable Toolbar, both of which we will be
using later (these are shown in the image
‘floating’ but can easily be dragged to the left
or right of the window and docked in the case of
the PivotTable Field List or docked on any edge
of the worksheet in the case of the PivotTable
Toolbar).

Fields are dropped on the PivotTable by simply
dragging items from the PivotTable Field List
and dropping them in their related zones. The
image below shows the ‘ShippedDate’ field
dropped into a ‘Row Field’ area with the
‘LinePrice’ field being dropped in the ‘Data
Items’ area.

Here we are also dragging the ‘OrderID’ into the
‘Data Items’ area from the PivotTable field list
as shown below. You will notice as you drag the
cursor arrow carries both a grey block which
represents the data field and a small image
which colours blue the area where you will drop
this data if your released your mouse button
now. In our image you can clearly see this will
be in the ‘Data’ area.

Excel, as this field is technically numeric, has
presumed that it should ‘Sum’ this field, hence
the title given of ‘Sum of OrderID’. This is
obviously wrong as our intention was to count
the number of orders, so we should set about
changing this error.

You will notice when we had added only one value
field that the ‘LinePrice’ had its own header
named ‘Sum of LinePrice’. However, as we add a
second data value, (‘OrderID’) the header
changes to simply ‘Data’. Therefore, to change
the individual headers which now sit below this
‘Data’ header, we will need to re-invoke the
PivotTable wizard. This is actioned via the
PivotTable toolbar as shown below.

Taking this option automatically starts the
wizard in the third step as shown below, as it
is presuming that you now wish to make a
‘Layout…’ or ‘Options…’ change. If, however,
you wish to make changes to either the data
range within Excel or the data source entirely
you should press the ‘Back’ button to return to
steps 2 or 1 respectively. We will be selecting
‘Layout…’ so as to make several changes to the
PivotTable shown above.

In the ‘Layout…’ dialog you will see all the
available fields stacked on the right together
with an image of the data areas (row, column,
page, data) and your current selections. Any of
the fields can be dragged into the grid or any
already in the grid dragged out to be removed.
Field labels already assigned to the grid can be
‘double-clicked’ to present the ‘PivotTable
Field’ dialog shown below.

This dialog allows the changing of all aspects
of an individual field, in this case we will be
changing the long name associated by Excel (‘Sum
of LinePrice’) to read ‘Sales’. We will be
formatting the cell as ‘Accounting’ using the
‘Number…’ button (which presents the regular
formatting dialog shown below).

The PivotTable field dialog is also where we
will change our ‘OrderID’ sub-total from Excel’s
guess of ‘Sum’ to the correct ‘Count’. (You can
change this on the worksheet if you desire and
the change will reflect in the dialog).
You should also note at this stage why it was so
important to think about the headers in your
source data as for example it would be
impossible to name this field ‘LinePrice’ as
this would therefore not be unique between your
Pivot and the source data.

After making these changes the resultant
PivotTable has more recognisable field headers
(Sales, Orders) and the Orders now show a
‘Count’ rather than a ‘Sum’.

We may, for example, want to show a margin
earned field in our PivotTable but you may have
noticed that this is not a field available in
our ‘PivotTable Field List’ or shown in our
‘PivotTable Wizard Layout Dialog’ above.
Therefore we need to add a ‘Calculated Field’.
Once again this is an option under the
‘PivotTable Toolbar’ as shown below.

This menu option results in the ‘Calculated
Field’ dialog shown below which lists all of the
available fields, a ‘Name’ box which will become
the field name in your Pivot for this
‘Calculated Field’ and must therefore have
normal naming conventions, e.g. no spaces etc.
and be unique. The ‘Formula’ box allows you to
enter almost any Excel function and when
completed the ‘Add’ button allows you to add the
‘Calculated Field’ to the collection. If you
returned this dialog selecting an existing
‘Calculated Field’ it would cause the ‘Add’
button to change to ‘Modify’. ‘Calculated
Fields’ can also be deleted from this dialog.
To insert fields they can either be selected and
the ‘Insert Field’ button pressed, or they can
simply be double-clicked.

The dialog image below shows the ‘Calculated
Field’ named to ‘Margin%’ (this will allow you
to re-name the field to ‘Margin’ in the Pivot as
each name must be unique). This is set by
pressing the ‘Add’ button as shown. This
‘Calculated Field’ will now become a selectable
field in the ‘PivotTable Field List’ or the
‘PivotTable Wizard Layout Dialog’.

Here we show the ‘Calculated Field’ of ‘Margin’
added to the PivotTable Data Area and formatted
as ‘Percent’ with two decimal places (ANY
‘Calculated Field’ can now be treated
identically to any field derived from your
source data).

Sometimes you would wish to filter your data by
another field, this is achieved using ‘Page
Fields’. Using the same process as detailed
above through the PivotTable wizard and layout
dialog we can drag the ‘ShipVia’ field to the
‘Page Field’ area as shown below (Layout dialog
trimmed for readability purposes, note the
‘Margin% Calculated Field’ we added above).

The result of the insertion of this ‘Page Field’
in the PivotTable is shown below.

When a ‘Page Field’ is added Excel will always
default the selection to ‘(All)’. In our
example the ‘ShipVia’ field contains shipper
names as shown below.

If we make a selection from this ‘Page Field’
drop down the data within our main PivotTable
will reflect this selection as shown below.

As we have seen above we can easily filter our
data on the individual shippers and this is
instantly reflected in the underlying Pivot. We
can however take a menu option to ‘Show Pages…’
from the data in our ‘Page’ field. This will
set up a new PivotTable on a new worksheet for
each selection in the ‘Page’ field. The menu
option is available on the ‘PivotTable Toolbar’
as shown below.

As we may have more than one ‘Page’ field Excel
presents the ‘Show Pages’ dialog as below so
that the user can make their selection as to
which ‘Page’ field they wish to show pages of.

In our case there is only one option and after
selecting this you can see that Excel adds
several sheet tabs each with a new PivotTable
showing the data for the relevant shipper (note
the sheet tab is named automatically and we show
the sheet tabs before and after selecting this
option).
Before

After

You will notice that our current PivotTable
layout shows order totals for each day. It will
be probable that at some point you may wish to
show this data in weeks, months, quarters or
years. This is easily achieved using the
PivotTables ‘Group and Show Detail’ feature.
This can be accessed either from the PivotTable
Toolbar or as shown right-clicking on the
‘ShippedDate’ column.

NOTE:- When you take this option Excel
will endeavour to parse all your source data
into a single data type, in this case ‘Dates’.
If your source data contains non-date, blank, or
text data Excel will produce the following
dialog.

If your data, in this instance, contains only
dates with no blanks, you will get the following
dialog:-

This dialog allows the selection of all
date/time values from seconds through to years.
Certain values can also be multiple-selected,
for example, you may select ‘Months’ and ‘Years’
or if your data spans more than one year
selecting only ‘Months’ will group for example,
all January sales whether they are in 1996, 1997
or 1998.
By default the date range selected is that
detected in your ‘Source Data’ by Excel. In our
example the 10th July 1996 through to
18th May 1998. If you require your
PivotTable to display data through a shorter
range than this you should type values in the
‘Starting at’ and ‘Ending at’ boxes, Excel will
automatically remove the check marks from the
‘Auto’ boxes.
We will be moving forward with our example using
‘Month’ and ‘Year’ data, but one selection that
often confuses users is that of selection a
‘Week’ as it does not appear in the grouping
drop down. This is achieved simply by selecting
‘Days’ which will all a selection in the ‘Number
of days:’ spinner box which should be set to 7
as below. You can then ensure your week starts
on a certain day by once again altering the
‘Starting at’ value in the grouping dialog.

The selections made in the above dialog resulted
in the PivotTable extract below.

As mentioned above, however, we will continue
with a selection of ‘Months’ and ‘Years’
resulting in the grid below.

It may now become necessary to only show the
sales etc. for a certain year. To do this we
use the drop down filter next to the field
against which we wish to restrict the data. The
year filter drop down is shown below.

Here we are selecting purely the year 1997 and
you will notice that Excel gives an option for
<10/07/1996 and >18/05/1998, although this is
outside of the date range in the source data.
If you are only looking to select one item as
above, it is quicker to click ‘(Show All)’ and
click it again which will un-select all options
and then you can simply select the one you
require. Clicking ‘OK’ applies the filter as
below.

Occasionally the selections you have made still
show too much or not enough data. You can
correct this by either double-clicking on a cell
where you would expect to be able to hide
detail, for example double-clicking on a year
would hide the detail of the months associated
with it, or you can right-click on a row header
or select from the ‘PivotTable Toolbar’ the
‘Group and Show Detail’ option that we used
above to group the ‘Date’ data (inserted again
for clarity).

If you select ‘Show Detail’ against say a
‘Month’ field Excel would have no idea what
detail you wanted to show as in our current
example ‘Month’ is the greatest detail. You
would therefore be presented with a ‘Show
Detail’ dialog from which to make your selection
and this is shown below.

So for example making the selection as shown
above of ‘Country’ would show the detail by
month for each country as shown in the grid
below.

Now we have our Pivot built and displayed to our
liking, we can continue to sort our data. Data
can be sorted by any row or data field and can
be achieved either by double-clicking a row or
column header or via the ‘Sort and Top 10…’
option from the PivotTable toolbar as shown
below.

If you are accessing the menu option via the
double-click method you will get the PivotTable
Field dialog below from which you should select
the ‘Advanced…’ option.

Selecting the ‘PivotTable Sort and Top 10…’
option above presents a slightly shortened
version of the dialog achieved when you select
‘Advanced…’ from the PivotTable field dialog.
For consistency, both are shown below:-

By default this dialog shows ‘Manual’ selected,
for our purposes we are going to select
‘Descending’ and in the ‘Using field’ drop down
we will select ‘Margin%’. The result of this
action is shown in the PivotTable grid below.

Occasionally you may want to see a
representation of your data in relation to other
data rather than simply a sum of sales or a
count of orders etc. This can be achieved by
selecting a value from the ‘Data’ area of the
PivotTable and via the ‘PivotTable Toolbar’
selecting ‘Field Settings’. This presents a
slightly different ‘PivotTable Field’ dialog
which has an ‘Options>>’ button.

When this button is selected it expands the
dialog as shown. From here we can select
several options to show our data in relation to
other data in the set. We are selecting to show
‘% of column’.

This will show our sales data for each month
with its related % contribution towards the
years total. Below we have a before and after
view of the PivotTable during this action.
Before

After

Note: If you want to see both the actual
data and the relational data, just add the field
to the PivotTable twice
As the name suggests data in a PivotTable should
be easily pivoted. We have shown that this can
be achieved in the ‘PivotTable Wizard Layout
View’, it can also be simply and quickly
achieved by dragging column, row, page or value
fields from their current location to a new
location. Here we can show the sequence of
moving the current ‘Sales’, ‘Orders’ and
‘Margin’ fields from their current column
orientation to row orientation.
Before Pivot

During Pivot

After Pivot

A pivot correctly only shows a summary of the
data. Once this summary is viewed it may
highlight key areas requiring a more detailed
examination. This can be easily achieved using
the ‘Drill-Down’ feature of PivotTables which is
simply accessed by double-clicking any ‘Data’
field which will insert a new sheet showing the
‘Detail’ records making up the summary of the
cell double-clicked. Some example data is shown
below after clicking in the January 1997 Sales
cell from an earlier example. As yu can see it
has inserted sheet 8 and all the orders shown
are from January 1997, but the data is now no
longer a PivotTable in this sheet but a ‘Flat’
data file with the detailed data.

One of the true advantages of PivotTables is
that they do not need significant re-design and
re-keying as the underlying data changes. If
the data is stored within Excel and your source
data set to a range name, as suggested at the
start of this tutorial, if you add to this you
should simply need to extend the range attached
to this name and refresh your data. If your
data is stored within a table, query, view etc.
in a normalised relational database then this
should simply be a matter of refreshing the
data. This can be actioned by right-clicking
anywhere within the PivotTable or selecting the
‘Refresh Data’ option from the ‘PivotTable
Toolbar’ drop down shown below or the
exclamation mark on the ‘PivotTable Toolbar’.

For example, below, we change the 10th
July 1996 order from £167.40 to £267.40 and then
simply invoke ‘Refresh Data’ and that dates
orders in the PivotTable increase by £100.00.
Line Price Before

Line Price After

July 96 Pivot Before

July 96 Pivot After

Either by right-clicking anywhere in the
PivotTable or from the ‘PivotTable Toolbar’ you
can select ‘Table Options…’, this allows you to
set global settings which will apply to the
entire table. The dialog that is presented is
shown below.

Here we are able to re-name the PivotTable which
could be useful if we were accessing these via
VBA for example as it is often easier to refer
to descriptively named tables. We can select
several simple options to show totals for rows
or columns, auto-format the table, preserve
formatting between refreshes and set default
values where a PivotTable shows error values or
empty cells. There are also several ‘Data
options’ where you can save a data cache with
the table layout, this speeds refreshes but
increases file size. You can enable drill-downs
as outlined above and set the file to refresh
when opened or at certain time intervals. If
the data is from an external data source you
have the ability to save the password to the
database with the file, carry out background
queries and optimise Excel’s memory.
Top
Copyright Nick Hodge 2008. All Rights Reserved. |