Import External Data

Home Examples

Up
Consolidate...
Filter
Form...
Group and Outline
Import External Data
List
Pivot Table Report...
Refresh Data
Sort...
Subtotals...
Table...
Text to Columns...
Validation...
XML

New! Excel User Group Now Open @ www.excelusergroup.org,

Menu command Type What happens next?
Import External Data  

Here is the menu...

 

 

 

Data Types Available

 

Excel can receive data from a myriad of external data sources, ranging from simple text files (*.csv, *.txt) to specialist data deriving from Oracle, IBM iSeries, Sharepoint etc.  Below is a list of currently available formats in Excel 2003.

 

Text Files

XML Files

CSV Files

Lotus1-2-3 Files

Access Databases

Quattro Pro Databases

DBase Files

Web Pages

Sql Databases

Other Excel Files

Paradox Files

OLAP Cubes

Oracle Databases

IBM iSeries

Sharepoint

Most ODBC Data Sources

Most OLE DB Data Sources

 

How Excel imports this data and it is handled once it arrives within Excel differs from case to case but we will intend covering the most commonly used file formats in this presentation.

 

ODBC Data Sources

 

Currently, by far the most common way of importing data into Excel is by means of Open DataBase Connectivity (ODBC).  To import ODBC from virtually any source Excel uses an intermediate tool called MSQuery which is automatically installed with a full install of Excel.  Under normal circumstances the user is hidden from the MSQuery product but we will be exploring this later in the presentation when we look at parameter queries within Excel.  If you are going to be using ODBC you will need to be aware of DSN’s (Data Source Names).  These in Windows XP can be set up via ‘Start>Control Panel>Administrative Tools>Data Sources (ODBC)’.  The dialog this presents is shown below and is in fact the same dialog you will face through any menu selection in Excel which offers you an ODBC Data Source.

 

 

 

ODBC connections are organised into three areas shown in tabs on the dialog above.  These are:-

 

User DSN

A user DSN is a Data Source Name which is set up and available purely under that user’s profile, that is if I sign on as NickH these Data Sources will only be available to me.

 

System DSN

A system DSN is a Data Source Name which is set up and available to any users logging on to a particular machine.

 

File DSN

This stores the Data Source information in a file which is made available in the assigned ‘Data Sources Folder’ on a particular machine.  These are particularly handy if you wish to set up your Data Connection on another machine as the file can simply be copied and posted into the Data Sources Directory on the new machine.  We will be seeing the list of File DSN’s later. 

 

The remaining tabs show all currently available ODBC drivers installed on the local machine, the ability to trace for error checking and de-bugging and connection pooling which allows Data Applications to re-use currently open connections without retrieving them each time from a server which can result in data delays and timeouts.  To give some idea of the drivers available, here is an image of the Drivers Tab on this dialog which shows in view less than half of the available ODBC drivers currently on my machine (your list may be much shorter or much longer depending on the installed ODBC drivers).

 

 

 

As mentioned earlier there is no need for you to set these up in advance as they can be added during any external data import operation but if you are planning carefully your data import it is handy to know how these are set up in advance, what they do and where they are stored.

 

Importing External Database Data

We have seen above an import of text data which is actually more of an open file process than an import using a data technology (ODBC, OLE DB, etc.).  We have already outlined the data name structure for ODBC connections and we will move onto getting data via these technologies from external databases.  To import external data we use the Data>Import External Data, followed by either of the three new data options (‘Import Data…’, ‘New Web Query…’, ‘New Database Query…’).  The first option allows you to select more data sources than the other two options, both ODBC and OLE DB.  The ‘New Web Query…’ is relatively self-explanatory and will be covered later and the ‘New Database Query…’ uses an ODBC connection via MSQuery and we will be demonstrating this in the ‘Using Parameters in External ODBC Data Queries’ section.

 

ODBC and OLE DB Data Sources

 

These data sources use DSN’s explained earlier, or connection strings to connect to external data.  They are accessed from the ‘Data>Import External Data>Import Data…’ menu option shown below. 

 

 

This presents the ‘Select Data Source’ dialog in which are listed all of your ODBC File DSN’s in your default ‘Data Sources’ folder as shown below.

 

 

The default is to show ‘All Data Sources’ but selecting the drop down will allow you to filter this by specific types of ODBC File DSN if the list is long.

 

If you do not have an ODBC File DSN for the connection you require you can select ‘New Source…’ from this dialog which presents the ‘Data Connection’ wizard shown below.

 

 

This offers the standard connections (including the common ODBC DSN), but if your connection type is not offered you should select the ‘Other/advanced’ option and when selecting ‘Next>’ you will be presented with the OLE DB ‘Data Link Properties’ dialog shown below listing all the OLE DB data providers.  You should select from here the relevant one (say for SQL Server).

 

 

Selecting ‘Next>>’ will move you to the ‘Connection’ tab in the same dialog.

 

 

This allows you to select a server, what security type you wish to use and allows you to select a database, attach a database, enter a connection string (certain data providers) and finally test the connection.  Providing this succeeds you can press ‘OK’ and you should be presented with a view of all the Tables, Queries, Views and Cubes for your selected database in the ‘Data Connection’ wizard shown below.

 

 

You can select your specific Table, Query, View or Cube from the list or simply set a connection to the entire database by removing the check mark from the ‘Connect to a specific table’ check box.  Pressing ‘Next>’ moves you forward in the wizard to the step shown below.

 

 

This allows you to change the file name, location of file and to save the log on password in the file.  You can set a description for the data connection and any search keywords.  Pressing ‘Finish’ moves you back to the ‘Select Data Source’ dialog with the newly created data connection selected.  Clicking ‘Open’ will connect through this file to the database returning its data to Excel.

 

 

Once Excel has connected to the database and collected the data you will be presented with the familiar ‘Import Data’ dialog (shown below) allowing you to position the upper left cell of the data, place it on a new sheet, or create a PivotTable report from it.  You can equally at this stage select the ‘Properties…’ button to set the ‘External Data Range Properties’ dialog (outlined later).  ‘Edit Query…’ will open either MSQuery (ODBC) or the ‘Edit OLE DB Query’ dialog shown below or the ‘Edit Web Query’ dialog shown in the section on web queries.  If the Data Source file uses parameters you will be able to select this button to edit those.  This is discussed later in this presentation.

 

 

 

 

Once you select ‘OK’ in the above dialog the data will be imported as shown below.

 

 

As related earlier pressing the ‘Properties…’ button in the ‘Import Data’ dialog presents the External Data Range Properties dialog below (default selections shown).

 

 

This allows you to set the ‘Name’ which can be useful when referring to it in VBA.  Whether the query definition and password should be saved with the file and at what point the data should be refreshed whether by manually pressing the ‘Refresh’ button or by setting a time interval or an ‘Auto Refresh on Open’.  Selecting this last option allows selection of the check box to remove the connection to the external data before the worksheet is saved.  The options in the lower half of this dialog refer to formatting settings which are useful if you have set certain formats or formulas on the imported data as unless you check these carefully when refreshing your data column widths may shrink, filters may disappear, formats may change and any formulas placed alongside the external data will not expand as data is added. 

 

Web Queries

Excel can collect data from tables on the internet, this is particularly useful for financial data, sports scores, etc.  To invoke a Web Query go to the ‘Data>Import External Data>New Web Query…’ menu option as shown below.

 

 

Taking this option will invoke the ‘New Web Query’ dialog into which a URL can be pasted or typed and the ‘Go’ button pressed.  The related page on the web site will then appear in the browser window below with small yellow square tags with arrows indicating where Excel can ‘see’ tables within the HTML. 

 

 

Once the web page has appeared you are able to hover over the table tags which turns them green and highlights the table to which they are related with a light grey line as shown below.

 

 

Once you are happy that you are selecting the correct table you should click the now green table tag which will insert a tick mark as shown below.

 

 

Pressing the ‘Import’ button will now bring this data into Excel via a version of the ‘Import Data’ dialog which we have seen before but is now not allowing creation of a PivotTable, ‘Parameters…’ or ‘Edit Query…’. 

 

 

Pressing ‘OK’ in this dialog results in the data being imported from the internet onto the worksheet as shown below.

 

 

You will notice that this data has nowhere near the formatted attraction of the origin web page, this is because by default Excel imports only the data without any HTML formatting.  This can be corrected at an earlier stage when creating your ‘New Web Query’ or after importing your data via the ‘Data>Import External Data>Edit Query…’ option as shown below.

 

 

This fires the ‘Edit Web Query’ dialog (which is identical to the ‘New Web Query’ dialog as shown below.

 

 

If we select the ‘Options…’ button in the header of this dialog we are presented with the ‘Web Query Options’ dialog which is shown below and allows retention of RTF formatting or full fidelity HTML formatting.  Additionally you can switch off Excel’s automatic conversion of dates and web query redirections. 

 

 

We have selected ‘Full HTML formatting’ resulting in the grid image shown below.

 

 

Examples

Top

Copyright Nick Hodge 2009. All Rights Reserved.