Oracle SQL Loader Example – NFL 2016 Games

Oracle SQL Loader (sqlldr.exe) utility provides an efficient way to perform a bulk data load into an Oracle table.  Sqlldr.exe doesn’t record inserts into a transaction log, so performance is improved.

The following example was set up to walk through a scenario to load data from a text file into an Oracle database.  The steps in the example’s procedure were performed on a Microsoft Windows 10 workstation, but other platforms should work as well.  Some additional commentary is provided after the procedure instructions.

Prerequisites

  1. Oracle database – A simple Oracle XE instance on a local workstation is fine.

Note:  For this example, I am using a “datascienceuser” username and “xe” instance name.

  1. Oracle SQL Client installed on workstation.
  2. Oracle sqlldr.exe installed on workstation – this utility is installed with Oracle XE and is also available with Oracle SQL client. With Oracle SQL client – sqlldr.exe is not installed by default.  You may need to run Oracle SQL client install again to add sqlldr.exe if not already installed.
  3. zip – zip file with all data and scripts for 2016 NFL games Oracle examples. Zip file can be obtain from the following Github repository:

https://github.com/mndatascienceexamples/datascienceexamples

  1. zip – extract zip file to c:\ drive. For consistency, I will place the extracted files in c:\Oracle_2016NFLSeason directory.  C:\Oracle_2016NFLSeason will be the working directory for the Oracle SQL Loader Example.

Procedure

  1. Open Windows command prompt and change directory to C:\Oracle_2016NFLSeason
  2. Run Oracle SQL client and login.
  3. Create Oracle NFL Game tables by running the following script

@createNFLOraTables.sql

  1. Create Oracle stored procedures for validating and loading NFL Game data from staging table to transaction table.

@nflDataLoad.pks
@nflDataLoad.pkg

  1. Exit out of Oracle SQL client utility
  2. From Windows command prompt, execute the loadNFLGameDataScript.bat batch file with username, password and Oracle instance id as parameters. Example:

loadNFLGameDataScript.bat %ora_user_id% %password% %ora_instance%

  1. Run Oracle SQL client and login.
  2. Turn on Oracle server output by executing following command

set serveroutput on;

  1. Verify data was loaded into NFL_GAME_STAGE table

select count(*) from NFL_GAME_STAGE;

*Should have 261 records at this point

  1. Move NFL Games data from staging table into transaction table by executing following stored procedure

exec nfldataload.NFL_GAME_DATA_LOAD;

  1. Verify NFL Games data was moved appropriately.

select count(*) from NFL_GAME_STAGE;
*Should have 0 records at this point

select count(*) from NFL_GAME;
*Should have 256 records at this point

select count(*) from NFL_GAME_ERROR;
*Should have 5 records at this point

Additional Notes

Oracle SQL Loader Control Files

There are two control files which were intended to show differences in how data files can be loaded into Oracle.  Notes regarding nflGameStaging.ctl and nflGameStagingBadData.ctl control files.

  • Both data input files contain a field header line, which should not be loaded into Oracle. Line one (OPTIONS (SKIP=1)) in the control file will do that.
  • Both control files use LOAD DATA INFILE to direct sqlldr.exe to data input file.
  • Both control files load data into same Oracle table (NFL_GAME_STAGE)
  • ctl control file uses “INSERT INTO TABLE” option. “INSERT INTO TABLE” requires the Oracle table to be empty in order to load data.  If there are any rows in the Oracle table, sqlldr.exe will error with “INSERT INTO TABLE” option.
  • ctl control file uses “APPEND INTO TABLE” option. “APPEND INTO TABLE” does not require the Oracle table to be empty in order to load data.  If there are any rows in the Oracle table, sqlldr.exe will continue to load data with “APPEND INTO TABLE” option.

loadNFLGameDataScript.bat

Commands to run sqlldr.exe can be directly typed at command prompt.  A batch program is helpful if there are multiple tables to load data into, which would require multiple control files.

Data Validation – Using a Staging Table

A staging table provides a way to load data into Oracle without affecting the application’s transaction table.  If there data errors, you may want to exclude that data from an application’s transaction table.  If the load fails due to a data error – application is not affected.

Using sqlldr.exe utility will check the structure of the data (Integers, Characters, length of data, etc).  Logical data errors need to be checked separately.  If there is a process is to validate the data before using sqlldr.exe, then a staging table isn’t necessary.  If there is not a process to check the data before loading data into Oracle, then a staging table is recommended.  An Oracle stored procedure could be used to validate the data prior to moving the data from a staging table into a transaction table.

Oracle PL/SQL MERGE INTO

The NFL Game data load stored procedure uses a MERGE INTO command.  The MERGE INTO can be used to prevent duplicate records.  If there is a match on the defined criteria, then PL/SQL will just perform an update.  If there is not a match on the defined criteria, then the PL/SQL will insert the data into the transaction table.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

Up ↑

%d bloggers like this: