Sunday, December 30, 2012

SQL Loader in Oracle

basic sql loader comments----
load data
 infile 'c:\data\mydata.csv'
 into table emp
 fields terminated by "," optionally enclosed by '"'    
 ( empno, empname, sal, deptno )

Loading delimited (variable length) data
In the first example we will show how delimited (variable length) data can be loaded into Oracle:
LOAD DATA
 INFILE *
 INTO TABLE load_delimited_data
 FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
 TRAILING NULLCOLS
 (  data1,
    data2
 )
BEGINDATA
11111,AAAAAAAAAA
22222,"A,B,C,D,"
NOTE: The default data type in SQL*Loader is CHAR(255). To load character fields longer than 255 characters, code the type and length in your control file. By doing this, Oracle will allocate a big enough buffer to hold the entire column, thus eliminating potential "Field in data file exceeds maximum length" errors. Example:
...
resume char(4000),
...
Loading positional (fixed length) data
If you need to load positional data (fixed length), look at the following control file example:
LOAD DATA
  INFILE *
  INTO TABLE load_positional_data
  (  data1 POSITION(1:5),
     data2 POSITION(6:15)
  )
BEGINDATA
11111AAAAAAAAAA
22222BBBBBBBBBB
For example, position(01:05) will give the 1st to the 5th character (11111 and 22222).

Can one modify data as the database gets loaded?

Data can be modified as it loads into the Oracle Database. One can also populate columns with static or derived values. However, this only applies for the conventional load path (and not for direct path loads). Here are some examples:
LOAD DATA
  INFILE *
  INTO TABLE modified_data
  (  rec_no                      "my_db_sequence.nextval",
     region                      CONSTANT '31',
     time_loaded                 "to_char(SYSDATE, 'HH24:MI')",
     data1        POSITION(1:5)  ":data1/100",
     data2        POSITION(6:15) "upper(:data2)",
     data3        POSITION(16:22)"to_date(:data3, 'YYMMDD')"
  )
BEGINDATA
11111AAAAAAAAAA991201
22222BBBBBBBBBB990112
LOAD DATA
  INFILE 'mail_orders.txt'
  BADFILE 'bad_orders.txt'
  APPEND
  INTO TABLE mailing_list
  FIELDS TERMINATED BY ","
  (  addr,
     city,
     state,
     zipcode,
     mailing_addr   "decode(:mailing_addr, null, :addr, :mailing_addr)",
     mailing_city   "decode(:mailing_city, null, :city, :mailing_city)",
     mailing_state,
     move_date      "substr(:move_date, 3, 2) || substr(:move_date, 7, 2)"
  )

[edit]Can one load data from multiple files/ into multiple tables at once?

Loading from multiple input files
One can load from multiple input files provided they use the same record format by repeating the INFILE clause. Here is an example:
LOAD DATA
  INFILE file1.dat
  INFILE file2.dat
  INFILE file3.dat
  APPEND
  INTO TABLE emp
  ( empno  POSITION(1:4)   INTEGER EXTERNAL,
    ename  POSITION(6:15)  CHAR,
    deptno POSITION(17:18) CHAR,
    mgr    POSITION(20:23) INTEGER EXTERNAL
  )
Loading into multiple tables
One can also specify multiple "INTO TABLE" clauses in the SQL*Loader control file to load into multiple tables. Look at the following example:
LOAD DATA
 INFILE *
 INTO TABLE tab1 WHEN tab = 'tab1'
   ( tab  FILLER CHAR(4),
     col1 INTEGER
   )
 INTO TABLE tab2 WHEN tab = 'tab2'
   ( tab  FILLER POSITION(1:4),
     col1 INTEGER
   )
BEGINDATA
tab1|1
tab1|2
tab2|2
tab3|3
The "tab" field is marked as a FILLER as we don't want to load it.
Note the use of "POSITION" on the second routing value (tab = 'tab2'). By default field scanning doesn't start over from the beginning of the record for new INTO TABLE clauses. Instead, scanning continues where it left off. POSITION is needed to reset the pointer to the beginning of the record again. In delimited formats, use "POSITION(1)" after the first column to reset the pointer.
Another example:
LOAD DATA
 INFILE 'mydata.dat'
 REPLACE
 INTO TABLE emp
      WHEN empno != ' '
 ( empno  POSITION(1:4)   INTEGER EXTERNAL,
   ename  POSITION(6:15)  CHAR,
   deptno POSITION(17:18) CHAR,
   mgr    POSITION(20:23) INTEGER EXTERNAL
 )
 INTO TABLE proj
      WHEN projno != ' '
 (  projno POSITION(25:27) INTEGER EXTERNAL,
    empno  POSITION(1:4)   INTEGER EXTERNAL
 )

[edit]Can one selectively load only the records that one needs?

Look at this example, (01) is the first character, (30:37) are characters 30 to 37:
LOAD DATA
  INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
  APPEND
  INTO TABLE my_selective_table
  WHEN (01) <> 'H' and (01) <> 'T' and (30:37) = '20031217'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR
  )
NOTE: SQL*Loader does not allow the use of OR in the WHEN clause. You can only use AND as in the example above! To workaround this problem, code multiple "INTO TABLE ... WHEN" clauses. Here is an example:
LOAD DATA
  INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
  APPEND
  INTO TABLE my_selective_table
  WHEN (01) <> 'H' and (01) <> 'T'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR
  )
  INTO TABLE my_selective_table
  WHEN (30:37) = '20031217'
  (
     region              CONSTANT '31',
     service_key         POSITION(01:11)   INTEGER EXTERNAL,
     call_b_no           POSITION(12:29)   CHAR
  )

[edit]Can one skip certain columns while loading data?

One cannot use POSITION(x:y) with delimited data. Luckily, from Oracle 8i one can specify FILLER columns. FILLER columns are used to skip columns/fields in the load file, ignoring fields that one does not want. Look at this example:
LOAD DATA
TRUNCATE INTO TABLE T1
FIELDS TERMINATED BY ','
( field1,
  field2 FILLER,
  field3
)
BOUNDFILLER (available with Oracle 9i and above) can be used if the skipped column's value will be required later again. Here is an example:
LOAD DATA
INFILE *
TRUNCATE INTO TABLE sometable
FIELDS TERMINATED BY ","  trailing nullcols
(
 c1,
 field2 BOUNDFILLER,
 field3 BOUNDFILLER,
 field4 BOUNDFILLER,
 field5 BOUNDFILLER,
 c2     ":field2 || :field3",
 c3     ":field4 + :field5"
)

[edit]How does one load multi-line records?

One can create one logical record from multiple physical records using one of the following two clauses:
  • CONCATENATE - use when SQL*Loader should combine the same number of physical recordstogether to form one logical record.
  • CONTINUEIF - use if a condition indicates that multiple records should be treated as one. Eg. by having a '#' character in column 1.

[edit]How does one load records with multi-line fields?

Using Stream Record format, you can define a record delimiter, so that you're allowed to have the default delimiter ('\n') in the field's content.
After the INFILE clause set the delimiter:
 load data
 infile "test.dat" "str '|\n'"
 into test_table
 fields terminated by ';' TRAILING NULLCOLS
 (
   desc,
   txt
 )
test.dat:
 one line;hello dear world;|
 two lines;Dear world,
 hello!;|
Note that this doesn't seem to work with inline data (INFILE * and BEGINDATA).

No comments: