Character Set in SQL*Loader

Sometimes we face a problem of SQL*Loader loading data into a wrong column OR only partial data gets loaded into a column. These issues might occur because of the characters in the data being loaded. By default SQL*Loader recognizes data of UTF-8 character set only. If unrecognized characters are found, SQL*Loader might treat it as a delimiter and populate data into the next column. To avoid this, correct character set has to be mentioned in the SQL*Loader control file properties based on the data being loaded.

Below are the commonly used character sets in SQL*Loader:

  1. ASCII (American Standard Code for Information Interchange) for the United States
  2. WE8ISO8859P1 for Western European Languages
  3. KOI8-R for Russian
  4. GB18030 and BIG5 for Chinese
  5. BIG5-HKSCS for HongKong

Syntax to Set Character Set property:

OPTIONS
LOAD DATA
CHARACTERSET WE8ISO8859P1
INFILE ‘$FILE’
TRUNCATE
INTO TABLE XX_XYZ
.
.

For complete supported list of character sets, have a look this doc from Oracle.