A recent forum question I answered prompted me to post this note - the requirement was that the external table needed to have a number type column but the underlying data had commas in which meant it would not work. I initially thought some kind of number format could be applied in the external table definition but that did not work and the answer was to use a preprocessor - so here are the steps:
The csv file contains this data:
fielda|fieldb
somedata|1,000
somedata|1,000
We need to create a simple script that will strip out commas - for this we can use sed
I created a script (test.sh) with the following content
/bin/sed 's/,//g' $1
Note full path to sed is important as is the $1 and the file must have execute permissions set
So to now use this shell script we refer to it in the external table defintion as a preprocessor - so before the csv file is read the shell script runs against it - this does not alter the original file but the result of the shell script on the file is what is fed into oracle.
So to create the external table we now run:
DROP TABLE XT_TEST;
CREATE TABLE XT_test
( fielda varchar2(10),
fieldb number
)
ORGANIZATION EXTERNAL
( TYPE ORACLE_LOADER
DEFAULT DIRECTORY DATA_PUMP_DIR
ACCESS PARAMETERS
( records delimited by newline
PREPROCESSOR DATA_PUMP_DIR:'test.sh' badfile DATA_PUMP_DIR:'testcsv.bad'
logfile DATA_PUMP_DIR:'testcsv.log'
discardfile DATA_PUMP_DIR:'testcsv.dsc'
skip 1
fields terminated by '|'
lrtrim
missing field values are null
( FIELDA CHAR(20) ,
FIELDB
)
)
LOCATION (DATA_PUMP_DIR:'test.csv')
)
REJECT LIMIT UNLIMITED NOMONITORING
/
Now when we fetch from the table the data is there (with no commas) and the table has a column of type number
SYS@EMZA3>select * from xt_test;
FIELDA FIELDB
---------- ----------
somedata 1000
somedata 1000
SYS@EMZA3>desc xt_test
Name Null? Type
----------------------------------------- -------- ----------------------------
FIELDA VARCHAR2(10)
FIELDB NUMBER
Comments
Post a Comment