12c has introduced some new compression options for datapump and i wanted to do a quick (fairly scientific) comparison of the various options to see how they compared.
I did some tests with a single table from the HPQC application whose structure is this
SQL> desc QUALITY_R000000006_DB.AUDIT_LOG
Name Null? Type
----------------------------------------- -------- ----------------------------
AU_ACTION_ID NOT NULL NUMBER(10)
AU_FATHER_ID NOT NULL NUMBER(10)
AU_USER NOT NULL VARCHAR2(2000)
AU_SESSION_ID NOT NULL NUMBER(10)
AU_TIME NOT NULL DATE
AU_ACTION NOT NULL VARCHAR2(200)
AU_ENTITY_TYPE NOT NULL VARCHAR2(200)
AU_ENTITY_ID VARCHAR2(200)
AU_DESCRIPTION CLOB
The table contains 1.2 million rows and is about 400MB (so pretty small).
I then did a series of tests with the various compression options to check the speed and the file size generated. I then just out of interest repeated this with exp and manual compression (both using a pipe and zipping after the process finished.
The results are summarized here
Command
|
Elapsed
|
File Size
|
|
datapump no compression
|
40s
|
353M
|
|
datapump basic compression
|
37s
|
32.23M
|
|
|
75s | 46.24M | |
datapump medium compression | 34s | 25.63M | |
datapump high compression | 42s | 18.93M | |
gzip dumpfile from datapump no compression | 20s | 21.6 | |
standard exp | 120s | 798M | |
exp with gzip | 120s + 10s for zip | 36M | |
exp with gzip (using inline pipe) |
180
|
39M
|
So what can even this most basic of tests tell us?
Well with the small data set some of the results may be a little unrepresentative but we can make the following sweeping statements based on this simple test....
1) expdp is much faster and create smaller files than exp (speed is no real surprise but the dumpfile being half the size is interesting)
2) datapump compression did not seem to make much difference to the overall speed
3) 'LOW' compression seems really bad (slow)for some reason, even when the test was re-run
4) 'HIGH' creates the smallest file (even better than the best gzip compression)
The amount and speed of compression is very variable depending on the datatypes and data that is being worked with - so the results above will likely not translate to 'your' environment but at least it gives a general feel for how relatively 'good' things are.
And remember datapump compression is not free - you need an advanced compression licence......
The result for standard exp doesn't look quite right :-)
ReplyDeleteThanks - the perils of cut and paste - now corrected to 798M!
Delete