The datapump detective...



As some of you may have read we're migrating an old 9i system to 12c and are having to make use of streams to make up for the fact that CDC has been removed in 12c. This had all been going well and we were quite happily replicating from 12.1.0.1 to 10.2.0.4 (where we built a downstream CDC - it's not the cleanest solution but it gets the job done and it should be short lived).

You'll notice i said 'had' been going well, we hit a bug whereby we had spilled data to disk from the capture queue and the database crashed at this exact point - this left us in a state where capture would not restart due to a bug.

Now the bug was only fixed in 12.1.0.2 and the vendor of the 3rd party app wasn't keen on us moving to this version and would not offer 'full' support, we requested a one off backport (OOB) from oracle support for 12.1.0.1 but it wasn't clear how long this would take so we had to go ahead and at least try to upgrade to 12.1.0.2.

Which we did and it indeed fixed the issue - but created a new one - again with streams. Now i had written a draft blog on all of this but the xray scanner at Munich airport seemed to decide to destroy my hard drive ( it was working before it went in and wasn't soon afterwards....) - anyway that story may never see the light of day now.

However we then had an even more interesting tale of what happened next....

Oracle support backported the fix to 12.1.0.1 and made it available for linux x64 - great - so lets downgrade.

Which i duly did and it seemed to work very smoothly - nice to know that this process works well.

What happened next they was more than a little confusing...

So after the downgrade we had to drop and rebuild streams as it was totally stuck because of the issue we hit at 12.1.0.2. We ran the same scripts as before but something very odd was happening - we were defining 4 tables to be extracted in the plsql block calling datapump - but only one was being extracted......

I won't paste the whole block here but the key points of interest were these

  object_name(1) := 'ZCOTRDDAY';
  object_name(2) := 'TPOW';
  object_name(3) := 'BOOK';
  object_name(4) := 'STREAM_HEARTBEAT';

  h1 := dbms_datapump.open(operation=>'EXPORT',job_mode=>'TABLE',
    remote_link=>'',
    job_name=>NULL, version=> min_compat);

The code was basically looping through the small array of values and then building up a filter string to pass to dbms_datapump. dbms_datapump was opened using the call above.

What was actually happening is that STREAM_HEARTBEAT was exported but nothing else was.

First i thought must be some bad plsql and went down that route but after debugging and checking it the plsql looked fine - so it must be datapump?

So i then assumed that maybe the downgrade hadn't done something properly (though there were no errors) so i'd rerun catalog and catproc - which i did.

And nothing changed

Then i thought i'd try command line datapump and see what happened

expdp / tables=zainet.tpow, zainet.book  content=metadata_only reuse_dumpfiles=y

This worked fine - so the problem must be dbms_datapump - right? Well as they both do essentially the same thing under the covers this seemed unlikely. Then i realised that actually the 'call' to datapump was different - in the plsql version we were passing in the version parameter explicitly (and this was being fetched from the 10.2 database) - so i then tried that from the command line

expdp / tables=zainet.tpow, zainet.book version=10.2 content=metadata_only reuse_dumpfiles=y

And guess what......

no errors but the tables did not export....

must be a downgrade bug right?

So i ran catalog and catproc again.....

and nothing changed...

OK time to get serious lets trace this thing and turn it up to 11 (anything less than that for this kind of issue probably wouldn't be worth it)

expdp / tables=zainet.tpow, zainet.book version=10.2 content=metadata_only reuse_dumpfiles=y trace=1FF0300

heading to the trace directory and finding the log for the datapump worker process a quick scan of the file reveals this

META:11:29:36.325:  get_xml_inputs TABLE_EXPORT/TABLE/TABLE_OBJNUM:
SELECT /*+all_rows*/ KU$.OBJ_NUM, KU$.TABLE_TYPE FROM SYS.KU$_10_1_TABLE_OBJNUM_VIEW KU$ WHERE NOT (BITAND (KU$.SCHEMA_OBJ.FLAGS,16)=16) AND   KU$.BASE_OBJ.OWNER_NAME IN (SELECT UNIQUE object_schema FROM "OPS$ORACLE"."SYS_EXPORT_TABLE_01" WHERE process_order = -55 AND duplicate BETWEEN 1 AND 2) AND  BITAND(KU$.SCHEMA_OBJ.FLAGS,4194304)=0 AND NOT EXISTS (SELECT 1 FROM  SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='TABLE' AND A.NAME=KU$.BASE_OBJ.NAME AND A.SCHEMA=KU$.BASE_OBJ.OWNER_NAME) AND NOT EXISTS (SELECT 1 FROM  SYS.KU$NOEXP_TAB A WHERE A.OBJ_TYPE='SCHEMA' AND A.NAME=KU$.BASE_OBJ.OWNER_NAME)  AND  (((ku$.schema_obj.owner_name,ku$.schema_obj.name) IN (SELECT object_schema, object_name FROM "OPS$ORACLE"."SYS_EXPORT_TABLE_01" WHERE process_order = -55 AND duplicate BETWEEN 1 AND 2)))
rowtag: ROW objnum_count: 0 callout: 8 Bind count: 0 Bind values:
META:11:29:36.325:  Begin statement open
META:11:29:36.391:  End statement open
META:11:29:36.391: metatrace-OPEN    41376.325    41376.391         .066 0 TABLE_EXPORT/TABLE/TABLE_OBJNUM
META:11:29:36.739: metatrace-FETCH    41376.391    41376.739         .348 1 TABLE_EXPORT/TABLE/TABLE_OBJNUM
META:11:29:36.739: SET_OBJECTS_FETCHED(2) called for TABLE_OBJNUM objects_fetched = 1
META:11:29:36.739: OBJECTS_FETCHED for TABLE_OBJNUM = 1

Now this is intriguing - there seems to be a version specific view that determines what datapump can extract - in this case SYS.KU$_10_1_TABLE_OBJNUM_VIEW

Lets look at the source of that to look for any clues...


  select t.* from ku$_11_2_table_objnum_view t
      where
         NOT EXISTS (
           select property from col$ c /* exclude tabs with virtual cols */
           where c.obj# = t.obj_num
               and bitand(c.property, 65536) >= 65536            /* virtual cols */
               and bitand(c.property, 256) = 0               /* not a sysgen col */
              and bitand(c.property, 32768) = 0)                  /* not unused */

Now this looks like a major clue - we are basically extracting anything in the 11.2 view apart from stuff which matches the criteria in the not exists clause.....

A quick query reveals


select c.name from col$ c, obj$ t /* exclude tabs with virtual cols */
           where c.obj# = t.obj#
               and bitand(c.property, 65536) >= 65536            /* virtual cols */
and t.name='TPOW'
/

NAME
--------------------------------------------------------------------------------
SYS_STSM_MC8HFVOZZTCAY2QTKG95P
SYS_NC00215$
SYS_NC00216$
SYS_STS4W6SM11$N7M7BC13DVHUTUU
SYS_STSC#WMWQ2GLEP04WAS49R4ZM3


Right...... what are those doing - there seem to be 2 distinct types of virtual column created - but i haven't added those.

However i know from reading various things that extra stuff is going on in v12 that might be causing this and i immediately do this

  1  select index_name,FUNCIDX_STATUS from dba_indexes where table_name='TPOW' and owner='ZAINET'
  2* and funcidx_status='ENABLED'
SQL> /

INDEX_NAME
--------------------------------------------------------------------------------
FUNCIDX_
--------
TPOW_TFIRM
ENABLED

TPOW_TRADTNUM
ENABLED


SQL> drop index zainet.TPOW_TFIRM;

Index dropped.

SQL> drop index zainet.TPOW_TRADTNUM;

Index dropped.

rerunning the query then reveals this

select c.name from col$ c, obj$ t /* exclude tabs with virtual cols */
           where c.obj# = t.obj#
               and bitand(c.property, 65536) >= 65536            /* virtual cols */
and t.name='TPOW'
/

NAME
--------------------------------------------------------------------------------
SYS_STSM_MC8HFVOZZTCAY2QTKG95P
SYS_STSQDSVGIY7_3QMVFD3Z#WODV2
SYS_STSH#5$FM9XDKSA1Z1GY468$F_
SYS_STS4W6SM11$N7M7BC13DVHUTUU
SYS_STSC#WMWQ2GLEP04WAS49R4ZM3

So - that's kind of progress - we've deleted the FBI's (only temporarily to allow is to move past this issue) - but what are the other virtual columns (and actually why have i got 2 extra ones now)?

Well - turns out they are extended statistics - so lets get rid of those - with a quick bit of sqlplus trickery

select 'exec dbms_stats.drop_extended_stats('||''''||owner||''''||','||''''||table_name||''''||','||''''||EXTENSION||''''||');'
from dba_stat_extensions
where owner='ZAINET' and table_name='TPOW'

This returns a list of sql statements i can run to drop all the extended stats - which i do and it all works apart from this one


SQL> exec dbms_stats.drop_extended_stats('ZAINET','TPOW','("AUDIT_ALF","AUDIT_AOHM","TRADE_TNUM","TRADE_TDATE")');
*
ERROR at line 1:
ORA-00001: unique constraint (SYS.I_WRI$_OPTSTAT_HH_OBJ_ICOL_ST) violated
ORA-06512: at "SYS.DBMS_STATS", line 13020
ORA-06512: at "SYS.DBMS_STATS", line 13074
ORA-06512: at "SYS.DBMS_STATS", line 45105
ORA-06512: at line 1

The only real hit i could find for this was an unanswered forum question - but i knew a way to get past it - just delete all of the optimizer stats history.....


SQL> exec DBMS_STATS.PURGE_STATS(DBMS_STATS.PURGE_ALL);

PL/SQL procedure successfully completed.

Now i try again and all is well.......

SQL> exec dbms_stats.drop_extended_stats('ZAINET','TPOW','("AUDIT_ALF","AUDIT_AOHM","TRADE_TNUM","TRADE_TDATE")');

PL/SQL procedure successfully completed.

And now when we do a datapump extract with version=10.2 it all works! SO the problem was 12c added some virtual columns which then precluded that table from being extracted in 10.2 format - i would imagine this was really intended to deal with customer created virtual columns and not hidden internal columns - perhaps the SYS.KU$_10_1_TABLE_OBJNUM_VIEW view should be slightly amended to deal with these special cases.

Anyway quite an interesting issue that was nicely explained by doing a bit of tracing and detective work.

1 comments:

  1. Excellent detective work -- made short work of my same problem!

    ReplyDelete