Article-ID: Circulation: PUBLISHED (EXTERNAL) Folder: server.Rdbms.DBA.Corruption Topic: Alerts Title: Alert: Database may crash after upgrade to 8.1.6 (Oracle8i Release2) Document-Type: ALERT Impact: HIGH Skill-Level: ACCOMPLISHED Server-Version: 08.01.06 Updated-Date: 31-JAN-2000 13:50:59 References: Shared-Refs: Authors: JBARLOW.UK RPOWELL.UK Attachments: NONE Content-Type: TEXT/PLAIN Products: 5/RDBMS; Platforms: GENERIC; Alert: Database may crash after upgrade to 8.1.6 (Oracle8i Release2) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This alert describes some essential checks to perform if upgrading to Oracle 8.1.6 from an earlier release. Oracle 8.1.6 is the first server release to perform block checking for dictionary objects by default. This block checking can detect problems which went un-noticed in earlier releases of Oracle. In particular , which affects Oracle releases 7.3 to 8.1.5 inclusive, may have silently introduced a minor corruption in the data dictionary. If a database with this corruption is upgraded to 8.1.6 or higher then the new block checking code can notice the corruption causing the instance to be aborted or the dictionary block to be marked as permanently corrupt. Versions of Oracle Affected ~~~~~~~~~~~~~~~~~~~~~~~~~~~ This problem potentially affects all databases upgraded as follows: Upgrade path Potentially affected? -------------- --------------------- 8.1.5 to 8.1.6 YES 8.0.X to 8.1.6 YES 7.3.4 to 8.1.6 NO Note, if a database is upgraded from 7.3.4 to 8.1.6 via 8.0.X (i.e. it has been run under 8.0.X) it is potentially affected by this problem. Platforms Affected ~~~~~~~~~~~~~~~~~~ GENERIC - affects ALL platforms. Description ~~~~~~~~~~~ Due to the SYS.C_TS# data dictionary cluster can become silently corrupted in Oracle release 7.3.4 to 8.1.5 inclusive. This corruption manifests itself as a single byte space 'leak' from a cluster block. Notably, but not exclusively, this behaviour has been observed on some databases after the CATREP.SQL script has been run. @ ( Note, that this bug is UNRELATED to the SYS.C_TS# problem described in @ which was alerted in ) @ The space leak is benign per se, and on Oracle versions up to (and including) 8.1.5 is likely to go unnoticed as long as the database instance is running WITHOUT any special block checking events. Under 8.1.6, block checking is automatically enabled for data objects in the SYSTEM tablespace. If a block in the SYS.C_TS# cluster was previously corrupted then the database may crash after the block is modified in 8.1.6. @ Internal Note: @ ~~~~~~~~~~~~~~ @ System tablespace block checking is enabled under 8.1.6 by the hidden @ parameter _DB_ALWAYS_CHECK_SYSTEM_TS, which is TRUE by default. Setting @ _DB_ALWAYS_CHECK_SYSTEM_TS=FALSE will disable dictionary object checking. @ has been fixed in 8.1.6. However, the fix only prevents the initial corruption. It is NOT remedial, and DOES NOT FIX existing corruptions. Likelihood of Occurrence ~~~~~~~~~~~~~~~~~~~~~~~~ The problem described here does NOT affect databases created using Oracle8i Release 2 (8.1.6) or higher. It can affect databases upgraded from earlier 8.0 or 8.1 releases. The likelihood of the database crashing as described here is high if the dictionary contains any logically corrupt data blocks. This can be checked prior to upgrade as described in the section "Checking for Corruptions" below. Databases migrated DIRECTLY from 7.3.4 to 8.1.6 will not be affected by this problem as the data dictionary objects (including SYS.C_TS#) are rebuilt during migration. Possible Symptoms ~~~~~~~~~~~~~~~~~ Possible symptoms of this problem are: (1) Recursive DML that modifies the datablock raises: ORA-00604: error occurred at recursive SQL level 1 ORA-00607: Internal error occurred while making a change to a data block ORA-00600: internal error code, arguments: [kcoapl_blkchk] [afn] [block] (2) Database writer process may terminate with: ORA-00600: internal error code, arguments: [kcbzpb_2] [afn] [block] Where 'afn' is the absolute datafile number, and 'block' is the datablock in that datafile. @ @ The reason that db writer may fail is because of the hidden parameter @ _CHECK_BLOCK_AFTER_CHECKSUM. If this parameter is enabled, block checking @ is performed after the block checksum is calculated. @ _CHECK_BLOCK_AFTER_CHECKSUM is enabled by default. @ @ Fortunately, both of the above internal errors produce trace files which @ dump out the details of the space check failure, for example: @ @ *** SESSION ID:(7.3) 2000-01-13 15:37:45.359 @ Block Checking: DBA = 4194427, Block Type = KTB-managed data block @ @ kdbchk: the amount of space used is not equal to block size @ used=310 fsc=11 avsp=1630 dtl=1952 @ If you see either of these symptoms contact your local support centre for advice on what steps to take. Checking for corruptions ~~~~~~~~~~~~~~~~~~~~~~~~ BEFORE upgrading to 8.1.6, Oracle Support strongly recommends that you check the data dictionary objects for possible corruptions. A SQL script to do this can be generated as follows from SQL*PLUS (connected as SYS): (Spool to an appropriately named file for your platform. This example uses Unix file naming conventions) set head off feedback off pagesize 500 echo off spool /tmp/analyze_objects.sql select 'ANALYZE TABLE "'||table_name||'" VALIDATE STRUCTURE CASCADE;' from dba_tables where tablespace_name='SYSTEM' and owner='SYS'; select 'ANALYZE CLUSTER "'||cluster_name||'" VALIDATE STRUCTURE CASCADE;' from dba_clusters where tablespace_name='SYSTEM' and owner='SYS'; spool off Because 8.1.6 enables block checking for all SYSTEM tablespace objects, it would be worthwhile running a script that analyzes ALL (SYS and non-SYS owned) objects in this tablespace. Although it is bad practice to use the SYSTEM tablespace for non-SYS objects, it is not an uncommon one. To generate a script that analyzes all SYSTEM tablespace objects, execute the following SQL: set head off feedback off pagesize 500 echo off spool /tmp/analyze_objects.sql select 'ANALYZE TABLE "'||owner||'"."'||table_name|| '" VALIDATE STRUCTURE CASCADE;' from dba_tables where tablespace_name='SYSTEM'; select 'ANALYZE CLUSTER "'||owner||'"."'||cluster_name|| '" VALIDATE STRUCTURE CASCADE;' from dba_clusters where tablespace_name='SYSTEM'; spool off Edit /tmp/analyze_objects.sql to remove everything but the ANALYZE commands, and run the script, connected as a privileged user. If ANALYZE fails for any of the tables or clusters, RAISE A SUPPORT CALL WITH ORACLE SUPPORT BEFORE PROCEEDING WITH THE UPGRADE. If ANALYZE detects a leaked byte in a cluster, it will create a trace file (which should be saved) and fail with the following error: ORA-1498: block check failure - see trace file *** IMPORTANT *** AT THE VERY LEAST, THE SYS.C_TS# CLUSTER MUST BE ANALYZED PRIOR TO UPGRADING TO 8.1.6 using the following command: ANALYZE CLUSTER SYS.C_TS# VALIDATE STRUCTURE CASCADE; What should I do I an "ANALYZE .. VALIDATE.." reports an error ? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If any of the ANALYZE commands reports an error then you should do the following: a) Check the meaning of the error reported. Eg: "ORA-942 table or view does not exist" may mean a table was dropped between generating the script and running the ANALYZE. b) Re-run the ANALYZE for the affected table / cluster to see if the error is repeatable. c) For ORA-1499 or ORA-1498 errors there should be a trace file produced in your USER_DUMP_DEST directory. Collect all such trace files together, along with details of the ANALYZE command which fails and the exact error reported. d) Contact your local support center with all the details and trace files collected. Workaround / Patches ~~~~~~~~~~~~~~~~~~~~ is fixed in 8.1.6. Any database with the corruption introduced by this bug should be looked at by Oracle Support prior to the database being upgraded to 8.1.6. @ INTERNAL NOTE: @ ~~~~~~~~~~~~~~ @ NB: It is essential to validate any corruption is due to Bug:839342 @ before advising the customer on any options. @ @ One option to allow a system to run under 8.1.6 with this corruption @ is to set the following parameters: @ DB_BLOCK_CHECKING=FALSE @ _DB_ALWAYS_CHECK_SYSTEM_TS=FALSE @ _CHECK_BLOCK_AFTER_CHECKSUM=FALSE @ This will revert to the default behaviour for earlier versions and allow @ the database to run. This should only be used as an interim workaround. @ @ See for instructions on how to patch the 'space leak' @ corruption. Articles ~~~~~~~~ Corruption detection features in Oracle Parameter reference: DB_BLOCK_CHECKING @ _CHECK_BLOCK_AFTER_CHECKSUM @ _DB_ALWAYS_CHECK_SYSTEM_TS @ Verification and patching instructions @ Block check code not soft corrupting in 8.1.6