发布新日志

  • 无法删除用户的处理(ora-01940)

    2009-02-25 13:14:36


    oracle 9208 for solaris 9

     


    故障描述:

    使用drop user test,进行用户删除时,报错或者hang,并且错误一直可以重现,测试如下:

    SQL> drop user test1 cascade;
    drop user test1 cascade
    *
    ERROR at line 1:
    ORA-01940: cannot drop a user that is currently connected

       并通过clone数据库也发现存在同样的问题,并且还会强行终止processs,以及报出ora-03114的
       错误。alert日志中无任何异常情况。

     

    原以为可能是数据字典损坏,通过10046进行追踪,
    =====================
    PARSING IN CURSOR #4 len=652 dep=1 uid=38 ct=47 lid=38 tim=1206301870620732 hv=3055982128 ad='247f3138'
    BEGIN
      BEGIN
        IF (xdb.DBMS_XDBZ0.is_hierarchy_enabled_internal(sys.dictionary_obj_owner, sys.dictionary_obj_name, sys.dictionary_obj_owner)) THEN
          xdb.XDB_PITRIG_PKG.pitrig_truncate(sys.dictionary_obj_owner, sys.dictionary_obj_name);
        END IF;
      EXCEPTION
        WHEN OTHERS THEN
         null;
      END;
      BEGIN
        IF (xdb.DBMS_XDBZ0.is_hierarchy_enabled_internal(sys.dictionary_obj_owner, sys.dictionary_obj_name, sys.dictionary_obj_owner, xdb.DBMS_XDBZ.IS_ENABLED_RESMETADATA)) THEN
          xdb.XDB_PITRIG_PKG.pitrig_dropmetadata(sys.dictionary_obj_owner, sys.dictionary_obj_name);
        END IF;
      EXCEPTION
        WHEN OTHERS THEN
         null;
      END;
    END;
    END OF STMT
    PARSE #4:c=1000,e=1261,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1206301870620711
    BINDS #4:


    发现每一次进行删除时都是到这一步,而正常的话,应该是后面还有很多过程,


    XCTEND rlbk=0, rd_only=1
    =====================
    PARSING IN CURSOR #5 len=14 dep=0 uid=0 ct=53 lid=0 tim=1206301870603227 hv=0 ad='b727fc04'
    drop user test
    END OF STMT
    PARSE #5:c=7998,e=32601,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1206301870603207
    BINDS #5:
    =====================
    PARSING IN CURSOR #4 len=652 dep=1 uid=38 ct=47 lid=38 tim=1206301870620732 hv=3055982128 ad='247f3138'
    =====================
    PARSING IN CURSOR #4 len=652 dep=1 uid=38 ct=47 lid=38 tim=1206301870620732 hv=3055982128 ad='247f3138'
    BEGIN
      BEGIN
        IF (xdb.DBMS_XDBZ0.is_hierarchy_enabled_internal(sys.dictionary_obj_owner, sys.dictionary_obj_name, sys.dictionary_obj_owner)) THEN
          xdb.XDB_PITRIG_PKG.pitrig_truncate(sys.dictionary_obj_owner, sys.dictionary_obj_name);
        END IF;
      EXCEPTION
        WHEN OTHERS THEN
         null;
      END;
      BEGIN
        IF (xdb.DBMS_XDBZ0.is_hierarchy_enabled_internal(sys.dictionary_obj_owner, sys.dictionary_obj_name, sys.dictionary_obj_owner, xdb.DBMS_XDBZ.IS_ENABLED_RESMETADATA)) THEN
          xdb.XDB_PITRIG_PKG.pitrig_dropmetadata(sys.dictionary_obj_owner, sys.dictionary_obj_name);
        END IF;
      EXCEPTION
        WHEN OTHERS THEN
         null;
      END;
    END;
    END OF STMT
    PARSE #4:c=1000,e=1261,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=1,tim=1206301870620711
    BINDS #4:
    =====================
    PARSING IN CURSOR #3 len=51 dep=2 uid=38 ct=3 lid=38 tim=1206301870636943 hv=397653107 ad='247ef48c'
    SELECT USER_ID FROM ALL_USERS WHERE USERNAME = :B1
    END OF STMT
    PARSE #3:c=2999,e=2890,p=0,cr=0,cu=0,mis=1,r=0,dep=2,og=1,tim=1206301870636918
    =====================
    PARSING IN CURSOR #2 len=37 dep=3 uid=0 ct=3 lid=0 tim=1206301870644807 hv=1398610540 ad='29af6be8'
    select text from view$ where rowid=:1
    END OF STMT
    PARSE #2:c=4999,e=4580,p=0,cr=0,cu=0,mis=1,r=0,dep=3,og=4,tim=1206301870644788
    =====================
    PARSING IN CURSOR #1 len=210 dep=4 uid=0 ct=3 lid=0 tim=1206301870654916 hv=864012087 ad='29a79348'
    select /*+ rule */ bucket_cnt, row_cnt, cache_cnt, null_cnt, timestamp#, sample_size, minimum, maximum, distcnt, lowval, hival, density, col#, spare1, spare2, avgcln from hist_head$ where obj#=:1 and intcol#=:2
    END OF STMT
    PARSE #1:c=6999,e=6808,p=0,cr=0,cu=0,mis=1,r=0,dep=4,og=3,tim=1206301870654895
    BINDS #1:
    kkscoacd
     Bind#0
      acdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
      acflg=08 fl2=0001 frm=00 csi=00 siz=24 ff=0
      kxsbbbfp=b71ea204  bln=22  avl=02  flg=05


      。。。。。

      。。。。。

     


    新建一个用户也无法删除,还是会报同样的错误。那么从这里基本可以确定出XDB这一块出现了损坏。

    恰好这里有一个经过clone过的环境,  通过testcase问题重现。

    唯一的解决办法就是reinstall xdb环境。过程如下:


    SQL> conn / as sysdba

    SQL> Shutdown Immediate;

    SQL> Startup migrate;

    SQL> @?/rdbms/admin/xdbrelod.sql 

    SQL> Startup

    SQL> @?/rdbms/admin/utlrp.sql


    通过重新安装xdb后问题解决,删除用户恢复正常。

    通过这个问题,并不是说这个case有多么的复杂,而是通过正常的跟踪手段来找出问题,判断问题是非常重要的。


     

  • 使用dbms_backup_restore.zerodbid(0)后的问题

    2008-09-06 02:34:46

    这是我自己的TESTDB
    之前使用dbms_backup_restore.zerodbid(0)之后没有重启DB
    因此造成在写system时实例down.

    alert日志中出现如下信息:
      Fri Sep  5 14:20:07 2008
    Errors in file /oracle/admin/orcl/bdump/orcl_ckpt_4414.trc:
    ORA-01242: data file suffered media failure: database in NOARCHIVELOG mode
    ORA-01122: database file 1 failed verification check
    ORA-01110: data file 1: '/oradata/system01.dbf'
    ORA-01206: file is not part of this database - wrong database id
    Fri Sep  5 14:20:07 2008
    CKPT: terminating instance due to error 1242
    Instance terminated by CKPT, pid = 4414

    处理方法如下:
    [oracle@asm2 ~]$ sqlplus /nolog

    SQL*Plus: Release 10.2.0.1.0 - Production on Fri Sep 5 14:27:58 2008

    Copyright (c) 1982, 2005, Oracle.  All rights reserved.

    SQL> conn / as sysdba
    Connected to an idle instance.
    SQL> startup
    ORACLE instance started.

    Total System Global Area  264241152 bytes
    Fixed Size                  1218892 bytes
    Variable Size              67110580 bytes
    Database Buffers          192937984 bytes
    Redo Buffers                2973696 bytes
    Database mounted.
    ORA-01221: data file 1 is not the same file to a background process


    SQL> shutdown abort;
    ORACLE instance shut down.
    SQL> startup mount;
    ORACLE instance started.

    Total System Global Area  264241152 bytes
    Fixed Size                  1218892 bytes
    Variable Size              67110580 bytes
    Database Buffers          192937984 bytes
    Redo Buffers                2973696 bytes
    Database mounted.
    SQL> recover database;
    Media recovery complete.
    SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-01221: data file 1 is not the same file to a background process


    SQL> alter database open resetlogs;
    alter database open resetlogs
    *
    ERROR at line 1:
    ORA-01139: RESETLOGS option only valid after an incomplete database recovery


    SQL> recover database using backup controlfile ;
    ORA-00279: change 1445010 generated at 09/05/2008 14:18:57 needed for thread 1
    ORA-00289: suggestion : /oradata/arch/1_816_663837713.dbf
    ORA-00280: change 1445010 for thread 1 is in sequence #816


    Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
    auto
    ORA-00308: cannot open archived log '/oradata/arch/1_816_663837713.dbf'
    ORA-27037: unable to obtain file status
    Linux Error: 2: No such file or directory
    Additional information: 3


    ORA-00308: cannot open archived log '/oradata/arch/1_816_663837713.dbf'
    ORA-27037: unable to obtain file status
    Linux Error: 2: No such file or directory
    Additional information: 3


    SQL> alter database until cancel;
    alter database until cancel
                         *
    ERROR at line 1:
    ORA-02231: missing or invalid option to ALTER DATABASE


    SQL> recover database until cancel;
    ORA-00283: recovery session canceled due to errors
    ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


    SQL> recover database using backup controlfile ;
    ORA-00279: change 1445010 generated at 09/05/2008 14:18:57 needed for thread 1
    ORA-00289: suggestion : /oradata/arch/1_816_663837713.dbf
    ORA-00280: change 1445010 for thread 1 is in sequence #816


    Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
    cancel;
    ORA-00308: cannot open archived log 'cancel;'
    ORA-27037: unable to obtain file status
    Linux Error: 2: No such file or directory
    Additional information: 3


    Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
    cancel
    Media recovery cancelled.
    SQL> alter database open;
    alter database open
    *
    ERROR at line 1:
    ORA-01589: must use RESETLOGS or NORESETLOGS option for database open


    SQL> alter database resetlogs;
    alter database resetlogs
                           *
    ERROR at line 1:
    ORA-02231: missing or invalid option to ALTER DATABASE


    SQL> alter database open resetlogs;
    alter database open resetlogs
    *
    ERROR at line 1:
    ORA-01113: file 1 needs media recovery
    ORA-01110: data file 1: '/oradata/system01.dbf'


    SQL> recover datafile 1;
    ORA-00283: recovery session canceled due to errors
    ORA-01610: recovery using the BACKUP CONTROLFILE option must be done


    SQL> recover database using backup controlfile until cancel;
    ORA-00279: change 1445010 generated at 09/05/2008 14:18:57 needed for thread 1
    ORA-00289: suggestion : /oradata/arch/1_816_663837713.dbf
    ORA-00280: change 1445010 for thread 1 is in sequence #816


    Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
    cancel
    Media recovery cancelled.
    SQL> alter database open resetlogs;

    Database altered.

     

  • rman备份失败的处理

    2008-08-17 01:39:19

    rman备份失败的处理

     

    晚上刚跑步回来,就接到用户电话说备份失败,数据库已经有两天没有备份成功。错误信息如下:

    Starting backup at 16-AUG-08

    released channel: t1

    RMAN-00571: ===========================================================

    RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

    RMAN-00571: ===========================================================

    RMAN-03002: failure of backup command at 08/16/2008 22:12:04

    RMAN-06004: ORACLE error from recovery catalog database: RMAN-20011: target database incarnation is not current in recovery catalog


    配好VPN马上拨上去,检查incarnation,发现两边一致。

    crosscheck等命令仍然是报这个错误。

    reset database和resync database等所有操作全部都报错。

    作出最后决定,重新注册catalog信息。

    rman target / catalog rman/XXX@rman;

    偿试
    unregister database ;


    居然发现DB不认这个命令。

    晕倒,

    赶快登陆至rman数据库。

    查到dbid和db_key

    使用这以下这个包没有问题。
    dbms_rcvcat.unregisterdatabase(db_key,dbid);


    rman target / catalog rman/XXX@rman;

    register database;

    重新注册后数据库备份成功。

    处理好之后,才想起来原来之前有作过全库的恢复,并且在直接在rman中使用了resetlogs

    打开了DB。这样就会造成target和catalog的不一致。如果在sqlplus中resetlogs是不会

    出现此类问题。


     

  • WARNING:Oracle instance running on a system with low open file descriptor

    2008-08-03 18:56:32


    WARNING:Oracle instance running on a system with low open file descriptor
            limit. Tune your system to increase this limit to avoid


    客户的一个生产库在把SGA调至3.4GB并升级至9208时,发现alert中出现以下错误
    WARNING:Oracle instance running on a system with low open file descriptor
            limit. Tune your system to increase this limit to avoid


    此错误主要由于limits方面的限制,检查os,发现limits.conf未设置以下参数;
    vi /etc/security/limits.conf

    oracle soft nproc 2047
    oracle hard nproc 16384
    oracle soft nofile 1024
    oracle hard nofile 65536

    加上后问题解决。

    因此在安装oracle的时候,一定要仔细的弄清楚每个参数的意义。

    这样在troubleshooting的时候才能比较快速进行定位。

     

  • 设置大sort_area_size时发生的ora-00600[925]错误

    2008-08-03 18:38:45

    设置大sort_area_size时发生的ora-00600[925]错误

    今天rebuild一个超大的index,在设置手工sort_area_size到达2GB时
    触发ora-00600[925],修改不成功。

    SQL> alter session set sort_area_size=2097152000;
    ERROR:
    ORA-02097: parameter cannot be modified because specified value is invalid
    ORA-00600: internal error code, arguments: [925], [hash_area_size], [-100663296], [], [], [], [], []


    查询hash_area_size

    SQL> show parameter hash_area_size

    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    hash_area_size                       integer     1048576

    发现为1M,

    而sort_area_size的限制是<2*31方,并且不能比hash_area_size大这么多,

    考虑到这个因素,我们把hash_area_size改大


    SQL> alter session set hash_area_size=10485760;

    Session altered.

    SQL>  alter session set sort_area_size=2097152000;

    Session altered.

    SQL> show parameter sort

    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    nls_sort                             string
    sort_area_retained_size              integer     0
    sort_area_size                       integer     2097152000


    可以看到设置成功。

    其实这个也算是oracle的一个bug了。

    见Bug 2205276 and 4011860

  • 升级过程执行catpatch时报04021错误的处理办法

    2008-08-03 11:55:12

    升级过程执行catpatch时报04021错误的处理办法


    有时候在升级9.2X的时候会发生类似于以下的错误:
    create type XMLTypePI OID '0000000000000000000000000002014F' as
    *
    ERROR at line 1:
    ORA-04021: timeout occurred while waiting to lock object SYS.XMLTYPEPI

     

    Type created.

    grant execute on XMLTypePI to public with grant option
                     *
    ERROR at line 1:
    ORA-04021: timeout occurred while waiting to lock object SYS.XMLTYPEPI

     

    发生这个错误的同时,脚本是在这儿等待很长的时间,v$session_wait中会显示

    library cache pin这一等待。发生这个错误的主要原因在于在更新xmldb的时候,

    oracle需要一个版本一个版本往上提,如9204->9205->9206,如中间缺少,那么就

    出现这种报出错误的情况,解决这个错误的方法就是手工执行xdbpatch脚本。

    如:

    startup migrate

    spool xmlpatch.log
    @?/rdbms/admin/xdbpatch
    spool off;

    然后可以重新跑一遍catpatch执行一次字典的更新,
    可以发现该错误被解决。

    当然,这个实际上也是一个bug,oracle说在9207中已经fix掉,但实际上在9208上面还会
    有这个错误发生。

     

  • redo 相关的events及涉及到结构与参数

    2008-06-03 10:02:39

     <WaitEvent:log_file_sequential_read>
     <WaitEvent:log_file_single_write>
     <WaitEvent:log_file_parallel_write>
     <WaitEvent:log_file_space_switch>
      <WaitEvent:log_file_switch_checkpoint_incomplete>
      <WaitEvent:log_file_switch_archiving_needed>
      <WaitEvent:log_file_switch_clearing_log_file>
     <WaitEvent:log_file_switch_switch_logfile_command>
     <WaitEvent:log_file_switch_full_log_file>
     <WaitEvent:log_file_sync>
     <Statistic:redo_entries>
     <Statistic:redo_size>
     <Statistic:redo_entries_linearized>
     <Statistic:redo_buffer_allocation_retries>
     <Statistic:redo_small_copies>
     <Statistic:redo_wastage>
     <Statistic:redo_writer_latching_time>
     <Statistic:redo_writes>
     <Statistic:redo_blocks_written>
     <Statistic:redo_write_time>
     <Statistic:redo_log_space_requests>
     <Statistic:redo_log_space_wait_time>
     <Statistic:redo_log_switch_interrupts>
     <Statistic:redo_ordering_marks>
     <Statistic:background_checkpoints_started>
     <Statistic:background_checkpoints_completed>
     <Parameter:THREAD>
     <Parameter:LOG_BUFFER>
     <Parameter:LOG_IO_SIZE>
     <Parameter:LOG_CHECKPOINT_INTERVAL>
     <Parameter:LOG_CHECKPOINT_TIMEOUT>
  • ora-00600[2103]错误分析

    2008-04-19 01:18:53

    ora-00600[2103]错误分析


    环境是两台HP 8400, hpux11.11i ,oracle9208 做data guard.

    MAXIMUM AVAILABILITY模式下. 今天中午主库出现出现hang住的情况.

    查看主库alert日志,
    Fri Apr 18 11:00:44 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/bdump/c3cdb615_arc1_29593.trc:
    ORA-16099: internal error ORA-00600 occurred at standby database
    Fri Apr 18 11:18:49 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/bdump/c3cdb615_arc1_29593.trc:
    ORA-16099: internal error ORA-00600 occurred at standby database
    Fri Apr 18 11:39:53 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/bdump/c3cdb615_arc1_29593.trc:
    ORA-16099: internal error ORA-00600 occurred at standby database
    Fri Apr 18 12:00:58 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/bdump/c3cdb615_arc1_29593.trc:
    ORA-16099: internal error ORA-00600 occurred at standby database
    Fri Apr 18 12:22:03 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/bdump/c3cdb615_arc1_29593.trc:
    ORA-16099: internal error ORA-00600 occurred at standby database
    Fri Apr 18 12:40:09 2008

    备库alert日志:
    Fri Apr 18 11:40:05 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/udump/c3cdb615_rfs_25083.trc:
    ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
    Fri Apr 18 11:40:05 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/udump/c3cdb615_rfs_25083.trc:
    ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
    Fri Apr 18 11:46:05 2008
    RFS: Possible network disconnect with primary database
    Fri Apr 18 12:01:08 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/udump/c3cdb615_rfs_25339.trc:
    ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
    Fri Apr 18 12:01:09 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/udump/c3cdb615_rfs_25339.trc:
    ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
    Fri Apr 18 12:01:09 2008
    Errors in file /oracle/app/oracle/admin/c3cdb615/udump/c3cdb615_rfs_25339.trc:
    ORA-00600: internal error code, arguments: [2103], [0], [0], [1], [900], [], [], []
    Fri Apr 18 12:07:10 2008
    RFS: Possible network disconnect with primary database


    一眼就看出现主库的rfs这里出了问题,那么我们查看备库IO情况,iostat发现, 每秒为324K,这
    个速度明显是极为不正常的.怎么说也是va呀. 使用ioscan,发现已经无法找到盘.这里已经确信为
    IO子系统出现了故障.联系硬件工程师,更换存储系统.=

    这里我们顺便来看一下,关于ora-00600[2103]的一些处理办法:

    其中有一种是会造成instance crash的,如下:
    ORA-00600: internal error code, arguments: [2103], [1], [0], [1], [900], [], [], []
       LGWR: Detected ARCH process failure
       LGWR: Detected ARCH process failure
       LGWR: STARTING ARCH PROCESSES

    这里是由于bug6520821(unpublished)所引起的错误,主要是在做alter database backup controlfile ...时
    所引起的bug, 针对这个bug,oracle目前没有提供opatch,但是10202的patchset修复了这个bug.
    因此在9208上遇到这个bug,可以不做alter database backup ....这样的操作,因此要注意你
    的备份策略的设置.


    第二种情况是:

    oracle进程会由于获取CF(controlfile enqueue)时间太长而导致超时造成的.
    主要原因有:
    由于IO子系统速率过于低下,当然通常这种情况都是由于存储故障所导致,如控制器链路故障
    以及接交换机链路出现问题都会导致IO出现问题,这时候
    os故障.
    AIO多个写进程导致 ,这里可以考虑设置max_io_server和min_io_server至一个合适的值.
    日志切换过于频繁,或者日志文件过小以及日志组太少.
    使用隐含参数来控制超时时间,如:
    _CONTROLFILE_ENQUEUE_TIMEOUT = 10800

    当然,可能会有一些情况这里没有一一列出.这里只是作为2103错误的一个参考和解决的思路.

  • 关于对象的依赖性

    2008-03-23 02:26:35

    在对于一个表作相关的结构变化后,所依赖她的对象都将失效,
    这里包括视图,过程,函数或包处于无效状态,但其定义仍被保存于数据字典内,
    相关的权限信息,引用此视图的同义词,对象,及其他视图也 都会被保留。
    当重新访问这些对象的时候,oracle会尝试自动编译这些对象。注意,表,序列,和同义词总是处于有效状态的。
    Oracle 能够自动地跟踪数据库中发生的特定变化,并在数据字典中记录相关的方案对象的最新状态。
    状态记录是一个递归的过程。引用对象的状态变化不仅会导致其直接依赖对象的状态变化,
    同时会影响其间接依赖对象的状态。

    下面来看一个简单的例子:

     


    SQL> alter table test drop column z;

    Table altered.


    SQL> select status,object_name from user_objects;

     

    STATUS
    -------
    OBJECT_NAME
    --------------------------------------------------------------------------------
    VALID
    TMP_SES

    VALID
    BIN$SCF8qbynKMfgQAB/AQAMxw==$0

    VALID
    TEST


    STATUS
    -------
    OBJECT_NAME
    --------------------------------------------------------------------------------
    INVALID
    V_TEST

    VALID
    I_X


    14 rows selected.


    SQL> alter session set events '10046 trace name context forever,level 8';

    Session altered.

    SQL> select * from v_test;

         COUNT
    ----------
        100000

     

    ********************************************************************************

    ALTER VIEW "TEST"."V_TEST" COMPILE


    call     count       cpu    elapsed       disk      query    current        rows
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    Parse        1      0.01       0.01          0          0          0           0
    Execute      1      0.00       0.01          0          0          0           0
    Fetch        0      0.00       0.00          0          0          0           0
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    total        2      0.01       0.03          0          0          0           0

    Misses in library cache during parse: 1
    Optimizer mode: ALL_ROWS
    Parsing user id: 56     (recursive depth: 1)
    ********************************************************************************

  • 恢复XP中"快速启动"中的"显示桌面"图标

    2008-03-09 03:02:08

    前两天一不小心把"显示桌面"的图标给删除掉了,结果这两天电脑用的一直不舒服,特别是打开多个窗口的情况下,是非常不方便的. 经网上各种资料搜集,方法如下:新建一个文本文件:

    [Shell]
    Command=2
    IconFile=explorer.exe,3
    [Taskbar]
    Command=ToggleDesktop

    保存为文件名"Show Desktop.scf"

    点击“文件”菜单中的“保存”,将文件保存在以下位置:

    C:\Documents and settings\登陆的用户名\Application data\Microsoft\Internet Explore\Quick launch\
    注意,这里的"Application data"文件夹为隐含.

    这样即可恢复.

  • oracle9.2/Linux rac 动态加入新节点(4)

    2008-03-09 00:06:12

    添加实例

     

    l         rac3上修改listener.ora

    [oracle@rac3 ~]$ vi /u01/oracle/product/9.2/network/admin/listener.ora

    # LISTENER.ORA.RAC1 Network Configuration File: /u01/oracle/product/9.2/network/admin/listener.ora.rac1

    # Generated by Oracle configuration tools.

     

    LISTENER =

      (DESCRIPTION_LIST =

        (DESCRIPTION =

          (ADDRESS_LIST =

            (ADDRESS = (PROTOCOL = TCP)(HOST = rac3)(PORT = 1521))

          )

        )

      )

     

    SID_LIST_LISTENER =

      (SID_LIST =

        (SID_DESC =

          (SID_NAME = PLSExtProc)

          (ORACLE_HOME = /u01/oracle/product/9.2)

          (PROGRAM = extproc)

        )

        (SID_DESC =

          (ORACLE_HOME = /u01/oracle/product/9.2)

          (SID_NAME = rac1)

        )

        (SID_DESC =

          (ORACLE_HOME = /u01/oracle/product/9.2)

          (SID_NAME = rac2)

        )

        (SID_DESC =

          (ORACLE_HOME = /u01/oracle/product/9.2)

          (SID_NAME = rac3)

        )

      )

      注:要加入红色部分,否则

    [oracle@rac1 dbs]$ srvctl start instance -d test -i rac3

    PRKP-1011 : Failed to get all the listeners associated with instance rac3 on noderac3

    l         rac3上修改tnsnames.ora

    加入下面两行

    LISTENER_RAC3 =

    (ADDRESS = (PROTOCOL = TCP)(HOST = rac3)(PORT = 1521))

     

    l         3.5修改初始化参数文件

     在rac3上修改初始化参数文件加入以下内容

    rac3.instance_name=RAC3

    rac3.instance_number=3

    rac3.local_listener=LISTENER_RAC3

    rac3.thread=3

    rac3.undo_tablespace=UNDOTBS3

     

    l         rac1上设置redo线程

    SQL> alter database add logfile thread 3 group 5 ('/u01/oracle/oradata/test/redolog3_1') size 10485760 ;

    SQL> alter database add logfile thread 3 group 6 ('/u01/oracle/oradata/test/redolog3_2') size 10485760 ;

    SQL> alter database enable thread 3;

    l         rac1上创建undo 表空间

    SQL> create undo tablespace undotbs3 datafile '/u01/oracle/oradata/test/undotbs03' size 40m;

    l         添加新的节点并启动服务

    [oracle@rac3 ~]$ srvctl add instance -d test -i rac3 -n rac3

    [oracle@rac3 ~]$ srvctl start instance -d test -i rac3

    注意事项:

    在用DBCA添加实例时,在如下界面时不能点下一步,未找到原因,建议用命令方式添加实例。

    删除实例

    删除实例时,可按如下步骤操作

    SQL> drop tablespace undotbs3;

    SQL> alter database disable thread 3;

    SQL> alter database drop logfile group 6;

    SQL> alter database drop logfile group 5;

    [oracle@rac3 ~]$ srvctl remove instance -d test -i rac3

    如果需要添加多个实例,按如上步骤,逐个添加。

  • oracle9.2/Linux rac 动态加入新节点(3)

    2008-03-08 23:50:57

    配置Oracle Cluster Manger

     

    重新配置已存在的节点

    l         修改rac1的Cluster Manger文件

    (红色为加入的内容)

    [oracle@rac1 admin]$ vi cmcfg.ora

    ClusterName=Oracle Cluster Manager, version 9i

    MissCount=210

    PrivateNodeNames=rac2-priv rac1-priv rac3-priv

    PublicNodeNames=rac2 rac1 rac3

    ServicePort=9998

    CmDiskFile=/dev/raw/raw1

    KernelModuleName=hangcheck-timer

    HostName=rac1-priv

    l         修改rac2的Cluster Manger文件

    [oracle@rac2 admin]$ vi cmcfg.ora

    ClusterName=Oracle Cluster Manager, version 9i

    MissCount=210

    PrivateNodeNames=rac2-priv rac1-priv rac3-priv

    PublicNodeNames=rac2 rac1 rac3

    ServicePort=9998

    CmDiskFile=/dev/raw/raw1

    KernelModuleName=hangcheck-timer

    HostName=rac2-priv

    注:上述文件只要修改即可,的第三个节点相关启动时,它会自动重新配置。

    3.2配置新节点

    l         修改rac3的Cluster Manger文件

    ClusterName=Oracle Cluster Manager, version 9i

    MissCount=210

    PrivateNodeNames=rac2-priv rac1-priv rac3-priv

    PublicNodeNames=rac2 rac1 rac3

    ServicePort=9998

    CmDiskFile=/dev/raw/raw1

    KernelModuleName=hangcheck-timer

    HostName=rac3-priv

    l         在rac3上启动Cluster Manger

    [root@rac2 9.2]$ mkdir /var/opt/oracle

    [root@rac2 9.2]$ vi /var/opt/oracle/srvConfig.loc

    srvconfig_loc=/dev/raw/raw14

    [root@rac2 9.2]$ chown -R oracle:dba /var/opt/oracle

    [root@rac3 oracle]# chmod -R 755 /var/opt/oracle/srvConfig.loc

    [root@rac3 ~]# . /home/oracle/.bash_profile

    [root@rac3 ~]# /u01/oracle/product/9.2/oracm/bin/ocmstart.sh

    [oracle@rac3 ~]$ ps -ef|grep ora

    root      4892     1  0 Mar07 pts/1    00:00:08 oracm

    root      4894  4892  0 Mar07 pts/1    00:00:05 oracm

    root      4895  4894  0 Mar07 pts/1    00:04:04 oracm

    root      4896  4894  0 Mar07 pts/1    00:00:00 oracm

    root      4897  4894  0 Mar07 pts/1    00:01:04 oracm

    root      4898  4894  0 Mar07 pts/1    00:00:09 oracm

    root      4899  4894  0 Mar07 pts/1    00:00:08 oracm

    root      4900  4894  0 Mar07 pts/1    00:01:03 oracm

    root      4901  4894  0 Mar07 pts/1    00:00:30 oracm

    root      4952  4894  0 Mar07 pts/1    00:00:00 oracm

    root      4994  4894  0 Mar07 pts/1    00:00:05 oracm

    root      6467  4894  0 01:09 pts/1    00:00:52 oracm

    检查ocm日志

    [oracle@rac2 log]$ tail –f /u01/oracle/product/9.2/oracm/log/cm.log

    HandleUpdate(): SYNC(6) from node(0) completed {Mon Mar  3 18:12:28 2008 }^M

    HandleUpdate(): NODE(0) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(4) {Mon Mar  3 18:12:28 2008 }^M

    HandleUpdate(): NODE(1) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(6) {Mon Mar  3 18:12:28 2008 }^M

    HandleUpdate(): NODE(2) IS ACTIVE MEMBER OF CLUSTER, INCARNATION(7) {Mon Mar  3 18:12:28 2008 }^M

    NMEVENT_RECONFIG [00][00][00][00][00][00][00][07] {Mon Mar  3 18:12:28 2008 }^M

    Debug Hang : CMNodeListener(PID=15316) Registered with ORACM. {Mon Mar  3 18:12:28 2008 }^M

    <11557:7152><58441:2567>Debug Hang : CMNodeListener(PID=15317) Registered with ORACM. {Mon Mar  3 18:12:29 2008 }^M

    Successful reconfiguration,  3 active node(s) node 0 is the master, my node num is 0 (reconfig 7) {Mon Mar  3 18:12:29 2008 }^M

     

       这时可以启动gsd服务

    [oracle@rac3 ~]$ gsdctl start

    [oracle@rac3 ~]$ ps -ef|grep gsd

     

  • oracle9.2/Linux rac 动态加入新节点(2)

    2008-03-08 23:48:57

     确认网络及心跳都设置正确

    [root@rac3 ~]# ping rac1

    [root@rac3 ~]# ping rac2

    [root@rac3 ~]# ping rac1-priv

    [root@rac3 ~]# ping rac2-priv

    [root@rac3 ~]# rlogin rac1

    [root@rac3 ~]# rlogin rac2

    [root@rac3 ~]# rlogin rac1-priv

    [root@rac3 ~]# rlogin rac1-priv

    l         Oracle软件相关配置

    在操作系统装好以后,需要启动rshrloginrexec服务,打上安装ORACLE需要的补丁,修改内核参数,并正确定设置环境变量,目录要求和其它节点一样。

    export LD_ASSUME_KERNEL=2.4.1

    export ORACLE_BASE=/u01/oracle

    export ORACLE_HOME=/u01/oracle/product/9.2

    export PATH=$ORACLE_HOME/bin:$ORACLE_HOME/Apache/Apache/bin:$PATH

    export ORACLE_OWNER=oracle

    export ORACLE_SID=rac3

    export LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH

     

    export DISPLAY=192.168.88.10:0.0;

    export DBCA_RAW_CONFIG=/home/oracle/filename

       复制rac2点的软件目录至rac3

    [oracle@rac3 ~]$ cd /u01/oracle/product/9.2/

    [oracle@rac3 9.2]$ rcp -rp rac1:/u01/oracle/product/9.2/* .

    [oracle@rac3 9.2]$ cd /u01/oracle/product/9.2/oracm/admin/

    l         识别共享磁盘

    rac3上识别共享磁盘并挂载裸设备。

    raw /dev/raw/raw1 /dev/sdc1

    raw /dev/raw/raw2 /dev/sdc2

    raw /dev/raw/raw3 /dev/sdc3

    raw /dev/raw/raw4 /dev/sdc5

    raw /dev/raw/raw5 /dev/sdc6

    raw /dev/raw/raw6 /dev/sdc7

    raw /dev/raw/raw7 /dev/sdc8

    raw /dev/raw/raw8 /dev/sdc9

    raw /dev/raw/raw9 /dev/sdc10

    raw /dev/raw/raw10 /dev/sdc11

    raw /dev/raw/raw11 /dev/sdc12

    raw /dev/raw/raw12 /dev/sdc13

    raw /dev/raw/raw13 /dev/sdc14

    raw /dev/raw/raw14 /dev/sdc15

     

    raw /dev/raw/raw15 /dev/sdd1

    raw /dev/raw/raw16 /dev/sdd2

    raw /dev/raw/raw17 /dev/sdd3

    raw /dev/raw/raw18 /dev/sdd5

    raw /dev/raw/raw19 /dev/sdd6

    raw /dev/raw/raw20 /dev/sdd7

    raw /dev/raw/raw21 /dev/sdd8

    raw /dev/raw/raw22 /dev/sdd9

    raw /dev/raw/raw23 /dev/sdd10

    raw /dev/raw/raw24 /dev/sdd11

    raw /dev/raw/raw25 /dev/sdd12

     

    for i in `seq 1 25`

    do

        chmod 660 /dev/raw/raw$i

        chown oracle.dba /dev/raw/raw$i

    done

  • oracle9.2/Linux rac 动态加入新节点(1)

    2008-03-08 23:38:29

    本文档介绍了9.2RAC数据库如何动态加入第三个节点。

    操作系统:Linux 2.6.9-34.EL CST 2006 i686 i686 i386 GNU/Linux

    数据库: Oracle9i Enterprise Edition Release 9.2.0.8.0

    ORACLE_BASE=/u01/oracle

    ORACLE_HOME=/u01/oracle/product/9.2

     

     

    主机名

    网卡

    IP地址

    rac1

    ifcfg-eth0

    192.168.88.240

    ifcfg-eth0:0

    172.30.1.240

    rac2

    ifcfg-eth0

    192.168.88.241

    ifcfg-eth0:0

    172.30.1.241

    rac3

    ifcfg-eth0

    192.168.88.243

    ifcfg-eth0:0

    172.30.1.243

     

    分区

    裸设备文件

    用途

    /dev/sdc1

    /dev/raw/raw1

    Cluster Manager Quorum

    /dev/sdc2

    /dev/raw/raw2

    Undo tbs1

    /dev/sdc3

    /dev/raw/raw3

    Undo tbs2

    /dev/sdc5

    /dev/raw/raw4

    System

    /dev/sdc6

    /dev/raw/raw5

    spfile

    /dev/sdc7

    /dev/raw/raw6

    temp

    /dev/sdc8

    /dev/raw/raw7

    users

    /dev/sdc9

    /dev/raw/raw8

    Control file1

    /dev/sdc10

    /dev/raw/raw9

    Control file1

    /dev/sdc11

    /dev/raw/raw10

    Redo thread 1_1

    /dev/sdc12

    /dev/raw/raw11

    Redo thread 1_2

    /dev/sdc13

    /dev/raw/raw12

    Redo thread 2_1

    /dev/sdc14

    /dev/raw/raw13

    Redo thread 2_2

    /dev/sdc15

    /dev/raw/raw14

    Shared Configuration

    /dev/sdd1

    /dev/raw/raw15

    undotbs03

    /dev/sdc2

    /dev/raw/raw16

    redolog3_1

    /dev/sdc3

    /dev/raw/raw17

    redolog3_2

     

     

    注:rac3为新加节点

     

    l        网络配置

    在所有节点修改/etc/hosts

    192.168.88.240  rac1

    192.168.88.241  rac2

    192.168.88.243  rac3

    172.30.1.240    rac1-priv

    172.30.1.241    rac2-priv

    172.30.1.243    rac3-priv

       在所有节点修改/etc/hosts.equiv

    rac1

    rac2

    rac3

    rac1-priv

    rac2-priv

    rac3-priv

       在所有节点修改/root/.rhosts

    rac1 root

    rac2 root

    rac3 root

    rac1-priv root

    rac2-priv root

    rac3-priv root

  • oracle中的临时表

    2008-03-04 01:16:40

    oracle中的临时表

    临时表通常用来保存一个事务或者会话期间的数据.
    临时表中保存的数据是具有独立性的,只对各自会话可见,并且每个会话
    都只能查询和修改属于此会话的数据,在对temporary table作dml操作时,
    不需要申请锁资源,因此lock语句对于临时表来说是没有作用的.

    在空间方面,在创建永久性表时通常是需要为表分配initial extent,但是对于
    临时表是不需要的,临时表只是在使用的时候,根据数据来分配创建临时段.


    对临时表的 DML 操作不会产生数据修改的重做日志,但是将产生被修改数据的撤销记录,

    及撤销记录的重做日志.

    我们来看一下临时表所产生的redo size情况.

    SQL> select * from v$sesstat where sid=159 and statistic#=134;

           SID STATISTIC#      VALUE
    ---------- ---------- ----------
           159        134     929956

    SQL> create global temporary table temp_ses on     
      2  commit preserve rows
      3  as
      4  select * from dba_objects;

    Table created.

    SQL> select * from v$sesstat where sid=159 and statistic#=134;

           SID STATISTIC#      VALUE
    ---------- ---------- ----------
           159        134     948420


    再来看一下创建一个同样大小数据量的永久性表:
    SQL> select * from v$sesstat where sid=142 and statistic#=134;

           SID STATISTIC#      VALUE
    ---------- ---------- ----------
           142        134       1432

    SQL> create table pert as select * from dba_objects;

    Table created.

    SQL> select * from v$sesstat where sid=142 and statistic#=134;

           SID STATISTIC#      VALUE
    ---------- ---------- ----------
           142        134    5724360

    SQL> select 5724360-1432 from dual;

    5724360-1432
    ------------
         5722928

    SQL>


    可以看到两者的差距已经不仅仅是一个数据级了.

     

    临时表一共有两种:
    会话级别和事务级别的.

    先来看一个会话级别的.

    SQL> create global temporary table tmp_ses on commit preserve rows
      2  as  select * from dba_objects;

    Table created.

    SQL>

    on commit preserve rows表明这一个基于会话的临时表,在会话断开以后.
    所有数据都将被抹去.


    SQL> create global temporary table tmp_trans on commit delete rows
      2  as select * from dba_objects;

    Table created.

    SQL>

    on commit delete rows表示这是一个基于事务的临时表,在会话提交的时候,数据

    就会被自动清除掉。

    SQL> insert into tmp_trans select * from dba_objects;

    49772 rows created.

    SQL> select count(*) from tmp_trans;

      COUNT(*)
    ----------
         49772

    SQL> commit;

    Commit complete.

    SQL> select count(*) from tmp_trans;

      COUNT(*)
    ----------
             0

    这里可以看到在事务commit以后,临时表中的数据被全部清空。而这个清空的过程

    是几乎不存在开销,oracle完成的仅仅是把临时段回收的一个动作。


    临时段的分配

    临时表使用临时段来分配数据,因此在创建临时表的时候,oracle并不会为其分配段,

    而是在使用的时候才分配。我们可以使用v$sort_usage来观察某个临时表所占用的

    空间大小。

    SQL> select * from v$sort_usage;

    no rows selected

    SQL> insert into tmp_ses select * from dba_objects;

    49772 rows created.

    SQL> commit;

    Commit complete.

    SQL> select count(*) from tmp_ses;

      COUNT(*)
    ----------
         49772

    SQL> select distinct sid from v$mystat;

           SID
    ----------
           142

    SQL> select sid,serial# from v$session where sid=142;

           SID    SERIAL#
    ---------- ----------
           142        168

    SQL>  select username,user,session_num,
      2  tablespace segtype,segfile#,segblk# ,blocks from v$sort_usage;

    USERNAME   USER       SESSION_NUM SEGTYPE      SEGFILE#    SEGBLK#     BLOCKS
    ---------- ---------- ----------- ---------- ---------- ---------- ----------
    TEST       TEST               168 TEMP              201       2313        768


    这里可以看到,通过与v$session视图中的serial#相关联,可以得出某个会话的

    所拥有的临时表中的数据的大小,当然,这里与永久表一样,在delete的时候

    是不会释放出空间的:

    SQL> select username,user,session_num,tablespace segtype,segfile#,segblk# ,blocks from v$sort_usage;

    USERNAME   USER       SESSION_NUM SEGTYPE      SEGFILE#    SEGBLK#     BLOCKS
    ---------- ---------- ----------- ---------- ---------- ---------- ----------
    TEST       TEST               168 TEMP              201       2313        768

    SQL> delete tmp_ses;

    19772 rows deleted.

    SQL> commit;

    Commit complete.

    SQL> select count(*) from tmp_ses;

      COUNT(*)
    ----------
             0

    SQL> select username,user,session_num,tablespace segtype,segfile#,segblk# ,blocks from v$sort_usage;

    USERNAME   USER       SESSION_NUM SEGTYPE      SEGFILE#    SEGBLK#     BLOCKS
    ---------- ---------- ----------- ---------- ---------- ---------- ----------
    TEST       TEST               168 TEMP              201       2313        768

    SQL>


    SQL> truncate table tmp_ses;

    Table truncated.

    SQL>  select username,user,session_num,tablespace segtype,segfile#,segblk# ,blocks from v$sort_usage;

    no rows selected


    可以看到,在删除数据时,oracle采用了节省成本的方式,减少了不必要的开销。


    关于临时表的事务,与事务相关的临时表中的数据可以被用户的事务及子事务访问。

    但是这些数据不能被同一会话里的两个事务同时访问。不同会话中的事务可以同时

    使用同一个事务相关的临时表。如果用户事务对临时表执行了 INSERT 操作,

    在此之后此事务的子事务将不能使用这个临时表。

    如果在子事务中对临时表执行了 INSERT 操作,临时表中已有的数据将被清除。

    子事务结束后,父事务及其他子事务对此临时表访问权利将被恢复。
     

  • ora-00600[4400]错误的解决

    2008-02-06 03:16:33

    ora-00600[4400]错误的解决

    4400错误通常在分布式事务中出现,如到这个错误不会对数据库产生重大的影响,

    以下是一个生产库的实际解决的例子。

    故障描述:
    Errors in file .../udump/ora_15899_$ORACLESID.trc:
    ORA-00600: 内部错误代码,自变量: [4400], [48], [], [], [], [], [], []
    Tue Feb 5 17:07:18 2008
    Errors in file .../bdump/reco_1297_$ORACLESIDl.trc:
    ORA-02062: distributed recovery received DBID e23ac21, expected 232acd6
    ARC1: media recovery disabled

     

    原因及解决方法:
    由于db_link断开,两阶段事务不能正常处理,查询dba_2pc_pending
    LOCAL_TRAN_ID          GLOBAL_TRAN_ID                 STATE            MIX A TRAN_COMMENT         FAIL_TIME FORCE_TIM RETRY_TIM OS_USER
    ---------------------- ------------------------------ ---------------- --- - -------------------- --------- --------- --------- ----------------------------------------------------------------
    OS_TERMINAL
    --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    HOST                                                                                                                             DB_USERCOMMIT#
    -------------------------------------------------------------------------------------------------------------------------------- ------------------------------ ----------------
    192.168.1.1              jpb.6c5f8f5f.192.168.1.1     collecting       no                         05-FEB-08           05-FEB-08 Administrator
    DGDX-WR
    WORKGROUP\DGDX-WR                                                                                                                IBSS7740378455873

    192.168.1.2             jpb.6c5f8f5f.192.168.1.2    collecting       no                         05-FEB-08           05-FEB-08 Administrator
    DGDX-WR
    WORKGROUP\DGDX-WR                                                                                                                IBSS7740389820541
    192.168.1.112             jpb.6c5f8f5f.192.168.1.112    collecting       no                         05-FEB-08           05-FEB-08 Administrator
    DGDX-WR
    WORKGROUP\DGDX-WR                                                                                                                IBSS7740970651708


    通过以下方法来清除这些事务:

    SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('192.168.1.2');
    PL/SQL procedure successfully completed.
    SQL> commit;
    Commit complete.
    SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('192.168.1.1');
    PL/SQL procedure successfully completed.
    SQL> commit;
    Commit complete.
    SQL> execute DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('192.168.1.112');
    PL/SQL procedure successfully completed.
    SQL> commit;


     

  • Tuning RAC and using parallel features(6)

    2008-02-06 01:34:11

    RAC waits events ad interconnect statistics

    The RAC events are listed next in the report if you are running RAC (multiple instances). As stated earlier, you need to run STATSPACK or AWR Report for each instance that you have.For statspack, you run the statspck.snap procedure and the spreport.sql script. on each node you want to monitor to compare to instances.One of the best methods to see if a node is operating efficiently is to compare the report from that node to one from another node that accesses the same database. It’s very important to remember that single-instance tuning should be performed before attempting to tune the processes that communicate via the cluster interconnect. In other words,tune the system in single instance before you move it to RAC.

    Some of the top waits events that you may encounter are listed briefly next.

    The top global cache(gc) waits to look out for include :

     

    Gc current block busy: happens when an instance requests a CURR data block(wants to do some DML) and the block to be transferred is in use.

     

    Gc buffer busy: A wait event that occurs whenever a session has to waiwt for an ongoing operation on the resource to complete because the block is in use. The process has to wait for a block to become available because another process is obtaining a resource for this block.

     

    Gc cr request: This happens when one instance is waiting for blocks from another instance’s cache(sent via the interconnect). This wait says that the current instance can’t find a consistent read(CR) version of a block in the local cache. If the block is not in the remote cache, then a db file  sequential read wait will also follow this one. Tune the SQL that is causing large amounts of reads that get moved from node to node. Try to put users that are using the same blocks on the same instance so that blocks are not moved from instance to instance. Some non-Oracle application servers will move the same process from node to node looking for the fastest node(unaware that they are moving the same blocks from node to node). Pin these long processes to the same node. Potentially increase the size of the local cache if slow I/O combined with a small cache is the problem. Monitor v$cr_block_server to see if there is an issue like reading UNDO segments.  Correlated to the waits the values for p1, p2, p3=file,block,lenum(look in v$lock_element for row where lock_element_addr has the same value as lenum). Happens when an instance  requests a CR data block and the block to be transferred hasn’t arrived at the requesting instance.  This is one I see the most, and it’s usually because the SQL is poorly tuned and many index blocks are being moved back and forth between instance.

     

    Figure shows the AWR Report RAC section. You can see that there are two instances in this cluster. You can also see things like the number of blocks send and received as well as how many of the blocks are being accessed in the local cache(99.1 percent) versus the disk or another instance. As you would guess , it is faster to access block in the local cache, but accessing one of the remote caches on one of the other nodes is almost always faster(given a fast enouth interconnect and no saturation of the interconnect) than going to disk.

         The following is another valuable query to derive session wait information. The instance_id lists the instance where the waiting session resides. The sid is te unique identifier for the waiting session(gv$session). The p1,p2 ,and p3 columns list event-specific information that may be useful for debugging. LAST_SQL lists

    The last SQL executed by the waiting session.

     

    Set numwidth 10

    Col state format a7 tru

    Col event format a25 tru

    Col last_sql format a40 tru

    Select sw.inst_id instance_id,

           sw.sid sid,

           sw.state state,

           sw.event event,

           Sw.seconds_in_wait seconds_wating,

           sw.p1,

           sw.p2,

           sw.p3,

           Sa.sql_text last_sql

      From gv$session_wait sw, gv$session s, gv$sqlarea sa

     Where sw.event not in

           ('rdbms ipc message', 'smon timer', 'pmon timer',

            'SQL*Net message from client',

            'lock manager wait for remote message', 'ges remote message',

            'gcs remote message', 'gcs for action', 'client message', 'pipe get',

            'PX Idle Wait', 'single-task message', 'listen endpoint status',

            'slave wait', 'wakeup time manager')

       And sw.SECONDS_IN_WAIT > 0

       And (sw.INST_ID = s.inst_id and sw.sid = s.sid)

       And (s.inst_id = sa.inst_id and s.sql_address = sa.address)

     Order by SECONDS_IN_WAIT desc;

    Seconds_in_wait:  if wait_time=0, then seconds_in_wait is the seconds spend in the current wait condition. If wait_time>0, then seconds_in_wait is the seconds since the start of the last wait, and seconds_in_wait-wait_time/100 is the active seconds since the last wait ended.

     

    RAC Statistics

     

    Begin

    End

    Number of Instances:

    2

    2

    Global Cache Load Profile

     

    Per Second

    Per Transaction

    Global Cache blocks received:

    73.42

    10.16

    Global Cache blocks served:

    81.75

    11.31

    GCS/GES messages received:

    110.26

    15.26

    GCS/GES messages sent:

    112.45

    15.56

    DBWR Fusion writes:

    0.07

    0.01

    Estd Interconnect traffic (KB)

    1,284.90

     

    Global Cache Efficiency Percentages (Target local+remote 100%)

    Buffer access - local cache %:

    99.31

    Buffer access - remote cache %:

    0.60

    Buffer access - disk %:

    0.08

     

     

  • Tuning RAC and using parallel features(5)

    2008-02-06 00:23:45

     

    RAC Cluster interconnect performance

     

    The most complex aspect of RAC tuning involves monitoring and the subsequent tuning of process associated with the Global Services Directory(GSD) . The group of processes associated with the GSD is the Global Enqueue Service(GES) and the Global Cache Service(GCS). The GSD processes communicated throught the cluster interconnects . if the cluster interconnects are not configured to process data packets efficiently, then the entire RAC implementation will perform. poorly. This is true regardless of performance-related tuning and configuration efforts in other areas.

     

    Interconnect Traffic-sessions waiting

    Sessions that wait on non-idle wait wvents that impact interconnect traffic can be monitored by a query that lists GCS waits using the global dynamic performance view gv$session_wait. You may also see these waits in a STATSPACK or AWR Repot. The major waits that are being monitored are as follows:

     

    Global cache busy: A wait event that occurs whenever a session has to wait for an ongoing operation on the resource to complete.

     

    Gc buffer busy : A wait event that is signaled when a process has to wait for a block to become available because another process is obtaining a resource for this block.

     

    Buffer busy global CR: waits on a consistent read(block needed for reading) via the global cache.

     

    To identify the sessins experiencing  waits on the system, perform. the following tasks.query v$session_wait to determine whether or no any sessions are experiencing RAC-related waits (at the current time)Identify the objects that are causing contention for these sessions. Try to modify the object or query to reduce contention.For example, query v$session_wait to determine whether or not any session are experiencing RAC cache-related waits. Note that the GV$ views are used much more to show statistics for the entire cluster, whereas the V$ views still show statistics from a single node. If you plan to use RAC, you must extend the v$ views an queries to the GV$ views for multiple nodes. Thie section is only an initial guide to help you see all of the components. This scope of this book does not cover RAC specifically,but some things that will help you tune RAC.

     

    Select inst_id, event, p1 file, p2 block, wait_time

      From gv$session_wait

     Where event in (‘buffer busy global CR’, ‘global cache busy’, ‘buffer busy global cache’);

     

    The output from this query should look something like this:

     

    Inst_id  event                                 file   block wait_time

    ---------------------------------------------------------------

    1        global cache busy                      9      150   15

    2        global cache busy                      9      150   10

     

    Run this query to identify objects that are causing contention for these sessions and identifying the object that corresponds to file and block for each file/block combination returned:

    Select owner, segment_name, segment_type

      From dba_extents

     Where file_id=9

       And 150 between block_id and block_id+blocks-1

     

    Modify the object to reduce the chances for application contention by doing the following:

    Reduct the number of rows per block.

    Adjust the block size to a smaller block size.

    Modify initrans and freelists.

  • 关于tom的unload脚本

    2008-02-05 01:42:01

    tom写的一个用来生成为sqlldr准备的格式.

    其中有unix和windows两个版本.

    在itpub中有人已经上传.

    具体可以到这里下载:

    http://www.itpub.net/thread-927857-1-1.html

    关于这个脚本,需要注意的是,由于tom只是输出一个例子.因此并未把数据写入至文件,

    其实稍把sqlldr_exp.sql脚本作一修改即可实现, 如下:

     

    set wrap off
    set linesize 100
    set feedback off
    set pagesize 0
    set verify off
    set termout off

    spool ytmpy.sql

    prompt set head off;
    prompt set feedback off;
    prompt spool aa.txt
    prompt prompt LOAD DATA
    prompt prompt INFILE *
    prompt prompt INTO TABLE &1
    prompt prompt REPLACE
    prompt prompt FIELDS TERMINATED BY '|'
    prompt prompt (
    select  'prompt ' || decode(column_id,1,'',',') || lower(column_name)
    from    user_tab_columns
    where   table_name = upper('&1')
    order by column_id
    /
    prompt prompt )
    prompt prompt BEGINDATA

    prompt  select
    select  lower(column_name)||'||chr(124)||'
    from    user_tab_columns
    where   table_name = upper('&1') and
        column_id != (select max(column_id) from user_tab_columns where
                 table_name = upper('&1'))
        order by column_id
    /
    select  lower(column_name)
    from    user_tab_columns
    where   table_name = upper('&1') and
        column_id = (select max(column_id) from user_tab_columns where
                 table_name = upper('&1'))
        order by column_id
    /
    prompt  from    &1
    prompt  /
    prompt spool off
    prompt set feedback on
    prompt set head on
    spool off
    set termout on
    @ytmpy.sql
    exit

     

     

  • Tuning RAC and using parallel features(4)

    2008-02-03 02:26:23

    Cache Fusion and Resource Coordination

      Since each node in Real  Application  Cluster has its own memory(cache) that is not shared with other nodes, RAC must coordination the buffer caches of different nodes while minimizing additional disk I/O that could reduct performance. Cache Fusion is the technology that uses high-speed interconnects to provide cache-to-cache transfers of data blocks between instances in a cluster. Cache Fusion functionality allows direct memory writes of dirty blocks to alleviate the need to force a disk write and re-read(or ping) the commited blocks. However, this is not to say that disk writes do not occur. Disk writes and still required for cache replacement and when a checkpoint occurs. Cache Fusion addresses the issues involved in concurrency between instances: concurrent reads on multiple nodes, concurrent reads and writes on different nodes,  and concurrent writes on different nodes.

       Oracle only reads data blocks from disk if they are not already present in the buffer caches of any instance. Because data block writes are deferred, they often contain modifications from multiple transactions. The modified data blocks are written to disk only when a checkpoint occurs. Before we go further, we need to be familiar with a couple of concepts introduced in Oracle 9i RAC:resource modes and resource roles. Because the same data blocks can concurrently exist in multiple instances, there are two identifiers that help to coordinate these blocks:

     

    l       Resource mode: The modes are null, shared, and exclusive. The block can be held in different modes, depending on whether a resource holder intends to modify data or merely read them.

    l       Resource role:The roles are locally managed and globally managed.

     

    Global Resource Directory(GRD) is not a database. It is a collection of internal structures and is used to find the current status of the data blocks. Whenever a block is transferred out of a local cache to another instance’s  cache, GRD is updated. The following information about a resource is available in GRD:

    l       Data Block identifiers(DBA)

    l       Location of most current versions

    l       Modes of the data blocks(N,S,X)

    l       The roles of the blocks (local or global)

     

    Past Image

    To maintain the data integrity, a new concept of past image was introduced in 9i Version of RAC. A past image(PI) of a block is kept in memory before the block is sent and serves as an indication of whether or not it is a dirty block. In the event of failure, GCS can reconstruct the current version of the block by reading PIs. This PI is different from a CR block, which is needed to reconstruct read-consistent images. The CR version of a block represents a consistent snapshot of the data at a point in time.

       As an example, Transaction-A f instance-A has updated row-2 on block-5, and later another Transaction-B of Inst-B has updated row-6 on same block-5. Block-5 has been transferred from Inst-A to B. At this time, Past Image(PI) for block-5 is created on Insta-A.

     

    SCN Processing

    System change numbers(SCNs) uniquely identify a commited transaction and the changes it makes. An SCN is a logical time stamp that defines a committed version of a database at one point in time. Oracle assigns every committed transaction a unique SCN.

       Within RAC, since you have multiple instances that perform. commits, the SCN changes need to be maintained within an instance, but at the same time, they must also be synchronized across all instances with a cluster. Therefore, SCN is handled by the Global Cache Service using the Lamport SCN generation scheme, or by using a hardware block or dedicated SCN server. SCNs are recorded in the redo log so that recovery operations can be synchronized in Oracle 9i Real Application Cluster.

1241/71234567>

数据统计

  • 访问量: 20629
  • 日志数: 650
  • 建立时间: 2007-12-21
  • 更新时间: 2009-02-25

RSS订阅

Open Toolbar