Friday, June 30, 2017

Disk Scrubbing Feature – Oracle Exadata Database Machine


Disk Scrubbing Feature – Oracle Exadata Database Machine


Introduction:

Disk scrubbing is a new feature introduced in Oracle 11.2.0.4 and Oracle Exadata 11.2.3.3.0 storage software version. 
The usage of disk scrubbing is to periodically validate the integrity of the mirrored ASM extents and thus eliminate 
latent corruption. Disk Scrubbing is designed to schedule on production servers when average I/O utilization is minimal 
because disk scrubbing can cause spikes in disk utilization and latency and adversely affect database performance. 
By default, the hard disk scrub runs every two weeks.

The following Parameters controlling the disk scrubbing:

• hardDiskScrubInterval: 
Sets the interval for proactive resilvering of latent bad sectors. Valid options are daily, weekly, biweekly and none.
• hardDiskScrubStartTime:
Command sets the start time for proactive resilvering of latent bad sectors. Valid options are a date/time 
combination or now.

Schedules available to enable Harddisk scrub Activity

• hardDiskScrubInterval=daily
• hardDiskScrubInterval=weekly
• hardDiskScrubInterval=biweekly

Ways to check alert log in Oracle Exadata Storage Server

1. ADRCI
2. CELLTRACE

Where to look in Oracle Exadata Storage Servers:

Exadata Storage Server-1: CellServer01

[celladmin@CellServer01 ~]$ cd $CELLTRACE
[celladmin@CellServer01 trace]$ pwd
/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log/diag/asm/cell/CellServer01/trace
[celladmin@CellServer01 trace]$ ls -l alert*
-rw-rw---- 1 root celladmin 254890 Mar 11 05:03 alert.log
[celladmin@CellServer01 trace]$

(OR)

[celladmin@CellServer01 ~]$ adrci
ADRCI: Release 12.1.0.2.0 - Production on Mon Mar 13 12:49:12 2017
Copyright (c) 1982, 2016, Oracle and/or its affiliates.  All rights reserved.
ADR base = "/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log"

adrci> show alert

Choose the home from which to view the alert log:

1: diag/asm/user_root/host_136421473_80
2: diag/asm/user_root/host_136421473_82
3: diag/asm/cell/CellServer01
4: diag/asm/cell/SYS_121233_161109
5: diag/asm/cell/SYS_112331_151006
Q: to quit

Please select option: 3
Output the results to file: /tmp/alert_35417_1399_CellServer01_1.ado

Begin scrubbing CellDisk:CD_03_CellServer01.
Begin scrubbing CellDisk:CD_04_CellServer01.
Begin scrubbing CellDisk:CD_07_CellServer01.
Begin scrubbing CellDisk:CD_06_CellServer01.
Begin scrubbing CellDisk:CD_10_CellServer01.
Begin scrubbing CellDisk:CD_05_CellServer01.
Begin scrubbing CellDisk:CD_01_CellServer01.
Begin scrubbing CellDisk:CD_09_CellServer01.
Begin scrubbing CellDisk:CD_11_CellServer01.
Begin scrubbing CellDisk:CD_08_CellServer01.
Begin scrubbing CellDisk:CD_00_CellServer01.
Begin scrubbing CellDisk:CD_02_CellServer01.

2017-02-24 10:55:08.976000 -05:00
Finished scrubbing CellDisk:CD_01_CellServer01, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 12:19:33.389000 -05:00
Finished scrubbing CellDisk:CD_00_CellServer01, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 17:40:33.013000 -05:00
Finished scrubbing CellDisk:CD_05_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 17:44:36.352000 -05:00
Finished scrubbing CellDisk:CD_08_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 17:50:16.765000 -05:00
Finished scrubbing CellDisk:CD_10_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 17:50:20.052000 -05:00
Finished scrubbing CellDisk:CD_07_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 17:53:45.900000 -05:00
Finished scrubbing CellDisk:CD_06_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 17:57:31.965000 -05:00
Finished scrubbing CellDisk:CD_04_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:23:17.292000 -05:00
Finished scrubbing CellDisk:CD_11_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:47:43.248000 -05:00
Finished scrubbing CellDisk:CD_09_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 19:12:58.308000 -05:00
Finished scrubbing CellDisk:CD_02_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 22:25:42.408000 -05:00
Finished scrubbing CellDisk:CD_03_CellServer01, scrubbed blocks (1MB):7499632, found bad blocks:0

Exadata Storage Server-2: CellServer02


[celladmin@CellServer02 ~]$ cd $CELLTRACE
[celladmin@CellServer02 trace]$ pwd
/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log/diag/asm/cell/CellServer02/trace
[celladmin@CellServer02 trace]$ ls -lrth alert*
-rw-rw---- 1 root celladmin 4.2M Mar 11 02:57 alert.log
[celladmin@CellServer02 trace]$

(OR)

[celladmin@CellServer02 ~]$ adrci
ADRCI: Release 12.1.0.2.0 - Production on Mon Mar 13 13:32:42 2017
Copyright (c) 1982, 2016, Oracle and/or its affiliates.  All rights reserved.

ADR base = "/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log"
adrci> show alert

Choose the home from which to view the alert log:

1: diag/asm/user_root/host_1634209856_80
2: diag/asm/user_root/host_1634209856_82
3: diag/asm/cell/CellServer02
4: diag/asm/cell/SYS_112331_151006
5: diag/asm/cell/SYS_121233_161109
Q: to quit

Please select option: 3
Output the results to file: /tmp/alert_5413_14027_CellServer02_1.ado

Begin scrubbing CellDisk:CD_02_CellServer02.
Begin scrubbing CellDisk:CD_00_CellServer02.
Begin scrubbing CellDisk:CD_11_CellServer02.
Begin scrubbing CellDisk:CD_10_CellServer02.
Begin scrubbing CellDisk:CD_09_CellServer02.
Begin scrubbing CellDisk:CD_06_CellServer02.
Begin scrubbing CellDisk:CD_01_CellServer02.
Begin scrubbing CellDisk:CD_04_CellServer02.
Begin scrubbing CellDisk:CD_05_CellServer02.
Begin scrubbing CellDisk:CD_03_CellServer02.
Begin scrubbing CellDisk:CD_07_CellServer02.
Begin scrubbing CellDisk:CD_08_CellServer02.
2017-02-24 11:32:04.092000 -05:00
Finished scrubbing CellDisk:CD_01_CellServer02, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 12:47:37.032000 -05:00
Finished scrubbing CellDisk:CD_00_CellServer02, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 18:33:47.058000 -05:00
Finished scrubbing CellDisk:CD_06_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:34:45.791000 -05:00
Finished scrubbing CellDisk:CD_02_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:39:05.954000 -05:00
Finished scrubbing CellDisk:CD_03_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:45:55.826000 -05:00
Finished scrubbing CellDisk:CD_08_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:51:21.961000 -05:00
Finished scrubbing CellDisk:CD_05_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:56:44.404000 -05:00
Finished scrubbing CellDisk:CD_07_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 19:07:25.359000 -05:00
Finished scrubbing CellDisk:CD_10_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 19:09:20.616000 -05:00
Finished scrubbing CellDisk:CD_04_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 19:10:06.256000 -05:00
Finished scrubbing CellDisk:CD_09_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 22:56:26.134000 -05:00
Finished scrubbing CellDisk:CD_11_CellServer02, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-03-08 19:00:06.986000 -05:00

Exadata Storage Server-3: CellServer03


[celladmin@CellServer03 ~]$ cd $CELLTRACE
[celladmin@CellServer03 trace]$ pwd
/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log/diag/asm/cell/CellServer03/trace
[celladmin@CellServer03 trace]$ ls -l alert*
-rw-rw---- 1 root celladmin 254890 Mar 11 05:03 alert.log
[celladmin@CellServer03 trace]$

OR

[celladmin@CellServer03 ~]$ adrci
ADRCI: Release 12.1.0.2.0 - Production on Tue Mar 14 14:48:47 2017
Copyright (c) 1982, 2016, Oracle and/or its affiliates.  All rights reserved.
ADR base = "/opt/oracle/cell12.1.2.3.3_LINUX.X64_161109/log"

adrci> show alert

Choose the home from which to view the alert log:

1: diag/asm/user_root/host_4214962514_80
2: diag/asm/user_root/host_4214962514_82
3: diag/asm/cell/SYS_112331_151006
4: diag/asm/cell/SYS_121233_161109
5: diag/asm/cell/CellServer03
Q: to quit

Please select option: 5
Output the results to file: /tmp/alert_37829_1402_CellServer03_1.ado

Begin scrubbing CellDisk:CD_02_CellServer03.
Begin scrubbing CellDisk:CD_07_CellServer03.
Begin scrubbing CellDisk:CD_10_CellServer03.
Begin scrubbing CellDisk:CD_11_CellServer03.
Begin scrubbing CellDisk:CD_01_CellServer03.
Begin scrubbing CellDisk:CD_05_CellServer03.
Begin scrubbing CellDisk:CD_06_CellServer03.
Begin scrubbing CellDisk:CD_08_CellServer03.
Begin scrubbing CellDisk:CD_09_CellServer03.
Begin scrubbing CellDisk:CD_04_CellServer03.
Begin scrubbing CellDisk:CD_03_CellServer03.
Begin scrubbing CellDisk:CD_00_CellServer03.
2017-02-24 12:26:46.102000 -05:00
Finished scrubbing CellDisk:CD_00_CellServer03, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 13:31:16.168000 -05:00
Finished scrubbing CellDisk:CD_01_CellServer03, scrubbed blocks (1MB):7465024, found bad blocks:0
2017-02-24 18:02:35.900000 -05:00
Finished scrubbing CellDisk:CD_03_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 18:35:41.075000 -05:00
Finished scrubbing CellDisk:CD_04_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 19:36:04.680000 -05:00
Finished scrubbing CellDisk:CD_10_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 20:12:15.913000 -05:00
Finished scrubbing CellDisk:CD_11_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 20:12:55.832000 -05:00
Finished scrubbing CellDisk:CD_09_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 20:36:54.813000 -05:00
Finished scrubbing CellDisk:CD_06_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 20:42:25.369000 -05:00
Finished scrubbing CellDisk:CD_05_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-24 20:58:07.648000 -05:00
Finished scrubbing CellDisk:CD_07_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-25 04:16:47.907000 -05:00
Finished scrubbing CellDisk:CD_08_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0
2017-02-25 09:07:33.619000 -05:00
Finished scrubbing CellDisk:CD_02_CellServer03, scrubbed blocks (1MB):7499632, found bad blocks:0

Note: Same process we can do based on Oracle Exadata Database Machine Model 
(1/8th Rack, Quarter Rack, Half Rack and Full Rack).


Command to verify Harddisk scrub Activity enabled on Oracle Exadata:


[celladmin@CellServer01 ~]$ cellcli -e list cell attributes name,hardDiskScrubInterval
         CellServer01    biweekly

[celladmin@CellServer02 ~]$ cellcli -e list cell attributes name,hardDiskScrubInterval
         CellServer02    biweekly

[celladmin@CellServer03 ~]$ cellcli -e list cell attributes name,hardDiskScrubInterval
         CellServer03    biweekly

Command to stop Harddisk scrub Activity enabled on Oracle Exadata:


[celladmin@CellServer01 ~]$ cellcli –e alter cell hardDiskScrubInterval=none
[celladmin@CellServer02 ~]$ cellcli –e alter cell hardDiskScrubInterval=none
[celladmin@CellServer03 ~]$ cellcli –e alter cell hardDiskScrubInterval=none

When to schedule:


Disk scrubbing will take I/O when it is running on the storage servers so there will be small load will be there 
in the oracle database. Before disk scrubbing check the idle window for mission critical production databases. 
Check the below steps to schedule in planned time.

Stop disk scrubbing and reschedule it for non peak hours time.
CellCLI> ALTER CELL hardDiskScrubInterval=none

Please decide on hardDiskScrubStartTime to start over weekend/non-peak hours and set appropriately. 
CellCLI> ALTER CELL hardDiskScrubStartTime='' 

Change the interval to BIWEEKLY if the previous action plan was implemented to stop the disk scrub.
CellCLI> ALTER CELL hardDiskScrubInterval=biweekly

Summary:
Disk scrubbing is use to periodically validate the integrity of the mirrored ASM extents across the 
Oracle Exadata storage servers and thus eliminate latent corruption. 

2 comments:

  1. Your blog has a lot of valuable information regarding Oracle Exadata. Thanks for your time on putting these all together.. Really helpful blog.. I just wanted to share information about Oracle Exadata Online Training.

    ReplyDelete