[Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue

Discussion:

Sebastien DAUBIGNE

2010-09-16 10:03:17 UTC

Dear Vx-addicts,

We encountered a failover issue on this configuration :

- Solaris 9 HW 9/05
- SUN SAN (SFS) 4.4.15
- Emulex with SUN generic driver (emlx)
- VxVM 5.0-2006-05-11a

- storage on HP SAN (XP 24K).

Multipathing is managed by MPxIO (not VxDMP) because the SAN team and HP
support imposed the Solaris native solution for multipathing :

VxVM ==> VxDMP ==> MPxIO ==> FCP ...

We have 2 paths to the switch, linked to 2 paths to the storage, so the
LUNs have 4 paths, with active/active support.
Failover operation has been tested successfully by offlining each port
successively on the SAN.

We regulary have transient I/O errors (scsi timeout, I/O error retries
with "Unit attention"), due to SAN-side issues. Usually these errors are
transparently managed by MPxIO/VxVM without impact on the applications.

Now for the incident we encountered :

One of the SAN port was reset , consequently there were some transient
I/O error.
The other SAN port was OK, so the MPxIO multipathing layer should have
failover the I/O on the other path, without transmiting the error to the
VxDMP layer.
For some reason, it did not failover the I/O before VxVM caught it as
unrecoverable I/O error, disabling the subdisk and consequently the
filesystem.

Note the "giving up" message from scsi layer at 06:23:03 :

Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x558 belonging to the dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x538 belonging to the dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x550 belonging to the dmpnode 288/0x18
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x18
Sep 1 06:18:54 myserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/***@g60060e80152777000001277700003794 (ssd165):
Sep 1 06:18:54 myserver SCSI transport failed: reason
'tran_err': retrying command
Sep 1 06:19:05 myserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/***@g60060e80152777000001277700003794 (ssd165):
Sep 1 06:19:05 myserver SCSI transport failed: reason 'timeout':
retrying command
Sep 1 06:21:57 myserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/***@g60060e8015277700000127770000376d (ssd168):
Sep 1 06:21:57 myserver SCSI transport failed: reason
'tran_err': retrying command
Sep 1 06:22:45 myserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/***@g60060e8015277700000127770000376d (ssd168):
Sep 1 06:22:45 myserver SCSI transport failed: reason 'timeout':
retrying command
Sep 1 06:23:03 myserver scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/***@g60060e80152777000001277700003787 (ssd166):
Sep 1 06:23:03 myserver SCSI transport failed: reason 'timeout':
giving up
Sep 1 06:23:03 myserver vxio: [ID 539309 kern.warning] WARNING: VxVM
vxio V-5-3-0 voldmp_errbuf_sio_start: Failed to flush the error buffer
300ce41c340 on device 0x1200000003a to DMP
Sep 1 06:23:03 myserver vxio: [ID 771159 kern.warning] WARNING: VxVM
vxio V-5-0-2 Subdisk mydisk_2-02 block 5935: Uncorrectable write error
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
1 mesg 037: V-2-37: vx_metaioerr - vx_logbuf_clean -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block 0/5935
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
2 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/mydg/vol1 file system disabled
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
3 mesg 037: V-2-37: vx_metaioerr - vx_inode_iodone -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block
0/265984

It seems VxDMP gets the I/O error at the same time as MPxIO : I though
MPxIO would have conceal the I/O error until failover has occured, which
is not the case.

As a workaround, I increased the VxDMP
recoveryotion/fixedretry/retrycount tunable from 5 to 20 to give MPxIO a
chance to failover before VxDMP fails, but I still don't understand why
VxVM catch the scsi errors.

Any advice ?

thanks.

--
Sebastien DAUBIGNE
***@atosorigin.com - +33(0)5.57.89.31.09
AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix

Victor Engle

2010-09-16 10:15:09 UTC

Permalink

Which version of veritas? Version 4/2MP2 and version 5.x introduced a
feature called DMP fast recovery. It was probably supposed to be
called DMP fast fail but "recovery" sounds better. It is supposed to
fail suspect paths more aggressively to speed up failover. But when
you only have one vxvm DMP path, as is the case with MPxIO, and
fast-recovery fails that path, then you're in trouble. In version 5.x,
it is possible to disable this feature.

Google DMP fast recovery.

http://seer.entsupport.symantec.com/docs/307959.htm

I can imagine there must have been some internal fights at symantec
between product management and QA to get that feature released.

Vic

On Thu, Sep 16, 2010 at 6:03 AM, Sebastien DAUBIGNE

Dear Vx-addicts,
- Solaris 9 HW 9/05
- SUN SAN (SFS) 4.4.15
- Emulex with SUN generic driver (emlx)
- VxVM 5.0-2006-05-11a
- storage on HP SAN (XP 24K).
Multipathing is managed by MPxIO (not VxDMP) because the SAN team and HP
VxVM ==> VxDMP ==> MPxIO ==> FCP ...
We have 2 paths to the switch, linked to 2 paths to the storage, so the
LUNs have 4 paths, with active/active support.
Failover operation has been tested successfully by offlining each port
successively on the SAN.
We regulary have transient I/O errors (scsi timeout, I/O error retries
with "Unit attention"), due to SAN-side issues. Usually these errors are
transparently managed by MPxIO/VxVM without impact on the applications.
One of the SAN port was reset , consequently there were some transient
I/O error.
The other SAN port was OK, so the MPxIO multipathing layer should have
failover the I/O on the other path, without transmiting the error to the
VxDMP layer.
For some reason, it did not failover the I/O before VxVM caught it as
unrecoverable I/O error, disabling the subdisk and consequently the
filesystem.
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x558 belonging to the dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x538 belonging to the dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x550 belonging to the dmpnode 288/0x18
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x18
Sep 1 06:18:54 myserver SCSI transport failed: reason
'tran_err': retrying command
retrying command
Sep 1 06:21:57 myserver SCSI transport failed: reason
'tran_err': retrying command
retrying command
giving up
Sep 1 06:23:03 myserver vxio: [ID 539309 kern.warning] WARNING: VxVM
vxio V-5-3-0 voldmp_errbuf_sio_start: Failed to flush the error buffer
300ce41c340 on device 0x1200000003a to DMP
Sep 1 06:23:03 myserver vxio: [ID 771159 kern.warning] WARNING: VxVM
vxio V-5-0-2 Subdisk mydisk_2-02 block 5935: Uncorrectable write error
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
1 mesg 037: V-2-37: vx_metaioerr - vx_logbuf_clean -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block 0/5935
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
2 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/mydg/vol1 file system disabled
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
3 mesg 037: V-2-37: vx_metaioerr - vx_inode_iodone -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block
0/265984
It seems VxDMP gets the I/O error at the same time as MPxIO : I though
MPxIO would have conceal the I/O error until failover has occured, which
is not the case.
As a workaround, I increased the VxDMP
recoveryotion/fixedretry/retrycount tunable from 5 to 20 to give MPxIO a
chance to failover before VxDMP fails, but I still don't understand why
VxVM catch the scsi errors.
Any advice ?
thanks.
--
Sebastien DAUBIGNE
AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx

Sebastien DAUBIGNE

2010-09-16 13:40:44 UTC

Permalink

Thank you Victor and William, it seems to be a very good lead.

Unfortunately, this tunable seems not to be supported in the VxVM

Post by William Havey
vxdmpadm gettune dmp_fast_recovery

VxVM vxdmpadm ERROR V-5-1-12015 Incorrect tunable
vxdmpadm gettune [tunable name]
Note - Tunable name can be dmp_failed_io_threshold, dmp_retry_count,
dmp_pathswitch_blks_shift, dmp_queue_depth, dmp_cache_open,
dmp_daemon_count, dmp_scsi_timeout, dmp_delayq_interval, dmp_path_age,
or dmp_stat_interval

Something odd because my version is 5.0 MP3 Solaris SPARC, and according
to http://seer.entsupport.symantec.com/docs/316981.htm this tunable
should be available.

Post by William Havey
modinfo | grep -i vx

38 7846a000 3800e 288 1 vxdmp (VxVM 5.0-2006-05-11a: DMP Drive)
40 784a4000 334c40 289 1 vxio (VxVM 5.0-2006-05-11a I/O driver)
42 783ec71d df8 290 1 vxspec (VxVM 5.0-2006-05-11a control/st)
296 78cfb0a2 c6b 291 1 vxportal (VxFS 5.0_REV-5.0A55_sol portal )
297 78d6c000 1b9d4f 8 1 vxfs (VxFS 5.0_REV-5.0A55_sol SunOS 5)
298 78f18000 a270 292 1 fdd (VxQIO 5.0_REV-5.0A55_sol Quick )

Post by William Havey
Which version of veritas? Version 4/2MP2 and version 5.x introduced a
feature called DMP fast recovery. It was probably supposed to be
called DMP fast fail but "recovery" sounds better. It is supposed to
fail suspect paths more aggressively to speed up failover. But when
you only have one vxvm DMP path, as is the case with MPxIO, and
fast-recovery fails that path, then you're in trouble. In version 5.x,
it is possible to disable this feature.
Google DMP fast recovery.
http://seer.entsupport.symantec.com/docs/307959.htm
I can imagine there must have been some internal fights at symantec
between product management and QA to get that feature released.
Vic
On Thu, Sep 16, 2010 at 6:03 AM, Sebastien DAUBIGNE

Post by Sebastien DAUBIGNE
Dear Vx-addicts,
- Solaris 9 HW 9/05
- SUN SAN (SFS) 4.4.15
- Emulex with SUN generic driver (emlx)
- VxVM 5.0-2006-05-11a
- storage on HP SAN (XP 24K).
Multipathing is managed by MPxIO (not VxDMP) because the SAN team and HP
VxVM ==> VxDMP ==> MPxIO ==> FCP ...
We have 2 paths to the switch, linked to 2 paths to the storage, so the
LUNs have 4 paths, with active/active support.
Failover operation has been tested successfully by offlining each port
successively on the SAN.
We regulary have transient I/O errors (scsi timeout, I/O error retries
with "Unit attention"), due to SAN-side issues. Usually these errors are
transparently managed by MPxIO/VxVM without impact on the applications.
One of the SAN port was reset , consequently there were some transient
I/O error.
The other SAN port was OK, so the MPxIO multipathing layer should have
failover the I/O on the other path, without transmiting the error to the
VxDMP layer.
For some reason, it did not failover the I/O before VxVM caught it as
unrecoverable I/O error, disabling the subdisk and consequently the
filesystem.
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x558 belonging to the dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x60
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x538 belonging to the dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 917986 kern.notice] NOTICE: VxVM
vxdmp V-5-0-112 disabled path 118/0x550 belonging to the dmpnode 288/0x18
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x20
Sep 1 06:18:54 myserver vxdmp: [ID 824220 kern.notice] NOTICE: VxVM
vxdmp V-5-0-111 disabled dmpnode 288/0x18
Sep 1 06:18:54 myserver SCSI transport failed: reason
'tran_err': retrying command
retrying command
Sep 1 06:21:57 myserver SCSI transport failed: reason
'tran_err': retrying command
retrying command
giving up
Sep 1 06:23:03 myserver vxio: [ID 539309 kern.warning] WARNING: VxVM
vxio V-5-3-0 voldmp_errbuf_sio_start: Failed to flush the error buffer
300ce41c340 on device 0x1200000003a to DMP
Sep 1 06:23:03 myserver vxio: [ID 771159 kern.warning] WARNING: VxVM
vxio V-5-0-2 Subdisk mydisk_2-02 block 5935: Uncorrectable write error
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
1 mesg 037: V-2-37: vx_metaioerr - vx_logbuf_clean -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block 0/5935
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
2 mesg 031: V-2-31: vx_disable - /dev/vx/dsk/mydg/vol1 file system disabled
Sep 1 06:23:03 myserver vxfs: [ID 702911 kern.warning] WARNING: msgcnt
3 mesg 037: V-2-37: vx_metaioerr - vx_inode_iodone -
/dev/vx/dsk/mydg/vol1 file system meta data write error in dev/block
0/265984
It seems VxDMP gets the I/O error at the same time as MPxIO : I though
MPxIO would have conceal the I/O error until failover has occured, which
is not the case.
As a workaround, I increased the VxDMP
recoveryotion/fixedretry/retrycount tunable from 5 to 20 to give MPxIO a
chance to failover before VxDMP fails, but I still don't understand why
VxVM catch the scsi errors.
Any advice ?
thanks.
--
Sebastien DAUBIGNE
AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx

--
Sebastien DAUBIGNE
***@atosorigin.com - +33(0)5.57.89.31.09
AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix

Joshua Fielden

2010-09-16 14:50:34 UTC

Permalink

dmp_fast_recovery is a mechanism by which we bypass the sd/scsi stack and send path inquiry/status CDBs directly from the HBA in order to bypass long SCSI queues and recover paths faster. With a TPD (third-party driver) such as MPxIO, bypassing the stack means we bypass the TPD completely, and interactions such as this can happen. The vxesd (event-source daemon) is another 5.0/MP2 backport addition that's moot in the presence of a TPD.

From your modinfo, you're not actually running MP3. This technote (http://seer.entsupport.symantec.com/docs/327057.htm) isn't exactly your scenario, but looking for partially-installed pkgs is a good start to getting your server correctly installed, then the tuneable should work -- very early 5.0 versions had a differently-named tuneable I can't find in my mail archive ATM.

Cheers,

Jf

-----Original Message-----
From: veritas-vx-***@mailman.eng.auburn.edu [mailto:veritas-vx-***@mailman.eng.auburn.edu] On Behalf Of Sebastien DAUBIGNE
Sent: Thursday, September 16, 2010 7:41 AM
To: Veritas-***@mailman.eng.auburn.edu
Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue

Thank you Victor and William, it seems to be a very good lead.

Unfortunately, this tunable seems not to be supported in the VxVM

Post by William Havey
vxdmpadm gettune dmp_fast_recovery

Post by William Havey
modinfo | grep -i vx

--
Sebastien DAUBIGNE
***@atosorigin.com - +33(0)5.57.89.31.09
AtosOrigin Infogerance - AIS/D1/SudOuest/Bordeaux/IS-Unix

_______________________________________________
Veritas-vx maillist - Veritas-***@mailman.eng.auburn.edu
http://mai

Sebastien DAUBIGNE

2010-10-06 16:31:50 UTC

Permalink

Hi,

I come back with my dmp_fast_recovery issue (VxDMP fails the path before
MPxIO gets a chance to failover on alternate path).
As stated previously, I am running 5.0GA, and this tunable is not
supported in this release. However I still don't know if VxVM 5.0GA
silently bypasses the MPxIO stack for error recovery.

Now I try to determine if upgrading to MP3 will resolve this issue
(which rarely occured).

Could anyone (maybe Joshua ?) explain if the behaviour of 5.0GA without
tunable is functionally identical to dmp_fast_recovery=0 or
dmp_fast_recovery=1 ? Maybe the mechanism has been implemented in 5.0
without the option to disable it (this could explain my issue) ?

Joshua, you mentioned another tuneable for 5.0 but looking at the list I

vxdmpadm gettune all

Tunable Current Value Default Value
------------------------------ ------------- -------------
dmp_failed_io_threshold 57600 57600
dmp_retry_count 5 5
dmp_pathswitch_blks_shift 11 11
dmp_queue_depth 32 32
dmp_cache_open on on
dmp_daemon_count 10 10
dmp_scsi_timeout 30 30
dmp_delayq_interval 15 15
dmp_path_age 0 300
dmp_stat_interval 1 1
dmp_health_time 0 60
dmp_probe_idle_lun on on
dmp_log_level 4 1

Cheers.

Venkata Sreenivasa Rao Nagineni

2010-10-06 17:08:08 UTC

Permalink

Hi Sebastien,

In the first mail you mentioned that you are using mpxio to control the XP24K array. Why are you using mpxio here?

Thanks,
Venkata Sreenivasarao Nagineni,
Symantec

vxdmpadm gettune all

dmp_fast_recovery is a mechanism by which we bypass the sd/scsi stack

and send path inquiry/status CDBs directly from the HBA in order to
bypass long SCSI queues and recover paths faster. With a TPD (third-
party driver) such as MPxIO, bypassing the stack means we bypass the
TPD completely, and interactions such as this can happen. The vxesd
(event-source daemon) is another 5.0/MP2 backport addition that's moot
in the presence of a TPD.

From your modinfo, you're not actually running MP3. This technote

(http://seer.entsupport.symantec.com/docs/327057.htm) isn't exactly
your scenario, but looking for partially-installed pkgs is a good start
to getting your server correctly installed, then the tuneable should
work -- very early 5.0 versions had a differently-named tuneable I
can't find in my mail archive ATM.

Cheers,
Jf
-----Original Message-----
Sent: Thursday, September 16, 2010 7:41 AM
Subject: Re: [Veritas-vx] Solaris-SFS / MPxIO / VxVM failover issue
Thank you Victor and William, it seems to be a very good lead.
Unfortunately, this tunable seems not to be supported in the VxVM

Post by William Havey
vxdmpadm gettune dmp_fast_recovery

dmp_path_age,

or dmp_stat_interval
Something odd because my version is 5.0 MP3 Solaris SPARC, and

according

to http://seer.entsupport.symantec.com/docs/316981.htm this tunable
should be available.

Post by William Havey
modinfo | grep -i vx

38 7846a000 3800e 288 1 vxdmp (VxVM 5.0-2006-05-11a: DMP

Drive)

40 784a4000 334c40 289 1 vxio (VxVM 5.0-2006-05-11a I/O driver)
42 783ec71d df8 290 1 vxspec (VxVM 5.0-2006-05-11a

control/st)

296 78cfb0a2 c6b 291 1 vxportal (VxFS 5.0_REV-5.0A55_sol portal

)

297 78d6c000 1b9d4f 8 1 vxfs (VxFS 5.0_REV-5.0A55_sol SunOS 5)
298 78f18000 a270 292 1 fdd (VxQIO 5.0_REV-5.0A55_sol Quick )

Post by William Havey
Which version of veritas? Version 4/2MP2 and version 5.x introduced

Post by William Havey
feature called DMP fast recovery. It was probably supposed to be
called DMP fast fail but "recovery" sounds better. It is supposed to
fail suspect paths more aggressively to speed up failover. But when
you only have one vxvm DMP path, as is the case with MPxIO, and
fast-recovery fails that path, then you're in trouble. In version

5.x,

Post by William Havey
it is possible to disable this feature.
Google DMP fast recovery.
http://seer.entsupport.symantec.com/docs/307959.htm
I can imagine there must have been some internal fights at symantec
between product management and QA to get that feature released.
Vic
On Thu, Sep 16, 2010 at 6:03 AM, Sebastien DAUBIGNE

and HP