Discussion:
[Veritas-vx] Replaced disk is failing too? Volumes detached...what's going on here?
Tony T.
2006-03-20 21:17:46 UTC
Permalink
Hi all,

I had a disk failure in an 880 today that was part of a 4 disk RAID 5
group. For some reason, this one failure brought down all the volumes.
Sun support was not able to explain why RAID 5 did nothing to protect us,
but that's the least of my worries right now:

An engineer replaced the disk and we did the standard drill (vxdiskadm, #5,
etc) and everything looked ok. vxdisk list went from "failed" to
online" and vxtask showed me the volumes were syncing back up.

Well, after about 1 1/2 hours of syncing, now the brand new disk is listed
as "online failing"

And all of the volumes are DETACHED.

If I try to start any of the volumes, I get either this message:

vxvm:vxvol: ERROR: Volume backup has no CLEAN or non-volatile ACTIVE plexes

or this one:


vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not startable; Raid5 plex does
not map the entire volume length
Any idea what the heck is going on here?? I have been on hold for 1/2 hour
with Sun and people are starting to get worried (including me!)

EEK!






--
)
(
)
[_])
Doug Hughes
2006-03-20 21:37:44 UTC
Permalink
Post by Tony T.
Hi all,
I had a disk failure in an 880 today that was part of a 4 disk RAID 5
group. For some reason, this one failure brought down all the volumes.
Sun support was not able to explain why RAID 5 did nothing to protect us,
well, they really ought to know. If an internal v880 disk goes out it
can take out the entire FC bus since it's a single loop. (it can happen)
However, once you pull the disk, you should be able to run in degraded
mode.
Post by Tony T.
An engineer replaced the disk and we did the standard drill (vxdiskadm, #5,
etc) and everything looked ok. vxdisk list went from "failed" to
online" and vxtask showed me the volumes were syncing back up.
Well, after about 1 1/2 hours of syncing, now the brand new disk is listed
as "online failing"
And all of the volumes are DETACHED.
vxvm:vxvol: ERROR: Volume backup has no CLEAN or non-volatile ACTIVE plexes
vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not startable; Raid5 plex does
not map the entire volume length
Any idea what the heck is going on here?? I have been on hold for 1/2 hour
with Sun and people are starting to get worried (including me!)
I'd run a parity check

/etc/vx/vxr5check -i -v -g <diskgroup> <volume_name>


anything still in vxtask list?

vxrecover <volume_name> may also do some things.
Darren Dunham
2006-03-20 21:58:51 UTC
Permalink
Post by Tony T.
Well, after about 1 1/2 hours of syncing, now the brand new disk is listed
as "online failing"
At the same time, do you have any indications in messages that this is a
real disk failure? It certainly happens.
Post by Tony T.
And all of the volumes are DETACHED.
vxvm:vxvol: ERROR: Volume backup has no CLEAN or non-volatile ACTIVE plexes
vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not startable; Raid5 plex does
not map the entire volume length
What's the 'vxprint -ht' output for that volume?
Post by Tony T.
Any idea what the heck is going on here?? I have been on hold for 1/2 hour
with Sun and people are starting to get worried (including me!)
You might have done some of this already, but here's a valuable technote
for Raid5 volume recoveries.

http://seer.support.veritas.com/docs/251793.htm
--
Darren Dunham ***@taos.com
Senior Technical Consultant TAOS http://www.taos.com/
Got some Dr Pepper? San Francisco, CA bay area
< This line left intentionally blank to confuse you. >
Tony T.
2006-03-20 22:09:40 UTC
Permalink
Ok here is the deal:

Sun had me run these commands on each volume:

vxrecover -b <volume>
vxvol -f start <volume>

I have done 2 volumes so far, and neither one will mount before running
fsck. After running fsck (clearing gobs of partically allocated inodes),
nothing is in any of the mounted filesystems. Zip. Nada. So, I dont know
what destroyed all the data: the disk failuer, the fiber connections, fsck,
vxrecover or raid5.

The only thing I know for sure is that RAID5 did not do a thing to protect
the data and I am not too happy about it :(

BTW, luxadm probe doesnt return any errors.
this is a stretch....but did you use luxadm to remove and replace the
disk?
the v880's are fibre-channel internal, not SCSI. The device paths for the
disks include the WWN of the disk, so when you replace a failed disk with a
new one...the WWN doesn't match and the OS, not to mention VRTS, don't know
what it is.
------------------------------
*Sent:* Monday, March 20, 2006 1:18 PM
*To:* vxlist
*Subject:* [Veritas-vx] Replaced disk is failing too? Volumes
detached...what's going on here?
Hi all,
I had a disk failure in an 880 today that was part of a 4 disk RAID 5
group. For some reason, this one failure brought down all the volumes.
Sun support was not able to explain why RAID 5 did nothing to protect us,
An engineer replaced the disk and we did the standard drill (vxdiskadm,
#5, etc) and everything looked ok. vxdisk list went from "failed" to
online" and vxtask showed me the volumes were syncing back up.
Well, after about 1 1/2 hours of syncing, now the brand new disk is listed
as "online failing"
And all of the volumes are DETACHED.
vxvm:vxvol: ERROR: Volume backup has no CLEAN or non-volatile ACTIVE
plexes
vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not startable; Raid5 plex does
not map the entire volume length
Any idea what the heck is going on here?? I have been on hold for 1/2 hour
with Sun and people are starting to get worried (including me!)
EEK!
--
)
(
)
[_])
--
)
(
)
[_])
Jerry Vochteloo
2006-03-20 22:26:36 UTC
Permalink
generally a volume will get marked as failing if there have been io errors
detected on the disk. Maybe it is/was not the disk at fault..

--
Jerry Vochteloo
w: +61-2-8220-7043, m: +61 408 206 748

The opinions stated here are mine and do not necessarily represent those of
Symantec Corp





_____

From: veritas-vx-***@mailman.eng.auburn.edu
[mailto:veritas-vx-***@mailman.eng.auburn.edu] On Behalf Of Tony T.
Sent: Tuesday, 21 March 2006 8:18 AM
To: vxlist
Subject: [Veritas-vx] Replaced disk is failing too? Volumes
detached...what's going on here?


Hi all,

I had a disk failure in an 880 today that was part of a 4 disk RAID 5 group.
For some reason, this one failure brought down all the volumes.
Sun support was not able to explain why RAID 5 did nothing to protect us,
but that's the least of my worries right now:

An engineer replaced the disk and we did the standard drill (vxdiskadm, #5,
etc) and everything looked ok. vxdisk list went from "failed" to
online" and vxtask showed me the volumes were syncing back up.

Well, after about 1 1/2 hours of syncing, now the brand new disk is listed
as "online failing"

And all of the volumes are DETACHED.

If I try to start any of the volumes, I get either this message:

vxvm:vxvol: ERROR: Volume backup has no CLEAN or non-volatile ACTIVE plexes

or this one:


vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not startable; Raid5 plex does
not map the entire volume length

Any idea what the heck is going on here?? I have been on hold for 1/2 hour
with Sun and people are starting to get worried (including me!)

EEK!
--
)
(
)
[_])
robertinoau
2006-03-21 01:59:53 UTC
Permalink
I would suggest you get Symantec Tech support on the
phone pronto. Any mistakes now and you can kiss your
data good-bye

Hope it is not to late :-(
Post by Tony T.
Hi all,
I had a disk failure in an 880 today that was part
of a 4 disk RAID 5
group. For some reason, this one failure brought
down all the volumes.
Sun support was not able to explain why RAID 5 did
nothing to protect us,
An engineer replaced the disk and we did the
standard drill (vxdiskadm, #5,
etc) and everything looked ok. vxdisk list went
from "failed" to
online" and vxtask showed me the volumes were
syncing back up.
Well, after about 1 1/2 hours of syncing, now the
brand new disk is listed
as "online failing"
And all of the volumes are DETACHED.
If I try to start any of the volumes, I get either
vxvm:vxvol: ERROR: Volume backup has no CLEAN or
non-volatile ACTIVE plexes
vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not
startable; Raid5 plex does
not map the entire volume length
Any idea what the heck is going on here?? I have
been on hold for 1/2 hour
with Sun and people are starting to get worried
(including me!)
EEK!
--
)
(
)
[_])
____________________________________________________
On Yahoo!7
Messenger - Make free PC-to-PC calls to your friends overseas.
http://au.messenger.yahoo.com
robertinoau
2006-03-21 02:51:44 UTC
Permalink
Oopps to late.

Well this is what lost your data:

vxvol -f start <volume>

you should never ever force start a raid5 volume
unless your know what you are doing and looks like Sun
had no idea what they are doing.

Recover time :-(
Post by Tony T.
vxrecover -b <volume>
vxvol -f start <volume>
I have done 2 volumes so far, and neither one will
mount before running
fsck. After running fsck (clearing gobs of
partically allocated inodes),
nothing is in any of the mounted filesystems. Zip.
Nada. So, I dont know
what destroyed all the data: the disk failuer, the
fiber connections, fsck,
vxrecover or raid5.
The only thing I know for sure is that RAID5 did not
do a thing to protect
the data and I am not too happy about it :(
BTW, luxadm probe doesnt return any errors.
this is a stretch....but did you use luxadm to
remove and replace the
disk?
the v880's are fibre-channel internal, not SCSI.
The device paths for the
disks include the WWN of the disk, so when you
replace a failed disk with a
new one...the WWN doesn't match and the OS, not to
mention VRTS, don't know
what it is.
------------------------------
Behalf Of *Tony T.
*Sent:* Monday, March 20, 2006 1:18 PM
*To:* vxlist
*Subject:* [Veritas-vx] Replaced disk is failing
too? Volumes
detached...what's going on here?
Hi all,
I had a disk failure in an 880 today that was part
of a 4 disk RAID 5
group. For some reason, this one failure brought
down all the volumes.
Sun support was not able to explain why RAID 5 did
nothing to protect us,
An engineer replaced the disk and we did the
standard drill (vxdiskadm,
#5, etc) and everything looked ok. vxdisk list
went from "failed" to
online" and vxtask showed me the volumes were
syncing back up.
Well, after about 1 1/2 hours of syncing, now the
brand new disk is listed
as "online failing"
And all of the volumes are DETACHED.
If I try to start any of the volumes, I get either
vxvm:vxvol: ERROR: Volume backup has no CLEAN or
non-volatile ACTIVE
plexes
vxvm:vxvol: ERROR: Volume ORACLE_EXPORT is not
startable; Raid5 plex does
not map the entire volume length
Any idea what the heck is going on here?? I have
been on hold for 1/2 hour
with Sun and people are starting to get worried
(including me!)
EEK!
--
)
(
)
[_])
--
)
(
)
[_])
____________________________________________________
On Yahoo!7
Dancing with the Stars: Win tickets to be part of the glittering Grand Final!
http://www.yahoo.com.au/dancing-with-the-stars
Kumar, Narender
2006-03-21 04:58:46 UTC
Permalink
Hi ,

I have HP-UX 11.11 running with VXVM 3.5 Full Product .Can somebody help
me with these queries :---

For resizing a volume and underlying file system online -- do I need to
have Online JFS .


I am not sure , whether vxresize takes care of resizing the underlying
filesystem(vxfs) also .
Or I will have to resize the file system using fsadm (Online JFS
Utility).


Regards

Narender
Puskur, Naveen
2006-03-21 11:47:14 UTC
Permalink
Narender,

vxresize will take care of filesystem(i can only talk for VxFS veritas filesystem) resize as well. You don't need to issue any extra commands at filesystem level to resize it. df -k will show you new extra space on the filesystem you resized.
#vxresize -g datadg oravol1 200GB (will increase volume to 200GB )
NOTE: Make sure you have enough space on disks in diskgroup before you issue vxresize command.Be carefull as it will shrink if you enter wrong number.

Regards,
Naveen
***@citigroup.com


-----Original Message-----
From: veritas-vx-***@mailman.eng.auburn.edu
[mailto:veritas-vx-***@mailman.eng.auburn.edu]On Behalf Of Kumar,
Narender
Sent: 21 March 2006 04:59
To: robertinoau; Tony T.; Schipper, Mark
Cc: vxlist
Subject: [Veritas-vx] Resizing File Systems Online in VXVM 3.5



Hi ,

I have HP-UX 11.11 running with VXVM 3.5 Full Product .Can somebody help
me with these queries :---

For resizing a volume and underlying file system online -- do I need to
have Online JFS .


I am not sure , whether vxresize takes care of resizing the underlying
filesystem(vxfs) also .
Or I will have to resize the file system using fsadm (Online JFS
Utility).


Regards

Narender

_______________________________________________
Veritas-vx maillist - Veritas-***@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
robertinoau
2006-03-22 00:08:46 UTC
Permalink
As per man page:


vxresize - change the length of a volume
containing a file system

SYNOPSIS
/etc/vx/bin/vxresize [-bfnsx] [-F fstype] [-g
diskgroup] [-t tasktag]
volume new_length [medianame...]

DESCRIPTION
The vxresize command either grows or shrinks
both the file system and
its underlying volume to match the specified new
volume length. The
ability to grow or shrink is file system
dependent. Some file system
types may require that the file system be
unmounted for the operation
Post by Kumar, Narender
Hi ,
I have HP-UX 11.11 running with VXVM 3.5 Full
Product .Can somebody help
me with these queries :---
For resizing a volume and underlying file system
online -- do I need to
have Online JFS .
I am not sure , whether vxresize takes care of
resizing the underlying
filesystem(vxfs) also .
Or I will have to resize the file system using fsadm
(Online JFS
Utility).
Regards
Narender
____________________________________________________
On Yahoo!7
Messenger - Make free PC-to-PC calls to your friends overseas.
http://au.messenger.yahoo.com
Scott Kaiser
2006-03-22 21:09:46 UTC
Permalink
If you use the - option it will ensure that only a grow operation will be performed.

Man page:
−x Requires that the operation represent an increase in the volume length. Fail the operation otherwise.


Regards,
Scott
Post by Puskur, Naveen
-----Original Message-----
Puskur, Naveen
Sent: Tuesday, March 21, 2006 3:47 AM
To: Kumar, Narender
Subject: RE: [Veritas-vx] Resizing File Systems Online in VXVM 3.5
Narender,
vxresize will take care of filesystem(i can only talk for
VxFS veritas filesystem) resize as well. You don't need to
issue any extra commands at filesystem level to resize it. df
-k will show you new extra space on the filesystem you resized.
#vxresize -g datadg oravol1 200GB (will increase volume to 200GB )
NOTE: Make sure you have enough space on disks in diskgroup
before you issue vxresize command.Be carefull as it will
shrink if you enter wrong number.
Regards,
Naveen
-----Original Message-----
Kumar, Narender
Sent: 21 March 2006 04:59
To: robertinoau; Tony T.; Schipper, Mark
Cc: vxlist
Subject: [Veritas-vx] Resizing File Systems Online in VXVM 3.5
Hi ,
I have HP-UX 11.11 running with VXVM 3.5 Full Product .Can
somebody help me with these queries :---
For resizing a volume and underlying file system online -- do
I need to have Online JFS .
I am not sure , whether vxresize takes care of resizing the underlying
filesystem(vxfs) also .
Or I will have to resize the file system using fsadm (Online
JFS Utility).
Regards
Narender
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
Chris Naddeo
2006-03-23 00:56:43 UTC
Permalink
Additionally you do need On-line JFS for this operation.

Chris



-----Original Message-----
From: veritas-vx-***@mailman.eng.auburn.edu [mailto:veritas-vx-***@mailman.eng.auburn.edu] On Behalf Of Scott Kaiser
Sent: Wednesday, March 22, 2006 4:10 PM
To: Puskur, Naveen; Kumar, Narender
Cc: veritas-***@mailman.eng.auburn.edu
Subject: RE: [Veritas-vx] Resizing File Systems Online in VXVM 3.5

If you use the - option it will ensure that only a grow operation will be performed.

Man page:
−x Requires that the operation represent an increase in the volume length. Fail the operation otherwise.


Regards,
Scott
Post by Puskur, Naveen
-----Original Message-----
Puskur, Naveen
Sent: Tuesday, March 21, 2006 3:47 AM
To: Kumar, Narender
Subject: RE: [Veritas-vx] Resizing File Systems Online in VXVM 3.5
Narender,
vxresize will take care of filesystem(i can only talk for
VxFS veritas filesystem) resize as well. You don't need to
issue any extra commands at filesystem level to resize it. df
-k will show you new extra space on the filesystem you resized.
#vxresize -g datadg oravol1 200GB (will increase volume to 200GB )
NOTE: Make sure you have enough space on disks in diskgroup
before you issue vxresize command.Be carefull as it will
shrink if you enter wrong number.
Regards,
Naveen
-----Original Message-----
Kumar, Narender
Sent: 21 March 2006 04:59
To: robertinoau; Tony T.; Schipper, Mark
Cc: vxlist
Subject: [Veritas-vx] Resizing File Systems Online in VXVM 3.5
Hi ,
I have HP-UX 11.11 running with VXVM 3.5 Full Product .Can
somebody help me with these queries :---
For resizing a volume and underlying file system online -- do
I need to have Online JFS .
I am not sure , whether vxresize takes care of resizing the underlying
filesystem(vxfs) also .
Or I will have to resize the file system using fsadm (Online
JFS Utility).
Regards
Narender
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
_______________________________________________
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx
_______________________________________________
Veritas-vx maillist - Veritas-***@mailman.eng.auburn.edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-vx

Loading...