Different Brands of HDD on RAID controller? at DVinfo.net

Adam Gold · December 14th, 2009, 12:32 PM

As many of you may know, in the ongoing soap opera that is my PC life, I've had multiple failures of my seven Seagate ES.2 1TB drives in my RAID array. I'm getting tired of swapping out the drives for the same make and model new ones, only to have them fail again. So I'm considering shifting to a different make and model.

But my PC builder fears that the Areca 1231-ML controller I have may have issues with drives of different brands in the RAID array. Obviously they all have to be the same capacity, but if they are, will the brand make a difference? In other words, if I switch, do I have to replace all the drives at once with the same brand?

Whether or not this is an issue, what's everyone's experience with other brands? Samsung Spinpoints? WD RE3s? Hitachi Deskstars/Ultrastars?

Harm, I counting on you to have an opinion on this...

Christopher Drews · December 14th, 2009, 01:39 PM

If you haven't filled the entire 6TB+ of data, I'd migrate to a different array if you want to change brands. Example, I'd use 3x of the newest WD Black Edition 2TB drives. They have been getting rave reviews lately and are quoted as the fastest and most secure drives on the market. I'd also recommend a RocketRaid card. Never had a problem using WD or RR.

As to your question, I've never heard of anyone successfully changing out their RAID HD brands unless buying a completely new set (ie - you'd need to get 7 and somehow migrate your data).
There's a product called "Drobo" but I wouldn't expects the same writes.

-C

Adam Gold · December 14th, 2009, 01:55 PM

Good info, thanks. Note that WD says you can't use a WD Caviar Black in a RAID -- some sort of deep recovery cycle it goes into periodically causes it to drop of of the array. They say you must use the RE3 or 4 (at a 50% price premium). Could be nonsense.

Harm Millaard · December 14th, 2009, 02:23 PM

Adam,

I just don't know the answer. I have my raids all with the same disks, a NAS with Seagates, a server with Maxtors, my video machine with Samsungs, another server with HP SCSI's, in short a whole bundle of different disks, but each and every raid has only the same disks.

While I agree with Christopher that WD RE4 disks have a great reputation, I disagree on the RocketRaid. He may not be aware that you are using the Rolls-Royce amongst the raid controllers, but RocketRaid is just a mediocre controller and nobody in his right mind would consider exchanging his Rolls-Royce (or Bentley) for a Skoda or Kia. Not saying that Skoda, Kia or RocketRaid are unreliable, far from it, but they can't compete with Areca or Rolls-Royce or Bentley.

I believe you have the 1231ML-8 card? That does not leave you with a lot of options. Had it been the 16 port version, I would have suggested to either migrate to the ultimate disks, Seagate Cheetah 15K.7 600 G in a raid3 with hot-spare (8 disks) or for reasonable performance to the WD RE4 and then slowly dismantle your current disk array and keep the reliable disks for other purposes.

Sorry I can't give you more substantial help.

Adam Gold · December 14th, 2009, 04:15 PM

No, it's a great help -- always appreciate your take on things. I've actually got the 1231-ML12 card, so could run 12 disks, but only have 7 available slots in the box for the RAID.

Looks like I'm sticking with the Seagates for now, I guess.

Steve Kalle · December 19th, 2009, 04:30 AM

Adam, where have you been getting your ES.2s? Some places have horrible inventory practices and some have bad shipping practices(like newegg). I have been fortunate with 18 Seagates over the last 3 yrs (6 7200.10s, 4 7200.11s, 8 7200.12s) but 2 of 8 WD Raptors have died.

About WD and Raid - they call it "Time Limited Error Recovery (TLER)". When a drive doesn't respond quick enough, some Raid controllers 'drop' the drive from the array because the controller thinks the drive is bad. I have experienced this with the built in Intel raid and lost data on a Raid 0 array. However, I was reading the error logs for my 3ware raid controller and it logged a drive not responding in time but nothing bad happened.

WD had TLER enabled on their Black drives but disabled it somewhat recently. I have read numerous reports of the WD Blacks having issues with Raid arrays even on good hardware controllers, which probably has something to do with the deep recovery cycle.
I built a PC a few months ago for someone and used 2 1TB WD Blacks. I noticed that the drives would spin down on their own all the time which would cause a several second delay when trying to open folders on the drives.

The new Samsung F3's have amazingly positive reviews on newegg and are the fastest 7200rpm 1TB drives. I have been a loyal Seagate user but I would have bought 6 F3's instead of the 7200.12s a couple months ago had I known about them.

With 7 drives failing, I would seriously look into either the power supply and Areca as possible culprits.

Adam Gold · December 19th, 2009, 03:55 PM

As it turns out, we *think* the problem may have been the backplane or connectors. All the failures were always on the same channel, so I made them replace the backplane and connectors and since then things have been pretty stable. On this most recent "failure," I took the "failed" drive and put it in an old external USB enclosure and it booted up and tested fine. I formatted it and it seems to be working okay as a spare drive. So obviously something wonky in the system.

Before I do my impending OS and CS4 upgrade, I'll go into the box and make sure all the wires and connectors are solid.

I've been getting my ES.2s from my local system builder, but I just ordered two online (not from NewEgg although that's where I usually go) so we'll see how these new ones turn out.

Harm Millaard · December 19th, 2009, 04:00 PM

Adam,

If I remember correctly from previous discussions you have had problems with the backplane earlier. Did they then replace it or only test it and reinsert it?

Adam Gold · December 19th, 2009, 04:08 PM

This was actually about a year ago and they *said* they replaced it. Who knows... Nothing done to the system since then, except for this most recent drive "failure" last week. First hardware issue in about ten months.

But we'll still never know whether all the previously-reported failed drives really were bad drives or just not being seen due to a failed connection.

Jim Andrada · December 19th, 2009, 04:30 PM

Aha - the infamous "long retry" problem strikes again!

To follow up on what Steve said a couple of posts ago:

The basic issue is that when an HDD (or tape drive for that matter) detects an error and goes into retry, it falls into a limbo state where it isn't good and isn't bad. As makers have improved their error recovery algoritms they tend to do a better job of recovery, but they also take longer.

In extreme cases (particularly in high end systems) the operating system will continue to queue i/o for the "limbo" drive but the i/o can sit there for a long time and eventully in a heavily loaded system the whole system can crash because of the stalled condition.

I used to do performance analysis for the US airlines' reservation systems and when you have 20 thousand i/o's or more per second, things can turn to Sh-- pretty quickly indeed!

In a lot of cases it's better to just declare an error and let the system or raid controller or whatever do its thing, and this is the concept behind things like Time Limited Recovery

Raally well thought out storage systems actually take a somewhat different tack and because the success rate on some kinds of error recovery is pretty good they want to let the drive keep trying to recover, so what they do is to mark the drive dead after X seconds and then start logging writes to that drive into memory so that to the total system it appears that everything is still functioning correctly. When error recovery eventually completes, if recovery was successful the controller applies the logged writes to the drive and then marks it "good" and brings it back into the array.

Jim Andrada · December 20th, 2009, 12:10 AM

Double post - sorry!!!

December 14th, 2009, 12:32 PM	#1
Adam Gold Inner Circle Join Date: Jan 2007 Location: Woodinville, WA USA Posts: 3,467	Different Brands of HDD on RAID controller? As many of you may know, in the ongoing soap opera that is my PC life, I've had multiple failures of my seven Seagate ES.2 1TB drives in my RAID array. I'm getting tired of swapping out the drives for the same make and model new ones, only to have them fail again. So I'm considering shifting to a different make and model. But my PC builder fears that the Areca 1231-ML controller I have may have issues with drives of different brands in the RAID array. Obviously they all have to be the same capacity, but if they are, will the brand make a difference? In other words, if I switch, do I have to replace all the drives at once with the same brand? Whether or not this is an issue, what's everyone's experience with other brands? Samsung Spinpoints? WD RE3s? Hitachi Deskstars/Ultrastars? Harm, I counting on you to have an opinion on this... __________________ "It can only be attributable to human error... This sort of thing has cropped up before, and it has always been due to human error."

December 14th, 2009, 01:55 PM	#3
Adam Gold Inner Circle Join Date: Jan 2007 Location: Woodinville, WA USA Posts: 3,467	Good info, thanks. Note that WD says you can't use a WD Caviar Black in a RAID -- some sort of deep recovery cycle it goes into periodically causes it to drop of of the array. They say you must use the RE3 or 4 (at a 50% price premium). Could be nonsense. __________________ "It can only be attributable to human error... This sort of thing has cropped up before, and it has always been due to human error."

December 14th, 2009, 02:23 PM	#4
Harm Millaard Trustee Join Date: Aug 2006 Location: Rotterdam, Netherlands Posts: 1,832	Adam, I just don't know the answer. I have my raids all with the same disks, a NAS with Seagates, a server with Maxtors, my video machine with Samsungs, another server with HP SCSI's, in short a whole bundle of different disks, but each and every raid has only the same disks. While I agree with Christopher that WD RE4 disks have a great reputation, I disagree on the RocketRaid. He may not be aware that you are using the Rolls-Royce amongst the raid controllers, but RocketRaid is just a mediocre controller and nobody in his right mind would consider exchanging his Rolls-Royce (or Bentley) for a Skoda or Kia. Not saying that Skoda, Kia or RocketRaid are unreliable, far from it, but they can't compete with Areca or Rolls-Royce or Bentley. I believe you have the 1231ML-8 card? That does not leave you with a lot of options. Had it been the 16 port version, I would have suggested to either migrate to the ultimate disks, Seagate Cheetah 15K.7 600 G in a raid3 with hot-spare (8 disks) or for reasonable performance to the WD RE4 and then slowly dismantle your current disk array and keep the reliable disks for other purposes. Sorry I can't give you more substantial help. Last edited by Harm Millaard; December 14th, 2009 at 03:32 PM.

December 14th, 2009, 04:15 PM	#5
Adam Gold Inner Circle Join Date: Jan 2007 Location: Woodinville, WA USA Posts: 3,467	No, it's a great help -- always appreciate your take on things. I've actually got the 1231-ML12 card, so could run 12 disks, but only have 7 available slots in the box for the RAID. Looks like I'm sticking with the Seagates for now, I guess. __________________ "It can only be attributable to human error... This sort of thing has cropped up before, and it has always been due to human error."

December 19th, 2009, 03:55 PM	#7
Adam Gold Inner Circle Join Date: Jan 2007 Location: Woodinville, WA USA Posts: 3,467	As it turns out, we think the problem may have been the backplane or connectors. All the failures were always on the same channel, so I made them replace the backplane and connectors and since then things have been pretty stable. On this most recent "failure," I took the "failed" drive and put it in an old external USB enclosure and it booted up and tested fine. I formatted it and it seems to be working okay as a spare drive. So obviously something wonky in the system. Before I do my impending OS and CS4 upgrade, I'll go into the box and make sure all the wires and connectors are solid. I've been getting my ES.2s from my local system builder, but I just ordered two online (not from NewEgg although that's where I usually go) so we'll see how these new ones turn out. __________________ "It can only be attributable to human error... This sort of thing has cropped up before, and it has always been due to human error."

December 14th, 2009, 01:39 PM	#2
Christopher Drews Major Player Join Date: Jul 2007 Location: Los Angeles, CA Posts: 628	If you haven't filled the entire 6TB+ of data, I'd migrate to a different array if you want to change brands. Example, I'd use 3x of the newest WD Black Edition 2TB drives. They have been getting rave reviews lately and are quoted as the fastest and most secure drives on the market. I'd also recommend a RocketRaid card. Never had a problem using WD or RR. As to your question, I've never heard of anyone successfully changing out their RAID HD brands unless buying a completely new set (ie - you'd need to get 7 and somehow migrate your data). There's a product called "Drobo" but I wouldn't expects the same writes. -C

December 19th, 2009, 04:30 AM	#6
Steve Kalle Trustee Join Date: Aug 2009 Location: Chicago, IL Posts: 1,554	Adam, where have you been getting your ES.2s? Some places have horrible inventory practices and some have bad shipping practices(like newegg). I have been fortunate with 18 Seagates over the last 3 yrs (6 7200.10s, 4 7200.11s, 8 7200.12s) but 2 of 8 WD Raptors have died. About WD and Raid - they call it "Time Limited Error Recovery (TLER)". When a drive doesn't respond quick enough, some Raid controllers 'drop' the drive from the array because the controller thinks the drive is bad. I have experienced this with the built in Intel raid and lost data on a Raid 0 array. However, I was reading the error logs for my 3ware raid controller and it logged a drive not responding in time but nothing bad happened. WD had TLER enabled on their Black drives but disabled it somewhat recently. I have read numerous reports of the WD Blacks having issues with Raid arrays even on good hardware controllers, which probably has something to do with the deep recovery cycle. I built a PC a few months ago for someone and used 2 1TB WD Blacks. I noticed that the drives would spin down on their own all the time which would cause a several second delay when trying to open folders on the drives. The new Samsung F3's have amazingly positive reviews on newegg and are the fastest 7200rpm 1TB drives. I have been a loyal Seagate user but I would have bought 6 F3's instead of the 7200.12s a couple months ago had I known about them. With 7 drives failing, I would seriously look into either the power supply and Areca as possible culprits.

December 19th, 2009, 04:00 PM	#8
Harm Millaard Trustee Join Date: Aug 2006 Location: Rotterdam, Netherlands Posts: 1,832	Adam, If I remember correctly from previous discussions you have had problems with the backplane earlier. Did they then replace it or only test it and reinsert it?

December 19th, 2009, 04:08 PM	#9
Adam Gold Inner Circle Join Date: Jan 2007 Location: Woodinville, WA USA Posts: 3,467	This was actually about a year ago and they said they replaced it. Who knows... Nothing done to the system since then, except for this most recent drive "failure" last week. First hardware issue in about ten months. But we'll still never know whether all the previously-reported failed drives really were bad drives or just not being seen due to a failed connection. __________________ "It can only be attributable to human error... This sort of thing has cropped up before, and it has always been due to human error."

December 19th, 2009, 04:30 PM	#10
Jim Andrada Inner Circle Join Date: Feb 2007 Location: Tucson AZ Posts: 2,211	Aha - the infamous "long retry" problem strikes again! To follow up on what Steve said a couple of posts ago: The basic issue is that when an HDD (or tape drive for that matter) detects an error and goes into retry, it falls into a limbo state where it isn't good and isn't bad. As makers have improved their error recovery algoritms they tend to do a better job of recovery, but they also take longer. In extreme cases (particularly in high end systems) the operating system will continue to queue i/o for the "limbo" drive but the i/o can sit there for a long time and eventully in a heavily loaded system the whole system can crash because of the stalled condition. I used to do performance analysis for the US airlines' reservation systems and when you have 20 thousand i/o's or more per second, things can turn to Sh-- pretty quickly indeed! In a lot of cases it's better to just declare an error and let the system or raid controller or whatever do its thing, and this is the concept behind things like Time Limited Recovery Raally well thought out storage systems actually take a somewhat different tack and because the success rate on some kinds of error recovery is pretty good they want to let the drive keep trying to recover, so what they do is to mark the drive dead after X seconds and then start logging writes to that drive into memory so that to the total system it appears that everything is still functioning correctly. When error recovery eventually completes, if recovery was successful the controller applies the logged writes to the drive and then marks it "good" and brings it back into the array.

December 20th, 2009, 12:10 AM	#11
Jim Andrada Inner Circle Join Date: Feb 2007 Location: Tucson AZ Posts: 2,211	Double post - sorry!!!