Just looking to see who else here run’s TrueNAS core (previously FreeNAS)
Migrated/switched over from xPenology to TrueNAS the last couple of days, had major problems, leading to me nearly loosing all our personal photos/documents etc on the xPenology, got it copied off, rebuild system as TrueNAS, copied back… interesting enough during the copy back picked up I had 2 bad SATA controllers, but TrueNAS handled that easy enough, moved drive to a different controller and it resilvered the drive all while I was copying more data… (things that the xPenology environment did not allow me to do).
Thats what I was also running, then had a drive start dying on my, pulled it, replaced and as the rebuild start it failed, zero’d the drive, tried again, it did the same thing, but now it caused a problem on a 2nd HDD’s leading to the volume crashing.
Rebooted the server, and it came back, and it’s been on ICU status since.
The way xPenology handled all of this, and the restriction of where a drive sits, and have to sit, (cast in stone) → Problem.
Well rebuild server as a TrueNAS and it seems the problem was not the HDD’s but rather 2 SATA ports. I had to replace the PSU 2 weeks ago, this is where everything started, server at random shut down, started it up and it was good for a couple of days, and then dead again, figured PSU is bad, so replaced that, but looks like the PSU damaged the MB/SATA ports along the line.
I did like xPenology… DSM interface was slicker, and SHR def saved my @ss, but getting to grips with TrueNAS Core quickly, rebuild my Unify Controller in a Jail quickly and same for my Plex Media Server.
I picked up (figured out that this was the problem causing xPenology crashing the volume) the bad Sata ports while copying the data (from backups back onto the TrueNas) and got to say all I had to do was shut down the NAS, move the HDD to another port and bring it back, I for safety sake also swopped out to another spare HDD, TrueNAS did not skip a beat, it came back said degraded pool, click on wheel, go repair and it did the resolver all I continued to copy data (come xPenology) can learn a thing or 2 here.
This is probably not at all the same thing. But. I’ve used Linux’s soft mirroring for years to build RAID1 devices. And then occasionally one of the drives would fail, monitoring would pick it up, and we’d just soft-remove (mdadm) the drive and add it back in, forcing a rebuild. Quite frequently the issue was a bad read, and the moment you rebuild the array, a write operation happens to the same sector and the drive firmware swaps out the bad sector.
Bad sectors are only swapped on write. Cannot be swapped on read. And the drive maintains this block mapping and you know nothing about it. Every hard drive is made with a certain number of spare sectors, and it swaps them out automatically over the lifetime of the drive. On a write operation.
But then this one day I had a massive problem. I could not rebuild the array, because there was a second read-error on the “healthy” drive, in another part of the disk. It took me the better part of a day, but I eventually found the block number of the bad sector (after jumping through several hoops, including the logical volume mapping). Then did more magic (all with the help of google, I hope I never have to do it again) to find out what file was stored on that block.
There was no data on that block. My raid-rebuild was failing because of a bad read operation on an unused block.
So I used the dd tool and I wrote a single 512-byte block to that block. The write caused the firmware to swap out the block. Then I rebuilt the array. It succeeded.
my car i rotated drives over those 2 bad ports and kept on having the problem pop up, eventually moved the drives vdev’s to another port and no errors on drive, resulting in my final assumption… ports are “sick”
I ordered a Sata expander card, the Vantec, 6 port, 4 channel, but talking to shop, going to skip on that and rather replace the MB, cost the same… never now what else is damaged on the board.