Melange history
ANOTHER FUCKING SAMSUNG 8TB 870 QVO BITES THE DUST
- 2024/10/15 REPLACE THE NEXT BAD MINE DRIVE
which one is it... i THINK it was da14.... but now hive doesn't say it's this drive: /dev/gptid/6660b7e0-a05b-11ee-898d-ddd4a1cedd1f here it is in a notification: CRITICAL Pool mine state is DEGRADED: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. The following devices are not healthy: Disk ATA Samsung SSD 870 S5VUNJ0W900939R is UNAVAIL okay let's offline it and swap it out ffs... with: S5VUNJ0W706311K
- HELL it is MELTING... the sixth disk da13 has now FAULTED... DID THE FAT LADY JUST START SINGING?? The resilver is stuck at 01.52%... eta keeps going up... at 6 DAYS NOW LOL... and we are supposed to watch Pachinko S2E8 dammmmmit! lolsszz
I hear i can possibly run `zpool clear` on the drive to pretend it is okay again. Might work for an SSD? No clue yet... I hate waiting... do i just reboot this fucking VM... I did both. FUcking hell... so utterly stressful...
I WAITED overnight... and it was... okay-ish the next day?? All disks good!! WTF!
But then it flipped to ONLINE (Unhealthy) based on error count i think? Doing a scrub... FINGERS FIRMLY CROSSED...
- NOW GO GET ANOTHER REPLACEMENT!
GRIM DEATH
2024/03/23 We are harvesting the hive grim drives for use as melange drives for VM storate.
I changed the location of the hive System Dataset Pool from grim to safe.
I copied all grim data to safe (oops, i forgot SharedDownloads... it's empty now...)
I removed the 'grim' pool from FreeNAS.
Now I need to move the drives! I want to keep the PCI card as pass-thru, but the two grim drives are on it.
- make note of all hive drive assignments
- open melange
- remove both grim drives from the PCI passthru
- move the one mine drive that is on SATA from SATA to one of the PCI passthroughs
- move one safe drive from SATA to the other of the PCI passthroughs
- add both grim drives to SATA
- close and restart melange and see if you can reconnect everything
- the grim drives should now show up on melange, not passed through
- the safe and mine drives should show up passed through, but perhaps hive cannot associate them; if not, try to fix
- if not, RESET/KEEP GOING brother!
Let's go...
First, capture everything...
SATA DRIVES:
๐ m@melange [~] sudo lsblk |awk 'NR==1{print $0" DEVICE-ID(S)"}NR>1{dev=$1;printf $0" ";system("find /dev/disk/by-id -lname \"*"dev"\" -printf \" %p\"");print "";}'|grep -v -E 'part|lvm' NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT DEVICE-ID(S) sda 8:0 0 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FB01 /dev/disk/by-id/wwn-0x500a0751e5a2fb01 sdb 8:16 0 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FD56 /dev/disk/by-id/wwn-0x500a0751e5a2fd56 sdc 8:32 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D2 /dev/disk/by-id/wwn-0x500a0751e5a313d2 sdd 8:48 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D6 /dev/disk/by-id/wwn-0x500a0751e5a313d6 sde 8:64 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2117E59AAE1B /dev/disk/by-id/wwn-0x500a0751e59aae1b sdf 8:80 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2121E5A2E131 /dev/disk/by-id/wwn-0x500a0751e5a2e131 sdg 8:96 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A3009A /dev/disk/by-id/wwn-0x500a0751e5a3009a sdh 8:112 1 7.3T 0 disk /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5VUNJ0W706320H /dev/disk/by-id/wwn-0x5002538f33710e1d nvme0n1 259:0 0 931.5G 0 disk /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_1TB_S4EWNJ0N107994E /dev/disk/by-id/nvme-eui.002538510141169d
104 PASSTHRUS:
๐ m@melange [~] sudo cat /etc/pve/qemu-server/104.conf boot: order=scsi0;net0 cores: 4 hostpci0: 0a:00.0,rombar=0 memory: 24576 name: hive net0: virtio=DA:E8:DA:81:EC:64,bridge=vmbr0,firewall=1 numa: 0 onboot: 1 ostype: l26 scsi0: local-lvm:vm-104-disk-0,size=25G scsi11: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FD56,size=976762584K,backup=no,serial=2122E5A2FD56 scsi12: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FB01,size=976762584K,backup=no,serial=2122E5A2FB01 scsi13: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D2,size=976762584K,backup=no,serial=2122E5A313D2 scsi14: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D6,size=976762584K,backup=no,serial=2122E5A313D6 scsi15: /dev/disk/by-id/ata-CT1000MX500SSD1_2117E59AAE1B,size=976762584K,backup=no,serial=2117E59AAE1B scsi16: /dev/disk/by-id/ata-CT1000MX500SSD1_2121E5A2E131,size=976762584K,backup=no,serial=2121E5A2E131 scsi17: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A3009A,size=976762584K,backup=no,serial=2122E5A3009A scsi18: /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5VUNJ0W706320H,backup=0,serial=S5VUNJ0W706320H,size=7814026584K scsihw: virtio-scsi-pci smbios1: uuid=dc3d077c-0063-41fe-abde-a97674d14dc8 sockets: 1 startup: order=1,up=90 vmgenid: 1dcb048d-122f-4343-ad87-1a01ee8284a6
PRE-MOVE DRIVES:
- 7 1tb safe drives on SATA
- 1 8tb mine drive on SATA
- 2 4tb grim drives on PCI
- 6 8tb mine drives on PCI
PRE-MOVE hive:
da7 2122E5A3009A 931.51 GiB safe da8 S5VUNJ0W706320H 7.28 TiB mine da11 S5B0NW0NB01796J 3.64 TiB N/A da12 S4CXNF0M307721X 3.64 TiB N/A
POST-MOVE DRIVES:
- 6 1tb safe drives on SATA
- 2 8tb mine drives on SATA
- 1 1tb safe drive on SATA
- 7 8tb mine drives on PCI
STEPS THAT MAY NEED REVERSAL
Do i need to adjust hive or melange before opening case? I guess i could remove the grim sata passthru... AND the SAFE i'm going to move, too, it will no longer pass through (we will be using the PCI card passthru for it).
- shut down VMS, then hive (but not melange)
- remove two drives from SATA passthru, first is 1 from SAFE (moving to PCI card) and second is 1 from MINE
scsi16: /dev/disk/by-id/ata-CT1000MX500SSD1_2121E5A2E131,size=976762584K,backup=no,serial=2121E5A2E131 ^^^ SWITCHING TO SWAPPING THIS ONE, it is easier to access for swapping scsi17: /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A3009A,size=976762584K,backup=no,serial=2122E5A3009A ^^ ACTUALLY, NOT THIS ONE, leave it scsi18: /dev/disk/by-id/ata-Samsung_SSD_870_QVO_8TB_S5VUNJ0W706320H,backup=0,serial=S5VUNJ0W706320H,size=7814026584K
- shut down melange
- remove two 4GB drives from PCI: S5B0NW0NB01796J, S4CXNF0M307721X
- move the only 8TB from SATA to PCI: S5VUNJ0W706320H
- move 1 1TB from SATA to PCI: 2121E5A2E131 (not 2122E5A3009A)
- connect two 4GB drives to SATA: S5B0NW0NB01796J, S4CXNF0M307721X
WENT GREAT, hive recognized the move without me doing ANYTHING!!
Now we can set up the ex-grim 4TBs for VM usage, yay.
2023 July full upgrade
I need to install a Windows 11 VM. Also have Ubuntu 20.04 machines that should be moved to 22.04. Figured as good a reason as any for a full upgrade of everything.
- Upgrade bitpost first. Upon reboot, my $*(@ IP changed again. Fuck you google. Spent a while resetting that. Here are the notes (also in my Red Dead RP journal, keepin it real (real accessible when everything's down), lol!):
cast$ ssh bitpost # not bp or bitpost.com, so we get to the LAN resource sudo su - stronger_firewall_and_save # internet should now work # get new IP from whatsmyip # fix bitpost.com DNS at domains.google.com # WAIT for propagation.... might as well fix the other DNS records... sudo service dnsmasq restart ping bitpost.com # EVENTUALLY this will work! May need to repeat this AND previous step.
- Ask Tom to update E-S DNS to use new IP
- Upgrade abtdev1, then all Ubuntu boxes (glam is toughest), then positronic last, with this pattern:
mh-update-ubuntu # and reboot sudo do-release-upgrade # best to connect directly, but ssh worked fine too sudo shutdown -h now # to prep for melange reboot
- Upgrade hive's TrueNAS install, via https://hive CHECK FOR UPDATES, then shut it down
- Update and reboot melange PROXMOX install, via https://melange:8006 Datacenter > melange > Updates
- CHECK EVERYTHING
- proxmox samba share for backups
- samba shares
- at ptl to ensure it can get to positronic
- shitcutter and blogs and wiki and...
- I had a terrible time getting GLAM apache + PHP working again now that Ubuntu uses PHP 8.1; just needed to ENABLE THE MODULE, ffs:
a2enmod php8.1
6.3 > 7.0
Proxmox uses apt for upgrades. I followed this, for the most part.
- Update all VMS
- Shut down all VMS
- Fully update current version's apt packages - this took me from 6.3 to 6.4, a necessary first step.
sudo apt update sudo apt dist-upgrade
- Upgrade basic apt sources list from buster to bullseye
sudo sed -i 's/buster\/updates/bullseye-security/g;s/buster/bullseye/g' /etc/apt/sources.list # instructions discuss pve-enterprise but i needed to change pve-no-subscription instead - but same exact steps, otherwise # ie, leave this commented out, but might as well set to bullseye # /etc/apt/sources.list.d/pve-enterprise.list # and update this to bullseye # /etc/apt/sources.list.d/pve-no-subscription.list
- Perform the full upgrade to bullseye / pm 7
sudo apt update sudo apt dist-upgrade
- Reboot
Manual restart notes
NOTE: This shouldn't be a problem any more with newer staged order restart.
One time bandit samba shares don't mount (it comes up too fast perhaps?). So restart them then restart qbt nox:
mh-setup-samba-shares sudo service qbittorrent-nox restart
I did another round of `apt update && apt dist-upgrade` without stopping containers and it went fine (with bandit fixup still needed after reboot, tho).
sudo apt update sudo apt dist-upgrade ssh bandit mh-setup-samba-shares sudo service qbittorrent-nox restart
Add 7 1TB zraid
After adding 7 new 1 TB ssds:
๐ m@melange [~] sudo lsblk |awk 'NR==1{print $0" DEVICE-ID(S)"}NR>1{dev=$1;printf $0" ";system("find /dev/disk/by-id -lname \"*"dev"\" -printf \" %p\"");print "";}'|grep -v -E 'part|lvm' NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT DEVICE-ID(S) sdh 8:112 0 931.5G 0 disk /dev/disk/by-id/wwn-0x500a0751e5a2fb01 /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FB01 sdi 8:128 0 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A2FD56 /dev/disk/by-id/wwn-0x500a0751e5a2fd56 sdj 8:144 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D2 /dev/disk/by-id/wwn-0x500a0751e5a313d2 sdk 8:160 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A313D6 /dev/disk/by-id/wwn-0x500a0751e5a313d6 sdl 8:176 1 931.5G 0 disk /dev/disk/by-id/ata-CT1000MX500SSD1_2117E59AAE1B /dev/disk/by-id/wwn-0x500a0751e59aae1b sdm 8:192 1 931.5G 0 disk /dev/disk/by-id/wwn-0x500a0751e5a2e131 /dev/disk/by-id/ata-CT1000MX500SSD1_2121E5A2E131 sdn 8:208 1 931.5G 0 disk /dev/disk/by-id/wwn-0x500a0751e5a3009a /dev/disk/by-id/ata-CT1000MX500SSD1_2122E5A3009A nvme0n1 259:0 0 931.5G 0 disk /dev/disk/by-id/nvme-eui.002538510141169d /dev/disk/by-id/nvme-Samsung_SSD_970_EVO_Plus_1TB_S4EWNJ0N107994E
Before adding 7 new 1 TB ssds:
๐ m@melange [~] ls /dev/ autofs dm-8 i2c-7 net stdin tty28 tty5 ttyS12 ttyS6 vcsu1 block dm-9 i2c-8 null stdout tty29 tty50 ttyS13 ttyS7 vcsu2 btrfs-control dri i2c-9 nvme0 tty tty3 tty51 ttyS14 ttyS8 vcsu3 bus ecryptfs initctl nvme0n1 tty0 tty30 tty52 ttyS15 ttyS9 vcsu4 char fb0 input nvme0n1p1 tty1 tty31 tty53 ttyS16 udmabuf vcsu5 console fd kmsg nvme0n1p2 tty10 tty32 tty54 ttyS17 uhid vcsu6 core full kvm nvme0n1p3 tty11 tty33 tty55 ttyS18 uinput vfio cpu fuse lightnvm nvram tty12 tty34 tty56 ttyS19 urandom vga_arbiter cpu_dma_latency gpiochip0 log port tty13 tty35 tty57 ttyS2 userio vhci cuse hpet loop0 ppp tty14 tty36 tty58 ttyS20 vcs vhost-net disk hugepages loop1 pps0 tty15 tty37 tty59 ttyS21 vcs1 vhost-vsock dm-0 hwrng loop2 psaux tty16 tty38 tty6 ttyS22 vcs2 watchdog dm-1 i2c-0 loop3 ptmx tty17 tty39 tty60 ttyS23 vcs3 watchdog0 dm-10 i2c-1 loop4 ptp0 tty18 tty4 tty61 ttyS24 vcs4 zero dm-11 i2c-10 loop5 pts tty19 tty40 tty62 ttyS25 vcs5 zfs dm-12 i2c-11 loop6 pve tty2 tty41 tty63 ttyS26 vcs6 dm-13 i2c-12 loop7 random tty20 tty42 tty7 ttyS27 vcsa dm-14 i2c-13 loop-control rfkill tty21 tty43 tty8 ttyS28 vcsa1 dm-2 i2c-14 mapper rtc tty22 tty44 tty9 ttyS29 vcsa2 dm-3 i2c-2 mcelog rtc0 tty23 tty45 ttyprintk ttyS3 vcsa3 dm-4 i2c-3 mem shm tty24 tty46 ttyS0 ttyS30 vcsa4 dm-5 i2c-4 mpt2ctl snapshot tty25 tty47 ttyS1 ttyS31 vcsa5 dm-6 i2c-5 mpt3ctl snd tty26 tty48 ttyS10 ttyS4 vcsa6 dm-7 i2c-6 mqueue stderr tty27 tty49 ttyS11 ttyS5 vcsu
macOS USB passthru failed attempt
That doesn't work on macOS. Tried setting usb mapping via console, following this:
sudo qm monitor 111 qm> info usbhost qm> quit sudo qm set 111 -usb1 host=05ac:12a8
No luck, same result. Reading his remarks on USB forwarding, try resetting machine type:
machine: pc-q35-6.0 (instead of latest, which was 6.2 at time of writing) remove this from /etc/pve/qemu-server/111.conf: -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
Hmm.. perhaps it is a conflict between Nick's usb keyboard config and my usb port selection... try plugging usb into another port and remapping...
No luck. FFS. Reset to 6.2 and see if we have any luck with hotplug line removed from config... Nope.
Keep trying permutations... nothing from googling indicates taht this shouldn't just FUCKING WORK...
Remove this and re-add the hotplug line, on the off chance it shouldn't be used with q35 v6.2:
-global nec-usb-xhci.msi=off
Nope, that jsut caused a problem with "Springboard", not working on this Mac, or some shit. Re-adding the line...
Well what now? Google more?
Update and reboot proxmox and retry... no luck.
Try changing from blue to light-blue port... the device is mapped so it should be passed through... nope.
Try this guy's approach to mount an EFI Disk
lsusb Bus 004 Device 009: ID 05ac:12a8 Apple, Inc. iPhone 5/5C/5S/6/SE ls -al /dev/bus/usb/004/009 crw-rw-r-- 1 root root 189, 392 Jul 22 16:10 /dev/bus/usb/004/009 sudo emacs /etc/pve/qemu-server/111.conf lxc.cgroup.devices.allow: c 189:* rwm lxc.mount.entry: /dev/bus/usb/004 dev/bus/usb/004 none bind,optional,create=dir
Nope.
Try mapping the port instead of device ID, from the Proxmox UI... Nope.
How can i check the apple side for any issues? straight up google for that, macOS not seeing a USB device.
System Information > USB > nada
hrmphhhh. Never got it working. RE-google next month maybe...
During original configuration, I added samba shares manually.
sudo emacs /etc/fstab # and paste samba stanza from another machine sudo emacs /root/samba_credentials sudo mkdir /spiceflow && sudo chmod 777 /spiceflow ๐ m@melange [~] mkdir /spiceflow/bitpost ๐ m@melange [~] mkdir /spiceflow/grim ๐ m@melange [~] mkdir /spiceflow/mack ๐ m@melange [~] mkdir /spiceflow/reservoir ๐ m@melange [~] mkdir /spiceflow/sassy ๐ m@melange [~] mkdir /spiceflow/safe
Now you can mount em up and hang em high!