Cheat Sheet for mdadm

# mdadm --detail /dev/md127
Displays the status & health of the specified array.

# mdadm --create /dev/md127 --level=10 /dev/sdb /dev/sdc /dev/sdd /dev/sde
Creates a RAID10 with 4 drives

# mdadm --add /dev/md127 /dev/sdf
Adds to the specified array, the specified device.

# mdadm /dev/md127 --fail /dev/sdd --remove /dev/sdd
Fails & removes specified device from array – to do only one or the other, omit the offending.

Repairing with mdadm

Time :: <5 minutes

When adding a new – or replacing an old – drive, sometimes you need to verify what the dev is inside of Linux.

# lsblk
Will give you a breakdown of how each block device is used (or, not).
In the example to the right, you can see that /dev/sdg is empty. Before we go about adding it, let’s double-check our array

# mdadm --detail /dev/md127
Gives a detailed view of the specified array
In the example, we can see that the array is running on 9 drives, and in a degraded state.

# mdadm --add /dev/md127 /dev/sdg
Add to the specified array the specified drive

This will add /dev/sdg to the array /dev/md127. You should see a response that the drive was added:
mdadm: added /dev/sdg

After this point, the array should start rebuilding by itself; you can verify with another:
# mdadm --detail /dev/mds127
The State should include recovering, and you should see your listed device something akin to:
spare rebuilding /dev/sdg

If, for whatever reason it doesn’t, you can manually grow the array.
# mdadm --grow --raid-devices=10 /dev/md127
As long as the array isn’t in a recovery/resync state, this should work. However, this can take a long time to do (days+) – as such, it is advisable to do a backup incase of, say, a power failure:
# mdadm --grow --raid-devices=10 --backup-file=/root/md127_date_grow.bak /dev/md127

# watch cat /proc/mdstat Displays mdadm status on a refresh interval – helpful to keep a gaze on the recovery process of the array.

Misc Linux Commands

Misc

# blkid
List connected partitions/devices and their’s UUID/PARTUUID

# lsblk
List connect block devices, and how they are used

# cat -n /etc/fstab
Cat file w/ line numbers

# sed -n '22p,24p;30p' /etc/fstab >> /boot/loader/entries/arch.ini
Extract selected lines of text from file and pip to another file

# setenforce 0
Disabled SELinux FOR THIS SESSION ONLY, will need to edit /etc/selinux/config to SELINUX=Disabled to keep it off

smartctl

smartctl --all /dev/sdx
Check full SMART status of a specific drive

smartctl -c /dev/sdx
Analyze test time

smartctl -t <short|long|conveyance|select> /dev/sdx
Start test (background)

smartctl -t <short|long|conveyance|select> -C /dev/sdx
Start test (foreground)

smartctl -a /dev/sdx
View results (all)

smartctl -l selftest /dev/sdx
Report only test results

smartctl -o off|on /dev/sdx
Turn OfflineAutoTests off/on per drive (requires device target)

mdadm

For setting up an array, see Using mdadm to create and manage an array.
mdadm --detail /dev/mdX
Check full details of a specific md/array
NOTE: May show up as /dev/md/raidX

Disable Auto Updates

SCONFIG

Simplest & quickest way is to use sconfig

Open an admin cmd prompt, type “sconfig” & hit enter

On the ‘Sconfig Base’ screen, choose option “5”, “Windows Update Settings.

It will state what the setting is currently at, along with options to choose Automatic, DownloadOnly, or Manual updates. Manual updates will stop the system from even checking for updates (stopping the annoying pop-up when you log in)

Once set, you will see a pop-up window stating the results of your pick. Upon returning to the main Sconfig screen, you will see the option for 5 has changed to reflect your changes.

Benefits of Manual Only:

  • Stops annoying pop-up on log-in
  • Helps lower disk queue on a VM/Cluster of constantly downloading updates in the background (helpful on a tight cluster)

Failover Disaster Scenarios

Easy Does It

Improperly power off one of the hosts.
Pull the plug, or Cold Boot in iLO

I believe the disk witness is supposed to allow the VM to failover to a working host with a Saved-Resume instead of an ‘improper shutdown-reboot’ of the VM.

I Unplugged My Switch By Accident

Power off/reboot the switch which connects the hosts to storage. If they are directly connected, completely unplug all cables. Do the same with your disk witness if both aren’t on the same system.
After a couple minutes, reconnect.
Successfully resume VMs from a saved state. Continue reading Failover Disaster Scenarios

Changing Failover Cluster Subnet Mask

NO DOWNTIME :: This will follow through a “hot” change
Time to make changes :: Less than 5 minutes

FOREWORD:

  • It is best to perform this task from the node hosting the Cluster at this time.
  • While not necessary to do it in this order, doing the Cluster’s subnet first doesn’t hurt.
  • If you have multiple DCs, make sure they are split across the cluster nodes. If you only have one, migrate it to the cluster node currently hosting the Cluster object.

Continue reading Changing Failover Cluster Subnet Mask

Misc Hyper-V Monitoring

Monitoring VM Disk Queues

Gives detailed look at each individual VM, despite what other VMs are doing

perfmon > Hyper-V Virtual Storage Device > Queue Length

Add instances separately. Can connect to remote host & add those counters, as well

Cluster Disk Counters > Read Queue Length / Write Queue Length

Gives good total look at Cluster Storage

Cluster CSVFS > Current Read Queue Length / Current Write Queue Length / Volume Pause Counter – Network

Hyper-V Virtual Storage Device(*) / Queue Length

 

Monitoring Active Migration Jobs (VM Storage)

get-wmiobject -namespace root\virtualization\v2 -class msvm_MigrationJob | ft Name,JobStatus,PercentComplete,StatusDescriptions

Just another IT blog