Showing posts with label esxcli corestorage. Show all posts
Showing posts with label esxcli corestorage. Show all posts

Friday, 20 May 2011

Unclaiming a device from ESX.


Need for unclaiming an ESX device usually arises when you want to change, the plugin claiming the device or paths to the device. For example if you want to mask a device, then you may need to first add the claimrules and then unclaim the claimrules that are currently acting upon the devices.

User needs to note that, path, adapter, plugin etc based  unclaims succeed only when device is free. In other words device should not be actively servicing IOs. If VMs are powered on, or there are IOs issued to a RDM disks, then the command is bound to fail. Unclaim often fails on local disks, as you may have scratch partition and dump partition configured on it.

There are different ways to unclaim a device.
You can uncalim claimrules on device basis as follows
~ # esxcli corestorage claiming unclaim  -t  device --device naa.6009999999999284000064c349cc3cd9

Claimrules can also be claimed on basis of device vendor names too.
~ # esxcli corestorage claiming unclaim  -t  vendor --vendor IBM

In ESX user can unclaim claimrules based on path too.
 esxcli corestorage claiming unclaim  -t  path --path vmhba2:C0:T0:L111

Less popular version are: 
Driver based unclaiming
~# esxcli corestorage claiming unclaim  -t  driver --driver qla2xxx

Plugin based unclaim.
There is also provision to unclaim devices on basis of plugin names.
~ # esxcli corestorage claiming unclaim  -t  plugin --plugin MASK_PATH

If all the claimrules are hard to remember, the you can try to unclaim all the devices in ESX.
ESX will try to unclaim all the claimrules working on non busy devices. Please note that this command will return device busy messages in most of the case as it tries to unclaim the local disk too,where user might have configured swap,dump and scratch partitions.
~ # esxcli corestorage claiming unclaim  -t location
Errors:
Unable to perform unclaim.  Error message was : Unable to unclaim paths.  Busy or in use devices detected.  See VMkernel logs for more information.

After unclaiming do not forget to load and run the new claimrules. Load and Run operations will read /etc/vmware/etc.conf file and apply the claimrules to unclaimed devices.
~ # esxcli corestorage claimrule load
~ # esxcli corestorage claimrule run

Saturday, 16 April 2011

Using autoclaim while connecting ESX host to existing fabric.


I had come across an interesting  scenario, a partner faced while testing ESX in their setup.Partner had a multi node ESX40 setup connected to a Active-Passive array.Customer isolated an ESX server running ESX40 from the SAN. Installed ESX41 and tried to plug the FC cables back into HBA slots on the ESX host. And suddenly all the shared LUNs on other ESX hosts trespassed to another storage processor[ESX41 too shared these shared LUNs].

The reason for this trespass was, two HBAs on each of the host were connected to two separate fabrics and the Storage Processor was connected to the fabric as shown below
HBA1 --------> Fabric1 ------------> SP1
HBA2 --------> Fabric2 ------------> SP2

When storage admin connected the HBA  corresponding to a standby SP to the SAN fabric first, ESX  trespassed the shared LUNs to Standby SP [ Activated Standby SP]  as it was the only available path for the shared LUNs as seen from ESX41 host.

ESX uses VMW_PSP_MRU for Active-Passive array , to avoid path thrashing.This results in all the ESX host sending IOs from common SP for shared LUNs.To avoid unwanted LUN  trespass while connecting a live ESX to an existing fabric  you can run below command before connecting to the SAN (on isolated ESX host).Disabling autoclaim will preventing ESX claiming further any new devices.
  esxcli corestorage claiming autoclaim --enabled false
After you restore SAN connectivity, you have to enable autoclaim by executing
esxcli corestorage claiming autoclaim --enabled true.

Do not forget to enable autoclaim. If not ESX will never claim any new devices, even when you initiate multiple rescans manually.The possible inputs autoclaim command can take are true,false,1,0,yes,no,y and n.

Disclaimer: Storage admins follow a different approach while connecting an ESX host to existing fabric. This experiment was done in a Customer's QA environment.Please try this out at your own risk :).