Sunday, 4 March 2018

Unable to delete the virtual machine snapshots (2017072)

Symptoms
  • Snapshots cannot be committed.
  • Snapshot commits without errors and Snapshot Manager is no longer populated. However, the snapshot disk is present in the virtual machine directory.
  • In ESXi 5.x hosts, the virtual machine summary tab shows message as:

    snapshot consolidation required 

    Note: For additional symptoms and log entries, see the Additional Information section.

Purpose
If you are unable to delete the virtual machine snapshots, consolidate the snapshot by powering off the backup appliance or clone the new disk.

Cause
This issue occurs if the backup server or appliance or other virtual machine holds a lock on the underlying base disk or previous snapshot file, preventing the snapshot consolidation. It can also occur due to stale processes during snapshot creation.

To access the underlying disk to back up the contents, backup servers and appliances use snapshots through the vStorage API.
After the backup completes, the backup server requests the snapshot taken for the backup to be deleted and the data merged into the previous disk. In some cases, the backup server may request the snapshot be deleted too early, while the base disk is still locked by the backup server.

Resolution
To work around this issue, use one of these options:
  • Consolidate the snapshot by removing the .vmdk from backup appliance.
  • Clone the latest snapshot disk to a new disk.

Consolidate the snapshot

  1. Power off the backup server, or remove the .vmdk from the backup virtual appliance.

    To remove the .vmdk file from the backup appliance:
    1. Ensure that it is safe to remove the disk from Backup application perspective.
    2. Right-click the backup appliance virtual machine and click Edit Settings.
    3. Check if the affected virtual machine's hard disk is mounted on the backup appliance virtual machine.
    4. If the hard disk is mounted, select the hard disk and select the Remove from virtual machine option. 

      Note: Do not select the Delete option as it will result in data loss.
    5. Click OK to exit.
  2. Create a new snapshot of the affected virtual machine and then click Delete All from the snapshot manager to consolidate all snapshots. 

    Note: In ESXi 5.x, you can consolidate the snapshot without creating a new snapshot by right-clicking the virtual machine and clicking Snapshot > Consolidate.

    • If the delete succeeds:

      Check the folder of the virtual machine to ensure that all the snapshots are consolidated. 
    • If the delete fails again with lock messages:

      Determine the host that still has a lock in the name-flat.vmdk or name-delta.vmdk file.
      For more information, see Finding the lock owners of a VMDK or file on a VMFS datastore in VMware ESXi 5.5 P05 (2110152).
    • If the delete continues to fail with lock messages:

      Restart the management agents on the ESX/ESXi host where the virtual machine is running.
    • If you are unable to determine the process or virtual machine holding the lock on the file in the error, consider migrating the running virtual machines of the ESX host that is holding a lock and reboot. 

      Note: If you are unable to reboot ESX host, consider cloning the disk in question and create a new disk.

Clone the latest snapshot disk to a new disk

To clone the latest snapshot disk and attach the new disk:
  1. Power off the virtual machine with the locked disk.
  2. Clone the latest snapshot disk from the command line. 

    For example: name-000003.vmdk
  3. Select Remove from virtual machine option to remove the locked disk from the virtual machine.
  4. Attach the cloned .vmdk file to the virtual machine.
  5. Power on the virtual machine.

Related Information
You experience these additional symptoms:
  • In the vmware.log, you see errors similar to:

    vmx| ConsolidateOnlineCB: nextState = 2 uid 3
    vmx| Foundry operation failed with system error: Device or resource busy (16), translated to 5
    vmx| ConsolidateOnlineCB: Done with consolidate
     

  • When you attempt to remove the datastore, you see this error:

    The resource '<VMFS-UUID>' is in use
  • If you are running third party back up software, consolidation might fail with the following errors in vmware.log file:

    vcpu-0| Vix: [8803 mainDispatch.c:4084]: VMAutomation_ReportPowerOpFinished: statevar=3, newAppState=1881, success=1 additionalError=0
    vcpu-0| Vix: [8803 vigorCommands.c:577]: VigorSnapshotManagerConsolidateCallback: snapshotErr = Failed to lock the file (5:4008)
    vcpu-0| SnapshotVMXConsolidateOnlineCB: Destroying thread 6
    vcpu-0| Turning off snapshot info cache.
    vcpu-0| Turning off snapshot disk cache.
    vcpu-0| SnapshotVMXConsolidateOnlineCB: Done with consolidate
  • In the vmkernel or messages log files you see entries similar to:

    vmkernel: gen 2141, mode 1, owner 4b94bb81-0dd2dd58-3bd1-002219927983 mtime 244622]on volume 'LUN03'.
    vmkernel: 12:14:41:53.816 cpu2:4109)FS3: 2890: [Requested mode: 1] Lock [type 10c00001 offset 7505920 v 920, hb offset 3510272
    vmkernel: gen 2141, mode 1, owner 4b94bb81-0dd2dd58-3bd1-002219927983 mtime 244622] is not free on volume 'LUN03'
    vmkernel: 12:14:41:53.832 cpu2:4111)FS3: 2798: [Requested mode: 1] Checking liveness of lock holders [type 10c00001 offset 7313408 v 796, hb offset 3510272

  • In the hostd.log file, you see entries similar to these during the snapshot delete process:

    DISKLIB-LIB : Failed to delete disk '/vmfs/volumes/4c5f4b7a-43b90a52-32ad-00237d5b917e/TESTVM/TESTVM_1-000001.vmdk' or one of its components: Device or resource busy
  • When you attempt to consolidate by right-clicking the virtual machine and clicking Snapshot > Consolidate, you see errors similar to:

    Consolidate virtual machine disk files <hostname> Unable to access file <unspecified filename> since it is locked
    Consolidation failed for disk node 'scsi0:8': msg.fileio.lock.

No comments:

Post a Comment

devops interview questions

Terraform* 1. Terraform workspace 2. ⁠what are Mera arguments 3. ⁠what’s difference b/w for each and dynamic block 4. ⁠provisioners in t...