VirtualBacon

Reading S.M.A.R.T. disk data in ESXi

Posted on March 10, 2014

Beginning in vSphere 5.1 VMware provides the ability to look at S.M.A.R.T. data for HDD and SSD. As described in the VMware documentation an esxcli command is available to retrieve S.M.A.R.T. data from the disks. This can be useful to keep track of the health of your drives, perhaps including trying to estimate how much life is left in your flash drives - something about which I am occasionally asked.

Here is an example from a host in my lab.

First retrieve the list of devices.  I have 4 different SSD drive models installed so I am listing the output for each one of them so that you can see how the output differs.

esxcli storage core device list

SSD1:
t10.ATA_____KINGSTON_SV300S37A120G__________________50026B773B02B8BF____
   Display Name: Local ATA Disk (t10.ATA_____KINGSTON_SV300S37A120G__________________50026B773B02B8BF____)
   Has Settable Display Name: true
   Size: 114473
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____KINGSTON_SV300S37A120G__________________50026B773B02B8BF____
   Vendor: ATA
   Model: KINGSTON SV300S3
   Revision: 506A
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: unknown
   Other UIDs: vml.010000000035303032364237373342303242384246202020204b494e475354
   Is Local SAS Device: false
   Is Boot USB Device: false
   No of outstanding IOs with competing worlds: 32

SSD2:

t10.ATA_____Samsung_SSD_840_EVO_120GB_______________S1D5NSBDB76769V_____
   Display Name: Local ATA Disk (t10.ATA_____Samsung_SSD_840_EVO_120GB_______________S1D5NSBDB76769V_____)
   Has Settable Display Name: true
   Size: 114473
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____Samsung_SSD_840_EVO_120GB_______________S1D5NSBDB76769V_____
   Vendor: ATA
   Model: Samsung SSD 840
   Revision: EXT0
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: unknown
   Other UIDs: vml.0100000000533144354e53424442373637363956202020202053616d73756e
   Is Local SAS Device: false
   Is Boot USB Device: false
   No of outstanding IOs with competing worlds: 32

SSD3:

t10.ATA_____M42DCT512M4SSD2__________________________000000001249091F58C6
   Display Name: Local ATA Disk (t10.ATA_____M42DCT512M4SSD2__________________________000000001249091F58C6)
   Has Settable Display Name: true
   Size: 488386
   Device Type: Direct-Access
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____M42DCT512M4SSD2__________________________000000001249091F58C6
   Vendor: ATA
   Model: M4-CT512M4SSD2
   Revision: 040H
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters:
   VAAI Status: unknown
   Other UIDs: vml.010000000030303030303030303132343930393146353843364d342d435435
   Is Local SAS Device: false
   Is Boot USB Device: false
   No of outstanding IOs with competing worlds: 32

SSD4:

t10.ATA_____INTEL_SSDSC2BA100G3_____________________BTTV347502FS100FGN__
   Display Name: Local ATA Disk (t10.ATA_____INTEL_SSDSC2BA100G3_____________________BTTV347502FS100FGN__)
   Has Settable Display Name: true
   Size: 95396
   Device Type: Direct-Access 
   Multipath Plugin: NMP
   Devfs Path: /vmfs/devices/disks/t10.ATA_____INTEL_SSDSC2BA100G3_____________________BTTV347502FS100FGN__
   Vendor: ATA     
   Model: INTEL SSDSC2BA10
   Revision: 5DV1
   SCSI Level: 5
   Is Pseudo: false
   Status: on
   Is RDM Capable: false
   Is Local: true
   Is Removable: false
   Is SSD: true
   Is Offline: false
   Is Perennially Reserved: false
   Queue Full Sample Size: 0
   Queue Full Threshold: 0
   Thin Provisioning Status: yes
   Attached Filters: 
   VAAI Status: unknown
   Other UIDs: vml.010000000042545456333437353032465331303046474e2020494e54454c20
   Is Local SAS Device: false
   Is Boot USB Device: false
   No of outstanding IOs with competing worlds: 32

Next retrieve the S.M.A.R.T. information for the SSD drives.

SSD1:

~ # esxcli storage core device smart get -d=t10.ATA_____KINGSTON_SV300S37A120G__________________50026B773B02B8BF____

Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A
Media Wearout Indicator       0      0          0
Write Error Count             N/A    N/A        N/A
Read Error Count              120    50         120
Power-on Hours                99     0          99
Power Cycle Count             100    0          100
Reallocated Sector Count      100    3          100
Raw Read Error Rate           120    50         120
Drive Temperature             27     0          72
Driver Rated Max Temperature  N/A    N/A        N/A
Write Sectors TOT Count       N/A    N/A        N/A
Read Sectors TOT Count        N/A    N/A        N/A
Initial Bad Block Count       N/A    N/A        N/A
~ #

SSD2:

~ # esxcli storage core device smart get -d=t10.ATA_____Samsung_SSD_840_EVO_120GB_______________S1D5NSBDB76769V_____

Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A
Media Wearout Indicator       N/A    N/A        N/A
Write Error Count             N/A    N/A        N/A
Read Error Count              N/A    N/A        N/A
Power-on Hours                99     0          99
Power Cycle Count             99     0          99
Reallocated Sector Count      100    10         100
Raw Read Error Rate           N/A    N/A        N/A
Drive Temperature             N/A    N/A        N/A
Driver Rated Max Temperature  73     0          70
Write Sectors TOT Count       100    0          100
Read Sectors TOT Count        N/A    N/A        N/A
Initial Bad Block Count       N/A    N/A        N/A
~ #


SSD3:

~ # esxcli storage core device smart get -d=t10.ATA_____M42DCT512M4SSD2__________________________000000001249091F58C6

Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A
Media Wearout Indicator       N/A    N/A        N/A
Write Error Count             N/A    N/A        N/A
Read Error Count              100    50         100
Power-on Hours                100    1          100
Power Cycle Count             100    1          100
Reallocated Sector Count      100    10         100
Raw Read Error Rate           100    50         100
Drive Temperature             100    0          100
Driver Rated Max Temperature  N/A    N/A        N/A
Write Sectors TOT Count       100    1          100
Read Sectors TOT Count        N/A    N/A        N/A
Initial Bad Block Count       100    50         100
~ #

SSD4:

~ # esxcli storage core device smart get -d=t10.ATA_____INTEL_SSDSC2BA100G3_____________________BTTV347
502FS100FGN__

Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A  
Media Wearout Indicator       100    0          100  
Write Error Count             N/A    N/A        N/A  
Read Error Count              N/A    N/A        N/A  
Power-on Hours                100    0          100  
Power Cycle Count             100    0          100  
Reallocated Sector Count      100    0          100  
Raw Read Error Rate           N/A    N/A        N/A  
Drive Temperature             100    0          100  
Driver Rated Max Temperature  77     0          77   
Write Sectors TOT Count       100    0          100  
Read Sectors TOT Count        N/A    N/A        N/A  
Initial Bad Block Count       100    90         100  
~ # 

The reason why I show the output from the 4 different model SSD drives is to show you how monitoring flash media using S.M.A.R.T. information is not straight forward. Different manufacturers include varying output for their drive models due to the controller firmware on the devices, so in my case all 4 SSD disks have different output fields with data in them. Sometimes there actually isn't enough information to determine how much life may be left in a drive. This can pose a challenge to those of you hoping to keep an eye on this.

We should hope that S.M.A.R.T. output for flash devices will standardize over time so that all drives reflect the same information.  I suspect that output for HDD is more standard than for newer SSD, though I have not validated this (no HDD in hosts). It is probably a good idea to periodically check for firmware updates for your drives as they could add functionality and expose additional information.

 

Reference the VMware KB for additional information.

Posted by Peter

Comments (1) Trackbacks (0)
  1. Great information, Peter. Thanks!

    -Josh


Trackbacks are disabled.

Website Security Test