NSClient++ Help (#1) - feeding smart status back to nagios (#328) - Message List
I'd like to get the SMART status of the disk. I think I'm making some progress, i.e.
C:\Program Files\NSClient++>"nsclient++.exe" CheckWMI Select Status from Win32_DiskDrive
l \NSClient++.cpp(370) Attempting to start NSCLient++ - 0.3.5.2 2008-09-24
l \NSClient++.cpp(476) NSCLient++ - 0.3.5.2 2008-09-24 Started!
l \CheckWMI.cpp(307) | Status |
l \CheckWMI.cpp(317) | OK |
l \NSClient++.cpp(530) Attempting to stop NSCLient++ - 0.3.5.2 2008-09-24
l \NSClient++.cpp(589) NSCLient++ - 0.3.5.2 2008-09-24 Stopped succcessfully
But I dont' see how to turn this output into something useful for Nagios, i.e. "OK", "WARNING", "CRITICAL". It appears that the possible return values for "Status" are one of the following: OK,Error,Degraded,Unknown,Pred Fail, Starting, Stopping, Service, Stressed, NonRecover?, No Contact or Lost Comm. I would be happy with "OK" resulting in a Nagios "OK" and anything else being a "WARNING". Ideally, "WARNING" followed by the "Status" from WMI. Is there a way to do this using the "filter" and Max/Min? syntax?
Bonus question: And what do you do if you have multiple drives? I don't see any obvious way to specify a specific drive to check (?)
Thanks -e
-
Message #1006
I've hacked something together that seems to work using WSH and WMI (no smartmontools). It displays the following for the "Service State Information":
Current Status: OK (for 0d 0h 14m 52s) Status Information:SMART Status is OK Performance Data:WDC WD1500HLFS-01G6U0 139 GB
In "nsc.ini" on the client, I've made the following changes:
uncommented NRPEListener.dll added to [External Scripts] check_smart_disk0=cscript.exe T:30 NoLogo "C:\Program Files\NSClient++\scripts\smart.vbs" 0 check_smart_disk1=cscript.exe T:30 NoLogo "C:\Program Files\NSClient++\scripts\smart.vbs" 1
The file "smart.vbs" contains:
set args = wscript.arguments drive = Cint(args(0))
strComputer = "." Set objWMIService = GetObject?("winmgmts:" _ & "{impersonationLevel=impersonate}\\" & strComputer _ & "\root\cimv2")
Set diskset = objWMIService.ExecQuery? _
("Select * from Win32_DiskDrive")
For Each disk in diskset If disk.index = drive Then
Select Case Disk.Status
Case "OK"
WScript.Echo "SMART Status is OK| " & Disk.Model & " " & Int(Disk.Size/1073741824) & " GB" WScript.Quit(0)
Case Else
Wscript.Echo "SMART Status is " & Disk.Status
Wscript.quit(1)
End Select
End If
next
There are actually a bunch of different error states, but I figure I want to know about anything other than "OK". I kept the level as "WARNING", as I don't know if it's going to be useful until I get more experience with the real-life disk error messages. I'm aware that disks sometimes die with no warning from SMART. I've never used visual basic before, so feel free to improve on this. I just cobbled together little snippets of code I found via google. On the Nagios server side, the service and hostgroup definitions look like the following: define service{
use generic-service hostgroup_name check_smart_disk0 service_description SMART Disk 0 check_command check_nrpe!check_smart_disk0 check_interval 720 }
define service{
use generic-service hostgroup_name check_smart_disk1 service_description SMART Disk 1 check_command check_nrpe!check_smart_disk1 check_interval 720 }
define hostgroup{
hostgroup_name check_smart_disk0 alias Windows SMART Disk0 status members host1, host2, host3
}
define hostgroup{
hostgroup_name check_smart_disk1 alias Windows SMART Disk1 status members host2 }
I do have to know ahead of time the number of disks to check on the client. Seems to be working so far. -e
eap01/22/09 23:35:39 (3 years ago)-
Message #1131
Be cautious; you may not be monitoring your actual SMART status. Based on the following resources, I wrote my own script, which monitors all available drives.
Using WMIC to check for predicted disk failure (S.M.A.R.T. analysis) (Links to the Microsoft standard are in here.)
'Adam Buchbinder '2009-03-20 ' Counts the number of disks reporting a SMART failure ' prediction. If it triggers, follow up with a test with ' smartmontools or the like. ' References (obfuscated to get by the antispambot): ' h**p://www.microsoft.com/whdc/archive/smartdrv.mspx ' h**p://blogs.msdn.com/jamesfi/archive/2007/01/15/using-wmic-to-check-for-predicted-disk-failure-s-m-a-r-t-analysis.aspx strComputer = "." Set objWMIService = GetObject("winmgmts:" _ & "{impersonationLevel=impersonate}!\\" & strComputer _ & "\root\WMI") Set disks = objWMIService.ExecQuery _ ("Select * from MSStorageDriver_FailurePredictStatus") totalcount = disks.Count If totalcount = 0 Then Wscript.echo "SMART status UNKNOWN (no disks found)" Wscript.quit(3) End If failcount = 0 For Each disk in disks If disk.PredictFailure Then failcount = failcount + 1 End If Next nonfailcount = totalcount - failcount okstr = "(" & nonfailcount & "/" & totalcount & " OK)" If failcount > 0 Then Wscript.echo "SMART status CRITICAL " & okstr Wscript.quit(2) End If Wscript.echo "SMART status OK " & okstr Wscript.quit(0)Note that we're using MSStorageDriver_FailurePredictStatus rather than Win32_DiskDrive; when I checked the latter on my Windows 2003 Server box, it returned empty for the Capabilities field, which describes whether or not SMART is available. Feeling sketchy about this, I wrote the above method. Hope it's helpful to you. (It still needs to be called as an external command, since WQL doesn't include anything like SELECT COUNT(*).)
Also, this is absolutely my first time writing anything in VBScript or WQL, so please let me know if I've done anything silly, or if this is useful.
adam.buchbinder@…03/20/09 21:02:39 (3 years ago)
-







