Skip to main content

HDD Smart Tools for LINUX

Victoria Harbour, Hong Kong

SmartCTL

yum install smartmontools
Dependencies Resolved

Version Repository Size
==============================================================================
smartmontools x86_64 1:7.0-2.el7 base 546 k
Installing for dependencies:
mailx x86_64 12.5-19.el7 base 245 k

Transaction Summary
===============================================================================

Total download size: 791 k
Installed size: 2.4 M
smartctl --scan

/dev/sda -d scsi # /dev/sda, SCSI device

Check SMART Support Enabled

smartctl /dev/sda -i

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Hitachi Deskstar 7K3000
Device Model: Hitachi HDS723020BLA642
Serial Number: MN1270FA069J4D
LU WWN Device Id: 5 000cca 36ac2ddf7
Firmware Version: MN6OA800
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Rotation Rate: 7200 rpm
Form Factor: 3.5 inches
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 13 08:54:04 2020 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

Check Supported Tests

smartctl /dev/sda -c

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (19092) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 319) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

Supported Test:

  • Offline surface scan supported.
  • Self-test supported.
  • No Conveyance Self-test supported.
  • Selective Self-test supported.

Self Test

Check Logs:

smartctl -l selftest /dev/sda

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 16237 -
# 2 Extended offline Completed without error 00% 16219 -
# 3 Extended offline Completed without error 00% 16199 -
# 4 Extended offline Completed without error 00% 16193 -
# 5 Extended offline Completed without error 00% 16176 -
# 6 Short offline Completed without error 00% 11338 -
# 7 Short offline Completed without error 00% 11315 -
# 8 Short offline Completed without error 00% 11290 -
# 9 Short offline Completed without error 00% 11266 -
#10 Short offline Completed without error 00% 11242 -
#11 Short offline Completed without error 00% 11218 -
#12 Short offline Completed without error 00% 11194 -
#13 Short offline Completed without error 00% 11170 -
#14 Short offline Completed without error 00% 11146 -
#15 Short offline Completed without error 00% 11122 -
#16 Short offline Completed without error 00% 11098 -
#17 Short offline Completed without error 00% 11074 -
#18 Short offline Completed without error 00% 11050 -
#19 Short offline Completed without error 00% 11026 -
#20 Short offline Completed without error 00% 11002 -
#21 Short offline Completed without error 00% 10978 -

Short Test

smartctl -t short /dev/sda

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Short self-test routine immediately in off-line mode".
Drive command "Execute SMART Short self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 1 minutes for test to complete.
Test will complete after Tue Oct 13 09:13:41 2020

Use smartctl -X to abort test.

Re-check the log:

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 51689 -

sudo smartctl -H /dev/sda gives us a quick assesment PASSED:

smartctl -H /dev/sda

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Check Tresholds

smartctl -a /dev/sda | less

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 65536
2 Throughput_Performance 0x0005 135 135 054 Pre-fail Offline - 86
3 Spin_Up_Time 0x0007 225 225 024 Pre-fail Always - 255 (Average 255)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 24
5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0
7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0
8 Seek_Time_Performance 0x0005 138 138 020 Pre-fail Offline - 25
9 Power_On_Hours 0x0012 093 093 000 Old_age Always - 51690
10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 622
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 622
194 Temperature_Celsius 0x0002 146 146 000 Old_age Always - 41 (Min/Max 25/54)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 2

THRESH is manufacturer defined threshold, and in most cases if VALUE goes below this threshold disk is a toast and should be replaced immediately. s. Failure Trends in a Large Disk Drive Population.

Long Test

smartctl -t long /dev/sda

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
Sending command: "Execute SMART Extended self-test routine immediately in off-line mode".
Drive command "Execute SMART Extended self-test routine immediately in off-line mode" successful.
Testing has begun.
Please wait 319 minutes for test to complete.
Test will complete after Tue Oct 13 15:02:09 2020

Use smartctl -X to abort test.

Please wait 319 minutes for test to complete.

smartctl -l selftest /dev/sda

smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-1127.19.1.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 51696 -
# 2 Short offline Completed without error 00% 51690 -

The drive does not contain any errors.

Dead Letter

ls -la
-rw-r--r-- 1 root mail 6.7K Sep 10 18:22 dead.letter
cat dead.letter

This is an automatically generated mail message from mdadm
running on CentOS-72-64-minimal

A DegradedArray event had been detected on md device /dev/md/2.

Faithfully yours, etc.

P.S. The /proc/mdstat file currently contains the following:

Personalities : [raid1]
md2 : active raid1 sda3[0]
1451898240 blocks super 1.2 [2/1] [U_]
bitmap: 11/11 pages [44KB], 65536KB chunk

md1 : active raid1 sda2[0]
523712 blocks super 1.2 [2/1] [U_]

md0 : active raid1 sda1[0]
12574720 blocks super 1.2 [2/1] [U_]

unused devices: <none>

One of the two raided drives is gone -> [U_] and has to be replaced.