smartd je démon, který monitoruje systém Self-Monitoring, Analysis and Reporting Technology (SMART) zabudovaný do mnoha pevných disků ATA-3 a pozdějších ATA, IDE a SCSI-3. Účelem SMART je monitorovat spolehlivost pevného disku a předpovídat selhání disku a provádět různé typy autotestů disku. Tato verze smartd je kompatibilní se standardy ATA/ATAPI-7 a dřívějšími.
smartd se pokusí povolit monitorování SMART na zařízeních ATA (ekvivalent smartctl -s on) a každých 30 minut se dotazuje těchto a SCSI zařízení (konfigurovatelné), protokoluje chyby SMART a změny atributů SMART prostřednictvím rozhraní SYSLOG. Výchozí umístění pro tato upozornění a varování SYSLOG je /var/log/messages. Chcete-li změnit výchozí umístění, použijte volbu příkazového řádku ‚-l‘ popsanou níže.
Kromě protokolování do souboru lze smartd také nakonfigurovat tak, aby v případě zjištění problémů zasílal e-mailová upozornění. V závislosti na typu problému můžete chtít spustit autotesty disku, zálohovat disk, vyměnit disk nebo použít nástroj výrobce k vynucení přerozdělení špatných nebo nečitelných sektorů disku. Pokud jsou zjištěny problémy s diskem, nahlédněte do manuálové stránky smartctl a na webovou stránku/FAQ smartmontools, kde najdete další pokyny.
Správa služeb
Umístění skriptu Init.d:
/etc/init.d/smartd
Příklad „chkconfig –list smartd“
# chkconfig --list smartd smartd 0:off 1:off 2:on 3:on 4:on 5:on 6:off
Dostupné možnosti využití služby
# service smartd Usage: /etc/init.d/smartd {start|stop|reload|report|restart|status}
# service smartd start Starting smartd: [ OK ]
# service smartd stop Shutting down smartd: [ OK ]
# service smartd status smartd (pid 4061 2857) is running...
# service smartd restart Shutting down smartd: [ OK ] Starting smartd: [ OK ]
# service smartd reload Reloading smartd daemon configuration: [ OK ]
# service smartd report Checking SMART devices now: [ OK ]
Které démony spouští:
/usr/sbin/smartd
Konfigurace
RPM balíčky:
smartmontools-[version]-[release]
Konfigurační soubor
/etc/smartd.conf ### For CentOS/RHEL 5,6 /etc/smartmontools/smartd.conf . ### For CentOS/RHEL 7
Příklad konfiguračního souboru /etc/smartmontools/smartd.conf
# cat /etc/smartmontools/smartd.conf # Sample configuration file for smartd. See man smartd.conf. # Home page is: http://smartmontools.sourceforge.net # $Id: smartd.conf 3651 2012-10-18 15:11:36Z samm2 $ # smartd will re-read the configuration file if it receives a HUP # signal # The file gives a list of devices to monitor using smartd, with one # device per line. Text after a hash (#) is ignored, and you may use # spaces and tabs for white space. You may use '\' to continue lines. # You can usually identify which hard disks are on your system by # looking in /proc/ide and in /proc/scsi. # The word DEVICESCAN will cause any remaining lines in this # configuration file to be ignored: it tells smartd to scan for all # ATA and SCSI devices. DEVICESCAN may be followed by any of the # Directives listed below, which will be applied to all devices that # are found. Most users should comment out DEVICESCAN and explicitly # list the devices that they wish to monitor. DEVICESCAN -H -m root -M exec /usr/libexec/smartmontools/smartdnotify -n standby,10,q # Alternative setting to ignore temperature and power-on hours reports # in syslog. #DEVICESCAN -I 194 -I 231 -I 9 # Alternative setting to report more useful raw temperature in syslog. #DEVICESCAN -R 194 -R 231 -I 9 # Alternative setting to report raw temperature changes >= 5 Celsius # and min/max temperatures. #DEVICESCAN -I 194 -I 231 -I 9 -W 5 # First (primary) ATA/IDE hard disk. Monitor all attributes, enable # automatic online data collection, automatic Attribute autosave, and # start a short self-test every day between 2-3am, and a long self test # Saturdays between 3-4am. #/dev/hda -a -o on -S on -s (S/../.././02|L/../../6/03) # Monitor SMART status, ATA Error Log, Self-test log, and track # changes in all attributes except for attribute 194 #/dev/hdb -H -l error -l selftest -t -I 194 # Monitor all attributes except normalized Temperature (usually 194), # but track Temperature changes >= 4 Celsius, report Temperatures # >= 45 Celsius and changes in Raw value of Reallocated_Sector_Ct (5). # Send mail on SMART failures or when Temperature is >= 55 Celsius. #/dev/hdc -a -I 194 -W 4,45,55 -R 5 -m [email protected] # An ATA disk may appear as a SCSI device to the OS. If a SCSI to # ATA Translation (SAT) layer is between the OS and the device then # this can be flagged with the '-d sat' option. This situation may # become common with SATA disks in SAS and FC environments. # /dev/sda -a -d sat # A very silent check. Only report SMART health status if it fails # But send an email in this case #/dev/hdc -H -C 0 -U 0 -m [email protected] # First two SCSI disks. This will monitor everything that smartd can # monitor. Start extended self-tests Wednesdays between 6-7pm and # Sundays between 1-2 am #/dev/sda -d scsi -s L/../../3/18 #/dev/sdb -d scsi -s L/../../7/01 # Monitor 4 ATA disks connected to a 3ware 6/7/8000 controller which uses # the 3w-xxxx driver. Start long self-tests Sundays between 1-2, 2-3, 3-4, # and 4-5 am. # NOTE: starting with the Linux 2.6 kernel series, the /dev/sdX interface # is DEPRECATED. Use the /dev/tweN character device interface instead. # For example /dev/twe0, /dev/twe1, and so on. #/dev/sdc -d 3ware,0 -a -s L/../../7/01 #/dev/sdc -d 3ware,1 -a -s L/../../7/02 #/dev/sdc -d 3ware,2 -a -s L/../../7/03 #/dev/sdc -d 3ware,3 -a -s L/../../7/04 # Monitor 2 ATA disks connected to a 3ware 9000 controller which # uses the 3w-9xxx driver (Linux, FreeBSD). Start long self-tests Tuesdays # between 1-2 and 3-4 am. #/dev/twa0 -d 3ware,0 -a -s L/../../2/01 #/dev/twa0 -d 3ware,1 -a -s L/../../2/03 # Monitor 2 SATA (not SAS) disks connected to a 3ware 9000 controller which # uses the 3w-sas driver (Linux). Start long self-tests Tuesdays # between 1-2 and 3-4 am. # On FreeBSD /dev/tws0 should be used instead #/dev/twl0 -d 3ware,0 -a -s L/../../2/01 #/dev/twl0 -d 3ware,1 -a -s L/../../2/03 # Same as above for Windows. Option '-d 3ware,N' is not necessary, # disk (port) number is specified in device name. # NOTE: On Windows, DEVICESCAN works also for 3ware controllers. #/dev/hdc,0 -a -s L/../../2/01 #/dev/hdc,1 -a -s L/../../2/03 # Monitor 3 ATA disks directly connected to a HighPoint RocketRAID. Start long # self-tests Sundays between 1-2, 2-3, and 3-4 am. #/dev/sdd -d hpt,1/1 -a -s L/../../7/01 #/dev/sdd -d hpt,1/2 -a -s L/../../7/02 #/dev/sdd -d hpt,1/3 -a -s L/../../7/03 # Monitor 2 ATA disks connected to the same PMPort which connected to the # HighPoint RocketRAID. Start long self-tests Tuesdays between 1-2 and 3-4 am #/dev/sdd -d hpt,1/4/1 -a -s L/../../2/01 #/dev/sdd -d hpt,1/4/2 -a -s L/../../2/03 # HERE IS A LIST OF DIRECTIVES FOR THIS CONFIGURATION FILE. # PLEASE SEE THE smartd.conf MAN PAGE FOR DETAILS # # -d TYPE Set the device type: ata, scsi, marvell, removable, 3ware,N, hpt,L/M/N # -T TYPE set the tolerance to one of: normal, permissive # -o VAL Enable/disable automatic offline tests (on/off) # -S VAL Enable/disable attribute autosave (on/off) # -n MODE No check. MODE is one of: never, sleep, standby, idle # -H Monitor SMART Health Status, report if failed # -l TYPE Monitor SMART log. Type is one of: error, selftest # -f Monitor for failure of any 'Usage' Attributes # -m ADD Send warning email to ADD for -H, -l error, -l selftest, and -f # -M TYPE Modify email warning behavior (see man page) # -s REGE Start self-test when type/date matches regular expression (see man page) # -p Report changes in 'Prefailure' Normalized Attributes # -u Report changes in 'Usage' Normalized Attributes # -t Equivalent to -p and -u Directives # -r ID Also report Raw values of Attribute ID with -p, -u or -t # -R ID Track changes in Attribute ID Raw value with -p, -u or -t # -i ID Ignore Attribute ID for -f Directive # -I ID Ignore Attribute ID for -p, -u or -t Directive # -C ID Report if Current Pending Sector count non-zero # -U ID Report if Offline Uncorrectable count non-zero # -W D,I,C Monitor Temperature D)ifference, I)nformal limit, C)ritical limit # -v N,ST Modifies labeling of Attribute N (see man page) # -a Default: equivalent to -H -f -t -l error -l selftest -C 197 -U 198 # -F TYPE Use firmware bug workaround. Type is one of: none, samsung # -P TYPE Drive-specific presets: use, ignore, show, showall # # Comment: text after a hash sign is ignored # \ Line continuation character # Attribute ID is a decimal integer 1 <= ID <= 255 # except for -C and -U, where ID = 0 turns them off. # All but -d, -m and -M Directives are only implemented for ATA devices # # If the test string DEVICESCAN is the first uncommented text # then smartd will scan for devices /dev/hd[a-l] and /dev/sd[a-z] # DEVICESCAN may be followed by any desired Directives.Jak monitorovat stav disku pomocí smartd (S.M.A.R.T.)
Jak zkontrolovat, zda disk neobsahuje špatné bloky nebo chyby disku v systému CentOS / RHEL