SMART Диска

Lord Maverik
На сайте с 15.04.2003
Offline
471
738

Вот такую вот инфу по диску получил командой

smartctl --all /dev/sda

Как я понимаю диску каюк практически?


smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: ST3000DM001-9YN166
Serial Number: Z1F10PYX
LU WWN Device Id: 5 000c50 04e11a801
Firmware Version: CC4H
User Capacity: 3,000,592,982,016 bytes [3.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ATA-8-ACS revision 4
Local Time is: Wed Feb 4 16:39:56 2015 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 584) seconds.
Offline data collection
capabilities: (0x73) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x3085) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 117 096 006 Pre-fail Always - 154029120
3 Spin_Up_Time 0x0003 094 093 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 13
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 082 060 030 Pre-fail Always - 183908304
9 Power_On_Hours 0x0032 083 083 000 Old_age Always - 15712
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 13
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 001 001 000 Old_age Always - 185
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x003a 096 096 000 Old_age Always - 4
190 Airflow_Temperature_Cel 0x0022 074 056 045 Old_age Always - 26 (Min/Max 19/44)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 12
193 Load_Cycle_Count 0x0032 073 073 000 Old_age Always - 55130
194 Temperature_Celsius 0x0022 026 044 000 Old_age Always - 26 (0 18 0 0)
197 Current_Pending_Sector 0x0012 100 099 000 Old_age Always - 32
198 Offline_Uncorrectable 0x0010 100 099 000 Old_age Offline - 32
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 30885109837612
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 2838387830817
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 255555644315680


RedMall.Ru (https://redmall.ru) - Товары из Китая (Таобао, Tmall) с проверкой качества, скидка для форумчан 7% Партнерская программа 2 уровня: 5% + 5%. Подробнее. (https://redmall.ru/about/partner/)
Lord Maverik
На сайте с 15.04.2003
Offline
471
#1

И продолжение:


SMART Error Log Version: 1
ATA Error Count: 177 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 177 occurred at disk power-on lifetime: 15654 hours (652 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 80 ff ff ff 4f 00 14d+22:12:57.874 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 14d+22:12:57.874 SET FEATURES [Reserved for Serial ATA]
27 00 00 00 00 00 e0 00 14d+22:12:57.874 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 14d+22:12:57.874 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 14d+22:12:57.874 SET FEATURES [Set transfer mode]

Error 176 occurred at disk power-on lifetime: 15654 hours (652 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 80 ff ff ff 4f 00 14d+22:12:55.024 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 14d+22:12:55.024 SET FEATURES [Reserved for Serial ATA]
27 00 00 00 00 00 e0 00 14d+22:12:55.024 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 14d+22:12:55.023 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 14d+22:12:55.023 SET FEATURES [Set transfer mode]

Error 175 occurred at disk power-on lifetime: 15654 hours (652 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 80 ff ff ff 4f 00 14d+22:12:52.182 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 14d+22:12:52.181 SET FEATURES [Reserved for Serial ATA]
27 00 00 00 00 00 e0 00 14d+22:12:52.181 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 14d+22:12:52.181 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 14d+22:12:52.181 SET FEATURES [Set transfer mode]

Error 174 occurred at disk power-on lifetime: 15654 hours (652 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 80 ff ff ff 4f 00 14d+22:12:49.331 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 14d+22:12:49.331 SET FEATURES [Reserved for Serial ATA]
27 00 00 00 00 00 e0 00 14d+22:12:49.331 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 14d+22:12:49.330 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 14d+22:12:49.330 SET FEATURES [Set transfer mode]

Error 173 occurred at disk power-on lifetime: 15654 hours (652 days + 6 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 80 ff ff ff 4f 00 14d+22:12:46.445 READ FPDMA QUEUED
ef 10 02 00 00 00 a0 00 14d+22:12:46.445 SET FEATURES [Reserved for Serial ATA]
27 00 00 00 00 00 e0 00 14d+22:12:46.445 READ NATIVE MAX ADDRESS EXT
ec 00 00 00 00 00 a0 00 14d+22:12:46.445 IDENTIFY DEVICE
ef 03 46 00 00 00 a0 00 14d+22:12:46.445 SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Lord Maverik
На сайте с 15.04.2003
Offline
471
#2

А вот на другом хосте такая информация:

root@213 ~ # smartctl --all /dev/sda

smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.2.0-4-amd64] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: WDC WD2000FYYZ-01UL1B1
Serial Number: WD-WCC1P1043885
LU WWN Device Id: 5 0014ee 25ed1fdb8
Firmware Version: 01.01K02
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Wed Feb 4 21:24:04 2015 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (24960) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 255) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x70bd) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 100 253 021 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 1
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 087 087 000 Old_age Always - 9928
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 1
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 100 096 000 Old_age Always - 50
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 292 -
# 2 Extended offline Completed without error 00% 4 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Меня собственно вот это смущает везде:

Pre-fail и Old_age

Тут тоже диску каюк скоро? Или я все не так понимаю?

Andron_buton
На сайте с 19.07.2007
Offline
270
#3

Lord Maverik, вот эти параметры критичны

5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
197 Current_Pending_Sector 0x0012 100 099 000 Old_age Always - 32
198 Offline_Uncorrectable 0x0010 100 099 000 Old_age Offline - 32

32 сектора конечно не убийственно, но они не релокейтед, значит, все что на них записано - не читаемо и при попадании головки на один из этих секторов все будет тупить. Первый веник желательно заменить и слить с него практически все получится.

Второй веник абсолютно нов и цел.

Lord Maverik:
Меня собственно вот это смущает везде:
Pre-fail и Old_age
Type – тип атрибута. Может быть критическим (pre-fail), который указывает на предстоящий отказ диска из-за ошибок или не критический, указывающий на достижение конца жизненного цикла диска.

Можно первый веник попробовать еще потестить:

smartctl -t long /dev/sda
Lord Maverik
На сайте с 15.04.2003
Offline
471
#4

Andron_buton, спасибо большое :)

pupseg
На сайте с 14.05.2010
Offline
347
#5

Парни, когда мы будем брать деньги за "посмотреть на смарт" ? не вижу, что бы ТС заглянул хотя бы сюда: http://ru.wikipedia.org/wiki/S.M.A.R.T.

:)

оценка смарт: 10$

зайти на сервер и лично взглянуть: 20$

и т д:)

PS: устал уже от постов в этом разделе (поэтому почти не захожу) на предмет: вот мои конфиги и выводы команд. Покажите пальцем - где и что написать, что бы было правильно. Ни капли предложений , начинающихся с "я прочитал это и это", "проанализировав логи тут и там я пришел к выводу" и "ситуация мне не ясна" прошу профессионалов "посмотреть на мои конфиги" и т д.

:)

Качественная помощь в обслуживании серверов. (/ru/forum/661100) Бесплатных консультаций не даю, не помогаю, не обучаю. Минималка от 100$. Как пропатчить KDE-просьба не спрашивать. Есть форумы (http://linux.org.ru) и полезные сайты (http://www.opennet.ru/).
Andreyka
На сайте с 19.02.2005
Offline
822
#6

Скорее там не диск а шлейф/контроллер

Не стоит плодить сущности без необходимости
Andron_buton
На сайте с 19.07.2007
Offline
270
#7
Andreyka:
Скорее там не диск а шлейф/контроллер

Когда шлейф/контроллер, растет этот счетчик:

199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
Andreyka
На сайте с 19.02.2005
Offline
822
#8

Обычно да, но не обязательно

Авторизуйтесь или зарегистрируйтесь, чтобы оставить комментарий