Одновременно умирают SSD в raid?

12 3
Win33
На сайте с 03.10.2009
Offline
129
1684

Здравствуйте. В online.net брался сервер года 3 назад, с 3 дисками ссд на борту, смарт показатели не радуют, заканчивается ресурс или ?

эти диски в рейде 1

smartctl --all /dev/sda
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.16.1.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: Crucial/Micron BX/MX1/2/3/500, M5/600, 1100 SSDs
Device Model: Micron_1100_MTFDDAK512TBN
Serial Number: 170315772A2E
LU WWN Device Id: 5 00a075 115772a2e
Firmware Version: M0MU020
User Capacity: 512,110,190,592 bytes [512 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-3 (unknown minor revision code: 0x006d)
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Fri Nov 1 12:44:33 2019 CET
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x04) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 433) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 7) minutes.
Conveyance self-test routine
recommended polling time: ( 3) minutes.
SCT capabilities: (0x0035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 100 100 000 Pre-fail Always - 0
5 Reallocate_NAND_Blk_Cnt 0x0032 100 100 010 Old_age Always - 0
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 22664
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 21
171 Program_Fail_Count 0x0032 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
173 Ave_Block-Erase_Count 0x0032 068 068 000 Old_age Always - 489
174 Unexpect_Power_Loss_Ct 0x0032 100 100 000 Old_age Always - 19
183 SATA_Interfac_Downshift 0x0032 100 100 000 Old_age Always - 0
184 Error_Correction_Count 0x0032 100 100 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
194 Temperature_Celsius 0x0022 059 053 000 Old_age Always - 41 (Min/Max 24/47)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
202 Percent_Lifetime_Remain 0x0030 068 068 001 Old_age Offline - 32
206 Write_Error_Rate 0x000e 100 100 000 Old_age Always - 0
246 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 159554855861
247 Host_Program_Page_Count 0x0032 100 100 000 Old_age Always - 5036176897
248 FTL_Program_Page_Count 0x0032 100 100 000 Old_age Always - 7965603143
180 Unused_Reserve_NAND_Blk 0x0033 000 000 000 Pre-fail Always - 2469
210 Success_RAIN_Recov_Cnt 0x0032 100 100 000 Old_age Always - 0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Vendor (0xff) Completed without error 00% 22663 -
# 2 Vendor (0xff) Completed without error 00% 22094 -
# 3 Vendor (0xff) Completed without error 00% 21455 -
# 4 Vendor (0xff) Completed without error 00% 20724 -
# 5 Vendor (0xff) Completed without error 00% 19973 -
# 6 Vendor (0xff) Completed without error 00% 19451 -
# 7 Vendor (0xff) Completed without error 00% 18964 -
# 8 Vendor (0xff) Completed without error 00% 18395 -
# 9 Vendor (0xff) Completed without error 00% 17851 -
#10 Vendor (0xff) Completed without error 00% 17068 -
#11 Vendor (0xff) Completed without error 00% 16597 -
#12 Vendor (0xff) Completed without error 00% 16181 -
#13 Vendor (0xff) Completed without error 00% 15722 -
#14 Vendor (0xff) Completed without error 00% 15106 -
#15 Vendor (0xff) Completed without error 00% 14395 -
#16 Vendor (0xff) Completed without error 00% 13652 -
#17 Vendor (0xff) Completed without error 00% 13061 -
#18 Vendor (0xff) Completed without error 00% 12605 -
#19 Vendor (0xff) Completed without error 00% 12060 -
#20 Vendor (0xff) Completed without error 00% 11297 -

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

второй диск имеет такие же примерно показатели

интересуют поля LifeTime(hours), это примерно показывает сколько часов осталось?

Mik Foxi
На сайте с 02.03.2011
Offline
1076
#1
Win33:

интересуют поля LifeTime(hours), это примерно показывает сколько часов осталось?

Это сколько прожили уже. А так то нормально ж выглядит диск.

Антибот, антиспам, веб файрвол, защита от накрутки поведенческих: https://antibot.cloud/ + партнерка, до 40$ с продажи.
Win33
На сайте с 03.10.2009
Offline
129
#2
foxi:
Это сколько прожили уже. А так то нормально ж выглядит диск.

так сколько прожили в часах Power_On_Hours 22664

и такой ошибки год назад не было SMART overall-health self-assessment test result: PASSED

lonelywoolf
На сайте с 23.12.2013
Offline
151
#3
Win33:
и такой ошибки год назад не было SMART overall-health self-assessment test result: PASSED

Перевод фразы нужен, или сами осилите?

Платный и бесплатный хостинг с защитой от DDoS (http://aquinas.su)
LEOnidUKG
На сайте с 25.11.2006
Offline
1723
#4

SSD не умирают как HDD.

Они или в readonly падают или просто вырубаются окончательно.

✅ Мой Телеграм канал по SEO, оптимизации сайтов и серверов: https://t.me/leonidukgLIVE ✅ Качественное и рабочее размещение SEO статей СНГ и Бурж: https://getmanylinks.ru/
S2
На сайте с 30.12.2015
Offline
307
#5

Нормальный диск, еще 3 года проживет.

VO
На сайте с 27.07.2008
Offline
149
#6
Sector Size: 512 bytes
Total_LBAs_Written 159554855861

Умножаете первое на второе, получаете сколько записано в байтах.

Дальше делите на 1024 четыре раза, получаете ~74TB.

Для вашей модели заявлен ресурс 240TB, сейчас записано примерно ~74TB.

То есть ресурс исчерпан примерно на треть.

Win33
На сайте с 03.10.2009
Offline
129
#7

ничосее, спасибо дядь Вов.

globalmoney
На сайте с 09.12.2005
Offline
390
#8
LEOnidUKG:
Они или в readonly падают или просто вырубаются окончательно.

Последнее случается чаще, даже если ресурс ещё более 50%. :(

MGNHost.ru - полный комплекс хостинг услуг ( https://www.mgnhost.ru ) VPS/VDS на SSD дисках в России / Нидерландах / США от 210 рублей ( https://www.mgnhost.ru/vds.php )
baas
На сайте с 17.09.2012
Online
161
#9
V(o)ViK:
Умножаете первое на второе, получаете сколько записано в байтах.
Дальше делите на 1024 четыре раза, получаете ~74TB.
Для вашей модели заявлен ресурс 240TB, сейчас записано примерно ~74TB.
То есть ресурс исчерпан примерно на треть.

А если смарт не показывает этот счетчик 246 Total_LBAs_Written?

smartctl --all /dev/sda
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-4.9.192-gentoo] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family: SandForce Driven SSDs
Device Model: KINGSTON SKC300S37A120G
Serial Number: 50026B723C0C8A9D
LU WWN Device Id: 5 0026b7 23c0c8a9d
Firmware Version: 605ABBF0
User Capacity: 120 034 123 776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: In smartctl database [for details use: -P show]
ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3
SATA Version is: SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Thu Nov 7 08:39:30 2019 +03
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7d) SMART execute Offline immediate.
No Auto Offline data collection support.
Abort Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0025) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0033 120 120 050 Pre-fail Always - 0/0
5 Retired_Block_Count 0x0033 100 100 003 Pre-fail Always - 0
9 Power_On_Hours_and_Msec 0x0032 085 085 000 Old_age Always - 13950h+08m+38.540s
12 Power_Cycle_Count 0x0032 095 095 000 Old_age Always - 5754
13 Soft_Read_Error_Rate 0x0032 120 120 000 Old_age Always - 0/0
100 Gigabytes_Erased 0x0032 000 000 000 Old_age Always - 14560
170 Reserve_Block_Count 0x0058 000 000 000 Old_age Offline - 3424
171 Program_Fail_Count 0x000a 100 100 000 Old_age Always - 0
172 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
174 Unexpect_Power_Loss_Ct 0x0030 000 000 000 Old_age Offline - 1278
177 Wear_Range_Delta 0x0000 000 000 000 Old_age Offline - 1
181 Program_Fail_Count 0x000a 100 100 000 Old_age Always - 0
182 Erase_Fail_Count 0x0032 100 100 000 Old_age Always - 0
184 IO_Error_Detect_Code_Ct 0x005c 100 100 090 Old_age Offline - 0
187 Reported_Uncorrect 0x0012 100 100 000 Old_age Always - 0
189 Airflow_Temperature_Cel 0x0000 027 037 000 Old_age Offline - 27 (Min/Max -21/37)
194 Temperature_Celsius 0x0022 027 037 000 Old_age Always - 27 (Min/Max -21/37)
195 ECC_Uncorr_Error_Count 0x001c 120 120 000 Old_age Offline - 0/0
196 Reallocated_Event_Count 0x0033 100 100 003 Pre-fail Always - 0
198 Uncorrectable_Sector_Ct 0x0010 120 120 000 Old_age Offline - 0/0
199 SATA_CRC_Error_Count 0x00b0 200 200 000 Old_age Offline - 0
201 Unc_Soft_Read_Err_Rate 0x001c 120 120 000 Old_age Offline - 0/0
204 Soft_ECC_Correct_Rate 0x001c 120 120 000 Old_age Offline - 0/0
230 Life_Curve_Status 0x0013 100 100 000 Pre-fail Always - 100
231 SSD_Life_Left 0x0013 097 097 010 Pre-fail Always - 1
232 Available_Reservd_Space 0x0032 000 000 000 Old_age Always - 13
233 SandForce_Internal 0x0032 000 000 000 Old_age Always - 11864
234 SandForce_Internal 0x0032 000 000 000 Old_age Always - 6508
241 Lifetime_Writes_GiB 0x0032 000 000 000 Old_age Always - 6508
242 Lifetime_Reads_GiB 0x0032 000 000 000 Old_age Always - 5323

SMART Error Log not supported

SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision number = 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


---------- Добавлено 07.11.2019 в 08:53 ----------

Вроде нашел.
100 Gigabytes_Erased 0x0032 000 000 000 Old_age Always - 14560

14,5 Т/байт стерто.

Заявленный ресурс у этого диска 290 Т/байт.

Только не понятно откуда столько Т/байт то, система домашняя.

linux gentoo, частенько пересобираю систему.

Пересборка пакетов в системе в памяти и на обычном hdd.

Большие файлы грузятся так же на отдельный hdd.

На ссл находится виндуз, но я им практически не пользуюсь, иногда просто проверить сайты на вирусы.

ссд служит верой и правдой с 2014 года, как бы скоро будет 5 лет работы.

Настройка BSD систем. (https://www.fryaha.ru) Знание сила, незнание Рабочая сила!
lonelywoolf
На сайте с 23.12.2013
Offline
151
#10
baas:
Только не понятно откуда столько Т/байт то, система домашняя.

Write Amplification. Нужно смотреть на TBW, а это запись без учета амплификации. Хотя Lifetime обычно считается с учетом амплификации.

Ну т.е. при выработке заявленного ресурса с SSD с большой долей вероятности ничего не случится. Вот вам смарт одного:

[root@geri ~]# smartctl -a /dev/nvme1
smartctl 7.0 2018-12-30 r4883 [x86_64-linux-3.10.0-957.21.3.el7.x86_64] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number: KXG50ZNV512G TOSHIBA
Serial Number: 48TS100JTYST
Firmware Version: AAGA4102
PCI Vendor/Subsystem ID: 0x1179
IEEE OUI Identifier: 0x00080d
Total NVM Capacity: 512*110*190*592 [512 GB]
Unallocated NVM Capacity: 0
Controller ID: 0
Number of Namespaces: 1
Namespace 1 Size/Capacity: 512*110*190*592 [512 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 00080d 0200460fc5
Local Time is: Thu Nov 7 11:45:07 2019 MSK
Firmware Updates (0x14): 2 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 78 Celsius
Critical Comp. Temp. Threshold: 82 Celsius
Namespace 1 Features (0x02): NA_Fields

Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.00W - - 0 0 0 0 0 0
1 + 2.40W - - 1 1 1 1 0 0
2 + 1.90W - - 2 2 2 2 0 0
3 - 0.0500W - - 3 3 3 3 1500 1500
4 - 0.0050W - - 4 4 4 4 6000 14000
5 - 0.0030W - - 5 5 5 5 50000 80000

Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 2
1 - 4096 0 1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 44 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 237%
Data Units Read: 362*009*808 [185 TB]
Data Units Written: 798*214*872 [408 TB]
Host Read Commands: 4*508*078*096
Host Write Commands: 4*878*999*702
Controller Busy Time: 94*391
Power Cycles: 8
Power On Hours: 8*743
Unsafe Shutdowns: 4
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 44 Celsius

Error Information (NVMe Log 0x01, max 128 entries)
No Errors Logged

[root@geri ~]#
12 3

Авторизуйтесь или зарегистрируйтесь, чтобы оставить комментарий