Попробуйте обновиться до 8. 2-RC2 - Администрирование серверов

ZFS использование кеша (arc)

Mage1 · 2011-01-07T17:20:45.0000000Z

Здравствуйте. Имеется сервер с 12 Гб ОЗУ. При генерации страниц используется чтение текстовых файлов с диска, общий объем около 11 Гб. Профилирование показывает, что больше всего времени при исполнении скриптов тратится именно на чтение текстовых файлов (винты 3xSATA в массиве ZFS raidz). Хочу отдать 6 Гб ОЗУ кешу ZFS, рассчитывая на то, что самые часто используемые файлы попадут в ОЗУ и среднее время улучшится. На разделе с mysql primarycache=metadata. Установил vfs.zfs.arc_max 6G, но используется только 1400 Мб. При этом на чтение 20-ти файлов по 50 кбайт уходит до 2-х секунд (в часы "пик"; повторное чтение тех же 20-ти файлов не происходит быстрее, в кеш они не попадают, следовательно). Возможно, кто-то подскажет, как заставить ZFS использовать кеш полнее? Вот выдача скрипта arc_summary.pl ------------------------------------------------------------------------ System Summary Fri Jan 7 20:20:39 2011 FreeBSD 8.0-STABLE #1: Tue Mar 9 12:36:22 UTC 2010 root Kernel Version: 800504 (osreldate) Hardware Platform: amd64 Processor Architecture: amd64 8:20PM up 10 days, 2:35, 2 users, load averages: 0.22, 0.20, 0.17 ------------------------------------------------------------------------ Physical Memory: 12263.99M Page Size: 4096 Kernel Memory TOTAL: 1135.77M DATA: 98.69% 1120.88M TEXT: 1.31% 14.89M ARC Summary Storage pool Version: 14 (spa) Filesystem Version: 3 (zpl) Memory Throttle Count: 1084 ARC Misc: Deleted: 46591582 Recycle Misses: 53124329 Mutex Misses: 346671 Evict Skips: 346671 ARC Size: Current Size: 22.69% 1393.97M (arcsize) Target Size: (Adaptive) 16.67% 1024.00M (c) Min Size (Hard Limit): 16.67% 1024.00M (c_min) Max Size (High Water): ~6:1 6144.00M (c_max) ARC Size Breakdown: Recently Used Cache Size: 73.46% 1024.00M (p) Frequently Used Cache Size: 26.54% 369.97M (arcsize-p) ARC Hash Breakdown: Elements Max: 444308 Elements Current: 15.23% 67687 Collisions: 17802432 Chain Max: 11 Chains: 7356 ARC Efficiency: Cache Access Total: 564059878 Cache Hit Ratio: 86.23% 486362988 Cache Miss Ratio: 13.77% 77696890 Actual Hit Ratio: 84.51% 476669702 Data Demand Efficiency: 90.47% Data Prefetch Efficiency: 21.45% CACHE HITS BY CACHE LIST: Anonymous: --% Counter Rolled. Most Recently Used: 23.26% 113124076 (mru) Most Frequently Used: 74.75% 363545626 (mfu) Most Recently Used Ghost: 2.40% 11652972 (mru_ghost) Most Frequently Used Ghost: 6.62% 32174703 (mfu_ghost) CACHE HITS BY DATA TYPE: Demand Data: 82.31% 400348953 Prefetch Data: 0.70% 3423529 Demand Metadata: 15.48% 75282378 Prefetch Metadata: 1.50% 7308128 CACHE MISSES BY DATA TYPE: Demand Data: 54.28% 42172596 Prefetch Data: 16.14% 12537851 Demand Metadata: 22.62% 17574445 Prefetch Metadata: 6.97% 5411998 L2 ARC Stats: (enabled with access > 0) 0 VDEV Cache Summary Access Total: 42402903 Hits Ratio: 33.53% 14219105 Miss Ratio: 66.47% 28183798 Delegations: 1071278 ZFS Tunable (sysctl): kern.maxusers=384 vfs.zfs.arc_meta_limit=1610612736 vfs.zfs.arc_meta_used=1456834896 vfs.zfs.mdcomp_disable=0 vfs.zfs.arc_min=1073741824 vfs.zfs.arc_max=6442450944 vfs.zfs.zfetch.array_rd_sz=1048576 vfs.zfs.zfetch.block_cap=256 vfs.zfs.zfetch.min_sec_reap=2 vfs.zfs.zfetch.max_streams=8 vfs.zfs.prefetch_disable=0 vfs.zfs.recover=0 vfs.zfs.txg.synctime=5 vfs.zfs.txg.timeout=30 vfs.zfs.scrub_limit=10 vfs.zfs.vdev.cache.bshift=16 vfs.zfs.vdev.cache.size=10485760 vfs.zfs.vdev.cache.max=16384 vfs.zfs.vdev.aggregation_limit=131072 vfs.zfs.vdev.ramp_rate=2 vfs.zfs.vdev.time_shift=6 vfs.zfs.vdev.min_pending=4 vfs.zfs.vdev.max_pending=35 vfs.zfs.cache_flush_disable=1 vfs.zfs.zil_disable=0 vfs.zfs.version.zpl=3 vfs.zfs.version.vdev_boot=1 vfs.zfs.version.spa=14 vfs.zfs.version.dmu_backup_stream=1 vfs.zfs.version.dmu_backup_header=2 vfs.zfs.version.acl=1 vfs.zfs.debug=0 vfs.zfs.super_owner=0 vm.kmem_size=8589934592 vm.kmem_size_scale=3 vm.kmem_size_min=0 vm.kmem_size_max=8589934592 Есть вариант кешировать файлы самому, используя, к примеру, memcached, но, кажется, если в ФС есть такая возможность, она должна работать быстрее и без костылей.

822

Andreyka

17 января 2011, 08:30

#11

zfs - кака?

Не стоит плодить сущности без необходимости

215

Boris A Dolgov

17 января 2011, 14:31

#12

Не кормим, не кормим :)

С уважением, Борис Долгов. Администрирование, дешевые лицензии ISPsystem, Parallels, cPanel, DirectAdmin, скины, SSL - ISPlicense.ru (http://www.isplicense.ru/?from=4926)

83

Mage1

17 января 2011, 16:28

#13

Andreyka:
zfs - кака?

Storage pool Version: 14 (spa)

73

vlad11

19 января 2011, 11:33

#14

Mage1:
Storage pool Version: 14 (spa)

не кормим троля

Администрирование Linux и FreeBSD. Настройка BGP. (/ru/forum/744772)

263

rtyug

20 января 2011, 00:17

#15

Andreyka:
zfs - кака?

у ТС файлов очень много, даже очень много и их надо все сразу отдать одним куском, как я понял...

тут большая нарузка будет на ядро и на fs, (точно не знаю как тут сказать, все будет на IO винчестера)

РСУБД будет эмулировать носитель инфрмации, так же можно сетевой носитель информации (с репликацией)

тут вообще-то долго все перечислять, можно многое написать :)

http://www.khmere.com/freebsd_book/html/ch06.html

6.1 Advanced I/O and Process Resources

As we have seen from the previous chapter, programs can have multiple file descriptors open at the same time. These file descriptors aren't necessarily files, but can be fifos, pipes, or sockets. As such, the ability to multiplex these open descriptors becomes important. For example, consider a simple mail reader program, like pine. It should of course allow the user to not only read and write email, but also simultaneously check for new email. That means the program receives input from at least two sources at any given point: one source is the user, the other is the descriptor checking for new mail. Handling multiplexing descriptors is a complex issue. One method is to mark all open descriptors non-blocking (O_NONBLOCK), and then loop through them until one is found that will allow an I/O operation. The problem with this approach is that the program constantly loops, and if no I/O is available for a long time, the process will tie up the CPU. Your CPU load only worsens when multiple processes are looping on a small set of descriptors.

Another approach is to set signal handlers to catch when I/O is available, and then put the process to sleep. This sounds good in theory, if you only have a few open descriptors and infrequent I/O requests. Because the process is sleeping, it will not tie up the CPU, and it will then only execute when I/O is available. However, the problem with this approach is that signal handling is somewhat expensive. For example a web server receiving 100 requests per minute, would need to catch signals almost constantly. The overhead from catching hundreds of signals per minute would be significant, not only for the process but for the kernel to send these signals as well.

So far, both options are limited and ineffective, with the common problem being that a process needs to know when I/O is available. However, this information is actually only known in advance by the kernel, because the kernel ultimately handles all the open descriptors on the system. For example, when a process sends data over a fifo to another, the sending process calls write, which is a system call and thus involves the kernel. The receiver will not know until the write system call is executed by the sender. So, a better way to multiplex file descriptors suggests itself: have the kernel manage it for the process. In other words, send the kernel a list of open descriptors and then wait until the kernel has one or more ready, or until a time-out has expired.

This is the approach taken by the select(), poll() and kqueue() interfaces. Through these, the kernel will manage the descriptors and awake the process when I/O is available. These interfaces elegantly handle the problems mentioned above. The process doesn't need to loop through the open descriptors, nor does it need to handle signals. The process will still incur a slight overhead, however, when using these functions. This is because the I/O operations are executed after the return from these interfaces. Thus it takes at least two system calls to perform any operation. For example, say your program has two descriptors used for reading. You use select and wait for them to have data to read. This requires the process to first call select, and then when select returns, to call read on the descriptor. More ideally, you could do a large blocking read against all of the open descriptors. Once one is ready to read, the read will return with the data inside the buffer and an indication of which descriptor the data was read from.

6.4 kqueue

So far, poll and select seem like elegant ways to multiplex file descriptors. To use either of those two functions, however, you need to create the list of descriptors, then send them to kernel, and then upon return, look through the list again. That seems a bit inefficient. A better model would be to give the descriptor list to the kernel and then wait. Once one or more events happen, the kernel can notify the process with a list of only the file descriptors which had events, avoiding loops through the entire list every time a function returns. Although this small gain is not noticeable if the process only has a few descriptors open, for programs with thousands of open file descriptors, the performance gains are significant. This was one of the main goals behind the creation of kqueue. Also, the designers wanted a process to be able to detect a wider variety of events, such as file modification, file deletion, signals delivered, or a child process exit, with a flexible function call that subsumed other tasks. Handling signals, multiplexing file descriptors, and waiting for child processes can all be wrapped into this single kqueue interface because they are all waiting for an event to occur.

Спалил тему: Pokerstars вывод WMZ, etc на VISA 0% или SWIFT + Конверт USD/GBP,etc (net profit $0,5 млрд) (https://minfin.com.ua/blogs/94589307/115366/) Monobank - 50₴ на счет при рег. тут (https://clck.ru/DLX4r) | Номер SIP АТС Москва 7(495) - 0Ꝑ, 8(800) - 800Ꝑ/0Ꝑ (http://goo.gl/XOrCSn)

Как устроены поисковые системы Обновилось приложение Gmail для У Дзена появилась темная

822

Andreyka

20 января 2011, 11:30

#16

А с каких пор 220k файлов это стало много?

Если конечно не держать их в одной директории.

Или это просто много для zfs?

73

vlad11

20 января 2011, 18:12

#17

rtyug:
у ТС файлов очень много, даже очень много и их надо все сразу отдать одним куском, как я понял...

тут большая нарузка будет на ядро и на fs, (точно не знаю как тут сказать, все будет на IO винчестера)
РСУБД будет эмулировать носитель инфрмации, так же можно сетевой носитель информации (с репликацией)
тут вообще-то долго все перечислять, можно многое написать :)

http://www.khmere.com/freebsd_book/html/ch06.html

Не вводите людей в заблуждение.

В работе видел пики до 10MBps, в синтетических тестах в пределах максимального i/o винтов.

263

rtyug

20 января 2011, 19:59

#18

Andreyka:
А с каких пор 220k файлов это стало много?
Если конечно не держать их в одной директории.

Или это просто много для zfs?

нет, проблема может быть в том как именно открываются эти файлы (так как используется самописный скрипт)

по крайней мере можно так предположить...? :)

тоже, самое может быть, если 10-50к сокетов каждую секунду толкать не правильно (не оптимизированно) в многопоточность, то ОС может жрать память и отвалится... (и на сервере и на клиенте *{*$ на клиенте в случае инициализации})

Как получить больше трафика Calltouch научился подменять номера Как приостановить работу сайта

83

Mage1

20 января 2011, 21:21

#19

rtyug:
нет, проблема может быть в том как именно открываются эти файлы (так как используется самописный скрипт)
по крайней мере можно так предположить...? :)

php file_get_contents

83

Mage1

29 января 2011, 09:33

#20

Решил попробовать обновиться до 8.2-RC2, но возникла следующая проблема. Хочу сделать бекап и скачать на другую машину перед обновлением. Ранее бекап делался в пятницу в 23 часа и к утру субботы был готов (в пятницу вечером и субботу наименьшая посещаемость). Сейчас количество (и объем) файлов увеличился и даже днем субботы процесс бекапа (bsdtar) в самом разгаре (судя по размеру архива - процентов 30-40 готово), при этом он занимает диск (LA 0.1-0.2), страницы отдаются всё медленнее и в определенное время в субботу совсем отказываются отдаваться :) Подскажите, как в таких условиях можно сделать бекап, не выключая веб-сервер?

UP: похоже, понижение приоритета командой renice сработало, сервер работает, бекап архивируется.

OnkelHost.ru - Быстрый и Почему ВПС при архивации После атаки вирусов сайт

VK приобрела 70% в структуре компании-разработчика red_mad_robot

Все что нужно знать о DDоS-атаках грамотному менеджеру

ZFS использование кеша (arc)