What is a good mass storage strategy for millions of small files (about 50 KB on average) with automatic trimming of files older than 20 minutes? I need to write and access them from a web server.
I am currently using ext4, and at the time of deletion (scheduled in cron), the hard drive usage reaches 100% when using [flush-8: 0], as the process creates the load. This load interferes with other applications on the server. In the absence of deletions, the maximum use of the hard drive is 0-5%. The situation is the same as nested and non-nested directory structures. Worst of all, it seems that mass deletion during peak load is slower than the insertion speed, so the number of files that need to be deleted is becoming more and more.
I tried changing the schedulers (deadline, cfq, noop), this did not help. I also tried setting ionization to remove the script, but that didn't help either.
I tried GridFS with MongoDB 2.4.3, and it works beautifully but horribly during the mass removal of old files. I tried to start MongoDB with logging disabled (nojournal) and without confirming the record to delete and insert (w = 0), and this did not help. It only works quickly and smoothly when no deletion occurs.
I also tried to store the data in MySQL 5.5, in the BLOB column, in the InnoDB table, with InnoDB setting, to use innodb_buffer_pool = 2GB, innodb_log_file_size = 1GB, innodb_flush_log_on_trx_commit = 2, but the performance was worse, the HDD load was always 80% -100% (expected, but I had to try). The table used only the BLOB, DATETIME and CHAR (32) latin1_bin UUID columns with the indices in the UUID and DATETIME columns, so there was no room for optimization, and all queries used indexes.
pdflush ( Linux, ), , .
, script, 1 , 1 , 5 , 30 , .
inode, , inode, .
CentOS 6. HDD - SSD RAID 1.
, ?