summaryrefslogtreecommitdiffstats
path: root/src/os/filestore/FileJournal.cc
diff options
context:
space:
mode:
authorJiang Yutang <yutang2.jiang@hxt-semitech.com>2018-09-07 05:09:24 +0200
committerJiang Yutang <yutang2.jiang@hxt-semitech.com>2018-09-07 05:19:45 +0200
commitcc59da9785730a4247a24f2af17401124c506293 (patch)
tree49fae8e10c10d479cbcb93d5fdbf01decefb67b8 /src/os/filestore/FileJournal.cc
parentMerge pull request #23900 from libingyang-zte/master (diff)
downloadceph-cc59da9785730a4247a24f2af17401124c506293.tar.xz
ceph-cc59da9785730a4247a24f2af17401124c506293.zip
common/buffer.cc: add create_small_page_aligned to avoid mem waste when apply for small mem in big page size(e.g. 64k) OS
On my arm64 dev board, CentOS 7.4, the default OS page size is 64k, one SSD disk, ceph version is 13.2.1. When I do fio randread test(bs=4k), the ceph-osd process uses a large amount of memory(more than 20G), while bs=64, just more than 2G. After traceing the mem allocate process, it is found to be related to page size alignment - applying for small mem(4k) but align to big page size(64k) will lead to waste memory. With reference to the original create_page_aligned, add a new interface create_small_page_aligned to useing 4k alignment. Go through all the callers of create_page_aligned, divide the big and small page align according to the relationship between applying for and current page size. Individual callers with their own context logic not do the diversion. After using the patch, do the fio randread test(bs=4k) in 64k page size OS, the memory used by the ceph-osd process be reduced from more than 20G to about 3G; for the bs=16k case, the memory used is also significantly reduced; while the reading performance has not been reduced. When I porting the patch to the last ceph tree(version 14.0.0-xxx), also made a comparative verification. For the fio(bs=4k) test, although the current 14.0.0-x version is less mem expensive than the 13.2.1 version, but the memory usage of using the patche is also reduced significantly. The following is a partial comparison of validation data, different software and hardware environments may have different test values, the better the performance of the SSD, the more memory it will use. ceph version bs VIRT RES 13.2.1 64k 3600896 2.7g 13.2.1 64k 3610112 2.7g 13.2.1 64k 3614208 2.7g 13.2.1 16k 7485184 6.4g 13.2.1 16k 7486208 6.4g 13.2.1 16k 7486208 6.4g 13.2.1 4k 23.7g 22.9g <--A lot of waste 13.2.1 4k 23.7g 22.9g 13.2.1 4k 23.7g 22.9g 13.2.1+patch 64k 3632384 2.7g 13.2.1+patch 64k 3636480 2.7g 13.2.1+patch 64k 3640576 2.7g 13.2.1+patch 16k 3175296 2.2g 13.2.1+patch 16k 3175296 2.2g 13.2.1+patch 16k 3176320 2.2g 13.2.1+patch 4k 4265920 3.3g <--Reasonable usage quantity 13.2.1+patch 4k 4265920 3.3g 13.2.1+patch 4k 4265920 3.3g 14.0.0-x 64k 6230784 4.4g 14.0.0-x 64k 5731840 4.1g 14.0.0-x 64k 4547072 3.5g 14.0.0-x 64k 4544000 3.6g 14.0.0-x 16k 6272192 5.2g 14.0.0-x 16k 6343168 5.3g 14.0.0-x 16k 6357696 5.3g 14.0.0-x 4k 10.1g 9.3g <--A lot of waste 14.0.0-x 4k 10.3g 9.6g 14.0.0-x 4k 10.3g 9.4g 14.0.0-x+patch 64k 5974464 4.6g 14.0.0-x+patch 64k 4547008 3.5g 14.0.0-x+patch 64k 4556288 3.6g 14.0.0-x+patch 16k 4058560 3.1g 14.0.0-x+patch 16k 4053504 3.1g 14.0.0-x+patch 16k 4062720 3.1g 14.0.0-x+patch 4k 5283264 4.3g <--Reasonable usage quantity 14.0.0-x+patch 4k 5324224 4.3g 14.0.0-x+patch 4k 5297600 4.3g Signed-off-by: Jiang Yutang <yutang2.jiang@hxt-semitech.com>
Diffstat (limited to 'src/os/filestore/FileJournal.cc')
-rw-r--r--src/os/filestore/FileJournal.cc4
1 files changed, 2 insertions, 2 deletions
diff --git a/src/os/filestore/FileJournal.cc b/src/os/filestore/FileJournal.cc
index cfb1692cf35..98bed0dc298 100644
--- a/src/os/filestore/FileJournal.cc
+++ b/src/os/filestore/FileJournal.cc
@@ -672,7 +672,7 @@ int FileJournal::read_header(header_t *hdr) const
dout(10) << "read_header" << dendl;
bufferlist bl;
- buffer::ptr bp = buffer::create_page_aligned(block_size);
+ buffer::ptr bp = buffer::create_small_page_aligned(block_size);
char* bpdata = bp.c_str();
int r = ::pread(fd, bpdata, bp.length(), 0);
@@ -727,7 +727,7 @@ bufferptr FileJournal::prepare_header()
header.committed_up_to = journaled_seq;
}
encode(header, bl);
- bufferptr bp = buffer::create_page_aligned(get_top());
+ bufferptr bp = buffer::create_small_page_aligned(get_top());
// don't use bp.zero() here, because it also invalidates
// crc cache (which is not yet populated anyway)
char* data = bp.c_str();