summaryrefslogtreecommitdiffstats
path: root/drivers (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'akpm' (patches from Andrew)Linus Torvalds2017-11-1629-50/+45
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...
| * mm: remove __GFP_COLDMel Gorman2017-11-1613-18/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm: remove cold parameter for release_pagesMel Gorman2017-11-166-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All callers of release_pages claim the pages being released are cache hot. As no one cares about the hotness of pages being released to the allocator, just ditch the parameter. No performance impact is expected as the overhead is marginal. The parameter is removed simply because it is a bit stupid to have a useless parameter copied everywhere. Link: http://lkml.kernel.org/r/20171018075952.10627-7-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * mm, pagevec: remove cold parameter for pagevecsMel Gorman2017-11-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Every pagevec_init user claims the pages being released are hot even in cases where it is unlikely the pages are hot. As no one cares about the hotness of pages being released to the allocator, just ditch the parameter. No performance impact is expected as the overhead is marginal. The parameter is removed simply because it is a bit stupid to have a useless parameter copied everywhere. Link: http://lkml.kernel.org/r/20171018075952.10627-6-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * drivers/block/zram/zram_drv.c: make zram_page_end_io() staticColin Ian King2017-11-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zram_page_end_io() is local to the source and does not need to be in global scope, so make it static. Cleans up sparse warning: symbol 'zram_page_end_io' was not declared. Should it be static? Link: http://lkml.kernel.org/r/20171016173336.20320-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * kmemcheck: remove annotationsLevin, Alexander (Sasha Levin)2017-11-162-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Patch series "kmemcheck: kill kmemcheck", v2. As discussed at LSF/MM, kill kmemcheck. KASan is a replacement that is able to work without the limitation of kmemcheck (single CPU, slow). KASan is already upstream. We are also not aware of any users of kmemcheck (or users who don't consider KASan as a suitable replacement). The only objection was that since KASAN wasn't supported by all GCC versions provided by distros at that time we should hold off for 2 years, and try again. Now that 2 years have passed, and all distros provide gcc that supports KASAN, kill kmemcheck again for the very same reasons. This patch (of 4): Remove kmemcheck annotations, and calls to kmemcheck from the kernel. [alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs] Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Alexander Potapenko <glider@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Pekka Enberg <penberg@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Tim Hansen <devtimhansen@gmail.com> Cc: Vegard Nossum <vegardno@ifi.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * zram: remove zlib from the list of recommended algorithmsSergey Senozhatsky2017-11-161-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | ZSTD tends to outperform deflate/inflate, thus we remove zlib from the list of recommended algorithms and recommend zstd instead. Link: http://lkml.kernel.org/r/20170912050005.3247-2-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * zram: add zstd to the supported algorithms listSergey Senozhatsky2017-11-161-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add ZSTD to the list of supported compression algorithms. ZRAM fio perf test: LZO DEFLATE ZSTD #jobs1 WRITE: (2180MB/s) (77.2MB/s) (1429MB/s) WRITE: (1617MB/s) (77.7MB/s) (1202MB/s) READ: (426MB/s) (595MB/s) (1181MB/s) READ: (422MB/s) (572MB/s) (1020MB/s) READ: (318MB/s) (67.8MB/s) (563MB/s) WRITE: (318MB/s) (67.9MB/s) (564MB/s) READ: (336MB/s) (68.3MB/s) (583MB/s) WRITE: (335MB/s) (68.2MB/s) (582MB/s) #jobs2 WRITE: (3441MB/s) (152MB/s) (2141MB/s) WRITE: (2507MB/s) (147MB/s) (1888MB/s) READ: (801MB/s) (1146MB/s) (1890MB/s) READ: (767MB/s) (1096MB/s) (2073MB/s) READ: (621MB/s) (126MB/s) (1009MB/s) WRITE: (621MB/s) (126MB/s) (1009MB/s) READ: (656MB/s) (125MB/s) (1075MB/s) WRITE: (657MB/s) (126MB/s) (1077MB/s) #jobs3 WRITE: (4772MB/s) (225MB/s) (3394MB/s) WRITE: (3905MB/s) (211MB/s) (2939MB/s) READ: (1216MB/s) (1608MB/s) (3218MB/s) READ: (1159MB/s) (1431MB/s) (2981MB/s) READ: (906MB/s) (156MB/s) (1457MB/s) WRITE: (907MB/s) (156MB/s) (1458MB/s) READ: (953MB/s) (158MB/s) (1595MB/s) WRITE: (952MB/s) (157MB/s) (1593MB/s) #jobs4 WRITE: (6036MB/s) (265MB/s) (4469MB/s) WRITE: (5059MB/s) (263MB/s) (3951MB/s) READ: (1618MB/s) (2066MB/s) (4276MB/s) READ: (1573MB/s) (1942MB/s) (3830MB/s) READ: (1202MB/s) (227MB/s) (1971MB/s) WRITE: (1200MB/s) (227MB/s) (1968MB/s) READ: (1265MB/s) (226MB/s) (2116MB/s) WRITE: (1264MB/s) (226MB/s) (2114MB/s) #jobs5 WRITE: (5339MB/s) (233MB/s) (3781MB/s) WRITE: (4298MB/s) (234MB/s) (3276MB/s) READ: (1626MB/s) (2048MB/s) (4081MB/s) READ: (1567MB/s) (1929MB/s) (3758MB/s) READ: (1174MB/s) (205MB/s) (1747MB/s) WRITE: (1173MB/s) (204MB/s) (1746MB/s) READ: (1214MB/s) (208MB/s) (1890MB/s) WRITE: (1215MB/s) (208MB/s) (1892MB/s) #jobs6 WRITE: (5666MB/s) (270MB/s) (4338MB/s) WRITE: (4828MB/s) (267MB/s) (3772MB/s) READ: (1803MB/s) (2058MB/s) (4946MB/s) READ: (1805MB/s) (2156MB/s) (4711MB/s) READ: (1334MB/s) (235MB/s) (2135MB/s) WRITE: (1335MB/s) (235MB/s) (2137MB/s) READ: (1364MB/s) (236MB/s) (2268MB/s) WRITE: (1365MB/s) (237MB/s) (2270MB/s) #jobs7 WRITE: (5474MB/s) (270MB/s) (4300MB/s) WRITE: (4666MB/s) (266MB/s) (3817MB/s) READ: (2022MB/s) (2319MB/s) (5472MB/s) READ: (1924MB/s) (2260MB/s) (5031MB/s) READ: (1369MB/s) (242MB/s) (2153MB/s) WRITE: (1370MB/s) (242MB/s) (2155MB/s) READ: (1499MB/s) (246MB/s) (2310MB/s) WRITE: (1497MB/s) (246MB/s) (2307MB/s) #jobs8 WRITE: (5558MB/s) (273MB/s) (4439MB/s) WRITE: (4763MB/s) (271MB/s) (3918MB/s) READ: (2201MB/s) (2599MB/s) (6062MB/s) READ: (2105MB/s) (2463MB/s) (5413MB/s) READ: (1490MB/s) (252MB/s) (2238MB/s) WRITE: (1488MB/s) (252MB/s) (2236MB/s) READ: (1566MB/s) (254MB/s) (2434MB/s) WRITE: (1568MB/s) (254MB/s) (2437MB/s) #jobs9 WRITE: (5120MB/s) (264MB/s) (4035MB/s) WRITE: (4531MB/s) (267MB/s) (3740MB/s) READ: (1940MB/s) (2258MB/s) (4986MB/s) READ: (2024MB/s) (2387MB/s) (4871MB/s) READ: (1343MB/s) (246MB/s) (2038MB/s) WRITE: (1342MB/s) (246MB/s) (2037MB/s) READ: (1553MB/s) (238MB/s) (2243MB/s) WRITE: (1552MB/s) (238MB/s) (2242MB/s) #jobs10 WRITE: (5345MB/s) (271MB/s) (3988MB/s) WRITE: (4750MB/s) (254MB/s) (3668MB/s) READ: (1876MB/s) (2363MB/s) (5150MB/s) READ: (1990MB/s) (2256MB/s) (5080MB/s) READ: (1355MB/s) (250MB/s) (2019MB/s) WRITE: (1356MB/s) (251MB/s) (2020MB/s) READ: (1490MB/s) (252MB/s) (2202MB/s) WRITE: (1488MB/s) (252MB/s) (2199MB/s) jobs1 perfstat instructions 52,065,555,710 ( 0.79) 855,731,114,587 ( 2.64) 54,280,709,944 ( 1.40) branches 14,020,427,116 ( 725.847) 101,733,449,582 (1074.521) 11,170,591,067 ( 992.869) branch-misses 22,626,174 ( 0.16%) 274,197,885 ( 0.27%) 25,915,805 ( 0.23%) jobs2 perfstat instructions 103,633,110,402 ( 0.75) 1,710,822,100,914 ( 2.59) 107,879,874,104 ( 1.28) branches 27,931,237,282 ( 679.203) 203,298,267,479 (1037.326) 22,185,350,842 ( 884.427) branch-misses 46,103,811 ( 0.17%) 533,747,204 ( 0.26%) 49,682,483 ( 0.22%) jobs3 perfstat instructions 154,857,283,657 ( 0.76) 2,565,748,974,197 ( 2.57) 161,515,435,813 ( 1.31) branches 41,759,490,355 ( 670.529) 304,905,605,277 ( 978.765) 33,215,805,907 ( 888.003) branch-misses 74,263,293 ( 0.18%) 759,746,240 ( 0.25%) 76,841,196 ( 0.23%) jobs4 perfstat instructions 206,215,849,076 ( 0.75) 3,420,169,460,897 ( 2.60) 215,003,061,664 ( 1.31) branches 55,632,141,739 ( 666.501) 406,394,977,433 ( 927.241) 44,214,322,251 ( 883.532) branch-misses 102,287,788 ( 0.18%) 1,098,617,314 ( 0.27%) 103,891,040 ( 0.23%) jobs5 perfstat instructions 258,711,315,588 ( 0.67) 4,275,657,533,244 ( 2.23) 269,332,235,685 ( 1.08) branches 69,802,821,166 ( 588.823) 507,996,211,252 ( 797.036) 55,450,846,129 ( 735.095) branch-misses 129,217,214 ( 0.19%) 1,243,284,991 ( 0.24%) 173,512,278 ( 0.31%) jobs6 perfstat instructions 312,796,166,008 ( 0.61) 5,133,896,344,660 ( 2.02) 323,658,769,588 ( 1.04) branches 84,372,488,583 ( 520.541) 610,310,494,402 ( 697.642) 66,683,292,992 ( 693.939) branch-misses 159,438,978 ( 0.19%) 1,396,368,563 ( 0.23%) 174,406,934 ( 0.26%) jobs7 perfstat instructions 363,211,372,930 ( 0.56) 5,988,205,600,879 ( 1.75) 377,824,674,156 ( 0.93) branches 98,057,013,765 ( 463.117) 711,841,255,974 ( 598.762) 77,879,009,954 ( 600.443) branch-misses 199,513,153 ( 0.20%) 1,507,651,077 ( 0.21%) 248,203,369 ( 0.32%) jobs8 perfstat instructions 413,960,354,615 ( 0.52) 6,842,918,558,378 ( 1.45) 431,938,486,581 ( 0.83) branches 111,812,574,884 ( 414.224) 813,299,084,518 ( 491.173) 89,062,699,827 ( 517.795) branch-misses 233,584,845 ( 0.21%) 1,531,593,921 ( 0.19%) 286,818,489 ( 0.32%) jobs9 perfstat instructions 465,976,220,300 ( 0.53) 7,698,467,237,372 ( 1.47) 486,352,600,321 ( 0.84) branches 125,931,456,162 ( 424.063) 915,207,005,715 ( 498.192) 100,370,404,090 ( 517.439) branch-misses 256,992,445 ( 0.20%) 1,782,809,816 ( 0.19%) 345,239,380 ( 0.34%) jobs10 perfstat instructions 517,406,372,715 ( 0.53) 8,553,527,312,900 ( 1.48) 540,732,653,094 ( 0.84) branches 139,839,780,676 ( 427.732) 1,016,737,699,389 ( 503.172) 111,696,557,638 ( 516.750) branch-misses 259,595,561 ( 0.19%) 1,952,570,279 ( 0.19%) 357,818,661 ( 0.32%) seconds elapsed 20.630411534 96.084546565 12.743373571 seconds elapsed 22.292627625 100.984155001 14.407413560 seconds elapsed 22.396016966 110.344880848 14.032201392 seconds elapsed 22.517330949 113.351459170 14.243074935 seconds elapsed 28.548305104 156.515193765 19.159286861 seconds elapsed 30.453538116 164.559937678 19.362492717 seconds elapsed 33.467108086 188.486827481 21.492612173 seconds elapsed 35.617727591 209.602677783 23.256422492 seconds elapsed 42.584239509 243.959902566 28.458540338 seconds elapsed 47.683632526 269.635248851 31.542404137 Over all, ZSTD has slower WRITE, but much faster READ (perhaps a static compression buffer used during the test helped ZSTD a lot), which results in faster test results. Memory consumption (zram mm_stat file): zram LZO mm_stat mm_stat (jobs1): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs2): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs3): 2147483648 23068672 33558528 0 33562624 0 0 mm_stat (jobs4): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs5): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs6): 2147483648 23068672 33558528 0 33562624 0 0 mm_stat (jobs7): 2147483648 23068672 33558528 0 33566720 0 0 mm_stat (jobs8): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs9): 2147483648 23068672 33558528 0 33558528 0 0 mm_stat (jobs10): 2147483648 23068672 33558528 0 33562624 0 0 zram DEFLATE mm_stat mm_stat (jobs1): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs2): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs3): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs4): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs5): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs6): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs7): 2147483648 16252928 25178112 0 25190400 0 0 mm_stat (jobs8): 2147483648 16252928 25178112 0 25190400 0 0 mm_stat (jobs9): 2147483648 16252928 25178112 0 25178112 0 0 mm_stat (jobs10): 2147483648 16252928 25178112 0 25178112 0 0 zram ZSTD mm_stat mm_stat (jobs1): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs2): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs3): 2147483648 11010048 16781312 0 16785408 0 0 mm_stat (jobs4): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs5): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs6): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs7): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs8): 2147483648 11010048 16781312 0 16781312 0 0 mm_stat (jobs9): 2147483648 11010048 16781312 0 16785408 0 0 mm_stat (jobs10): 2147483648 11010048 16781312 0 16781312 0 0 ================================================================================== Official benchmarks [1]: Compressor name Ratio Compression Decompress. zstd 1.1.3 -1 2.877 430 MB/s 1110 MB/s zlib 1.2.8 -1 2.743 110 MB/s 400 MB/s brotli 0.5.2 -0 2.708 400 MB/s 430 MB/s quicklz 1.5.0 -1 2.238 550 MB/s 710 MB/s lzo1x 2.09 -1 2.108 650 MB/s 830 MB/s lz4 1.7.5 2.101 720 MB/s 3600 MB/s snappy 1.1.3 2.091 500 MB/s 1650 MB/s lzf 3.6 -1 2.077 400 MB/s 860 MB/s Minchan said: : I did test with my sample data and compared zstd with deflate. zstd's : compress ratio is lower a little bit but compression speed is much faster : 3 times more and decompress speed is too 2 times more. With different : data, it is different but overall, zstd would be better for speed at the : cost of a little lower compress ratio(about 5%) so I believe it's worth to : replace deflate. [1] https://github.com/facebook/zstd Link: http://lkml.kernel.org/r/20170912050005.3247-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Tested-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * bdi: introduce BDI_CAP_SYNCHRONOUS_IOMinchan Kim2017-11-164-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed at https://lkml.kernel.org/r/<20170728165604.10455-1-ross.zwisler@linux.intel.com> someday we will remove rw_page(). If so, we need something to detect such super-fast storage on which synchronous IO operations like the current rw_page are always a win. Introduces BDI_CAP_SYNCHRONOUS_IO to indicate such devices. With it, we could use various optimization techniques. Link: http://lkml.kernel.org/r/1505886205-9671-3-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * zram: set BDI_CAP_STABLE_WRITES onceMinchan Kim2017-11-161-10/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With fast swap storage, the platform wants to use swap more aggressively and swap-in is crucial to application latency. The rw_page() based synchronous devices like zram, pmem and btt are such fast storage. When I profile swapin performance with zram lz4 decompress test, S/W overhead is more than 70%. Maybe, it would be bigger in nvdimm. This patchset reduces swap-in latency by skipping swapcache if the swap device is a synchronous device like a rw_page() based device. It enhances by 45% my swapin test (5G sequential swapin, no readahead) from 2.41sec to 1.64sec. This patch (of 4): Commit 19b7ccf8651d ("block: get rid of blk_integrity_revalidate()") fixed a weird thing (i.e., reset BDI_CAP_STABLE_WRITES flag unconditionally whenever revalidat_disk is called) so zram doesn't need to reset the flag any more when revalidating the bdev. Instead, set the flag just once when the zram device is created. It shouldn't change any behavior. Link: http://lkml.kernel.org/r/1505886205-9671-2-git-send-email-minchan@kernel.org Signed-off-by: Minchan Kim <minchan@kernel.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Hugh Dickins <hughd@google.com> Cc: Huang Ying <ying.huang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * drivers/infiniband/sw/rdmavt/qp.c: use kmalloc_array_node()Johannes Thumshirn2017-11-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a NUMA-aware version of kmalloc_array() we can use it instead of kmalloc_node() without an overflow check in the size calculation. Link: http://lkml.kernel.org/r/20170927082038.3782-5-jthumshirn@suse.de Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Doug Ledford <dledford@redhat.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Mike Marciniszyn <infinipath@intel.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * drivers/infiniband/hw/qib/qib_init.c: use kmalloc_array_node()Johannes Thumshirn2017-11-161-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a NUMA-aware version of kmalloc_array() we can use it instead of kmalloc_node() without an overflow check in the size calculation. Link: http://lkml.kernel.org/r/20170927082038.3782-4-jthumshirn@suse.de Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Mike Marciniszyn <infinipath@intel.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Damien Le Moal <damien.lemoal@wdc.com> Cc: David Rientjes <rientjes@google.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge tag 'ipmi-for-4.15' of git://github.com/cminyard/linux-ipmiLinus Torvalds2017-11-1619-2201/+2987
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull IPMI updates from Corey Minyard: "This is a fairly large rework of the IPMI code, along with a bunch of smaller fixes. The major changes have been in the next tree for a couple of months, so they should be good to do in. - Some users had IPMI systems where the GUID of the IPMI controller could change. So rescanning of the GUID was added. The naming of some sysfs things was dependent on the GUID, however, so this resulted in the sysfs interface code in IPMI changing to remove that dependency and name the IPMI BMCs like other sysfs devices. - The ipmi_si_intf.c code was fairly bloated with all the different discovery methods (PCI, ACPI, SMBIOS, OF, platform, module parameters, hot add). The structure of how the interfaces were added was redone to make them more modular, then the individual methods were pulled out into their own files" * tag 'ipmi-for-4.15' of git://github.com/cminyard/linux-ipmi: (48 commits) ipmi_si: Delete an error message for a failed memory allocation in try_smi_init() ipmi_si: fix memory leak on new_smi ipmi: remove redundant initialization of bmc ipmi: pr_err() strings should end with newlines ipmi: Clean up some print operations ipmi: Make the DMI probe into a generic platform probe ipmi: Make the IPMI proc interface configurable ipmi_ssif: Add device attrs for the things in proc ipmi_si: Add device attrs for the things in proc ipmi_si: remove ipmi_smi_alloc() function ipmi_si: Move port and mem I/O handling to their own files ipmi_si: Get rid of unused spacing and port fields ipmi_si: Move PARISC handling to another file ipmi_si: Move PCI setup to another file ipmi_si: Move platform device handling to another file ipmi_si: Move hardcode handling to a separate file. ipmi_si: Move the hotmod handling to another file. ipmi_si: Change ipmi_si_add_smi() to take just I/O info ipmi_si: Move io setup into io structure ipmi_si: Move irq setup handling into the io struct ...
| * \ Merge branch 'modules-next' of ↵Corey Minyard2017-11-02347-1578/+2978
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux into for-next The IPMI SI driver was split into different pieces, merge the module tree to accountfor that. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Delete an error message for a failed memory allocation in ↵Markus Elfring2017-10-171-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | try_smi_init() Omit an extra message for a memory allocation failure in this function. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: fix memory leak on new_smiColin Ian King2017-10-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error exit path omits kfree'ing the allocated new_smi, causing a memory leak. Fix this by kfree'ing new_smi. Detected by CoverityScan, CID#14582571 ("Resource Leak") Fixes: 7e030d6dff71 ("ipmi: Prefer ACPI system interfaces over SMBIOS ones") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: remove redundant initialization of bmcColin Ian King2017-09-281-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pointer bmc is being initialized and this initialized value is never being read, so this is assignment redundant and can be removed. Cleans up clang warning: warning: Value stored to 'bmc' during its initialization is never read Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: pr_err() strings should end with newlinesArvind Yadav2017-09-281-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pr_err() messages should terminated with a new-line to avoid other messages being concatenated onto the end. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Clean up some print operationsCorey Minyard2017-09-281-56/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Get rid of all printfs, using dev_xxx() if a device is available, pr_xxx() otherwise, and format long strings properly. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Make the DMI probe into a generic platform probeCorey Minyard2017-09-287-94/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework the DMI probe function to be a generic platform probe, and then rework the DMI code (and a few other things) to use the more generic information. This is so other things can declare platform IPMI devices. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Make the IPMI proc interface configurableCorey Minyard2017-09-284-21/+40
| | | | | | | | | | | | | | | | | | | | | | | | So we can remove it later. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_ssif: Add device attrs for the things in procCorey Minyard2017-09-281-2/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a device attribute for everything we show in proc, getting ready for removing the proc stuff. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Add device attrs for the things in procCorey Minyard2017-09-281-1/+103
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a device attribute for everything we show in proc, getting ready for removing the proc stuff. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: remove ipmi_smi_alloc() functionCorey Minyard2017-09-281-10/+2
| | | | | | | | | | | | | | | | | | | | | | | | It's only used in one place now, so it's overkill. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move port and mem I/O handling to their own filesCorey Minyard2017-09-285-255/+263
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Get rid of unused spacing and port fieldsCorey Minyard2017-09-281-10/+0
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move PARISC handling to another fileCorey Minyard2017-09-284-57/+71
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move PCI setup to another fileCorey Minyard2017-09-284-161/+179
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com> Stephen Rothwell <sfr@canb.auug.org.au> fixed an issue with the include files
| * | | ipmi_si: Move platform device handling to another fileCorey Minyard2017-09-284-589/+613
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com> Stephen Rothwell <sfr@canb.auug.org.au> fixed an issue with the include files
| * | | ipmi_si: Move hardcode handling to a separate file.Corey Minyard2017-09-274-147/+154
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move the hotmod handling to another file.Corey Minyard2017-09-274-244/+264
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Change ipmi_si_add_smi() to take just I/O infoCorey Minyard2017-09-273-309/+229
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of allocating the smi_info structure, filling in the I/O info, and passing it to ipmi_si_add_smi(), just pass the I/O info in the io structure and let ipmi_si_add_smi() allocate the smi_info structure. This required redoing the way the remove functions for some device interfaces worked, a new function named ipmi_si_remove_by_dev() allows the device to be passed in and detected instead of using driver data, which couldn't be filled out easily othersize. After this the platform handling should be decoupled from the smi_info structure and that handling can be pulled out to its own files. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move io setup into io structureCorey Minyard2017-09-272-92/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Where it belongs, and getting ready for pulling the platform handling into its own file. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move irq setup handling into the io structCorey Minyard2017-09-273-81/+82
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So the platform code can do it without having to access the smi info, getting ready for pulling the platform handling section to their own files. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Move some platform data into the io structureCorey Minyard2017-09-272-216/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | That's where it belongs, and we are getting ready for moving the platform handling out of the main ipmi_si_intf.c file. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi_si: Rename function to add smi, make it globalCorey Minyard2017-09-272-16/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Getting ready for moving the platform-specific stuff into their own files. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Convert IPMI GUID over to Linux guid_tCorey Minyard2017-09-271-27/+23
| | | | | | | | | | | | | | | | Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Rescan channel list on BMC changesCorey Minyard2017-09-271-58/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | If the BMC changes versions or a change is otherwise detected, rescan the channels on the BMC. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Move lun and address out of channel structCorey Minyard2017-09-271-22/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Put it in it's own struct, getting ready for channel information being dynamically changed. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Retry BMC registration on a failureCorey Minyard2017-09-271-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | If the BMC fails to register, just set up to retry periodically. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Rework device id and guid handling to catch changing BMCsCorey Minyard2017-09-271-73/+167
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A BMC's guid or device id info may change dynamically, this could result in a different configuration that needs to be done. Adjust the BMCs dynamically. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Use a temporary BMC for an interfaceCorey Minyard2017-09-271-9/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is getting ready for the ability to redo the BMC if it's information changes, we need a fallback mechanism. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Dynamically fetch GUID periodicallyCorey Minyard2017-09-271-19/+42
| | | | | | | | | | | | | | | | | | | | | | | | This will catch if the GUID changes. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Always fetch the guid through ipmi_get_device_id()Corey Minyard2017-09-271-28/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This is in preparation for making ipmi_get_device_id() dynamically return the guid and device id. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Remove the device id from ipmi_register_smi()Corey Minyard2017-09-274-24/+1
| | | | | | | | | | | | | | | | | | | | | | | | It's no longer used, dynamic device id handling is in place now. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: allow dynamic BMC version informationJeremy Kerr2017-09-271-15/+190
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, it's up to the IPMI SMIs to provide the product & version details of BMCs behind registered IPMI SMI interfaces. This device ID is provided on SMI regsitration, and kept around for all future queries. However, this version information isn't always static. For example, a BMC may be upgraded at runtime, making the old version information stale. This change allows querying the BMC device ID & version information dynamically. If no static device_id argument is provided to ipmi_register_smi, then the IPMI core code will perform a Get Device ID IPMI command to query the version information when needed. We keep a short-term cache of this information so we don't need to re-query for every attribute access. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> I basically rewrote this, I fixed some locking issues and simplified things. Same functional change, though. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Don't use BMC product/dev ids in the BMC nameCorey Minyard2017-09-271-35/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a lot of bad things that a set of BMCs could do that would really confuse the IPMI driver; it's possible for BMCs with different GUIDs to have the same product/devid (though that's not technically legal), which would result in platform device namespace collisions. Fixing it would involve either using the GUID in the BMC name, which resulted in huge names, or just using an ida for numbering the BMCs. The latter approach was chosen to avoid the huge names. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Make ipmi_demangle_device_id more genericJeremy Kerr2017-09-272-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, ipmi_demagle_device_id requires a full response buffer in its data argument. This means we can't use it to parse a response in a struct ipmi_recv_msg, which has the netfn and cmd as separate bytes. This change alters the definition and users of ipmi_demangle_device_id to use a split netfn, cmd and data buffer, so it can be used with non-sequential responses. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Fixed the ipmi_ssif.c and ipmi_si_intf.c changes to use data from the response, not the data from the message, when passing info to the ipmi_demangle_device_id() function. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Add a reference from BMC devices to their interfacesJeremy Kerr2017-09-271-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an upcoming change, we'll want to grab a reference to the ipmi_smi_t from a struct bmc_device. This change adds a pointer to allow this. Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Reworked to support multiple interfaces on a BMC. Signed-off-by: Corey Minyard <cminyard@mvista.com>
| * | | ipmi: Get the device id through a functionCorey Minyard2017-09-272-39/+127
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes getting the device id consistent, and make it possible to add a function to fetch it dynamically later. Signed-off-by: Corey Minyard <cminyard@mvista.com>