diff options
author | WANG Xuerui <git@xen0n.name> | 2023-09-06 16:53:55 +0200 |
---|---|---|
committer | Huacai Chen <chenhuacai@loongson.cn> | 2023-09-06 16:53:55 +0200 |
commit | f2091321044d9fbcadb93dfc1c9cf23e563ea40c (patch) | |
tree | a5c64676c92a84d27b56a7a9a5cec44a89f3d3f9 /lib/raid6/algos.c | |
parent | raid6: Add LoongArch SIMD syndrome calculation (diff) | |
download | linux-f2091321044d9fbcadb93dfc1c9cf23e563ea40c.tar.xz linux-f2091321044d9fbcadb93dfc1c9cf23e563ea40c.zip |
raid6: Add LoongArch SIMD recovery implementation
Similar to the syndrome calculation, the recovery algorithms also work
on 64 bytes at a time to align with the L1 cache line size of current
and future LoongArch cores (that we care about). Which means
unrolled-by-4 LSX and unrolled-by-2 LASX code.
The assembly is originally based on the x86 SSSE3/AVX2 ports, but
register allocation has been redone to take advantage of LSX/LASX's 32
vector registers, and instruction sequence has been optimized to suit
(e.g. LoongArch can perform per-byte srl and andi on vectors, but x86
cannot).
Performance numbers measured by instrumenting the raid6test code, on a
3A5000 system clocked at 2.5GHz:
> lasx 2data: 354.987 MiB/s
> lasx datap: 350.430 MiB/s
> lsx 2data: 340.026 MiB/s
> lsx datap: 337.318 MiB/s
> intx1 2data: 164.280 MiB/s
> intx1 datap: 187.966 MiB/s
Because recovery algorithms are chosen solely based on priority and
availability, lasx is marked as priority 2 and lsx priority 1. At least
for the current generation of LoongArch micro-architectures, LASX should
always be faster than LSX whenever supported, and have similar power
consumption characteristics (because the only known LASX-capable uarch,
the LA464, always compute the full 256-bit result for vector ops).
Acked-by: Song Liu <song@kernel.org>
Signed-off-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Diffstat (limited to 'lib/raid6/algos.c')
-rw-r--r-- | lib/raid6/algos.c | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/lib/raid6/algos.c b/lib/raid6/algos.c index 739c7ebcae1a..0ec534faf019 100644 --- a/lib/raid6/algos.c +++ b/lib/raid6/algos.c @@ -112,6 +112,14 @@ const struct raid6_recov_calls *const raid6_recov_algos[] = { #if defined(CONFIG_KERNEL_MODE_NEON) &raid6_recov_neon, #endif +#ifdef CONFIG_LOONGARCH +#ifdef CONFIG_CPU_HAS_LASX + &raid6_recov_lasx, +#endif +#ifdef CONFIG_CPU_HAS_LSX + &raid6_recov_lsx, +#endif +#endif &raid6_recov_intx1, NULL }; |