| Commit message (Collapse) | Author | Files | Lines |
|
software_key_query() returns the maximum signature and digest size for a
given key to user space. When it only supported RSA keys, calculating
those sizes was trivial as they were always equivalent to the key size.
However when ECDSA was added, the function grew somewhat complicated
calculations which take the ASN.1 encoding and curve into account.
This doesn't scale well and adjusting the calculations is easily
forgotten when adding support for new encodings or curves. In fact,
when NIST P521 support was recently added, the function was initially
not amended:
https://lore.kernel.org/all/b749d5ee-c3b8-4cbd-b252-7773e4536e07@linux.ibm.com/
Introduce a ->max_size() callback to struct sig_alg and take advantage
of it to move the signature size calculations to ecdsa-x962.c.
Introduce a ->digest_size() callback to struct sig_alg and move the
maximum ECDSA digest size to ecdsa.c. It is common across ecdsa-x962.c
and the upcoming ecdsa-p1363.c and thus inherited by both of them.
For all other algorithms, continue using the key size as maximum
signature and digest size.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
crypto_sig_maxsize() is a bit of a misnomer as it doesn't return the
maximum signature size, but rather the key size.
Rename it as well as all implementations of the ->max_size callback.
A subsequent commit introduces a crypto_sig_maxsize() function which
returns the actual maximum signature size.
While at it, change the return type of crypto_sig_keysize() from int to
unsigned int for consistency with crypto_akcipher_maxsize(). None of
the callers checks for a negative return value and an error condition
can always be indicated by returning zero.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Unlike the rsa driver, which separates signature decoding and
signature verification into two steps, the ecdsa driver does both in one.
This restricts users to the one signature format currently supported
(X9.62) and prevents addition of others such as P1363, which is needed
by the forthcoming SPDM library (Security Protocol and Data Model) for
PCI device authentication.
Per Herbert's suggestion, change ecdsa to use a "raw" signature encoding
and then implement X9.62 and P1363 as templates which convert their
respective encodings to the raw one. One may then specify
"x962(ecdsa-nist-XXX)" or "p1363(ecdsa-nist-XXX)" to pick the encoding.
The present commit moves X9.62 decoding to a template. A separate
commit is going to introduce another template for P1363 decoding.
The ecdsa driver internally represents a signature as two u64 arrays of
size ECC_MAX_BYTES. This appears to be the most natural choice for the
raw format as it can directly be used for verification without having to
further decode signature data or copy it around.
Repurpose all the existing test vectors for "x962(ecdsa-nist-XXX)" and
create a duplicate of them to test the raw encoding.
Link: https://lore.kernel.org/all/ZoHXyGwRzVvYkcTP@gondor.apana.org.au/
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Tested-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When extracting a signature component r or s from an ASN.1-encoded
integer, ecdsa_get_signature_rs() subtracts the expected length
"bufsize" from the ASN.1 length "vlen" (both of unsigned type size_t)
and stores the result in "diff" (of signed type ssize_t).
This results in a signed integer overflow if vlen > SSIZE_MAX + bufsize.
The kernel is compiled with -fno-strict-overflow, which implies -fwrapv,
meaning signed integer overflow is not undefined behavior. And the
function does check for overflow:
if (-diff >= bufsize)
return -EINVAL;
So the code is fine in principle but not very obvious. In the future it
might trigger a false-positive with CONFIG_UBSAN_SIGNED_WRAP=y.
Avoid by comparing the two unsigned variables directly and erroring out
if "vlen" is too large.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
If <linux/asn1_decoder.h> is the first header included from a .c file
(due to headers being sorted alphabetically), the compiler complains:
include/linux/asn1_decoder.h:18:29: error: unknown type name 'size_t'
Avoid by including <linux/types.h>.
Jonathan notes that the counterpart <linux/asn1_encoder.h> already
includes <linux/types.h>, but additionally includes the unnecessary
<linux/bug.h>. Drop it.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The crypto_sig_*() API calls lived in sig.c so far because they needed
access to struct crypto_sig_type: This was necessary to differentiate
between signature algorithms that had already been migrated from
crypto_akcipher to crypto_sig and those that hadn't yet.
Now that all algorithms have been migrated, the API calls can become
static inlines in <crypto/sig.h> to mimic what <crypto/akcipher.h> is
doing.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced and all asymmetric
sign/verify algorithms have been migrated to it.
The sign/verify operations can thus be dropped from akcipher_alg.
It is now purely for asymmetric encrypt/decrypt.
Move struct crypto_akcipher_sync_data from internal.h to akcipher.c and
unexport crypto_akcipher_sync_{prep,post}(): They're no longer used by
sig.c but only locally in akcipher.c.
In crypto_akcipher_sync_{prep,post}(), drop various NULL pointer checks
for data->dst as they were only necessary for the verify operation.
In the crypto_sig_*() API calls, remove the forks that were necessary
while algorithms were converted from crypto_akcipher to crypto_sig
one by one.
In struct akcipher_testvec, remove the "params", "param_len" and "algo"
elements as they were only needed for the ecrdsa verify operation.
Remove corresponding dead code from test_akcipher_one() as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The drivers aspeed-acry.c, hpre_crypto.c and jh7110-rsa.c purport to
implement sign/verify operations for raw (unpadded) "rsa".
But there is no such thing as message digests generally need to be
padded according to a predefined scheme (such as PSS or PKCS#1) to
match the size of the usually much larger RSA keys.
The bogus sign/verify operations defined by these drivers are never
called but block removal of sign/verify from akcipher_alg. Drop them.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The virtio crypto driver exposes akcipher sign/verify operations in a
user space ABI. This blocks removal of sign/verify from akcipher_alg.
Herbert opines:
"I would say that this is something that we can break. Breaking it
is no different to running virtio on a host that does not support
these algorithms. After all, a software implementation must always
be present.
I deliberately left akcipher out of crypto_user because the API
is still in flux. We should not let virtio constrain ourselves."
https://lore.kernel.org/all/ZtqoNAgcnXnrYhZZ@gondor.apana.org.au/
"I would remove virtio akcipher support in its entirety. This API
was never meant to be exposed outside of the kernel."
https://lore.kernel.org/all/Ztqql_gqgZiMW8zz@gondor.apana.org.au/
Drop sign/verify support from virtio crypto. There's no strong reason
to also remove encrypt/decrypt support, so keep it.
A key selling point of virtio crypto is to allow guest access to crypto
accelerators on the host. So far the only akcipher algorithm supported
by virtio crypto is RSA. Dropping sign/verify merely means that the
PKCS#1 padding is now always generated or verified inside the guest,
but the actual signature generation/verification (which is an RSA
decrypt/encrypt operation) may still use an accelerator on the host.
Generating or verifying the PKCS#1 padding is cheap, so a hardware
accelerator won't be of much help there. Which begs the question
whether virtio crypto support for sign/verify makes sense at all.
It would make sense for the sign operation if the host has a security
chip to store asymmetric private keys. But the kernel doesn't even
have an asymmetric_key_subtype yet for hardware-based private keys.
There's at least one rudimentary driver for such chips (atmel-ecc.c for
ATECC508A), but it doesn't implement the sign operation. The kernel
would first have to grow support for a hardware asymmetric_key_subtype
and at least one driver implementing the sign operation before exposure
to guests via virtio makes sense.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
When constructing the EMSA-PKCS1-v1_5 padding for the sign operation,
a buffer for the padding is allocated and the Full Hash Prefix is copied
into it. The padding is then passed to the RSA decrypt operation as an
sglist entry which is succeeded by a second sglist entry for the hash.
Actually copying the hash prefix around is completely unnecessary.
It can simply be referenced from a third sglist entry which sits
in-between the padding and the digest.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The RSASSA-PKCS1-v1_5 sign operation currently only checks that the
digest length is less than "key_size - hash_prefix->size - 11".
The verify operation merely checks that it's more than zero.
Actually the precise digest length is known because the hash algorithm
is specified upon instance creation and the digest length is encoded
into the final byte of the hash algorithm's Full Hash Prefix.
So check for the exact digest length rather than solely relying on
imprecise maximum/minimum checks.
Keep the maximum length check for the sign operation as a safety net,
but drop the now unnecessary minimum check for the verify operation.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced with the intent of moving all
asymmetric sign/verify algorithms to it one by one.
Migrate the sign/verify operations from rsa-pkcs1pad.c to a separate
rsassa-pkcs1.c which uses the new backend.
Consequently there are now two templates which build on the "rsa"
akcipher_alg:
* The existing "pkcs1pad" template, which is instantiated as an
akcipher_instance and retains the encrypt/decrypt operations of
RSAES-PKCS1-v1_5 (RFC 8017 sec 7.2).
* The new "pkcs1" template, which is instantiated as a sig_instance
and contains the sign/verify operations of RSASSA-PKCS1-v1_5
(RFC 8017 sec 8.2).
In a separate step, rsa-pkcs1pad.c could optionally be renamed to
rsaes-pkcs1.c for clarity. Additional "oaep" and "pss" templates
could be added for RSAES-OAEP and RSASSA-PSS.
Note that it's currently allowed to allocate a "pkcs1pad(rsa)" transform
without specifying a hash algorithm. That makes sense if the transform
is only used for encrypt/decrypt and continues to be supported. But for
sign/verify, such transforms previously did not insert the Full Hash
Prefix into the padding. The resulting message encoding was incompliant
with EMSA-PKCS1-v1_5 (RFC 8017 sec 9.2) and therefore nonsensical.
From here on in, it is no longer allowed to allocate a transform without
specifying a hash algorithm if the transform is used for sign/verify
operations. This simplifies the code because the insertion of the Full
Hash Prefix is no longer optional, so various "if (digest_info)" clauses
can be removed.
There has been a previous attempt to forbid transform allocation without
specifying a hash algorithm, namely by commit c0d20d22e0ad ("crypto:
rsa-pkcs1pad - Require hash to be present"). It had to be rolled back
with commit b3a8c8a5ebb5 ("crypto: rsa-pkcs1pad: Allow hash to be
optional [ver #2]"), presumably because it broke allocation of a
transform which was solely used for encrypt/decrypt, not sign/verify.
Avoid such breakage by allowing transform allocation for encrypt/decrypt
with and without specifying a hash algorithm (and simply ignoring the
hash algorithm in the former case).
So again, specifying a hash algorithm is now mandatory for sign/verify,
but optional and ignored for encrypt/decrypt.
The new sig_alg API uses kernel buffers instead of sglists, which
avoids the overhead of copying signature and digest from sglists back
into kernel buffers. rsassa-pkcs1.c is thus simplified quite a bit.
sig_alg is always synchronous, whereas the underlying "rsa" akcipher_alg
may be asynchronous. So await the result of the akcipher_alg, similar
to crypto_akcipher_sync_{en,de}crypt().
As part of the migration, rename "rsa_digest_info" to "hash_prefix" to
adhere to the spec language in RFC 9580. Otherwise keep the code
unmodified wherever possible to ease reviewing and bisecting. Leave
several simplification and hardening opportunities to separate commits.
rsassa-pkcs1.c uses modern __free() syntax for allocation of buffers
which need to be freed by kfree_sensitive(), hence a DEFINE_FREE()
clause for kfree_sensitive() is introduced herein as a byproduct.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
pkcs1pad_set_pub_key() and pkcs1pad_set_priv_key() are almost identical.
The upcoming migration of sign/verify operations from rsa-pkcs1pad.c
into a separate crypto_template will require another copy of the exact
same functions. When RSASSA-PSS and RSAES-OAEP are introduced, each
will need yet another copy.
Deduplicate the functions into a single one which lives in a common
header file for reuse by RSASSA-PKCS1-v1_5, RSASSA-PSS and RSAES-OAEP.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced with the intent of moving all
asymmetric sign/verify algorithms to it one by one.
Migrate ecrdsa.c to the new backend.
One benefit of the new API is the use of kernel buffers instead of
sglists, which avoids the overhead of copying signature and digest
sglists back into kernel buffers. ecrdsa.c is thus simplified quite
a bit.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
A sig_alg backend has just been introduced with the intent of moving all
asymmetric sign/verify algorithms to it one by one.
Migrate ecdsa.c to the new backend.
One benefit of the new API is the use of kernel buffers instead of
sglists, which avoids the overhead of copying signature and digest
sglists back into kernel buffers. ecdsa.c is thus simplified quite
a bit.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Commit 6cb8815f41a9 ("crypto: sig - Add interface for sign/verify")
began a transition of asymmetric sign/verify operations from
crypto_akcipher to a new crypto_sig frontend.
Internally, the crypto_sig frontend still uses akcipher_alg as backend,
however:
"The link between sig and akcipher is meant to be temporary. The
plan is to create a new low-level API for sig and then migrate
the signature code over to that from akcipher."
https://lore.kernel.org/r/ZrG6w9wsb-iiLZIF@gondor.apana.org.au/
"having a separate alg for sig is definitely where we want to
be since there is very little that the two types actually share."
https://lore.kernel.org/r/ZrHlpz4qnre0zWJO@gondor.apana.org.au/
Take the next step of that migration and augment the crypto_sig frontend
with a sig_alg backend to which all algorithms can be moved.
During the migration, there will briefly be signature algorithms that
are still based on crypto_akcipher, whilst others are already based on
crypto_sig. Allow for that by building a fork into crypto_sig_*() API
calls (i.e. crypto_sig_maxsize() and friends) such that one of the two
backends is selected based on the transform's cra_type.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
The ECDSA test vectors contain "params", "param_len" and "algo" elements
even though ecdsa.c doesn't make any use of them. The only algorithm
implementation using those elements is ecrdsa.c.
Drop the unused test vector elements.
For the curious, "params" is an ASN.1 SEQUENCE of OID_id_ecPublicKey
and a second OID identifying the curve. For example:
"\x30\x13\x06\x07\x2a\x86\x48\xce\x3d\x02\x01\x06\x08\x2a\x86\x48"
"\xce\x3d\x03\x01\x01"
... decodes to:
SEQUENCE (OID_id_ecPublicKey, OID_id_prime192v1)
The curve OIDs used in those "params" elements are unsurprisingly:
OID_id_prime192v1 (2a8648ce3d030101)
OID_id_prime256v1 (2a8648ce3d030107)
OID_id_ansip384r1 (2b81040022)
OID_id_ansip521r1 (2b81040023)
Those are just different names for secp192r1, secp256r1, secp384r1 and
secp521r1, respectively, per RFC 8422 appendix A:
https://www.rfc-editor.org/rfc/rfc8422#appendix-A
The entries for secp384r1 and secp521r1 curves contain a useful code
comment calling out the curve and hash. Add analogous code comments
to secp192r1 and secp256r1 curve entries.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
|
|
The cpu_emergency_register_virt_callback() function is used
unconditionally by the x86 kvm code, but it is declared (and defined)
conditionally:
#if IS_ENABLED(CONFIG_KVM_INTEL) || IS_ENABLED(CONFIG_KVM_AMD)
void cpu_emergency_register_virt_callback(cpu_emergency_virt_cb *callback);
...
leading to a build error when neither KVM_INTEL nor KVM_AMD support is
enabled:
arch/x86/kvm/x86.c: In function ‘kvm_arch_enable_virtualization’:
arch/x86/kvm/x86.c:12517:9: error: implicit declaration of function ‘cpu_emergency_register_virt_callback’ [-Wimplicit-function-declaration]
12517 | cpu_emergency_register_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/kvm/x86.c: In function ‘kvm_arch_disable_virtualization’:
arch/x86/kvm/x86.c:12522:9: error: implicit declaration of function ‘cpu_emergency_unregister_virt_callback’ [-Wimplicit-function-declaration]
12522 | cpu_emergency_unregister_virt_callback(kvm_x86_ops.emergency_disable_virtualization_cpu);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fix the build by defining empty helper functions the same way the old
cpu_emergency_disable_virtualization() function was dealt with for the
same situation.
Maybe we could instead have made the call sites conditional, since the
callers (kvm_arch_{en,dis}able_virtualization()) have an empty weak
fallback. I'll leave that to the kvm people to argue about, this at
least gets the build going for that particular config.
Fixes: 590b09b1d88e ("KVM: x86: Register "emergency disable" callbacks when virt is enabled")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Chao Gao <chao.gao@intel.com>
Cc: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The isomorphism neg_if_exp negates the test of a ?: conditional,
making it unnecessary to have an explicit case for a negated test
with the branches inverted.
At the same time, we can disable neg_if_exp in cases where a
different API function may be more suitable for a negated test.
Finally, in the non-patch cases, E matches an expression with
parentheses around it, so there is no need to mention ()
explicitly in the pattern. The () are still needed in the patch
cases, because we want to drop them, if they are present.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
The parentheses are only needed if there is a disjunction, ie a
set of possible changes. If there is only one pattern, we can
remove these parentheses. Just like the format:
- x
+ y
not:
(
- x
+ y
)
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_yes_no()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_on_off()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_write_read()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_read_write()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_enable{d}_
disable{d}() to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_lo{w}_hi{gh}()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As other rules done, we add rules for str_hi{gh}_lo{w}()
to check the relative opportunities.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
As done with str_true_false(), add checks for str_false_true()
opportunities. A simple test can find over 9 cases currently
exist in the tree.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
After str_true_false() has been introduced in the tree,
we can add rules for finding places where str_true_false()
can be used. A simple test can find over 10 locations.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
if an inode backpointer points to a dirent that doesn't point back,
that's an error we should warn about.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If the reader acquires the read lock and then the writer enters the slow
path, while the reader proceeds to the unlock path, the following scenario
can occur without the change:
writer: pcpu_read_count(lock) return 1 (so __do_six_trylock will return 0)
reader: this_cpu_dec(*lock->readers)
reader: smp_mb()
reader: state = atomic_read(&lock->state) (there is no waiting flag set)
writer: six_set_bitmask()
then the writer will sleep forever.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we shut down successfully, there shouldn't be any logged ops to
resume.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a filesystem flag to indicate whether we did a clean recovery -
using c->sb.clean after we've got rw is incorrect, since c->sb is
updated whenever we write the superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We had a bug where disk accounting keys didn't always have their version
field set in journal replay; change the BUG_ON() to a WARN(), and
exclude this case since it's now checked for elsewhere (in the bkey
validate function).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was added to avoid double-counting accounting keys in journal
replay. But applied incorrectly (easily done since it applies to the
transaction commit, not a particular update), it leads to skipping
in-mem accounting for real accounting updates, and failure to give them
a version number - which leads to journal replay becoming very confused
the next time around.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
give bversions a more distinct name, to aid in grepping
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Previously, check_inode() would delete unlinked inodes if they weren't
on the deleted list - this code dating from before there was a deleted
list.
But, if we crash during a logged op (truncate or finsert/fcollapse) of
an unlinked file, logged op resume will get confused if the inode has
already been deleted - instead, just add it to the deleted list if it
needs to be there; delete_dead_inodes runs after logged op resume.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
BCH_SB_ERRS() has a field for the actual enum val so that we can reorder
to reorganize, but the way BCH_SB_ERR_MAX was defined didn't allow for
this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
__bch2_fsck_err() warns if the current task has a btree_trans object and
it wasn't passed in, because if it has to prompt for user input it has
to be able to unlock it.
But plumbing the btree_trans through bkey_validate(), as well as
transaction restarts, is problematic - so instead make bkey fsck errors
FSCK_AUTOFIX, which doesn't need to warn.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In order to check for accounting keys with version=0, we need to run
validation after they've been assigned version numbers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes the following bug, where a disk accounting key has an invalid
replicas entry, and we attempt to add it to the superblock:
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): starting version 1.12: rebalance_work_acct_fix opts=metadata_replicas=2,data_replicas=2,foreground_target=ssd,background_target=hdd,nopromote_whole_extents,verbose,fsck,fix_errors=yes
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): recovering from clean shutdown, journal seq 15211644
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): accounting_read...
accounting not marked in superblock replicas
replicas cached: 1/1 [0], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 88):
user: 2 [3 5] user: 2 [1 4] cached: 1 [2] btree: 2 [1 2] user: 2 [2 5] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 2 [1 2] user: 2 [2 3] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 3] user: 2 [1 5] user: 2 [2 4]
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): inconsistency detected - emergency read only at journal seq 15211644
accounting not marked in superblock replicas
replicas user: 1/1 [3], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 0 in entry cached: 1/1 [0]
replicas_v0 (size 96):
user: 2 [3 5] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5]
accounting not marked in superblock replicas
replicas user: 1/2 [3 7], fixing
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): sb invalid before write: Invalid superblock section replicas_v0: invalid device 7 in entry user: 1/2 [3 7]
replicas_v0 (size 96):
user: 2 [3 7] user: 2 [1 3] cached: 1 [2] btree: 2 [1 2] user: 2 [2 4] cached: 1 [0] cached: 1 [4] journal: 2 [1 5] user: 1 [3] user: 2 [1 5] user: 2 [3 4] user: 2 [4 5] cached: 1 [1] cached: 1 [3] cached: 1 [5] journal: 2 [1 2] journal: 2 [2 5] btree: 2 [2 5] user: 2 [1 2] user: 2 [1 4] user: 2 [2 3] user: 2 [2 5] user: 2 [3 5]
done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): alloc_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): stripes_read... done
bcachefs (3c0860e8-07ca-4276-8954-11c1774be868): snapshots_read... done
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
accounting read was checking if accounting replicas entries were marked
in the superblock prior to applying accounting from the journal,
which meant that a recently removed device could spuriously trigger a
"not marked in superblocked" error (when journal entries zero out the
offending counter).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Minor refactoring - replace multiple bool arguments with an enum; prep
work for fixing a bug in accounting read.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Dealing with outside state within a btree transaction is always tricky.
check_extents() and check_dirents() have to accumulate counters for
i_sectors and i_nlink (for subdirectories). There were two bugs:
- transaction commit may return a restart; therefore we have to commit
before accumulating to those counters
- get_inode_all_snapshots() may return a transaction restart, before
updating w->last_pos; then, on the restart,
check_i_sectors()/check_subdir_count() would see inodes that were not
for w->last_pos
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
dead code
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|