diff options
author | Radosław Zarzyński <rzarzyns@redhat.com> | 2024-05-22 15:33:23 +0200 |
---|---|---|
committer | Radoslaw Zarzynski <rzarzyns@redhat.com> | 2024-06-20 22:37:57 +0200 |
commit | 1308da3a8800430e77b64ce33e9223115388e92f (patch) | |
tree | 22d3eb2718653f0e522daff4fe44d52b1451e72b /qa/standalone | |
parent | osd: fix trimming of multistripe partial reads (diff) | |
download | ceph-1308da3a8800430e77b64ce33e9223115388e92f.tar.xz ceph-1308da3a8800430e77b64ce33e9223115388e92f.zip |
qa: test-erasure-eio.sh honors the EC partial read support
This is supposed to fix:
```
2024-05-15T01:19:55.945 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:243: rados_get_data_bad_size: rados_get td/test-erasure-
eio pool-jerasure obj-size-81362-1-10 fail
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:104: rados_get: local dir=td/test-erasure-eio
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:105: rados_get: local poolname=pool-jerasure
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:106: rados_get: local objname=obj-size-81362-1-10
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:107: rados_get: local expect=fail
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:112: rados_get: '[' fail = fail ']'
2024-05-15T01:19:55.946 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:114: rados_get: rados --pool pool-jerasure get obj-size-
81362-1-10 td/test-erasure-eio/COPY
2024-05-15T01:19:56.175 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:115: rados_get: return
2024-05-15T01:19:56.175 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:243: rados_get_data_bad_size: return 1
2024-05-15T01:19:56.175 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:323: TEST_rados_get_bad_size_shard_1: return 1
2024-05-15T01:19:56.175 INFO:tasks.workunit.client.0.smithi190.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/erasure-code/test-erasure-eio.sh:41: run: return 1
```
(https://pulpito.ceph.com/rzarzynski-2024-05-14_22:09:16-rados-wip-osd-ec-partial-reads-distro-default-smithi/7706517/)
The failed scenario was exercising a behavior that got truly
changed by introduction of partial reads. Before, regardless
of read size, OSD was always reading and checking for errors
entire stripe.
In this test first 4 KB has been read from an EC pool with
m=2 k=1 while errors had been injected to shards 1 and 2.
Handling the first 4 KB doesn't really require the damaged
shards but, because of the full-stripe alignment, EIO was
returned. This is not anymore.
Signed-off-by: Radosław Zarzyński <rzarzyns@redhat.com>
Diffstat (limited to 'qa/standalone')
-rwxr-xr-x | qa/standalone/erasure-code/test-erasure-eio.sh | 32 |
1 files changed, 26 insertions, 6 deletions
diff --git a/qa/standalone/erasure-code/test-erasure-eio.sh b/qa/standalone/erasure-code/test-erasure-eio.sh index 42c538eb918..4c23b4b4488 100755 --- a/qa/standalone/erasure-code/test-erasure-eio.sh +++ b/qa/standalone/erasure-code/test-erasure-eio.sh @@ -178,9 +178,19 @@ function rados_put_get_data() { wait_for_clean || return 1 # Won't check for eio on get here -- recovery above might have fixed it else - shard_id=$(expr $shard_id + 1) - inject_$inject ec data $poolname $objname $dir $shard_id || return 1 - rados_get $dir $poolname $objname fail || return 1 + local another_shard_id=$(expr $shard_id + 1) + inject_$inject ec data $poolname $objname $dir $another_shard_id || return 1 + if [ $shard_id -eq 1 -a $another_shard_id -eq 2 ]; + then + # we're reading 4 kb long object while the stripe size is 8 kb. + # as we do partial reads and this request can be satisfied + # from the undamaged shard 0, we expect a success. + rados_get $dir $poolname $objname || return 1 + else + # both shards 0 and 1 are demaged. there is no way no serve + # the requests, regardless of partial reads + rados_get $dir $poolname $objname fail || return 1 + fi rm $dir/ORIGINAL fi @@ -238,9 +248,19 @@ function rados_get_data_bad_size() { rados_get $dir $poolname $objname || return 1 # Leave objname and modify another shard - shard_id=$(expr $shard_id + 1) - set_size $objname $dir $shard_id $bytes $mode || return 1 - rados_get $dir $poolname $objname fail || return 1 + local another_shard_id=$(expr $shard_id + 1) + set_size $objname $dir $another_shard_id $bytes $mode || return 1 + if [ $shard_id -eq 1 -a $another_shard_id -eq 2 ]; + then + # we're reading 4 kb long object while the stripe size is 8 kb. + # as we do partial reads and this request can be satisfied + # from the undamaged shard 0, we expect a success. + rados_get $dir $poolname $objname || return 1 + else + # both shards 0 and 1 are demaged. there is no way no serve + # the requests, regardless of partial reads + rados_get $dir $poolname $objname fail || return 1 + fi rm $dir/ORIGINAL } |