diff options
author | Andrea Righi <righi.andrea@gmail.com> | 2017-08-08 19:48:07 +0200 |
---|---|---|
committer | Jes Sorensen <jsorensen@fb.com> | 2018-01-21 22:36:43 +0100 |
commit | 31b6f0cdc1f36d92bb233344bbe4a4a01739d37a (patch) | |
tree | a6d7655b6fc4cb2eb8ce30aa413b12fb257d3d21 | |
parent | mdadm/clustermd_tests: add test case to test grow_resize cluster-raid10 (diff) | |
download | mdadm-31b6f0cdc1f36d92bb233344bbe4a4a01739d37a.tar.xz mdadm-31b6f0cdc1f36d92bb233344bbe4a4a01739d37a.zip |
Assemble: prevent segfault with faulty "best" devices
I was able to trigger this curious problem that seems to happen only on
one of our server:
Segmentation fault
This md volume is a raid1 volume made of 2 device mapper (dm-multipath)
devices and the underlying LUNs are imported via iSCSI.
Applying the following patch (see below) seems to fix the problem:
mdadm: /dev/md/10.4.237.12-volume has been started with 2 drives.
But I'm not sure if it's the right fix or if there're some other
problems that I'm missing.
More details about the md superblocks that might help to better
understand the nature of the problem:
dev: 36001405a04ed0c104881100000000000p2
/dev/mapper/36001405a04ed0c104881100000000000p2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
Name : 10.4.237.12-volume
Creation Time : Thu Jul 27 14:43:16 2017
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
Array Size : 536864704 (511.99 GiB 549.75 GB)
Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
Data Offset : 8192 sectors
Super Offset : 8 sectors
Unused Space : before=8104 sectors, after=95 sectors
State : clean
Device UUID : 16dae7e3:42f3487f:fbeac43a:71cf1f63
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Aug 8 11:12:22 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : 518c443e - correct
Events : 167
Device Role : Active device 0
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
dev: 36001405a04ed0c104881200000000000p2
/dev/mapper/36001405a04ed0c104881200000000000p2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 5f3e8283:7f831b85:bc1958b9:6f2787a4
Name : 10.4.237.12-volume
Creation Time : Thu Jul 27 14:43:16 2017
Raid Level : raid1
Raid Devices : 2
Avail Dev Size : 1073729503 (511.99 GiB 549.75 GB)
Array Size : 536864704 (511.99 GiB 549.75 GB)
Used Dev Size : 1073729408 (511.99 GiB 549.75 GB)
Data Offset : 8192 sectors
Super Offset : 8 sectors
Unused Space : before=8104 sectors, after=95 sectors
State : clean
Device UUID : ef612bdd:e475fe02:5d3fc55e:53612f34
Internal Bitmap : 8 sectors from superblock
Update Time : Tue Aug 8 11:12:22 2017
Bad Block Log : 512 entries available at offset 72 sectors
Checksum : c39534fd - correct
Events : 167
Device Role : Active device 1
Array State : AA ('A' == active, '.' == missing, 'R' == replacing)
dev: 36001405a04ed0c104881100000000000p2
00001000 fc 4e 2b a9 01 00 00 00 01 00 00 00 00 00 00 00 |.N+.............|
00001010 5f 3e 82 83 7f 83 1b 85 bc 19 58 b9 6f 27 87 a4 |_>........X.o'..|
00001020 31 30 2e 34 2e 32 33 37 2e 31 32 2d 76 6f 6c 75 |10.4.237.12-volu|
00001030 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |me..............|
00001040 64 50 7a 59 00 00 00 00 01 00 00 00 00 00 00 00 |dPzY............|
00001050 80 cf ff 3f 00 00 00 00 00 00 00 00 02 00 00 00 |...?............|
00001060 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001080 00 20 00 00 00 00 00 00 df cf ff 3f 00 00 00 00 |. .........?....|
00001090 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000010a0 00 00 00 00 00 00 00 00 16 da e7 e3 42 f3 48 7f |............B.H.|
000010b0 fb ea c4 3a 71 cf 1f 63 00 00 08 00 48 00 00 00 |...:q..c....H...|
000010c0 54 f0 89 59 00 00 00 00 a7 00 00 00 00 00 00 00 |T..Y............|
000010d0 ff ff ff ff ff ff ff ff 9c 43 8c 51 80 00 00 00 |.........C.Q....|
000010e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001100 00 00 01 00 fe ff fe ff fe ff fe ff fe ff fe ff |................|
00001110 fe ff fe ff fe ff fe ff fe ff fe ff fe ff fe ff |................|
*
00001200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002000 62 69 74 6d 04 00 00 00 5f 3e 82 83 7f 83 1b 85 |bitm...._>......|
00002010 bc 19 58 b9 6f 27 87 a4 a7 00 00 00 00 00 00 00 |..X.o'..........|
00002020 a7 00 00 00 00 00 00 00 80 cf ff 3f 00 00 00 00 |...........?....|
00002030 00 00 00 00 00 00 00 01 05 00 00 00 00 00 00 00 |................|
00002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00004000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
003ffe00
dev: 36001405a04ed0c104881200000000000p2
00001000 fc 4e 2b a9 01 00 00 00 01 00 00 00 00 00 00 00 |.N+.............|
00001010 5f 3e 82 83 7f 83 1b 85 bc 19 58 b9 6f 27 87 a4 |_>........X.o'..|
00001020 31 30 2e 34 2e 32 33 37 2e 31 32 2d 76 6f 6c 75 |10.4.237.12-volu|
00001030 6d 65 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |me..............|
00001040 64 50 7a 59 00 00 00 00 01 00 00 00 00 00 00 00 |dPzY............|
00001050 80 cf ff 3f 00 00 00 00 00 00 00 00 02 00 00 00 |...?............|
00001060 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001080 00 20 00 00 00 00 00 00 df cf ff 3f 00 00 00 00 |. .........?....|
00001090 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000010a0 01 00 00 00 00 00 00 00 ef 61 2b dd e4 75 fe 02 |.........a+..u..|
000010b0 5d 3f c5 5e 53 61 2f 34 00 00 08 00 48 00 00 00 |]?.^Sa/4....H...|
000010c0 54 f0 89 59 00 00 00 00 a7 00 00 00 00 00 00 00 |T..Y............|
000010d0 ff ff ff ff ff ff ff ff 5b 34 95 c3 80 00 00 00 |........[4......|
000010e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001100 00 00 01 00 fe ff fe ff fe ff fe ff fe ff fe ff |................|
00001110 fe ff fe ff fe ff fe ff fe ff fe ff fe ff fe ff |................|
*
00001200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002000 62 69 74 6d 04 00 00 00 5f 3e 82 83 7f 83 1b 85 |bitm...._>......|
00002010 bc 19 58 b9 6f 27 87 a4 a7 00 00 00 00 00 00 00 |..X.o'..........|
00002020 a7 00 00 00 00 00 00 00 80 cf ff 3f 00 00 00 00 |...........?....|
00002030 00 00 00 00 00 00 00 01 05 00 00 00 00 00 00 00 |................|
00002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................|
*
00004000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
003ffe00
Assemble: prevent segfault with faulty "best" devices
In Assemble(), after context reload, best[i] can be -1 in some cases,
and before checking if this value is negative we use it to access
devices[j].i.disk.raid_disk, potentially causing a segfault.
Check if best[i] is negative before using it to prevent this potential
segfault.
Signed-off-by: Andrea Righi <andrea@betterlinux.com>
Fixes: 69a481166be6 ("Assemble array with write journal")
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
-rw-r--r-- | Assemble.c | 4 |
1 files changed, 2 insertions, 2 deletions
@@ -1671,6 +1671,8 @@ try_again: int j = best[i]; unsigned int desired_state; + if (j < 0) + continue; if (devices[j].i.disk.raid_disk == MD_DISK_ROLE_JOURNAL) desired_state = (1<<MD_DISK_JOURNAL); else if (i >= content->array.raid_disks * 2) @@ -1680,8 +1682,6 @@ try_again: else desired_state = (1<<MD_DISK_ACTIVE) | (1<<MD_DISK_SYNC); - if (j<0) - continue; if (!devices[j].uptodate) continue; |