summaryrefslogtreecommitdiffstats
path: root/Manage.c (follow)
Commit message (Collapse)AuthorAgeFilesLines
* Fix race of "mdadm --add" and "mdadm --incremental"Li Xiao Keng2023-10-261-8/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a raid1 with sda and sdb. And we add sdc to this raid, it may return -EBUSY. The main process of --add: 1. dev_open(sdc) in Manage_add 2. store_super1(st, di->fd) in write_init_super1 3. fsync(fd) in store_super1 4. close(di->fd) in write_init_super1 5. ioctl(ADD_NEW_DISK) Step 2 and 3 will add sdc to metadata of raid1. There will be udev(change of sdc) event after step4. Then "/usr/sbin/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}" will be run, and the sdc will be added to the raid1. Then step 5 will return -EBUSY because it checks if device isn't claimed in md_import_device()->lock_rdev()->blkdev_get_by_dev() ->blkdev_get(). It will be confusing for users because sdc is added first time. The "incremental" will get map_lock before add sdc to raid1. So we add map_lock before write_init_super in "mdadm --add" to fix the race of "add" and "incremental". Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Fix memory leak in file ManageGuanqin Miao2023-09-011-2/+11
| | | | | | | | | | | | When we test mdadm with asan, we found some memory leaks in Manage.c We fix these memory leaks based on code logic. v2: Fix free() of uninitialized 'tst' in abort path. Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Bump minimum kernel version to 2.6.32Jes Sorensen2023-04-101-17/+0
| | | | | | | Summary: At this point it probably is reasonable to drop support for anything prior to 3.10. Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* manage: move comment with function descriptionKinga Tanska2023-01-051-28/+44
| | | | | | | | | Move the function description from the function body to outside to obey kernel coding style. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* incremental, manage: do not verify if remove is safeKinga Tanska2023-01-051-3/+4
| | | | | | | | | | | Function is_remove_safe() was introduced to verify if removing member device won't cause failed state of the array. This verification should be used only with set-faulty command. Add special mode indicating that Incremental removal was executed. If this mode is used do not execute is_remove_safe() routine. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Manage: do not check array state when drive is removedKinga Tanska2023-01-051-2/+1
| | | | | | | | | | Array state doesn't need to be checked when drive is removed, but until now clean state was required. Result of the is_remove_safe() function will be independent from array state. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Manage&Incremental: code refactor, string to enumMateusz Kusiak2023-01-041-18/+17
| | | | | | | | | Prepare Manage and Incremental for later changing context->update to enum. Change update from string to enum in multiple functions and pass enum where already possible. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Change update to enum in update_super and update_subarrayMateusz Kusiak2023-01-041-6/+8
| | | | | | | | | | Use already existing enum, change update_super and update_subarray update to enum globally. Refactor function references also. Remove code specific options from update_options. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Fix --update-subarray on active volumeMateusz Kusiak2023-01-041-0/+7
| | | | | | | | | | Options: bitmap, ppl and name should not be updated when array is active. Those features are mutually exclusive and share the same data area in IMSM (danger of overwriting by kernel). Remove check for active subarrays from super-intel. Since ddf is not supported, apply it globally for all options. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
* Manage: Block unsafe member failingMateusz Kusiak2022-09-081-1/+52
| | | | | | | | | | | Kernel may or may not block mdadm from removing member device if it will cause arrays failed state. It depends on raid personality implementation in kernel. Add verification on requested removal path (#mdadm --set-faulty command). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm: Replace obsolete usleep with nanosleepMateusz Grzonka2022-08-221-5/+5
| | | | | | | | | According to POSIX.1-2001, usleep is considered obsolete. Replace it with a wrapper that uses nanosleep, as recommended in man. Add handy macros for conversions between msec, usec and nsec. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm: block update=ppl for non raid456 levelsLukasz Florczak2022-06-241-2/+12
| | | | | | | | | | | | Option ppl should be used only for raid levels 4, 5 and 6. Cancel update for other levels. Applied globally for imsm and ddf format. Additionally introduce is_level456() helper function. Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage: Call validate_geometry when adding drive to external containerMariusz Tkaczyk2021-05-261-0/+7
| | | | | | | | | | | | When adding drive to container call validate_geometry to verify whether drive is supported and can be addded to container. Remove unused parameters from validate_geometry_imsm_container(). There is no need to pass them. Don't calculate freesize if it is not mandatory. Make it configurable. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage, imsm: Write metadata before addTkaczyk Mariusz2020-04-271-5/+1
| | | | | | | | | | | | | | | | | | | | | | New drive in container always appears as spare. Manager is able to handle that, and queues appropriative update to monitor. No update from mdadm side has to be processed, just insert the drive and ping the mdmon. Metadata has to be written if no mdmon is running (case for Raid0 or container without arrays). If bare drive is added very early on startup (by custom bare rule), there is possiblity that mdmon was not restarted after switch root. Old one is not able to handle new drive. New one fails because there is drive without metadata in container and metadata cannot be loaded. To prevent this, write spare metadata before adding device to container. Mdmon will overwrite it (same case as spare migration, if drive appears it writes the most recent metadata). Metadata has to be written only on new drive before sysfs_add_disk(), don't race with mdmon if running. Signed-off-by: Tkaczyk Mariusz <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Fix up a few formatting issuesJes Sorensen2019-11-271-4/+9
| | | | Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Remove last traces of HOT_ADD_DISKJes Sorensen2019-11-271-2/+0
| | | | | | This ioctl is no longer used, so remove all references to it. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage: Remove the legacy code for md driver prior to 0.90.03Xiao Yang2019-11-271-12/+0
| | | | | | | | | | | | | | | | | | | Previous re-add operation only calls ioctl(HOT_ADD_DISK) for array without metadata(e.g. mdadm -B/--build) when md driver is less than 0.90.02, but commit 091e8e6 breaks the logic and current re-add operation can call ioctl(HOT_ADD_DISK) even if md driver is 0.90.03. This issue is reproduced by 05r1-re-add-nosuper: ------------------------------------------------ ++ die 'resync or recovery is happening!' ++ echo -e '\n\tERROR: resync or recovery is happening! \n' ERROR: resync or recovery is happening! ------------------------------------------------ Fixes: 091e8e6("Manage: Remove all references to md_get_version()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Xiao Yang <ice_yangxiao@163.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage_subdevs(): Use a dev_tJes Sorensen2017-09-301-1/+1
| | | | | | Use the correct type for rdev Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Error messages should end with a newline character.NeilBrown2017-08-161-1/+1
| | | | | | | | Add "\n" to the end of error messages which don't already have one. Also spell "opened" correctly. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm/r5cache: allow adding journal to array without journalSong Liu2017-08-021-6/+0
| | | | | | | | | | | | Currently, --add-journal can be only used to recreate broken journal for arrays with journal since creation. As the kernel code getting more mature, this constraint is no longer necessary. This patch allows --add-journal to add journal to array without journal. Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm: Fixup more broken logical operator formattingJes Sorensen2017-05-161-2/+2
| | | | Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm: Fixup a large number of bad formatting of logical operatorsJes Sorensen2017-05-161-16/+13
| | | | | | Logical oprators never belong at the beginning of a line. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* retire the APIs that driver no longer supportsZhilong Liu2017-05-111-4/+0
| | | | | | | | | | refer to commit: e6e5f8f1267d ("Build: Stop bothering about supporting md driver ...") continue to retire the APIs that md driver wasn't supported for very long period of time. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm/util: unify stat checking blkdev into functionZhilong Liu2017-05-051-10/+1
| | | | | | | | | | | | declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, *rdev is optional, parse the pointer of dev_t *rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* mdadm/util: unify fstat checking blkdev into functionZhilong Liu2017-05-051-1/+1
| | | | | | | | | | | | declare function fstat_is_blkdev() to integrate repeated fstat checking block device operations, it returns true/1 when it is a block device, and returns false/0 when it isn't. The fd and devname are necessary parameters, *rdev is optional, parse the pointer of dev_t *rdev, if valid, assigned the device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage: Manage_ro(): Use md_array_active()Jes Sorensen2017-05-021-4/+2
| | | | | | | One call less to md_get_array_info() for determining whether an array is active or not. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* sysfs: Parse array_state in sysfs_read()Jes Sorensen2017-04-201-1/+1
| | | | | | | Rather than copying in the array_state string, parse it and use an enum to indicate the state. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Retire mdassembleJes Sorensen2017-04-111-9/+1
| | | | | | | | mdassemble doesn't handle container based arrays, no support for sysfs, etc. It has not been actively maintained for years, so time to send it off to retirement. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
* Manage: Remove all references to md_get_version()Jes Sorensen2017-04-051-19/+1
| | | | | | | At this point, support for md driver prior to 0.90.03 is going to disappear. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* sysfs: Make sysfs_init() return an error codeJes Sorensen2017-03-301-2/+5
| | | | | | | | Rather than have the caller inspect the returned content, return an error code from sysfs_init(). In addition make all callers actually check it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* util: Introduce md_get_disk_info()Jes Sorensen2017-03-291-10/+9
| | | | | | | This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* util: Introduce md_get_array_info()Jes Sorensen2017-03-291-7/+6
| | | | | | | | | | | Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* Add 'force' flag to *hot_remove_disk().NeilBrown2017-03-281-5/+5
| | | | | | | | | | | | | | | | | | In rare circumstances, the short period that *hot_remove_disk() waits isn't long enough to IO to complete. This particularly happens when a device is failing and many retries are still happening. We don't want to increase the normal wait time for "mdadm --remove" as that might be use just to test if a device is active or not, and a delay would be problematic. So allow "--force" to mean that mdadm should try extra hard for a --remove to complete, waiting up to 5 seconds. Note that this patch fixes a comment which claim the previous wait time was half a second, where it was really 50msec. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* Introduce sys_hot_remove_disk()NeilBrown2017-03-281-5/+1
| | | | | | | | | | | The new hot_remove_disk() will retry HOT_REMOVE_DISK several times in the face of EBUSY. However we sometimes remove a device by writing "remove" to the "state" attributed. This should be retried as well. So introduce sys_hot_remove_disk() to repeat this action a few times. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* Retry HOT_REMOVE_DISK a few times.NeilBrown2017-03-281-2/+2
| | | | | | | | | | | | | | | HOT_REMOVE_DISK can fail with EBUSY if there are outstanding IO request that have not completed yet. It can sometimes be helpful to wait a little while for these to complete. We already do this in impose_level() when reshaping a device, but not in Manage.c in response to an explicit --remove request. So create hot_remove_disk() to central this code, and call it where-ever it makes sense to wait for a HOT_REMOVE_DISK to succeed. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
* Introduce enum flag_mode for setting and clearing flags.NeilBrown2016-11-291-16/+16
| | | | | | | | | | | | | | We currently use '1' to indicate that a flag (writemostly or failfast) needs to be set, and '2' to indicate that it needs to be cleared. Using magic number like this is not a best-practice. So replaced them with values from a enum. No functional change. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Add failfast support.NeilBrown2016-11-281-1/+19
| | | | | | | | | | | | | Allow per-device "failfast" flag to be set when creating an array or adding devices to an array. When re-adding a device which had the failfast flag, it can be removed using --nofailfast. failfast status is printed in --detail and --examine output. Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Remove: container should wait for an array to release a driveTomasz Majchrzak2016-07-211-13/+28
| | | | | | | | | | | | | | | | | | A 'faulty' drive is being removed from a container after it has been released by an array, however there is a race there. The drive is released asynchronously by a monitor but sometimes it doesn't happen before container checks it. It results in a container refusing to remove a drive as it still seems to be a part of some array. It seems 'ping_monitor' could be a solution here to assure monitor has had a chance to process the events, however it doesn't resolve the problem - sometimes an array has to request a release of the drive few times (as the array is busy) and single 'ping_monitor' call is not sufficient. As there is no way to query monitor progress, it forces us to retry a check several times before an error is returned. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Manage_subdevs(): Remove unnecessary NULL initializationJes Sorensen2016-03-221-1/+1
| | | | Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Manage_add(): Avoid NULL initialization of dev_stJes Sorensen2016-03-221-13/+12
| | | | | | | | dev_st is only ever assigned if array->not_persistent == 0, so move the second use of it into the same scope where the assignment is made. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Manage_add(): Fix memory leakJes Sorensen2016-03-221-0/+3
| | | | | | | | sysfs_read() allocates and populates a struct mdinfo, however the code forgot to free it again, before dropping the reference to the pointer. Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Fix regression during add devicesHannes Reinecke2016-03-101-1/+1
| | | | | | | | | | | | Commit d180d2aa2a17 ("Manage: fix test for 'is array failed'.") introduced a regression which would not allow to re-add new drivers to a failed array. Fixes: d180d2aa2a17 ("Manage: fix test for 'is array failed'.") Signed-off-by: Hannes Reinecke <hare@suse.de> Cc: Coly Li <colyli@suse.de> Cc: Neil Brown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Manage_subdevs() fix file descriptor leakJes Sorensen2016-03-091-2/+3
| | | | | Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Manage_add(): Fix potential NULL pointer dereferenceJes Sorensen2016-03-081-0/+4
| | | | | | | sysfs_read() may return NULL, so we should check the validity of the pointer before dereferencing it. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage: Remove unnecessary NULL pointer checksJes Sorensen2016-03-081-6/+3
| | | | | | | sysfs_free() handles NULL pointers, so remove superfluous NULL pointer checks before calling it. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* Manage.c: Only issue change events for kernels older than 2.6.28Jes Sorensen2016-02-171-8/+11
| | | | | | | | | 2.6.28+ kernels handle this themselves and issuing the event here can cause a race. Reported-by: Sebastian Parschauer <sebastian.riemer@profitbricks.com> Suggested-by: NeilBrown <neilb@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
* in --add assign raid_disk of 0 to journalSong Liu2015-12-211-1/+1
| | | | | | Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
* recreate journal in mdadmSong Liu2015-12-161-3/+39
| | | | | | | | | | | | | | | | | | | | | | | This patch tries recreates missing/faulty journal in mdadm. Example: ./mdadm --fail /dev/md1 /dev/sdb2 mdadm: set /dev/sdb2 faulty in /dev/md1 ./mdadm --stop /dev/md1 mdadm: stopped /dev/md1 ./mdadm -A --scan --force mdadm: Journal is missing or stale, starting array read only. mdadm: /dev/md/1 has been started with 15 drives. ./mdadm --add-journal /dev/md1 /dev/sdb2 mdadm: added /dev/sdb2 Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
* re-add: make re-add try to write sysfs node firstGuoqing Jiang2015-10-081-0/+13
| | | | | | | If sysfs node existed, we should try to write "re-add" to it. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>
* mdadm: make cluster raid also could support re-addGuoqing Jiang2015-09-281-0/+9
| | | | | | | | If it is a cluster raid, the disc.state need to be changed accordingly when do re-add. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>