summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* ReadMe: Fix stylistic issuesMariusz Tkaczyk2024-11-053-211/+155
| | | | | | No functional changes, just adopt style to allow checkpatch to pass. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdmon: delegate removal to managemonMariusz Tkaczyk2024-11-044-41/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from [1], kernel requires suspend lock on member drive remove path. It causes deadlock with external management because monitor thread may be locked on suspend and is unable to switch array to active, for example if badblock is reported in this time. It is blocking action now, so it must be delegated to managemon thread but we must ensure that monitor does metadata update first, just after detecting faulty. This patch adds appropriative support. Monitor thread detects "faulty", and updates the metadata. After that, it is asking manager thread to remove the device. Manager must be careful because closing descriptors used by select() may lead to abort with D_FORTIFY_SOURCE=2. First, it must ensure that device descriptors are not used by monitor. There is unlimited numer of remove retries and recovery is blocked until all failed drives are removed. It is safe because "faulty" device is not longer used by MD. Issue will be also mitigated by optimalization on badlbock recording path in kernel. It will check if device is not failed before badblock is recorded but relying on this is not ideologically correct. Userspace must keep compatibility with kernel and since it is blocking action, we must tract is as blocking action. [1] kernel commit cfa078c8b80d ("md: use new apis to suspend array for adding/removing rdev from state_store()") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* monitor: Add DS_EXTERNAL_BB flagMariusz Tkaczyk2024-11-042-21/+31
| | | | | | | | | | If this is set, then metadata handler must support external badblocks. Remove checks for superswitch functions. If mdi->state_fd is not set then we should not try to record badblock, we cannot trust this device. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* sysfs: add sysfs_open_memb_attr()Mariusz Tkaczyk2024-11-043-83/+67
| | | | | | | | | | Function is added to not repeat defining "dev-%s", disk_name. Related code branches are updated. Ioctl way for setting disk faulty/remove is removed, sysfs is always used now. Some non functional style issues are fixed in Manage_subdevs(). Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* [PATCH] mdadm: Grow.c distinguish takeover vs reshape on grow operationNigel Croxon2024-10-281-1/+2
| | | | | | | | Correcting the terminology on the output when doing a takeover vs a reshape. Signed-off-by: Nigel Croxon <ncroxon@redhat.com> Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
* mdadm/Grow: Check new_level interface rather than kernel versionXiao Ni2024-10-181-1/+1
| | | | | | | | Different os distributions have different kernel version themselves. Check new_level sysfs interface rather than kernel version. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Manage: Clear superblock if adding new device failsXiao Ni2024-10-181-0/+4
| | | | | | | | The superblock is kept if adding new device fails. It should clear the superblock if it fails to add a new disk. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* util: use only /dev directory in open_dev()Kinga Stefaniuk2024-10-161-11/+0
| | | | | | | | | Previously, open_dev() tried to open device in two ways - using /dev and /tmp directory. This method could be used by users which have no access to /tmp directory (e.g. udev) and dev_open() fails which may affect many processes. Remove try to open in /tmp directory. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* mdadm.man: Add udev-rules flagAndre Paiusco2024-10-161-0/+10
| | | | | | | --udev-rules flag is added and point to mdadm.conf man page for further explanations about POLICY. Signed-off-by: Andre Paiusco <github@paiusco.org>
* mdadm.conf.man: Explain udev ruleAndre Paiusco2024-10-161-10/+14
| | | | | | | | | Clarify a filename is accepted and the need of reloading the udev rules. Small correction on example order. Signed-off-by: Andre Paiusco <github@paiusco.org>
* mdadm: Add mdadm_status.hAnna Sztukowska2024-10-103-8/+16
| | | | | | | Move mdadm_status_t to mdadm_status.h file. Add status for memory allocation failure. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* mdadm.man: elaborate more about mdmonitor.serviceMariusz Tkaczyk2024-10-102-29/+34
| | | | | | Describe how it behaves and how it can be configured to work. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdmonitor: Abandon custom configuration filesMariusz Tkaczyk2024-10-103-53/+15
| | | | | | | | | | | | | | | Operating system vendors are customizing mdmonitor service beacause the default form is not satifying for them (expect SUSE). As a result, support is complicated (maintainers have to check the system) and man page is not detailed. I propose to abandon custom configuration files via sysconfig and keep it inside mdadm.conf only. Detailed comment in service for OSV maintainers is added to help with transition. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* super-intel: move scsi_get_serial from sg_ioKinga Stefaniuk2024-10-083-66/+45
| | | | | | | scsi_get_serial() function is used only by super-intel.c. Move function to this file and remove sg_io.c file. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* Rename Monitor.c to mdmonitor.cKinga Stefaniuk2024-10-072-1/+1
| | | | | | | Rename Monitor.c to mdmonitor.c to avoid errors during compilation on case-insensitive filesystems. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* util: fix sys_hot_remove_disk()Mariusz Tkaczyk2024-10-041-1/+1
| | | | | | | Instead of "remove", "faulty" was called. Fixes: d95edceb362a ("sysfs: add function for writing to sysfs fd") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* md.man: update refference to raid5-ppl.rstMariusz Tkaczyk2024-10-041-8/+2
| | | | | | | | | Documentation/md has moved to Documentation/driver-api/md. Update and and rework sentence. Remove refference to not supported kernel close to updated text. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: add xmalloc.hMariusz Tkaczyk2024-09-2733-44/+100
| | | | | | | | | | Move memory declaration helpers outside mdadm.h. They seems to be useful so keep them but include separatelly. Rework them to not reffer to Name[] declared internally in mdadm/mdmon. This is first step to start decomplexing mdadm.h. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* Mdmonitor: Fix startup with missing directoryAnna Sztukowska2024-09-271-7/+7
| | | | | | | | | Commit 0a07dea8d3b78 ("Mdmonitor: Refactor check_one_sharer() for better error handling") introduced an issue, if directory /run/mdadm is missing, monitor fails to start. Move the directory creation earlier to ensure it is always created. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* sysfs: add function for writing to sysfs fdMariusz Tkaczyk2024-09-276-56/+101
| | | | | | | | | | Proposed function sysfs_wrte_descriptor() unifies error handling for write() done to sysfs files. Main purpose is to use it with MD sysfs file but it can be used elsewhere. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* Incremental: Rename IncrementalRemoveMariusz Tkaczyk2024-09-273-5/+4
| | | | | | | Rename it to Incremental_remove for better readability. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* CI: do not install unnecessary packagesKinga Stefaniuk2024-09-261-2/+2
| | | | | | | Updating all of the packages every time is not needed and costs a lot of resources. Install only necessary packages and their dependencies. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
* Remove INSTALL and dev/nullMariusz Tkaczyk2024-09-232-13/+0
| | | | | | | | | INSTALL is not needed because it added to README.md dev/null was created accidentally. Remove them. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Manage: record errnoXiao Ni2024-09-231-3/+5
| | | | | | | | | Sometimes it reports: mdadm: failed to stop array /dev/md0: Success It's the reason the errno is reset. So record errno during the loop. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: remove 09imsm-assemble.brokenXiao Ni2024-09-231-6/+0
| | | | | | | 09imsm-assemble can run successfully. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: 07testreshape5 fixXiao Ni2024-09-232-12/+1
| | | | | | | Init dir to avoid test failure. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: Remove 07reshape5intr.brokenXiao Ni2024-09-231-45/+0
| | | | | | | 07reshape5intr can run successfully now. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: 07changelevels fixXiao Ni2024-09-234-24/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are five changes to this case. 1. remove testdev check. It can't work anymore and check if it's a block device directly. 2. It can't change level and chunk size at the same time 3. Sleep more than 10s before check wait. The test devices are small. Sometimes it can finish so quickly once the reshape just starts. mdadm will be stuck before it waits reshape to start. So the sync speed is limited. And it restores the sync speed when it waits reshape to finish. It's good for case without backup file. It uses systemd service mdadm-grow-continue to monitor reshape progress when specifying backup file. If reshape finishes so quickly before it starts monitoring reshape progress, the daemon will be stuck too. Because reshape_progress is 0 which means the reshape hasn't been started. So give more time to let service can get right information from kernel space. But before getting these information. It needs to suspend array. At the same time the reshape is running. The kernel reshape daemon will update metadata 10s. So it needs to limit the sync speed more than 10s before restoring sync speed. Then systemd service can suspend array and start monitoring reshape progress. 4. Wait until mdadm-grow-continue service exits mdadm --wait doesn't wait systemd service. For the case that needs backup file, systemd service deletes the backup file after reshape finishes. In this test case, it runs next case when reshape finishes. And it fails because it can't create backup file because the backup file exits. 5. Don't reshape from raid5 to raid1. It can't work now. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/tests: wait until level changesXiao Ni2024-09-231-0/+4
| | | | | | | | | check wait waits reshape finishes, but it doesn't wait level changes. The level change happens in a forked child progress. So we need to search the child progress and monitor it. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Grow: sleep a while after removing disk in impose_levelXiao Ni2024-09-231-0/+7
| | | | | | | | | | | It needs to remove disks when reshaping from raid456 to raid0. In kernel space it sets MD_RECOVERY_RUNNING. And it will fail to change level. So wait sometime to let md thread to clear this flag. This is found by test case 05r6tor0. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Grow: Can't open raid when running --grow --continueXiao Ni2024-09-231-3/+6
| | | | | | | | It passes 'array' as devname in Grow_continue. So it fails to open raid device. Use mdinfo to open raid device. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Grow: Update reshape_progress to need_back after reshape finishesXiao Ni2024-09-231-4/+10
| | | | | | | | | | | It tries to update data offset when kicking off reshape. If it can't change data offset, it needs to use child_monitor to monitor reshape progress and do back up job. And it needs to update reshape_progress to need_back when reshape finishes. If not, it will be in a infinite loop. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm/Grow: Update new level when starting reshapeXiao Ni2024-09-231-0/+9
| | | | | | | | | | | | | | | Reshape needs to specify a backup file when it can't update data offset of member disks. For this situation, first, it starts reshape and then it kicks off mdadm-grow-continue service which does backup job and monitors the reshape process. The service is a new process, so it needs to read superblock from member disks to get information. But in the first step, it doesn't update new level in superblock. So it can't change level after reshape finishes, because the new level is not right. So records the new level in the first step. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: Add compilation process to README.mdAnna Sztukowska2024-09-231-0/+55
| | | | | | Add compilation process and dependencies to README.md. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* Detail.c: Fix divide_by_zero issueAnna Sztukowska2024-09-231-6/+9
| | | | | | | | Fix divide_by_zero issue reported by SAST analysis in Detail.c when calling enough() from util.c. Also add missing spaces for better code readability. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* Incremental: support devnode in IncrementalRemove.Mariusz Tkaczyk2024-09-103-23/+46
| | | | | | | | There are no reasons to keep this interface different than others. Allow to use devnode but keep old way for backward compatibility. Method is added to verify that only devnode or kernel name is used. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* dlink.h: Fix checkpatch warnings for function argsAnna Sztukowska2024-09-101-7/+7
| | | | | | | Checkpatch issued a warning due to missing function argument names. Add the names to resolve the warnings. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* Examine.c: Fix memory leaks in Examine()Anna Sztukowska2024-09-103-5/+33
| | | | | | | | | Fix memory leaks in Examine() reported by SAST analysis. Implement a method to traverse and free all the nodes of the doubly linked list. Replace for loop with while loop in order to improve redability of the code and free allocated memory correctly. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* imsm: save checkpoint prior to exitMateusz Kusiak2024-09-041-2/+3
| | | | | | | | | | | | If reshape (eg. chunksize migration) is gracefully stopped via SIGTERM the checkpoint is not saved and reshape cannot be resumed due to "data being present in copy area". This is because UNIT_SRC_NORMAL isn't set if SIGTERM occurred. Move SIGTERM handling at the end of the loop to allow saving checkpoint (and state) so reshapes can be properly resumed. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
* mdadm: Increase number limit in md device name to 1024.Shminderjit Singh2024-09-041-1/+1
| | | | | | | | | | Updated the maximum device number in md device names from 127 to 1024. The previous limit was causing issues in the automation framework. This change ensures backward compatibility and allows for future scalability. Fixes: 25aa7329141c ("mdadm: numbered names verification") Signed-off-by: Shminderjit Singh <shminderjit.singh@oracle.com>
* imsm: add IMSM_OROM_CAPABILITIES_TPV to nvme oromMariusz Tkaczyk2024-08-302-40/+39
| | | | | | | | | | | | | | Add it to avoid excluding. It has some value for users even if it is always true for nvme virtual orom. Rework detail-platform printing code, move printing 3rd party nvmes to print_imsm_capability (as it should be), but keep it meaningful only for nvme controllers (NVME and VMD hba types). Pass whole orom_entry instead of orom there. Squash code responsible for printing NVME and VMD hbas. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* imsm: Remove warning and refactor add_to_super_imsm codeMariusz Tkaczyk2024-08-301-63/+39
| | | | | | | Intel x8 drives are not supported, remove unnecessary warning and refactor add_to_super_imsm code. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* mdadm: Change displaying of devices in --detailAnna Sztukowska2024-08-301-10/+4
| | | | | | | | | | The counts of active, working, failed and spare devices were not printed when the number was zero. Refactor the code to always display the counts of all device types, regardless of their number. This way, it is more reliable for users. Signed-off-by: Anna Sztukowska <anna.sztukowska@intel.com>
* platform-intel: refactor path_attached_to_hba()Mateusz Kusiak2024-08-303-22/+19
| | | | | | | dprintf() call in path_attached_to_hba() is too noisy. Remove the call and refactor the function. Remove obsolete env variables check. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com>
* imsm: get bus from VMD driver directoryMariusz Tkaczyk2024-08-301-11/+77
| | | | | | | | | | | | | | | Enumeration of VMD child devices is started early, kernel is not waiting for VMD enumeration to finish. It causes that: /sys/bus/pci/drivers/vmd/{dev}/domain/device link might be not yet ready. With PCI gen5 devices we can observe that mdadm is failing to start IMSM raid arrays because of that. In that case, it needs to find bus path manually. Look for bus device in VMD driver directory if realpath() failed with ENOENT. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
* imsm: add read OROM form ACPI UEFI tablesBlazej Kucman2024-08-131-25/+299
| | | | | | | | | | | | | | | | | | | | | | | OROM - IMSM hardware capabilities EFI vars depends on userspace, they need to be mounted to be accessible. Sporadic problems have been observed with availability at an early assemble stage. It is not possible to fully synchronize EFI vars mounts with udev rules processing. For the reason above, read of IMSM OROM from ACPI tables as secondary option is added. This method will be used for SATA and VMD family controllers. ACPI tables are generated by sysfs, earlier in the boot process, before the stage of RAID assembly. The way of loading OROM via EFI vars is retained, ACPI tables will be a backup way. Two paths will be maintained, because IMSM hardware capabilities are necessary for RAID assembly during booting, so access to them must be provided. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com>
* mdadm: sysfs.c fix coverity issuesNigel Croxon2024-08-131-1/+3
| | | | | | | | | | | | | | | | | | Fixing the following coding errors the coverity tools found: * Event fixed_size_dest: You might overrun the 32-character fixed-size string "mdi->sys_name" by copying "devnm" without checking the length * Event fixed_size_dest: You might overrun the 50-character fixed-size string "sra->text_version" by copying "buf + 9" without checking the length. * Event string_overflow: You might overrun the 32-character destination string "dev->sys_name" by writing 256 characters from "de->d_name". Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
* mdadm: util.c fix coverity issuesNigel Croxon2024-08-131-16/+25
| | | | | | | | | | | | | | | | | | Fixing the following coding errors the coverity tools found: * Event check_return: Calling "open" without checking return value * Event check_return: Calling "lseek(fd, sector_size, 0)" without checking return value. * Event leaked_handle: Handle variable "fd" going out of scope leaks the handle. * Event leaked_storage: Variable "dir" going out of scope leaks the storage it points to. * Event fixed_size_dest: You might overrun the 32-character fixed-size string "st->devnm" by copying "_devnm" without checking the length. * Event fixed_size_dest: You might overrun the 32-character fixed-size string "container" by copying "dev" without checking the length. Signed-off-by: Nigel Croxon <ncroxon@redhat.com>
* md.4: replace wrong wordNicolas Roeser2024-08-131-1/+1
| | | | | | There is a wrong word in the md(4) man page, this commit corrects it. Signed-off-by: Nicolas Roeser <nicolas.roeser@alumni.uni-ulm.de>
* mdstat: fix list detach issuesMariusz Tkaczyk2024-08-061-2/+4
| | | | | | | | | | | | Move ent = ent->next; to while. It was outside the loop so if there are more than 2 elements and we are looking for 3rd element it causes infinite loop.. Fix el->next zeroing. It causes segfault in mdstat_free(). Theses issues were not visible in my testing because I had only 2 MD devices. Fixes: 4b3644ab4ce6 ("mdstat: Rework mdstat external arrays handling") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>