mdadm - mdadm

	Commit message (Collapse)	Author	Age	Files	Lines
*	Monitor/msg: Don't print error message if mdmon doesn't run	Mariusz Tkaczyk	2017-11-21	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 4515fb28a53a ("Add detail information when can not connect monitor") was added to warn about failed connection to monitor in WaitClean function (see link below). Mdmon runs for IMSM containers when they have array with redundancy so if mdmon doesn't run, mdadm prints this error. This is misleading and unnecessary. Just print it in WaitClean function. The sock in WaitClean is deprecated so it is removed. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1375002 Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Check redundancy for arrays	Mariusz Tkaczyk	2017-10-02	1	-4/+4
\| \| \| \| \| \| \| \| \|	GET_MISMATCH option doesn't exist for RAID arrays without redundancy so sysfs_read fails if this information is requested. Set options according to the device using information from /proc/mdstat. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Include containers in spare migration	Mariusz Tkaczyk	2017-08-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Spare migration doesn't work for external metadata. mdadm skips a container with spare device because it is inactive. It used to work because GET_ARRAY_INFO ioctl returned valid structure for a container and mdadm treated such response as active container. Current implementation checks it in sysfs where container is shown as inactive. Adapt sysfs implementation to work the same way as ioctl. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: containers don't have the same sysfs properties as arrays	Mariusz Tkaczyk	2017-08-16	1	-18/+28
\| \| \| \| \| \| \| \| \|	GET_MISMATCH option doesn't exist for containers so sysfs_read fails if this information is requested. Set options according to the device using information from /proc/mdstat. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: don't assume mdadm parameter is a block device	Tomasz Majchrzak	2017-07-10	1	-2/+11
\| \| \| \| \| \| \| \| \| \| \|	If symlink (e.g. /dev/md/raid) is passed as a parameter to mdadm --wait, it fails as it's not able to find a corresponding entry in /proc/mdstat output. Get parameter file major:minor and look for block device name in sysfs. This commit is partial revert of commit 9e04ac1c43e6 ("mdadm/util: unify stat checking blkdev into function"). Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Get failed disk count from array state	Tomasz Majchrzak	2017-06-05	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent commit has changed the way failed disks are counted. It breaks recovery for external metadata arrays as failed disks are not part of the array and have no corresponding entries is sysfs (they are only reported for containers) so degraded arrays show no failed disks. Recent commit overwrites GET_DEGRADED result prior to GET_STATE and it is not set again if GET_STATE has not been requested. As GET_STATE provides the same information as GET_DEGRADED, the latter is not needed anymore. Remove GET_DEGRADED option and replace it with GET_STATE option. Don't count number of failed disks looking at sysfs entries but calculate it at the end. Do it only for arrays as containers report no disks, just spares. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	mdadm: Fixup more broken logical operator formatting	Jes Sorensen	2017-05-16	1	-2/+2
\| \| \| \|	Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Fixup a pile of whitespace issues	Jes Sorensen	2017-05-11	1	-55/+55
\| \| \| \| \| \|	No code was hurt in this event Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: mailfrom is initialized correctly	Jes Sorensen	2017-05-11	1	-1/+1
\| \| \| \| \| \|	Remove gratituous variable initialization. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Not much point declaring mdlist in both forks of the if() statement	Jes Sorensen	2017-05-11	1	-2/+3
\| \| \| \|	Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Use working_disks from sysfs	Jes Sorensen	2017-05-09	1	-2/+2
\| \| \| \| \| \|	sysfs now provides working_disks information, so lets use it too. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Get nr_disks, active_disks and spare_disks from sysfs	Jes Sorensen	2017-05-09	1	-7/+7
\| \| \| \| \| \| \|	This leaves working_disks and utime missing before we can eliminate check_array()'s call to md_get_array_info() Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Get array_disks from sysfs	Jes Sorensen	2017-05-09	1	-2/+2
\| \| \| \|	Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Get 'failed_disks' from sysfs	Jes Sorensen	2017-05-09	1	-3/+4
\| \| \| \|	Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Obtain RAID level from syfs	Jes Sorensen	2017-05-09	1	-3/+3
\| \| \| \|	Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Read sysfs entry earlier	Jes Sorensen	2017-05-09	1	-6/+10
\| \| \| \| \| \| \|	This will allow us to pull additional info from sysfs, such as level and device info. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Declate mdinfo instance globally	Jes Sorensen	2017-05-09	1	-2/+2
\| \| \| \| \| \|	We can pull in more information from sysfs earlier, so move sra to the top. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Reduce duplicated error handling	Jes Sorensen	2017-05-09	1	-24/+15
\| \| \| \| \| \| \|	Avoid closing fd in multiple places, and duplicating the error message for when a device disappeared. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor/check_array: Centralize exit path	Jes Sorensen	2017-05-09	1	-10/+14
\| \| \| \| \| \| \|	Improve exit handling to make it easier to share error handling and free sysfs entries later. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Add sector size as spare selection criterion	Alexey Obitotskiy	2017-05-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	Add sector size as new spare selection criterion. Assume that 0 means there is no requirement for the sector size in the array. Skip disks with unsuitable sector size when looking for a spare to move across containers. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Allow more spare selection criteria	Alexey Obitotskiy	2017-05-09	1	-14/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Disks can be moved across containers in order to be used as a spare drive for reubild. At the moment the only requirement checked for such disk is its size (if it matches donor expectations). In order to introduce more criteria rename corresponding superswitch method to more generic name and move function parameter to a structure. This change is a big edit but it doesn't introduce any changes in code logic, it just updates function naming and parameters. Signed-off-by: Alexey Obitotskiy <aleksey.obitotskiy@intel.com> Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Code is 80 characters per line	Jes Sorensen	2017-05-08	1	-34/+27
\| \| \| \| \| \| \|	Fix up some lines that are too long for no reason, and some that have silly line breaks. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Monitor: Use md_array_active() instead of manually fiddling in sysfs	Jes Sorensen	2017-05-08	1	-28/+11
\| \| \| \| \| \| \|	This removes a pile of clutter that can easily behandled with a simple check of array_state. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	mdadm/util: unify stat checking blkdev into function	Zhilong Liu	2017-05-05	1	-12/+4
\| \| \| \| \| \| \| \| \| \| \| \|	declare function stat_is_blkdev() to integrate repeated stat checking blkdev operations, it returns 'true/1' when it is a block device, and returns 'false/0' when it isn't. The devname is necessary parameter, rdev is optional, parse the pointer of dev_t rdev, if valid, assigned device number to dev_t *rdev, if NULL, ignores. Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	Retire mdassemble	Jes Sorensen	2017-04-11	1	-3/+0
\| \| \| \| \| \| \| \|	mdassemble doesn't handle container based arrays, no support for sysfs, etc. It has not been actively maintained for years, so time to send it off to retirement. Signed-off-by: Jes Sorensen <jsorensen@fb.com>
*	sysfs: Make sysfs_init() return an error code	Jes Sorensen	2017-03-30	1	-1/+3
\| \| \| \| \| \| \| \|	Rather than have the caller inspect the returned content, return an error code from sysfs_init(). In addition make all callers actually check it. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
*	util: Introduce md_get_disk_info()	Jes Sorensen	2017-03-29	1	-1/+1
\| \| \| \| \| \| \|	This removes all the inline ioctl calls for GET_DISK_INFO, allowing us to switch to sysfs in one place, and improves type checking. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
*	util: Introduce md_get_array_info()	Jes Sorensen	2017-03-29	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \|	Remove most direct ioctl calls for GET_ARRAY_INFO, except for one, which will be addressed in the next patch. This is the start of the effort to clean up the use of ioctl calls and introduce a more structured API, which will use sysfs and fall back to ioctl for backup. Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
*	mdadm/Monitor: Fix NULL pointer dereference when stat2devnm return NULL	Zhilong Liu	2017-03-28	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	Wait(): stat2devnm() returns NULL for non block devices. Check the pointer is valid derefencing it. This can happen when using --wait, such as the 'f' and 'd' file type, causing a core dump. such as: ./mdadm --wait /dev/md/ Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Zhilong Liu <zlliu@suse.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
*	Monitor: release /proc/mdstat fd when no arrays present	Tomasz Majchrzak	2016-07-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If md kernel module is reloaded, /proc/mdstat cannot be accessed ("cat: /proc/mdstat: No such file or directory"). The reason is mdadm monitor still holds a file descriptor to previous /proc/mdstat instance. It leads to really confusing outcome of the following operations - mdadm seems to run without errors, however some udev rules don't get executed and new array doesn't work. Add a check if lseek was successful as it fails if md kernel module has been unloaded - close a file descriptor then. The problem is mdadm monitor doesn't always do it before next operation takes place. To prevent it monitor always releases /proc/mdstat descriptor when there are no arrays to be monitored, just in case driver unload happens in a moment. Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com> Reviewed-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
*	Monitor: Use sysfs_free() to free object returned by sysfs_read()	Jes Sorensen	2016-06-10	1	-1/+1
\| \| \| \| \| \| \|	We should always use sysfs_free() to release sysfs_* allocated objects. Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
*	Fix some type comparison problems	Xiao Ni	2016-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	As 26714713cd2bad9e0bf7f4669f6cc4659ceaab6c said, 32 bit signed timestamps will overflow in the year 2038. It already changed the utime and ctime in struct mdu_array_info_s from int to unsigned int. So we need to change the values that compared with them to unsigned int too. Signed-off-by : Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
*	Monitor: don't Wait forever on a 'frozen' array.	NeilBrown	2015-07-06	1	-2/+10
\| \| \| \| \| \| \|	If Wait() finds the array resync is 'frozen', then wait a little while to avoid races, but don't wait forever. Signed-off-by: NeilBrown <neilb@suse.com>
*	mdadm: monitor: fix nullptr dereference when get_md_name() returns NULL	Sergey Vidishev	2015-05-20	1	-1/+9
\| \| \| \| \| \| \| \| \|	Function add_new_arrays() expects that function get_md_name() should return pointer to devname, but also get_md_name() may return NULL. So check the pointer before use it in add_new_arrays(). Signed-off-by: Sergey Vidishev <sergeyv@yandex-team.ru> Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: use the "space protocol" for "Wrong-Level".	NeilBrown	2015-04-08	1	-1/+1
\| \| \| \| \| \| \|	"Wrong-Level" is a reason, not a component device, so it should start with a space to indiciate this to alert(). Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: Obey "space protocol" when writing to syslog.	NeilBrown	2015-04-08	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	"alert" treats the "disc" arg differently if it starts with a space. At least it does for sending email. It doesn't for writing to syslog. Make this consistent and obey the 'space protocol' when writing to syslog. Signed-off-by: NeilBrown <neilb@suse.de>
*	Don't break long strings onto multiple lines.	NeilBrown	2015-02-12	1	-23/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is best to keep strings all together so that they are easier to search for in the source code. If a string is so long that it looks ugly one line, them maybe it should be broken into multiple lines for display too. Only strings which contain a newline can be broken into multiple lines: "It is OK to\n" "break this string\n" Signed-off-by: NeilBrown <neilb@suse.de>
*	Change way of printing name of a process	Pawel Baldysiak	2015-02-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Sometimes mdadm prints messages with wrong name "mdmon", and vice versa. This patch solves this problem by changing method of determining process name. Now "Name" will be set in const at start of a program, previously was hardcoded as #define. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: fix for regression with container devices	Artur Paszkiewicz	2015-02-11	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes 2 problems introduced by commit 9a518d8: not closing a file descriptor and ignoring container devices. Array state is always "inactive" for containers, so we make sure that the device is not a container by reading also the "level" sysfs entry. Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com> Reviewed-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: don't open md array that doesn't exist.	NeilBrown	2014-11-25	1	-1/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Opening a block-special-device for an array that doesn't exist causes that array to be instantiated (as an empty array). Races at array shutdown can cause the array to spontaneously re-appear if some deamon notices a 'change' event and goes to investigate. Teach "mdadm --monitor" to avoid this race by checking the "array_state" before opening the device. Reported-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: Stop monitoring devices that have disappeared.	NeilBrown	2014-08-14	1	-6/+18
\| \| \| \| \| \| \| \|	If we are only monitoring a device because we found it in /proc/mdstat, and it has been gone for 5 checks, forget about it completely. Signed-off-by: NeilBrown <neilb@suse.de>
*	New function: sysfs_wait	NeilBrown	2013-07-01	1	-8/+2
\| \| \| \| \| \| \|	We have several places that wait for activity on a sysfs file. Combine most of these into a single 'sysfs_wait' function. Signed-off-by: NeilBrown <neilb@suse.de>
*	Remove lots of unnecessary white space.	NeilBrown	2013-06-19	1	-7/+5
\| \| \| \| \| \| \|	Now that I am using white-space mode in Emacs I can see all of this, and I don't like it :-) Signed-off-by: NeilBrown <neilb@suse.de>
*	Wait: also wait if an action is about to start.	NeilBrown	2013-05-01	1	-0/+13
\| \| \| \| \| \| \| \| \| \|	If a sync/recover action is about to start but hasn't actually begun yet, /proc/mdstat won't show it, but md/sync_action will (it checks MD_RECOVERY_NEEDED). So when /proc/mdstat seems to say nothing is happening, double check with md/sync_action. Signed-off-by: NeilBrown <neilb@suse.de>
*	Discard devnum in favour of devnm	NeilBrown	2013-02-21	1	-46/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We widely use a "devnum" which is 0 or +ve for md%d devices and -ve for md_d%d devices. But I want to be able to use md_%s device names. So get rid of devnum (a number) and use devnm (a 32char string). eg. md0 md_d2 md_home Signed-off-by: NeilBrown <neilb@suse.de>
*	Allow --wait to wait for delayed resync.	NeilBrown	2012-11-21	1	-1/+1
\| \| \| \| \| \| \| \|	If a resync is delayed, then e->percent will be negative but not RESYNC_NONE. In that case we still want to wait. Reported-by: Ross Boylan <ross@biostat.ucsf.edu> Signed-off-by: NeilBrown <neilb@suse.de>
*	Monitor: don't complain about non-monitorable arrays in mdadm.conf	NeilBrown	2012-10-24	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are asked to monitor a RAID0 or Linear - which cannot be monitored - we complain with "Device Disappeared .... Wrong-Level". However if the RAID0 or Linear is being requested because it is in mdadm.conf then the message is inappropriate and confusing. So track which arrays are added from the config file, and suppress that message in that case. Reported-by: "Johnson Yan" <johnson_yan@usish.com> Signed-off-by: NeilBrown <neilb@suse.de>
*	Change Monitor to take a struct context	NeilBrown	2012-07-09	1	-13/+14
\| \| \| \|	Signed-off-by: NeilBrown <neilb@suse.de>
*	Remove scattered checks for malloc success.	NeilBrown	2012-07-09	1	-15/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	malloc should never fail, and if it does it is unlikely that anything else useful can be done. Best approach is to abort and let some super-daemon restart. So define xmalloc, xcalloc, xrealloc, xstrdup which don't fail but just print a message and exit. Then use those removing all the tests for failure. Also replace all "malloc;memset" sequences with 'xcalloc'. Signed-off-by: NeilBrown <neilb@suse.de>
*	Introduce pr_err for printing error messages.	NeilBrown	2012-07-09	1	-12/+12
\| \| \| \| \| \| \|	'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": ' cont_err() is also available. Signed-off-by: NeilBrown <neilb@suse.de>