| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
After 1db03765("Subdevs can't be all missing when create raid device")
raid volume can't be created with link to container. This feature should
not be blocked in Create function. IMSM code forbids creation of
container with missing disk, so case like all dev's missing is already
handled.
Permit IMSM volume creation when devices are given as link to container.
Signed-off-by: Michal Zylowski <michal.zylowski@intel.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
|
|
|
|
|
| |
Signed-off-by: Xiao Ni <xni@redhat.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
declare function fstat_is_blkdev() to integrate repeated fstat
checking block device operations, it returns true/1 when it is
a block device, and returns false/0 when it isn't.
The fd and devname are necessary parameters, *rdev is optional,
parse the pointer of dev_t *rdev, if valid, assigned the device
number to dev_t *rdev, if NULL, ignores.
Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an array is created the content is not initialized,
so it could have remnants of an old filesystem or md array
etc on it.
udev will see this and might try to activate it, which is almost
certainly not what is wanted.
So create a mechanism for mdadm to communicate with udev to tell
it that the device isn't ready. This mechanism is the existance
of a file /run/mdadm/created-mdXXX where mdXXX is the md device name.
When creating an array, mdadm will create the file.
A new udev rule file, 01-md-raid-creating.rules, will detect the
precense of thst file and set ENV{SYSTEMD_READY}="0".
This is fairly uniformly used to suppress actions based on the
contents of the device.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <jsorensen@fb.com>
|
|
|
|
|
|
| |
More legacy code moved to the bit-bucket.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
| |
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
| |
These always go at the end of the line, never at the front
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
|
| |
Create:declaring 'struct stat stb' twice within the same
function, rename stb as stb2 when declares 'struct stat'
at the second time.
Signed-off-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
| |
Rather than have the caller inspect the returned content, return an
error code from sysfs_init(). In addition make all callers actually
check it.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Remove most direct ioctl calls for GET_ARRAY_INFO, except for one,
which will be addressed in the next patch.
This is the start of the effort to clean up the use of ioctl calls and
introduce a more structured API, which will use sysfs and fall back to
ioctl for backup.
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable creating and assembling raid5 arrays with PPL for 1.x metadata.
When creating, reserve enough space for PPL and store its size and
location in the superblock and set MD_FEATURE_PPL bit. Write an initial
empty header in the PPL area on each device. PPL is stored in the
metadata region reserved for internal write-intent bitmap, so don't
allow using bitmap and PPL together.
While at it, fix two endianness issues in write_empty_r5l_meta_block()
and write_init_super1().
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new parameter to mdadm: --consistency-policy=. It determines how
the array maintains consistency in case of unexpected shutdown. This
maps to the md sysfs attribute 'consistency_policy'. It can be used to
create a raid5 array using PPL. Add the necessary plumbing to pass this
option to metadata handlers. The write journal and bitmap
functionalities are treated as different policies, which are implicitly
selected when using --write-journal or --bitmap options.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently use '1' to indicate that a flag (writemostly or failfast)
needs to be set, and '2' to indicate that it needs to be cleared.
Using magic number like this is not a best-practice.
So replaced them with values from a enum.
No functional change.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow per-device "failfast" flag to be set when creating an
array or adding devices to an array.
When re-adding a device which had the failfast flag, it can be removed
using --nofailfast.
failfast status is printed in --detail and --examine output.
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
|
|
|
|
|
|
|
|
| |
add_internal_bitmap() returned 1 on success and 0 on error which is
inconsistent. This changes it to return 0 on success and use more
reasonable error codes on error.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
|
|
|
|
|
|
|
|
|
| |
It doesn't make sense to create a clustered raid
with only 1 node.
Reported-by: Zhilong Liu <zlliu@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
|
|
|
|
|
|
|
|
|
| |
The check of "is there a filesystem here" is still appropriate for a
journal device.
Also set active_disks correctly - even though it is ignored.
Signed-off-by: NeilBrown <neilb@suse.com>
|
|
|
|
|
|
|
| |
Recent commit caused 'missing' declarations to not be handled correctly.
Fixes: cc1799c3ddc9 ("Enable create array with write journal (--write-journal DEVICE).")
Signed-off-by: NeilBrown <neilb@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specify the write journal device with --write-journal DEVICE
./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
Only one journal device is allowed. If multiple --write-journal
are given, mdadm will use the first and ignore others
./mdadm --create -f /dev/md0 --assume-clean -c 32 --raid-devices=4 --level=5 /dev/sd[c-f] --write-journal /dev/sdb1 --write-journal /dev/sdx
mdadm: Please specify only one journal device for the array.
mdadm: Ignoring --write-journal /dev/sdx...
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: NeilBrown <neilb@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels
to assemble a clustered device.
In order to maximize compatibility, the major version is set to
BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered.
Also, added MD_FEATURE_CLUSTERED in order to return error
for older kernels which would assemble MD in case bitmap is
corrupted.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: NeilBrown <neilb@suse.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The home-cluster is stored in the bitmap super block of the
array. The device can be assembled on a cluster with the
cluster name same as the one recorded in the bitmap.
If home-cluster is not specified, this is auto-detected using
dlopen corosync cmap library.
neilb: allow code to compile when corosync-devel is not installed.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
Specifies the maximum number of nodes in the cluster that may use
this device simultaneously. This is equivalent to the number of
bitmaps created in the internal superblock (patches to follow).
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a clustered MD, create bitmaps equal to number of nodes so
each node has an independent bitmap.
Only the first bitmap is has the bits set so that the first node
that assembles the device also performs the sync.
The bitmaps are aligned to 4k boundaries.
On-disk format:
0 4k 8k 12k
-------------------------------------------------------------------
| idle | md super | bm super [0] + bits |
| bm bits[0, contd] | bm super[1] + bits | bm bits[1, contd] |
| bm super[2] + bits | bm bits [2, contd] | bm super[3] + bits |
| bm bits [3, contd] | | |
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is best to keep strings all together so that they
are easier to search for in the source code.
If a string is so long that it looks ugly one line,
them maybe it should be broken into multiple lines
for display too.
Only strings which contain a newline can be broken
into multiple lines:
"It is OK to\n"
"break this string\n"
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
Infrastructure is there, so use it.
This requires making sure that ->data_offset is correctly set, even
for containers.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For large arrays (component size > 100GB) if write-intent bitmap is not
enabled, then it is set by default to "internal", even if the metadata
format does support internal bitmaps, which causes Create to fail.
This patch adds checking if add_internal_bitmap is set in the
superswitch before setting bitmap_file to "internal".
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This modifies locking in Create to eliminate a situation where
--incremental can assemble a device between write_init_super() and
add_disk(), which causes Create to fail.
It sporadically occurs e.g. when metadata is written on a device,
causing an udev change event which triggers mdadm --incremental.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
(and various cosmetic fixes)
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
Some people want to create truely enormous arrays.
As we sometimes need to hold one file descriptor for each
device, this can hit the NOFILE limit.
So raise the limit if it ever looks like it might be a problem.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
If module parameter start_ro is set, arrays start readonly.
This is OK when assembling, but is very surprising when creating
an array as the resync won't start.
So over-ride the setting (unless --read-only was given) make
arrays RW when created.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
With the 'devnm' infrastructure fixed, it is quite easy to support
names like "md_home" for md arrays.
The currently defaults to "off" and can be enabled in mdadm.conf with
CREATE names=yes
This is incase other tools get confused by the new names.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
Test for VARIABLE_OFFSET was wrong.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
Here, "large" means components are 100G or more. It is
usually beneficial to have write-intent bitmaps on such arrays.
They can be suppressed with --bitmap=none
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We widely use a "devnum" which is 0 or +ve for md%d devices
and -ve for md_d%d devices.
But I want to be able to use md_%s device names.
So get rid of devnum (a number) and use devnm (a 32char string).
eg.
md0
md_d2
md_home
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
"freesize" can be equal 0, particularly after rounding to the chunk's size.
Creating should be aborted in such case.
Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
| |
mdadm --create /dev/md0 .... /dev/sda1:1024 /dev/sdb1:2048 ...
The size is in K unless a suffix: K M G is given.
The suffix 's' means sectors.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
| |
This can be used to over-ride the automatic assignment of
data offset.
For --create, it is useful to re-create old arrays where different
defaults applied.
For --grow it may be able to force a reshape in the reverse direction.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
So if ->data_offset is already set, use that rather than
computing one.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
This is needed to return correct available size. It isn't
really used yet.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usually, 'mdadm --detail-platform -e imsm' scans all the controllers
looking for IMSM capabilities. This patch provides the possibility
to specify a controller to scan, enabling custom usage by other
processes - especially with the --export switch.
$ mdadm --detail-platform
Platform : Intel(R) Matrix Storage Manager
Version : 9.5.0.1037
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : not supported
Max Disks : 7
Max Volumes : 2 per array, 4 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2
Platform : Intel(R) Matrix Storage Manager
Version : 9.5.0.1037
RAID Levels : raid0 raid1 raid10 raid5
Chunk Sizes : 4k 8k 16k 32k 64k 128k
2TB volumes : supported
2TB disks : not supported
Max Disks : 7
Max Volumes : 2 per array, 4 per controller
I/O Controller : /sys/devices/pci0000:00/0000:00:1f.2 (SATA)
$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.2 --export
MD_FIRMWARE_TYPE=imsm
IMSM_VERSION=9.5.0.1037
IMSM_SUPPORTED_RAID_LEVELS=raid0 raid1 raid10 raid5
IMSM_SUPPORTED_CHUNK_SIZES=4k 8k 16k 32k 64k 128k
IMSM_2TB_VOLUMES=yes
IMSM_2TB_DISKS=no
IMSM_MAX_DISKS=7
IMSM_MAX_VOLUMES_PER_ARRAY=2
IMSM_MAX_VOLUMES_PER_CONTROLLER=4
$ mdadm --detail-platform /sys/devices/pci0000:00/0000:00:1f.0 # This isn't an IMSM-capable controller
mdadm: no active Intel(R) RAID controller found under /sys/devices/pci0000:00/0000:00:1f.0
Signed-off-by: Maciej Naruszewicz <maciej.naruszewicz@intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
'size' is already unsigned long long, so no need to cast it.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
| |
Both are impossible, and '1' allows size to be unsigned,
which is neater.
Also #define MAX_SIZE to be '1' to make it all more explicit.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
| |
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
| |
Allow array to be created read-only
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
malloc should never fail, and if it does it is unlikely
that anything else useful can be done. Best approach is to
abort and let some super-daemon restart.
So define xmalloc, xcalloc, xrealloc, xstrdup which don't
fail but just print a message and exit. Then use those
removing all the tests for failure.
Also replace all "malloc;memset" sequences with 'xcalloc'.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
| |
'pr_err("' is a lot shorter than 'fprintf(stderr, Name ": '
cont_err() is also available.
Signed-off-by: NeilBrown <neilb@suse.de>
|
|
|
|
|
|
|
|
|
|
|
| |
RAID1 arrays don't have a chunk size, but if you ever convert
one to RAID5 you will need at least a small one >= 4K.
So round of size to a multiple of 64K.
This only affect Create, not "--grow --size=max". The latter
is too hard and with smaller returns.
Signed-off-by: NeilBrown <neilb@suse.de>
|