ceph - ceph

	Commit message (Collapse)	Author	Age	Files	Lines
*	v0.46v0.46	Sage Weil	2012-04-30	2	-1/+7
\|
*	librbd: use unique error code for image removal failures	Josh Durgin	2012-04-30	2	-3/+10
\| \| \| \| \| \| \|	This allows the rbd tool to provide a useful error message, instead of compounding more possible causes into one error code. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
*	FileJournal: simply flush by waiting for completions to empty	Samuel Just	2012-04-27	3	-46/+6
\| \| \| \|	Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
*	PG: in GetInfo Notify handler, fix peer_info_requested filter	Samuel Just	2012-04-27	1	-1/+1
\| \| \| \|	Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
*	librados: test get/set of debug levels	Sage Weil	2012-04-27	1	-0/+35
\| \| \| \| \| \|	Also do some sanity checks on the subsystem log level settings. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	config: allow {get,set}_val on subsystem debug levels	Sage Weil	2012-04-27	2	-3/+37
\| \| \| \| \| \| \| \| \|	This mimics the allows you to get and set subsystem debug levels via the normal config access methods. Among other things, this allows librados users to set debug levels. Fixes: #2350 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	librbd: the length argument of aio_discard should be uint64_t	Josh Durgin	2012-04-27	3	-5/+5
\| \| \| \| \| \|	size_t was accidentally copy-pasted. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
*	filestore: interprect any fiemap error as EOPNOTSUPP	Sage Weil	2012-04-27	1	-1/+1
\| \| \| \| \| \|	On 2.6.32-5-amd64 (debian) and XFS I'm getting EINVAL. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	filestore: fix a journal replay issue with collection_add()	Joao Eduardo Luis	2012-04-27	1	-0/+8
\| \| \| \|	Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
*	osd: filter osds removed from probe set from peer_info_requested	Sage Weil	2012-04-27	1	-51/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Peef_info_requested should be a strict subset of the probe set. Filter osds that are dropped from probe from peer_info_requested. We could also restart peering from scratch here, but this is less expensive, because we don't have to re-probe everyone. Once we adjust the probe and peer_info_requested sets, (re)check if we're done: we may have been blocedk on a previous peer_info_requested entry. The situation I saw was: "recovery_state": [ { "name": "Started\/Primary\/Peering\/GetInfo", "enter_time": "2012-04-25 14:39:56.905748", "requested_info_from": [ { "osd": 193}]}, { "name": "Started\/Primary\/Peering", "enter_time": "2012-04-25 14:39:56.905748", "probing_osds": [ 79, 191, 195], "down_osds_we_would_probe": [], "peering_blocked_by": []}, { "name": "Started", "enter_time": "2012-04-25 14:39:56.905742"}]} Once in this state, cycling osd.193 doesn't help, because the prior_set is not affected. Signed-off-by: Sage Weil <sage.weil@dreamhost.com> Reviewed-by: Samuel Just <samuel.just@dreamhost.com>
*	PG: get_infos() should not post GotInfo	Samuel Just	2012-04-27	1	-4/+3
\| \| \| \| \| \| \|	The MNotifyRec handler also posts GotInfo under the same conditions after calling get_infos(). Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
*	Revert "PG: whitelist MNotifyRec in started"	Samuel Just	2012-04-27	1	-2/+0
\| \| \| \|	This reverts commit 9579365720818125a4b15741ae65e58948b9c69f.
*	PG: whitelist MNotifyRec in started	Samuel Just	2012-04-26	1	-0/+2
\| \| \| \|	Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
*	mon: decode old PGMap Incrementals differently from new ones	Greg Farnum	2012-04-25	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \|	We need to distinguish between the old 0 (meaning undefined) and the new 0 (meaning switch to 0 and disable the flags). So rev the encoding version on PGMap::Incremental, and if you decode an old version with [near]full_ratio == 0, set the ratio to -1 instead. Then when applying the Incremental interpret -1 as no change. Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net>
*	Merge remote branch 'origin/wip-rbd-snapid' into next	Josh Durgin	2012-04-24	2	-62/+113
\|\ \| \| \| \| \| \|	Reviewed-by: Sage Weil <sage.weil@dreamhost.com>
\| *	test_rbd: add tests for snap_set and more complicated resizing	Josh Durgin	2012-04-24	1	-2/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* snap_set to a deleted (and recreated) snapshot * resizing down (truncating) and back up * resizing to non-object-aligned sizes Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
\| *	librbd: reset needs_refresh flag before re-reading header	Josh Durgin	2012-04-24	1	-4/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This way we can't miss an update if we get a notify during ictx_refresh. Specifically, a race like this: Thread 1 Thread 2 Process 2 ictx_refresh() read_header() snap_create() notify() need_refresh = true process header... need_refresh = false If this happened, we would not re-read the header with the new snapshot, so the snapshot would not happen at the intended point in time, but only after we re-read the header again. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
\| *	librbd: clean up snapshot handling a bit	Josh Durgin	2012-04-24	1	-51/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* snapid should determine whether our mapped snapshot is gone, not snapname * snap_set(<nonexistent_snap>) shouldn't reset us to CEPH_NOSNAP * snapname should be set before using the it in the perfcounter name * snapname and image name don't need to be passed as arguments since an ImageCtx already contains that info * ictx_check() doesn't need to check for non-existent snaps - only I/Os care, so check in check_io() instead Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
\| *	librbd: clarify handle_sparse_read condition	Josh Durgin	2012-04-24	1	-5/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The earlier condition is >. != means < at this point, and the nesting is unnecessary. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
* \|	librbd: pass errors removing head back to user	Sage Weil	2012-04-24	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In particular, the OSD may return EBUSY if there are still watchers. Ignore ENOENT, as that may indicate we are cleaning up a previously aborted removal. Fixes: #2311 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	mon: clean up handle_osd_timeouts a bit	Sage Weil	2012-04-24	1	-6/+4
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>
* \|	mon: fix pg stats timeout	Sage Weil	2012-04-24	2	-8/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We clear out the osd entry when an osd goes up or down. Thus, if we find it missing from an up osd, we should start the timer. Otherwise we get behavior like this 2012-04-24 13:22:47.888291 7fa5bc587700 mon.peon5752@0(leader).osd e21633 OSDMonitor::handle_osd_timeouts: never got MOSDPGStat info from osd 521. Marking down! 2012-04-24 13:22:50.076394 7fa5bcd88700 log [INF] : osd.521 [2607:f298:4:2243::7088]:6806/53217 boot 2012-04-24 13:22:52.903558 7fa5bc587700 mon.peon5752@0(leader).osd e21638 OSDMonitor::handle_osd_timeouts: never got MOSDPGStat info from osd 521. Marking down! 2012-04-24 13:23:15.144532 7fa5bcd88700 log [INF] : osd.521 [2607:f298:4:2243::7088]:6806/53217 boot 2012-04-24 13:23:17.967118 7fa5bc587700 mon.peon5752@0(leader).osd e21663 OSDMonitor::handle_osd_timeouts: never got MOSDPGStat info from osd 521. Marking down! 2012-04-24 13:23:22.173778 7fa5bcd88700 log [INF] : osd.521 [2607:f298:4:2243::7088]:6806/53217 boot 2012-04-24 13:23:22.981556 7fa5bc587700 mon.peon5752@0(leader).osd e21668 OSDMonitor::handle_osd_timeouts: never got MOSDPGStat info from osd 521. Marking down! 2012-04-24 13:23:45.245380 7fa5bcd88700 log [INF] : osd.521 [2607:f298:4:2243::7088]:6806/53217 boot when the pg stats message doesn't arrive quickly enough. Fixes: #2341 Signed-off-by: Sage Weil <sage@newdream.net> Reviewed-by: Greg Farnum <gregory.farnum@dreamhost.com>
* \|	mon: fix whitespace	Sage Weil	2012-04-24	1	-2/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	mon: fix pgmonitor ratio commands	Greg Farnum	2012-04-24	1	-2/+2
\|/ \| \| \| \| \| \|	The indices were set incorrectly when I whipped thi sup. That's what you get for not testing nor being careful enough in review. Signed-off-by: Greg Farnum <gregory.farnum@dreamhost.com>
*	run_seed_to.sh: rework the script, make it more flexible and broaden the tests.	Sage Weil	2012-04-24	1	-26/+268
\| \| \| \| \| \| \| \| \|	Allow for '-h' and other options such as disabling the journal sync tests, defining it is to be run on a btrfs FS, enabling exit on error (default is now 'off'), and allow certain env variables to specify additional options to each store. Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
*	librbd: rev version for discard addition	Sage Weil	2012-04-24	1	-1/+1
\| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
*	Merge remote-tracking branch 'gh/wip-discard'	Sage Weil	2012-04-23	11	-278/+808
\|\
\| *	perfcounters: tolerate multiple loggers with the same name	Sage Weil	2012-04-22	2	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make them unique by appending -<ptr>, so that the json we dump will remain valid. We may also want to allow people to share counters of the same type. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| *	Merge branch 'master' into wip-discard	Sage Weil	2012-04-22	8	-241/+356
\| \|\
\| * \|	Makefile: disable format-security warning	Sage Weil	2012-04-22	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The prt() varargs function generates this warning test/rbd/fsx.c: In function ‘prt’: warning: test/rbd/fsx.c:203:2: format not a string literal and no format arguments [-Wformat-security] warning: test/rbd/fsx.c:205:3: format not a string literal and no format arguments [-Wformat-security] Disable that check for the fsx build only. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| * \|	filestore: verify that fiemap works	Sage Weil	2012-04-21	2	-16/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check for a bug present in older versions of ext4. If present, disable FIEMAP. See #2328. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| * \|	rados: fix error printout for mapext	Sage Weil	2012-04-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| * \|	librbd: instrument with perfcounters	Sage Weil	2012-04-21	1	-7/+97
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Track IO operations on a per-image basis. Implements: #1451 Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
\| * \|	librbd: allow image resize to non-block boundaries	Sage Weil	2012-04-21	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The caller is still invalidating the entire cache, so we don't need to deal with discard at this level. That might be worth cleaning up later, though. Fixes: #2296 Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	objectcacher: rename truncate_set -> discard_set, and use discard	Sage Weil	2012-04-21	4	-21/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do not assume the object extents are at the trailing edge of objects. Instead, discard arbitrary extents. Fix callers. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	objectcacher: implement Object::discard()	Sage Weil	2012-04-21	2	-1/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Discard a range of bytes from an object. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	librbd: fix debug output	Sage Weil	2012-04-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	objects is misleading here, these are byte offsets Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	librbd: make discard invalidate the range in cache	Sage Weil	2012-04-21	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fed this to test_librbd_fsx and it was happy. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	librbd: fix zeroing of trailing bits on short reads that span objects	Sage Weil	2012-04-21	1	-10/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	handle_sparse_read() was taking buf_ofs and buf_len, but buf_len was being interpreted as the total size of the buffer, not the length of the extent in the buffer start at buf_ofs. Both callers pass in an extent length, so fix the zero code to do the right thing. Specifically, the behavior I saw was: - read range spanning 2 objects, trailing 20k and leading 50k - first object didn't exist, zeroed first 20k of buffer - second object didn't exist, zeroed next 30k (50k-20k) of buffer - the last 20k of buffer was unzeroed. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	librbd: fix debug output for image resize	Sage Weil	2012-04-21	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Print old -> new, not new -> old. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	test_librbd_fsx: port newer xfsprogs version	Sage Weil	2012-04-21	1	-411/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Builds and runs... for a few ops at least. Signed-off-by: Sage Weil <sage@newdream.net>
\| * \|	revert to xfstests' fsx, which has discard support	Sage Weil	2012-04-21	1	-332/+862
\| \| \|
* \| \|	run_seed_to.sh: remove stray arg	Sage Weil	2012-04-23	1	-1/+0
\| \|/ \|/\| \| \| \| \| \| \| \| \|	This crept in in commit d1740bd586db80068fc0292223cf21911de66428. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	librbd: fix ictx_check pointer weirdness by using std::string	Sage Weil	2012-04-21	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I was seeing failures of LibRBD.TestIOToSnapshot where we would fail to refresh after rollback, even though the snap existed. I assume it is because the std::string whose c_str() we were pointing to was reallocated. Use a std::string here instead. This code is weird. Signed-off-by: Sage Weil <sage@newdream.net>
* \|	FileJournal: don't wait flusher until completions are queued	Samuel Just	2012-04-21	1	-19/+20
\| \| \| \| \| \| \| \| \| \|	Fixes: #2324 Signed-off-by: Samuel Just <samuel.just@dreamhost.com>
* \|	filestore: fix collection_add journal replay problem	Sage Weil	2012-04-21	3	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In collection_add we have a two-phase guard set on the linked object via the old name. During replay, we might see that the dest name is missing and replay the operation, and in the process overwrite a newer guard with an older one. Avoid this by checking the source name too, and skipping the operation entirely if a new guard exists. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	FileStoreDiff: flip sense of diff*() methods around	Sage Weil	2012-04-21	2	-42/+66
\| \| \| \| \| \| \| \| \| \| \| \|	true means diff, false means same. Signed-off-by: Sage Weil <sage.weil@dreamhost.com>
* \|	test_idempotent_sequence: Use FileStoreDiff class instead.	Sage Weil	2012-04-21	4	-210/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use FileStoreDiff instead of having the diff code embedded in the test, allowing for more tests and people to use the code in case it comes in hand. Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
* \|	test_idempotent_sequence: Output missing options on "usage".	Joao Eduardo Luis	2012-04-21	1	-1/+3
\| \| \| \| \| \| \| \|	Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>
* \|	FileStoreDiff: check if two FileStores match.	Joao Eduardo Luis	2012-04-21	2	-0/+282
\|/ \| \| \| \| \| \| \|	This code should be on a stand-alone class, instead of being embedded on a single test, in case someone or something find it useful somewhere down the line. Signed-off-by: Joao Eduardo Luis <jecluis@gmail.com>