ceph: fix denial of service issue in ceph_update_snap_trace()#1440
Open
vfsci-bot[bot] wants to merge 1 commit into
Open
ceph: fix denial of service issue in ceph_update_snap_trace()#1440vfsci-bot[bot] wants to merge 1 commit into
vfsci-bot[bot] wants to merge 1 commit into
Conversation
A WARN_ON() fires inside ceph_update_snap_trace() when the client receives a malformed snap trace from the MDS. The kernel additionally logs: ceph_update_snap_trace do remount to continue after corrupted snaptrace is fixed indicating the client cannot recover the snap state and forces the operator to remount the filesystem. Triggered with the call chain reaching the warn from ceph_con_process_message() -> mds_dispatch() -> ceph_update_snap_trace(). Impact: denial of service of the affected mount until remount; the client refuses further snap-related operations after the warn, so any open file in a snap-realm becomes unusable. [ 230.026879] WARNING: fs/ceph/snap.c:926 at ceph_update_snap_trace+0x308/0x3300, CPU#1: kworker/1:1/58 [ 230.028125] Modules linked in: [ 230.028427] CPU: 1 UID: 0 PID: 58 Comm: kworker/1:1 Not tainted 6.19.0-g44331bd6a610-dirty #9 PREEMPT(lazy) [ 230.029089] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 230.029815] Workqueue: ceph-msgr ceph_con_workfn [ 230.030177] RIP: 0010:ceph_update_snap_trace+0x31c/0x3300 [ 230.030550] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 23 26 00 00 48 8d 3d 28 d2 3a 05 48 8b 73 28 49 89 e9 4d 89 e8 4c 89 f1 48 c7 c2 40 40 8e 9e <67> 48 0f b9 3a e8 3a 7c db fe 48 b8 00 00 00 00 00 fc ff df 4d 8d [ 230.031708] RSP: 0018:ffffc900003f7728 EFLAGS: 00010246 [ 230.032084] RAX: dffffc0000000000 RBX: ffff88812282d000 RCX: ffffffff9e8e3b40 [ 230.032533] RDX: ffffffff9e8e4040 RSI: 0000000000016f97 RDI: ffffffffa087e7d0 [ 230.033008] RBP: ffffffff9e8e3c00 R08: ffffffff9e8e3b40 R09: ffffffff9e8e3c00 [ 230.033460] R10: 00000000fffffffb R11: ffff888102928040 R12: ffff88811031a3e0 [ 230.033937] R13: ffffffff9e8e3b40 R14: ffffffff9e8e3b40 R15: 1ffff9200007eefd [ 230.034401] FS: 0000000000000000(0000) GS:ffff888254a05000(0000) knlGS:0000000000000000 [ 230.034941] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 230.035315] CR2: 00007f7efa7be740 CR3: 000000012435c000 CR4: 0000000000750ef0 [ 230.035803] PKRU: 55555554 [ 230.036001] Call Trace: [ 230.036175] <TASK> [ 230.036342] ? mds_dispatch+0x1ceb/0x6f60 [ 230.036645] ? __pfx___might_resched+0x10/0x10 [ 230.037010] ? iget5_locked+0x44/0xb0 [ 230.037316] ? __pfx_ceph_update_snap_trace+0x10/0x10 [ 230.037670] ? __pfx_down_write+0x10/0x10 [ 230.038005] mds_dispatch+0x1dd1/0x6f60 [ 230.038306] ? ceph_con_process_message+0x1ab/0x270 [ 230.039024] ? lock_release+0xc7/0x270 [ 230.039321] ? __pfx_mds_dispatch+0x10/0x10 [ 230.039622] ? __local_bh_enable_ip+0xa1/0x110 [ 230.039992] ceph_con_process_message+0x1f4/0x270 -- [ 321.986495] libceph: failed to decode MOSDOpReply for tid 9: -22 [ 330.121991] ------------[ cut here ]------------ The error path in ceph_update_snap_trace() is reached whenever snap trace decoding fails. The two entry points are: (1) bad — a ceph_decode_need() macro fails when the encoded snap trace does not contain enough bytes for the declared number of snaps or prior-parent snaps. (2) fail — reached directly from -ENOMEM returns (ceph_create_snap_realm(), dup_array(), adjust_snap_realm_parent()). This patch fixes the issue by changing WARN(1, ...) on pr_warn_ratelimited(). Fixes: 38d4640 ("ceph: print cluster fsid and client global_id in all debug logs") Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> cc: Alex Markuze <amarkuze@redhat.com> cc: Ilya Dryomov <idryomov@gmail.com> cc: Patrick Donnelly <pdonnell@redhat.com> cc: Ceph Development <ceph-devel@vger.kernel.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Series: https://patchwork.kernel.org/project/linux-fsdevel/list/?series=1098303
Submitter: Viacheslav Dubeyko
Version: 1
Patches: 1/1
Message-ID:
<20260520214700.1265095-2-slava@dubeyko.com>Base: vfs.base.ci
Lore: https://lore.kernel.org/linux-fsdevel/20260520214700.1265095-2-slava@dubeyko.com
Automated by ml2pr