Skip to content

ceph: fix denial of service issue in ceph_update_snap_trace()#1440

Open
vfsci-bot[bot] wants to merge 1 commit into
vfs.base.cifrom
pw/1098303/vfs.base.ci
Open

ceph: fix denial of service issue in ceph_update_snap_trace()#1440
vfsci-bot[bot] wants to merge 1 commit into
vfs.base.cifrom
pw/1098303/vfs.base.ci

Conversation

@vfsci-bot
Copy link
Copy Markdown

@vfsci-bot vfsci-bot Bot commented May 20, 2026

Series: https://patchwork.kernel.org/project/linux-fsdevel/list/?series=1098303
Submitter: Viacheslav Dubeyko
Version: 1
Patches: 1/1
Message-ID: <20260520214700.1265095-2-slava@dubeyko.com>
Base: vfs.base.ci
Lore: https://lore.kernel.org/linux-fsdevel/20260520214700.1265095-2-slava@dubeyko.com


Automated by ml2pr

A WARN_ON() fires inside ceph_update_snap_trace() when the client
receives a malformed snap trace from the MDS. The kernel additionally
logs:

ceph_update_snap_trace do remount to continue after corrupted snaptrace is fixed

indicating the client cannot recover the snap state and forces
the operator to remount the filesystem. Triggered with the call chain
reaching the warn from ceph_con_process_message() -> mds_dispatch()
-> ceph_update_snap_trace(). Impact: denial of service of the affected
mount until remount; the client refuses further snap-related operations
after the warn, so any open file in a snap-realm becomes unusable.

[  230.026879] WARNING: fs/ceph/snap.c:926 at ceph_update_snap_trace+0x308/0x3300, CPU#1: kworker/1:1/58
[  230.028125] Modules linked in:
[  230.028427] CPU: 1 UID: 0 PID: 58 Comm: kworker/1:1 Not tainted 6.19.0-g44331bd6a610-dirty #9 PREEMPT(lazy)
[  230.029089] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  230.029815] Workqueue: ceph-msgr ceph_con_workfn
[  230.030177] RIP: 0010:ceph_update_snap_trace+0x31c/0x3300
[  230.030550] Code: fa 48 c1 ea 03 80 3c 02 00 0f 85 23 26 00 00 48 8d 3d 28 d2 3a 05 48 8b 73 28 49 89 e9 4d 89 e8 4c 89 f1 48 c7 c2 40 40 8e 9e <67> 48 0f b9 3a e8 3a 7c db fe 48 b8 00 00 00 00 00 fc ff df 4d 8d
[  230.031708] RSP: 0018:ffffc900003f7728 EFLAGS: 00010246
[  230.032084] RAX: dffffc0000000000 RBX: ffff88812282d000 RCX: ffffffff9e8e3b40
[  230.032533] RDX: ffffffff9e8e4040 RSI: 0000000000016f97 RDI: ffffffffa087e7d0
[  230.033008] RBP: ffffffff9e8e3c00 R08: ffffffff9e8e3b40 R09: ffffffff9e8e3c00
[  230.033460] R10: 00000000fffffffb R11: ffff888102928040 R12: ffff88811031a3e0
[  230.033937] R13: ffffffff9e8e3b40 R14: ffffffff9e8e3b40 R15: 1ffff9200007eefd
[  230.034401] FS:  0000000000000000(0000) GS:ffff888254a05000(0000) knlGS:0000000000000000
[  230.034941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  230.035315] CR2: 00007f7efa7be740 CR3: 000000012435c000 CR4: 0000000000750ef0
[  230.035803] PKRU: 55555554
[  230.036001] Call Trace:
[  230.036175]  <TASK>
[  230.036342]  ? mds_dispatch+0x1ceb/0x6f60
[  230.036645]  ? __pfx___might_resched+0x10/0x10
[  230.037010]  ? iget5_locked+0x44/0xb0
[  230.037316]  ? __pfx_ceph_update_snap_trace+0x10/0x10
[  230.037670]  ? __pfx_down_write+0x10/0x10
[  230.038005]  mds_dispatch+0x1dd1/0x6f60
[  230.038306]  ? ceph_con_process_message+0x1ab/0x270
[  230.039024]  ? lock_release+0xc7/0x270
[  230.039321]  ? __pfx_mds_dispatch+0x10/0x10
[  230.039622]  ? __local_bh_enable_ip+0xa1/0x110
[  230.039992]  ceph_con_process_message+0x1f4/0x270
--
[  321.986495] libceph: failed to decode MOSDOpReply for tid 9: -22
[  330.121991] ------------[ cut here ]------------

The error path in ceph_update_snap_trace() is reached whenever snap trace
decoding fails. The two entry points are:
(1) bad — a ceph_decode_need() macro fails when the encoded snap trace does not
contain enough bytes for the declared number of snaps or prior-parent snaps.
(2) fail — reached directly from -ENOMEM returns (ceph_create_snap_realm(),
dup_array(), adjust_snap_realm_parent()).

This patch fixes the issue by changing WARN(1, ...) on
pr_warn_ratelimited().

Fixes: 38d4640 ("ceph: print cluster fsid and client global_id in all debug logs")
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
cc: Alex Markuze <amarkuze@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Patrick Donnelly <pdonnell@redhat.com>
cc: Ceph Development <ceph-devel@vger.kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant