Skip to content

[pull] master from axboe:master#314

Open
pull[bot] wants to merge 1374 commits into
kubestone:masterfrom
axboe:master
Open

[pull] master from axboe:master#314
pull[bot] wants to merge 1374 commits into
kubestone:masterfrom
axboe:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Dec 10, 2021

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

calebsander and others added 27 commits July 17, 2025 11:02
…seen events"

This reverts commit ae8646a.

fio_ioring_cqring_reap() returns up to max - events CQEs. However, the
return value of fio_ioring_cqring_reap() is used to both add to events
and subtract from max. This means that if less than min CQEs are
available and the CQ needs to be polled again, max is effectively
lowered by the number of CQEs that were available. Adding to events is
sufficient to ensure the next call to fio_ioring_cqring_reap() will only
return the remaining CQEs. Commit ae8646a ("engines/io_uring:
update getevents max to reflect previously seen events") added an
incorrect subtraction from max as well, so revert it.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Fixes: ae8646a ("engines/io_uring: update getevents max to reflect previously seen events")
fio_ioring_cqring_reap() takes both an events and a max argument and
will return up to events - max CQEs. Only one of the two callers passes
an existing events count. So remove the events argument and have
fio_ioring_getevents() pass events - max instead. This simplifies the
function signature and avoids an addition inside the loop over CQEs.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Currently fio_ioring_cqring_reap() loops over each available CQE,
re-loading the tail index, incrementing local variables, and checking
whether the max requested CQEs have been seen.
Avoid the loop by computing the number of available CQEs as tail - head
and capping it to the requested max.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
fio_ioring_cqring_reap() can't fail and returns an unsigned variable. So
change its return type from int to unsigned.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
There is no point in comparing events to min again after calling
io_uring_enter() to wait for events, as it doesn't change either events
of min. So remove the loop condition and only compare events to min
after updating events. Don't bother repeating fio_ioring_cqring_reap()
before calling io_uring_enter() if less than the min requested events
were available, as it's highly unlikely the CQ tail will have changed.
Avoid breaking and then branching on the return value by just returning
the value from inside the loop.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Add a relaxed-ordering atomic store helper, analogous to
atomic_store_release() and atomic_load_relaxed().

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
fio_ioring_getevents() advances the io_uring CQ head index in
fio_ioring_cqring_reap() before fio_ioring_event() is called to read the
CQEs. In general this would allow the kernel to reuse the CQE slot
prematurely, but the CQ is sized large enough for the maximum iodepth
and a new io_uring operation isn't submitted until the CQE is processed.
Add a comment to explain why it's safe to advance the CQ head index
early. Use relaxed ordering for the store, as there aren't any accesses
to the CQEs that need to be ordered before the store.

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
…/fio

* 'fix/io_uring-cq-reap' of https://github.com/calebsander/fio:
  engines/io_uring: relax CQ head atomic store ordering
  arch: add atomic_store_relaxed()
  engines/io_uring: simplify getevents control flow
  engines/io_uring: return unsigned from fio_ioring_cqring_reap()
  engines/io_uring: remove loop over CQEs in fio_ioring_cqring_reap()
  engines/io_uring: consolidate fio_ioring_cqring_reap() arguments
  Revert "engines/io_uring: update getevents max to reflect previously seen events"
The filetype option enables the skipping of the 'stat' syscall for each file
defined in jobs at initialization stage, thus optimizing the huge-set-of-files
fio usage scenario.

Signed-off-by: Sergei Truschev <s.truschev@yadro.com>
For some reason folks thought this was a good idea, but sprinkling
strcmp() calls in a hot path is pretty crazy. Particularly when
you can just check the io_ops address for the right IO engine,
trading a string compare for a simple address compare.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the memory compare with ioengine_uring_cmd with checking the
prep pointer, as that should always be sane.

Outside of that, about half the comparisons are either redundant (eg
it's ONLY run in a uring_cmd specific handler), or should be factored
out into separate code.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
For the love of deity, let's use functions where they make sense.
It nicely encapsulates code that is specific to one thing, AND it
avoids having a ton of indented levels making the code utterly
unreadable.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Don't use an overly long line if it can be avoided.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
Don't repeat the code for open/close file, just have the cmd variants
call the normal helper for the actual open or close part.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_u->numberio is used to keep track of the sequence number of writes
and verify reads. It is entirely feasible to issue millions or even
billions of IOs in a singe load, so let's use enough bits to handle
that.

numberio is copied into io_piece and verify_header, so update those
structs accordingly.

Signed-off-by: Riley Thomasson <riley.thomasson@gmail.com>
Currently, fio keeps track of a finite number of recent write
completions for each file in fio_file->last_write_comp. This information
is saved/loaded as part of the "verify state." The verify code
(verify_state_should_stop(), specifically) assumes that any write issued
before these recorded writes must have been successfully completed. This
is not generally true for iodepth > 1 and if writes are completed
out-of-order.

Consider this example: a single write stalls while all other writes
complete normally. This condition can persist for an arbitrarily long
time, and the stalled write will fall out of the range "covered" by
last_write_comp. Saving state at this point (e.g. via a trigger) and
halting the workload (e.g. via power-cycling the machine) will result in
that stalled write being verified when the state is loaded despite the
fact that the write may have never completed.

Instead of tracking the last N write completions, we can instead track
(1) the maximum issued sequence number, and (2) the sequence numbers for
all in-flight writes. The "sequence number" here is a monotonically
increasing value assigned to each write and verifying read (this is
already implemented as io_u->numberio). An "in-flight" write is a write
which has been issued but not yet completed. Furthermore, the number of
in-flight writes is bounded by the iodepth.

We can accomplish this by using a simple array of sequence numbers,
which are initialized to an invalid value. Before issuing a write, its
sequence number is written to a "free" slot, and then the maximum issued
sequence number is incremented. After completing a write, its slot is
changed back to the invalid value. On the verify side, we are allowed to
verify as long as the current sequence number is <= the maximum issued
sequence number AND it is not present in the inflight list.

Saving/loading this information in the verify state and using it in
verify_state_should_stop() is covered in a subsequent patch.

Fixes: Issue #1950

Signed-off-by: Riley Thomasson <riley.thomasson@gmail.com>
Plumb the new inflight write information from thread_data through
thread_io_list and the saved/loaded verify state.

Use this information in verify_state_should_stop() to halt verify as
soon as the first inflight sequence number is reached.

Fixes: Issue #1950

Signed-off-by: Riley Thomasson <riley.thomasson@gmail.com>
When loops are used, the sequence number invariants in the inflight log
are broken. In particular, experimental verify can issue writes in
between loops, which ends up incrementing numberio without logging the
writes to the inflight log.

The intended interaction between loops and verify state save/load is a
bit murky to me, but it seems reasonable to clear the inflight log in
between each loop.

Signed-off-by: Riley Thomasson <riley.thomasson@gmail.com>
Add a version field to verify_header and print a helpful message when
it does not match expectations. This can happen if a user tries to
verify data written by a different version of fio.

The version field is split from the verify_type field in order to avoid
messing with the layout of the struct. This is also makes distinguishing
the new versioned header from older unversioned headers easier, as the
old header format has a very limited set of valid values at this offset,
regardless of endian-ness. Set the MSB in the new version value to
distinguish it from the old header format.

Signed-off-by: Riley Thomasson <riley.thomasson@gmail.com>
Mark longest_existing_path() as static since it is only used within
filesetup.c. Also, declare the 'path' parameter as const char *
because it is not modified within the function.

Signed-off-by: Tomas Winkler tomas.winkler@sandisk.com
Link: https://lore.kernel.org/r/20250731122011.539660-1-tomas.winkler@sandisk.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Update skip_eta() to use an ANSI C function declaration
by explicitly specifying the void parameter list.

Signed-off-by: Tomas Winkler <tomas.winkler@sandisk.com>
Link: https://lore.kernel.org/r/20250731122011.539660-2-tomas.winkler@sandisk.com
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
To reduce pointer chasing in the hot path just store whether we are
using the io_uring or io_uring_cmd ioengine in the ioengine data.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20250725175808.2632-2-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Move two inline helper functions from C source file to header file. In
later patches we will need to use these helper functions outside of
nvme.c.

No functional change intended.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20250725175808.2632-3-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Factor out of fio_nvme_pi_fill() the code that generates and fills in
the Guard Protection Information field. This is so that later patches
can use this code without filling in the fields in the NVMe command.

No functional change intended.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20250725175808.2632-4-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just free the buffer unconditionally. If the pointer is null nothing
will happen, so no harm is done.

This lets us also use this function for the io_uring ioengine when it
gains the ability to handle metadata which will happen in a subsequent
patch.

No functional change intended.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20250725175808.2632-5-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The protection information check flags, apptag, and apptag mask are
fixed for every single operation in a job. So we should just set these
values at init time instead of populating this structure anew for every
single IO.

Any uses of this structure are guarded by the device's protection
information type, so it is not a problem to fill in this structure even
when the device is formatted with Type 0 PI (no protection).

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
Link: https://lore.kernel.org/r/20250725175808.2632-6-vincent.fu@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
vincentkfu and others added 30 commits March 18, 2026 17:58
* 'posix-errnos' of https://github.com/minwooim/fio:
  options: add support more POSIX errnos
* 'fix-null-comm-prctl' of https://github.com/Criticayon/fio:
  backend: guard prctl(PR_SET_NAME) against NULL thread name
Issue: __show_running_run_stats() acquires stat_sem then blocks on each
worker's rusage_sem. But workers need stat_sem to reach the code that
posts rusage_sem, creating an ABBA deadlock. The verify path deadlocks
via a blocking fio_sem_down(stat_sem). The IO path's trylock loop can
mitigate the contention but times out under sustained contention with
multiple workers.

Fix: Moved rusage collection before the stat_sem acquire so the stat
thread never holds stat_sem while waiting on rusage_sem. Added a
double-check of td->runstate after setting update_rusage to guard
against blocking on a worker that has already exited. The trylock
loop and check_update_rusage() calls are retained as precautions.

Signed-off-by: Ryan Tedrick <ryan.tedrick@nutanix.com>
…/fio

* 'fix_statsem_deadlock' of https://github.com/RyanTedrick/fio:
  Fix stat_sem/rusage_sem deadlock during stats collection
prune_io_piece_log() is called only at the start of each loop iteration, so
io_piece entries accumulated during the final do_io() run are never explicitly
freed.

When fio runs as a process this goes unnoticed because the OS reclaims the heap
on exit. When fio is embedded as a pthread, which is a use-case of unvme-cli,
the parent process keeps running, so those allocations become a genuine
memory leak proportional to the number of write IOs logged for verify.

Signed-off-by: Haeun Kim <hanee.kim@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
* 'ipo' of https://github.com/minwooim/fio:
  iolog: free io_piece log on thread cleanup
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Introduce a new ioengine that mmaps anonymous memory and copies data
on read/write to trigger page faults. This allows us to leverage FIOs
powerful framework for MM related testing, and will ideally allow us to
quickly expand testing, by leveraging previously FS related fio scripts.

Signed-off-by: Nico Pache <npache@redhat.com>
Link: https://patch.msgid.link/20260408012004.198115-2-npache@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Document the new page fault engine.

Signed-off-by: Nico Pache <npache@redhat.com>
Link: https://patch.msgid.link/20260408012004.198115-3-npache@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Merge page fault engine from Nico:

"This series introduces a new page_fault ioengine for Anonymous memory
  testing. This enables using fio’s existing framework and job files for
  memory management style workloads without relying on a filesystem. An
  example job file is included to demonstrate usage and lays the
  groundwork for how we plan on utilizing fio to test a number of MM
  related workloads."

* anon-fault:
  engines/page_fault: minor style cleanups
  Documentation: update the documentation to include the page_fault engine
  page_fault: add mmap-backed ioengine for anonymous faults
On our ARM platform, select() could return -1 with errno EINTR fairly
often, while we have almost never observed this on x86 platforms.
This breaks the helper_thread loop with A_EXIT, and stops status update
at stdout as well as bandwidth logging (the one enabled by
`write_bw_log` and `log_avg_msec`), causing `bw` logs to look like
getting cutoff since a random time for prolonged runs (~1 hour).
The issue can be easily reproduced on our ARM platform even with
`ioengine=null` and `filename=/dev/null` by spwaning ~30 individual
fio processes with each logging `bw`, and observe the lines of all
produced logs with `wc -l` once all processes finish.

Added action enum A_NOOP and a check to handle the situation
as no error.

Tested on both ARM and x86 platforms with and without
CONFIG_HAVE_TIMERFD_CREATE marco defined. x86 platform never reproduces
the issue in any situation, and result looks good. ARM platform
no longer reproduces the bug and retains the full `bw` log
after the fix.

Signed-off-by: Alex Qiu <xqiu@google.com>
This reverts commit 981c372.

The previous patch "fio.h: treat io_size > size as offset overlap risk"
has led verify table to rb-tree rather than the simple flist in case of
io_size > size meaning that we can simply revert this patch to allow
overlap for sequential readwrite workload.
* 'arm-select-eintr' of https://github.com/alex310110/fio:
  helper_thread: Handle EINTR errno from select()
Update `total_bytes` not for `td_rw(td)`, but `td_write(td)` to keep
going when WRITE goes being overlapped.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Wrap around `f->last_pos` in case of `io_size` > `size` to consider
overlap.  Also, for write-only sequential jobs, `total_bytes` was capped
at size regardless of `io_size`, causing the write phase to stop after
one pass over the file even when `io_size` > `size`.  Unlike rw mode
where reads and writes both consume `bytes_issued`, write-only jobs only
count writes, so there is no risk of `io_size` being consumed by reads.
Allow `io_size` to control how much to write for this case.

This allows online verification when `io_size` > `size` with write-only
sequential jobs.

	fio \
	--name=online \
	--ioengine=io_uring_cmd --filename=/dev/ng0n1 \
	--cmd_type=nvme \
	--rw=write --bs=128k --size=1M --io_size=2M \
	--verify=crc32 --do_verify=1 --debug=io,verify

Before this patch:

- Writes didn't happen twice (no overlap) even though `io_size` >
  `size` since reads consumed the `bytes_issued`.

io       238438 complete: io_u 0x74a23c000d80: off=0x0,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x20000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x40000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x60000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x80000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xa0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xc0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xe0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x0,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x20000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x40000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x60000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0x80000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xa0000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xc0000,len=0x20000,ddir=0,file=/dev/ng0n1
io       238438 complete: io_u 0x74a23c000d80: off=0xe0000,len=0x20000,ddir=0,file=/dev/ng0n1

After this patch:

- Writes are overlapped, but verify once with the latest `numberio`
  verification.

io       237335 complete: io_u 0x71c1f4000d80: off=0x0,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x20000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x40000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x60000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x80000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xa0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xc0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xe0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x0,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x20000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x40000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x60000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x80000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xa0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xc0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xe0000,len=0x20000,ddir=1,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x0,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x20000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x40000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x60000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0x80000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xa0000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xc0000,len=0x20000,ddir=0,file=/dev/ng0n1
io       237335 complete: io_u 0x71c1f4000d80: off=0xe0000,len=0x20000,ddir=0,file=/dev/ng0n1

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
reset_io_counters() clears td->nr_done_files so that keep_running()
does not return false prematurely because fio_files_done() sees all
files already marked done.

The existing conditions (time_based, loops > 1, do_verify) cover the
cases where the job is expected to restart file iteration.  A job with
io_size > size also requires restarting: the sequential write pointer
wraps around and visits every offset a second time, so the "file done"
bit must be cleared at the start of each outer iteration.

Without this fix, the first pass sets each fio_file's done flag, and
keep_running() exits the outer loop early instead of writing the
second pass -- the overlap that the io_size > size feature is meant to
produce never happens.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Add a Python test suite testing fio's numberio overlap verification for
the three fixes that preceded this commit:

  - fio.h: treat io_size > size as offset overlap risk
  - io_u.c: check wrap around f->last_pos if verify_only=1
  - backend.c: use io_size as limit for seq write-only
  - libfio.c: reset nr_done_files when io_size > size

The test creates situations where io_size > size so that every offset
is written more than once.  fio_offset_overlap_risk() must return true
to activate the rb-tree io_hist backend, which retains only the latest
io_piece per offset.  The verify phase then reads each block exactly
once and checks that numberio on disk matches the latest write.

13 test cases, grouped by rw mode / verify style will be run by two
sync/async ioengines in turn.

Offline verify (OfflineOverlapVerifyTest) runs two separate fio
invocations: a write phase (verify_state_save=1) followed by a verify
phase (verify_only=1, verify_state_load=1).  The verify phase performs
a dry-run write pass to rebuild io_hist, then reads each block and
checks numberio via verify_write_sequence=1.

Online verify (OnlineOverlapVerifyTest) runs a single fio invocation
with do_verify=1, which writes and verifies in the same job.

The filesize < size cases pre-truncate the file smaller than size=
before running fio, verifying that fio_offset_overlap_risk() activates
the rb-tree even before the first I/O because real_file_size < io_size
at setup time.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
…m/fio

* 'check-numberio-read-only' of https://github.com/minwooim/fio:
  t/numberio_overlap: add overlap write verification tests
  libfio: reset nr_done_files when io_size > size
  io_u: check wrap around `f->last_pos` for overlapped case
  backend: update `total_bytes` for TD_DDIR_WRITE
  Revert "backend: fix verify issue during readwrite"
  fio.h: treat io_size > size as offset overlap risk
Fix segfault happening in the teardown caseud by `--trigger-timeout=`
option regardless to the --verify= and --verify_state_save= options.

`td->inflight_numberio` array is allocated in `init_inflight_logging()`
only if ``td->o.verify != VERIFY_NONE && td->o.verify_state_save``.
But, in case where those two options are not given and
--trigger-timeout= is given and the timer is expired, verify state is
attempted to be saved.  In this case, `td->inflight_numberio` array is
NULL causing the segfault.  This patch fixed the segfault by checking
`td->inflight_numberio` before assigning them to the state.

This can be reproduced with --trigger-timeout= where the device ignores
the incoming commands by not issuig the completion simply with QEMU;
dm-flakey does not support to ignore the completion itself yet.

Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
…wooim/fio

* 'fix-trigger-timeout-segfault' of https://github.com/minwooim/fio:
  verify: fix segfault in timedout teardown
Large TRIM ranges combined with high iodepth can cause memory usage
to spike and trigger Out Of Memory (OOM) errors. Since TRIM is a
metadata-only operation, it does not require a data payload buffer.
This patch adds a function to calculate the buffer size without DDIR_TRIM,
preventing unnecessary memory allocation.

Fixes: Issue #2056 #2056
Signed-off-by: Dennis Chang <cherhungc@google.com>
…https://github.com/dennischerchang/fio

* 'prevent_large_trim_size_and_high_iodepth_causing_oom' of https://github.com/dennischerchang/fio:
  fio: prevent OOM by not allocating buffer for TRIM range
The HDD drives with Zone Domain feature enabled have two types of zones
SOBR and SWR. When running fio on the drives with zone domain enabled by
libzbc engine, the fio would abort unexpectedly due to the function
libzbc_report_zones fails prematurely when encountering SOBR zones if using
libzbc engine. The function lacks the handling for SOBR zone type. With the
fix, fio will not quit early when reading the zone information from HDD
drives with Zone Domain feature enabled.

Signed-off-by: rorychen <rory.c.chen@seagate.com>
* 'run_HSMR' of https://github.com/rorychen/fio-HSMR:
  libzbc: fix fio abort prematurely when encounting SOBR zones
There is no a priori reason why the SPRandom approach cannot work with
block sizes that are not a power of 2. Lift the constraint that allows
only power of 2 block sizes when sprandom is enabled.

This allows sprandom to be used when devices are formatted in extended
LBA (DIF) mode with 520- or 4160-byte sectors.

Signed-off-by: Vincent Fu <vincent.fu@samsung.com>
cgroup path construction uses fixed-size buffers in a few places.

get_cgroup_root() allocates 64 bytes and then uses sprintf() to
construct the cgroup directory path. write_int_to_file() similarly
uses a fixed 256-byte stack buffer for the target file path. Long
cgroup paths can therefore overflow these buffers and cause fio to
abort.

Replace the fixed-size buffers with dynamically sized allocation based
on the actual path length. Keep the existing cgroup semantics and let
normal filesystem errors such as ENOENT or ENAMETOOLONG flow through
the existing error handling.

Add unit tests covering long cgroup root paths and long control file
paths. Without the fix, the long-path test aborts with a buffer
overflow. With the fix, both new tests pass.

Closes #2084

Signed-off-by: yangpeng <yangpengxdu@gmail.com>
…xholic/fio

* 'codex/cgroup-path-overflows' of https://github.com/linuxholic/fio:
  cgroup: fix path buffer overflows for long cgroup names
Add --hugetlb <path> option to allocate IO buffers from a hugetlbfs
file instead of posix_memalign. This allows observing IO performance
effects of hugepage-backed buffers (fewer TLB misses, contiguous
physical memory).

It is also useful for testing ublk SHMEM_ZC where the client and
server share the same physical pages via hugetlbfs.

The hugetlb file is mmap'd with MAP_SHARED | MAP_POPULATE. The main
thread pre-assigns per-thread buffer offsets into struct submitter
before starting worker threads, so no locking is needed.

Signed-off-by: Ming Lei <tom.leiming@gmail.com>
* 't_io_uring_htlb' of https://github.com/ming1/fio:
  t/io_uring: add --hugetlb option for hugetlbfs IO buffers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.