[rlc-10] Fix idpf tx timeout issue by roxanan1996 · Pull Request #917 · ctrliq/kernel-src-tree

roxanan1996 · 2026-02-25T13:29:54Z

Description

Customer request to fix a tx timeout issue in c4a metal instances in
google cloud.

From mainline submission explanation:

This series fixes a stability issue in the flow scheduling Tx send/clean
path that results in a Tx timeout.

The existing guardrails in the Tx path were not sufficient to prevent
the driver from reusing completion tags that were still in flight (held
by the HW). This collision would cause the driver to erroneously clean
the wrong packet thus leaving the descriptor ring in a bad state.

The main point of this fix is to replace the flow scheduling buffer ring
with a large pool/array of buffers. The completion tag then simply is
the index into this array. The driver tracks the free tags and pulls
the next free one from a refillq. The cleaning routines simply use the
completion tag from the completion descriptor to index into the array to
quickly find the buffers to clean.

All of the code to support this is added first to ensure traffic still
passes with each patch. The final patch then removes all of the
obsolete stashing code.

COMMITS

The majority are not not clean cherry picks due to missing
1a49cf8 ("idpf: add Tx timestamp flows").
I did not backport this one as well because it was part of a bigger submission
https://lore.kernel.org/all/20250425215227.3170837-1-anthony.l.nguyen@intel.com/
After checking the code, the tx timestamp does not interfere with the new
large pool/aaray of buffers instead of one buffer ring.

idpf: fix a race in txq wakeup

jira KERNEL-170
commit-author Brian Vazquez <brianvv@google.com>
commit 7292af042bcf22e2c18b96ed250f78498a5b28ab

idpf: add support for Tx refillqs in flow scheduling mode

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit cb83b559bea39f207ee214ee2972657e8576ed18
upstream-diff |
	adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct introduced in commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

idpf: improve when to set RE bit logic

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit f2d18e16479cac7a708d77cbfb4220a9114a71fc

idpf: simplify and fix splitq Tx packet rollback error path

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit b61dfa9bc4430ad82b96d3a7c1c485350f91b467
upstream-diff |
	adjusted context in 2 places:
	- when removing func idpf_tx_dma_map_error due to different memset
	call that uses the hardcoded struct type;
	- in func idpf_tx_splitq_frame due to missing expected
	union idpf_flex_tx_ctx_desc *ctx_desc;
	both differences were introduced in commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

idpf: replace flow scheduling buffer ring with buffer pool

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 5f417d551324d2894168b362f2429d120ab06243
upstream-diff |
	adjusted context in:
	- ifpf_tx_splitq_frame and idpf_tx_clean_bufs;
	- libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	all cases are due to missing commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

idpf: stop Tx if there are insufficient buffer resources

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 0c3f135e840d4a2ba4253e15d530ec61bc30718e
upstream-diff |
	adjusted conflict in idpf_tx_splitq_frame func due to missing
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

idpf: remove obsolete stashing code

jira KERNEL-170
commit-author Joshua Hay <joshua.a.hay@intel.com>
commit 6c4e68480238274f84aa50d54da0d9e262df6284
upstream-diff |
	- adjusted context due to missing idpf_tx_read_tstamp func;
	- adjusted the number of bytes expected in
	libeth_cacheline_set_assert for struct idpf_tx_queue due to missing
	of some elements in the struct;
	both are due to missing commit
	1a49cf814fe1e ("idpf: add Tx timestamp flows").

BUILD

/home/rnicolescu/ciq/kernels/rlc-10-latest/kernel-src-tree
Running make mrproper...
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96"
Making olddefconfig
--
  HOSTCC  scripts/kconfig/util.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  GEN     arch/x86/include/generated/asm/orc_hash.h
  WRAP    arch/x86/include/generated/uapi/asm/bpf_perf_event.h
  WRAP    arch/x86/include/generated/uapi/asm/errno.h
  WRAP    arch/x86/include/generated/uapi/asm/fcntl.h
  WRAP    arch/x86/include/generated/uapi/asm/ioctl.h
--
  LD [M]  net/qrtr/qrtr-mhi.ko
  LD [M]  virt/lib/irqbypass.ko
  BTF [M] net/qrtr/qrtr.ko
  BTF [M] net/qrtr/qrtr-mhi.ko
  BTF [M] virt/lib/irqbypass.ko
[TIMER]{BUILD}: 4091s
Making Modules
  SYMLINK /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/build
  INSTALL /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/modules.order
  INSTALL /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/modules.builtin
  INSTALL /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/modules.builtin.modinfo
--
  STRIP   /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/kernel/virt/lib/irqbypass.ko
  SIGN    /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/kernel/net/qrtr/qrtr.ko
  SIGN    /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/kernel/net/qrtr/qrtr-mhi.ko
  SIGN    /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+/kernel/virt/lib/irqbypass.ko
  DEPMOD  /lib/modules/6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+
[TIMER]{MODULES}: 10s
Making Install
  INSTALL /boot
[TIMER]{INSTALL}: 15s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+ and Index to 2
The default is /boot/loader/entries/2adeea37d6d145f09ba612fce6891624-6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+.conf with index 2 and kernel /boot/vmlinuz-6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+
The default is /boot/loader/entries/2adeea37d6d145f09ba612fce6891624-6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+.conf with index 2 and kernel /boot/vmlinuz-6.12.0-rnicolescu_rlc-10_6.12.0-124.38.1.el10_1-044a3252be96+
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 4091s
[TIMER]{MODULES}: 10s
[TIMER]{INSTALL}: 15s
[TIMER]{TOTAL} 4126s
Rebooting in 10 seconds

kernel-build-after.log

Kselftests

./kselftest-after.log
480
./kselftest-before.log
481
Before: ./kselftest-after.log
After: ./kselftest-before.log
Diff:
-ok 2 selftests: syscall_user_dispatch: sud_benchmark
+ok 4 selftests: cgroup: test_freezer
+ok 66 selftests: net: gre_gso.sh

kselftest-after.log
kselftest-before.log

Check_kernel_commits

> python3 ~/ciq/kernel-src-tree-tools/check_kernel_commits.py --repo . --pr_branch {rnicolescu}/rlc-10/6.12.0-124.38.1.el10_1 --base_branch origin/rlc-10/6.12.0-124.38.1.el10_1
All referenced commits exist upstream and have no Fixes: tags.

Run interdiff

[DIFF] PR commit 37f70602c1f2 (idpf: add support for Tx refillqs in flow scheduling mode) → upstream cb83b559bea3
Differences found:

  ================================================================================
  *    DELTA DIFFERENCES - code changes that differ between the patches          *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -683,7 +683,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    96 + sizeof(struct u64_stats_sync),
  +			    88 + sizeof(struct u64_stats_sync),
   			    24);

   /**
83b559bea3################################################################################
  !    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
  ################################################################################

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -694,7 +696,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    112 + sizeof(struct u64_stats_sync),
  +			    120 + sizeof(struct u64_stats_sync),
   			    24);

   /**

  ================================================================================
  *    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -3306,5 +3431,5 @@
   skip_data:
  -		rx_buf->page = NULL;
  +		rx_buf->netmem = 0;

   		idpf_rx_post_buf_refill(refillq, buf_id);
   		IDPF_RX_BUMP_NTC(rxq, ntc);
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -678,7 +686,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    88 + sizeof(struct u64_stats_sync),
  +			    112 + sizeof(struct u64_stats_sync),
   			    24);

   /**

As explained in the comment.

Check colordiff log as well
37f70602c1f2_colordiff.log

[DIFF] PR commit 49bb62933e77 (idpf: simplify and fix splitq Tx packet rollback error path) → upstream b61dfa9bc443
Differences found:

  ================================================================================
  *    DELTA DIFFERENCES - code changes that differ between the patches          *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -2289,4 +2289,55 @@

   /**
  + * idpf_tx_dma_map_error - handle TX DMA map errors
  + * @txq: queue to send buffer on
  + * @skb: send buffer
  + * @first: original first buffer info buffer for packet
  + * @idx: starting point on ring to unwind
  + */
  +void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
  +			   struct idpf_tx_buf *first, u16 idx)
  +{
  +	struct libeth_sq_napi_stats ss = { };
  +	struct libeth_cq_pp cp = {
  +		.dev	= txq->dev,
  +		.ss	= &ss,
  +	};
  +
  +	u64_stats_update_begin(&txq->stats_sync);
  +	u64_stats_inc(&txq->q_stats.dma_map_errs);
  +	u64_stats_update_end(&txq->stats_sync);
  +
  +	/* clear dma mappings for failed tx_buf map */
  +	for (;;) {
  +		struct idpf_tx_buf *tx_buf;
  +
  +		tx_buf = &txq->tx_buf[idx];
  +		libeth_tx_complete(tx_buf, &cp);
  +		if (tx_buf == first)
  +			break;
  +		if (idx == 0)
  +			idx = txq->desc_count;
  +		idx--;
  +	}
  +
  +	if (skb_is_gso(skb)) {
  +		union idpf_tx_flex_desc *tx_desc;
  +
  +		/* If we failed a DMA mapping for a TSO packet, we will have
  +		 * used one additional descriptor for a context
  +		 * descriptor. Reset that here.
  +		 */
  +		tx_desc = &txq->flex_tx[idx];
  +		memset(tx_desc, 0, sizeof(struct idpf_flex_tx_ctx_desc));
  +		if (idx == 0)
  +			idx = txq->desc_count;
  +		idx--;
  +	}
  +
  +	/* Update tail in case netdev_xmit_more was previously true */
  +	idpf_tx_buf_hw_update(txq, idx, false);
  +}
  +
  +/**
    * idpf_tx_splitq_bump_ntu - adjust NTU and generation
    * @txq: the tx ring to wrap
  @@ -2337,35 +2388,4 @@

   /**
  - * idpf_tx_splitq_pkt_err_unmap - Unmap buffers and bump tail in case of error
  - * @txq: Tx queue to unwind
  - * @params: pointer to splitq params struct
  - * @first: starting buffer for packet to unmap
  - */
  -static void idpf_tx_splitq_pkt_err_unmap(struct idpf_tx_queue *txq,
  -					 struct idpf_tx_splitq_params *params,
  -					 struct idpf_tx_buf *first)
  -{
  -	struct libeth_sq_napi_stats ss = { };
  -	struct idpf_tx_buf *tx_buf = first;
  -	struct libeth_cq_pp cp = {
  -		.dev    = txq->dev,
  -		.ss     = &ss,
  -	};
  -	u32 idx = 0;
  -
  -	u64_stats_update_begin(&txq->stats_sync);
  -	u64_stats_inc(&txq->q_stats.dma_map_errs);
  -	u64_stats_update_end(&txq->stats_sync);
  -
  -	do {
  -		libeth_tx_complete(tx_buf, &cp);
  -		idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
  -	} while (idpf_tx_buf_compl_tag(tx_buf) == params->compl_tag);
  -
  -	/* Update tail in case netdev_xmit_more was previously true. */
  -	idpf_tx_buf_hw_update(txq, params->prev_ntu, false);
  -}
  -
  -/**
    * idpf_tx_splitq_map - Build the Tx flex descriptor
    * @tx_q: queue to send buffer on

  ################################################################################
  !    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
  ################################################################################

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -2339,57 +2339,6 @@
   	return count;
   }

  -/**
  - * idpf_tx_dma_map_error - handle TX DMA map errors
  - * @txq: queue to send buffer on
  - * @skb: send buffer
  - * @first: original first buffer info buffer for packet
  - * @idx: starting point on ring to unwind
  - */
  -void idpf_tx_dma_map_error(struct idpf_tx_queue *txq, struct sk_buff *skb,
  -			   struct idpf_tx_buf *first, u16 idx)
  -{
  -	struct libeth_sq_napi_stats ss = { };
  -	struct libeth_cq_pp cp = {
  -		.dev	= txq->dev,
  -		.ss	= &ss,
  -	};
  -
  -	u64_stats_update_begin(&txq->stats_sync);
  -	u64_stats_inc(&txq->q_stats.dma_map_errs);
  -	u64_stats_update_end(&txq->stats_sync);
  -
  -	/* clear dma mappings for failed tx_buf map */
  -	for (;;) {
  -		struct idpf_tx_buf *tx_buf;
  -
  -		tx_buf = &txq->tx_buf[idx];
  -		libeth_tx_complete(tx_buf, &cp);
  -		if (tx_buf == first)
  -			break;
  -		if (idx == 0)
  -			idx = txq->desc_count;
  -		idx--;
  -	}
  -
  -	if (skb_is_gso(skb)) {
  -		union idpf_tx_flex_desc *tx_desc;
  -
  -		/* If we failed a DMA mapping for a TSO packet, we will have
  -		 * used one additional descriptor for a context
  -		 * descriptor. Reset that here.
  -		 */
  -		tx_desc = &txq->flex_tx[idx];
  -		memset(tx_desc, 0, sizeof(*tx_desc));
  -		if (idx == 0)
  -			idx = txq->desc_count;
  -		idx--;
  -	}
  -
  -	/* Update tail in case netdev_xmit_more was previously true */
  -	idpf_tx_buf_hw_update(txq, idx, false);
  -}
  -
   /**
    * idpf_tx_splitq_bump_ntu - adjust NTU and generation
    * @txq: the tx ring to wrap
  @@ -2438,6 +2387,37 @@
   	return true;
   }

  +/**
  + * idpf_tx_splitq_pkt_err_unmap - Unmap buffers and bump tail in case of error
  + * @txq: Tx queue to unwind
  + * @params: pointer to splitq params struct
  + * @first: starting buffer for packet to unmap
  + */
  +static void idpf_tx_splitq_pkt_err_unmap(struct idpf_tx_queue *txq,
  +					 struct idpf_tx_splitq_params *params,
  +					 struct idpf_tx_buf *first)
  +{
  +	struct libeth_sq_napi_stats ss = { };
  +	struct idpf_tx_buf *tx_buf = first;
  +	struct libeth_cq_pp cp = {
  +		.dev    = txq->dev,
  +		.ss     = &ss,
  +	};
  +	u32 idx = 0;
  +
  +	u64_stats_update_begin(&txq->stats_sync);
  +	u64_stats_inc(&txq->q_stats.dma_map_errs);
  +	u64_stats_update_end(&txq->stats_sync);
  +
  +	do {
  +		libeth_tx_complete(tx_buf, &cp);
  +		idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
  +	} while (idpf_tx_buf_compl_tag(tx_buf) == params->compl_tag);
  +
  +	/* Update tail in case netdev_xmit_more was previously true. */
  +	idpf_tx_buf_hw_update(txq, params->prev_ntu, false);
  +}
  +
   /**
    * idpf_tx_splitq_map - Build the Tx flex descriptor
    * @tx_q: queue to send buffer on
  @@ -2482,8 +2462,9 @@
   	for (frag = &skb_shinfo(skb)->frags[0];; frag++) {
   		unsigned int max_data = IDPF_TX_MAX_DESC_DATA_ALIGNED;

  -		if (dma_mapping_error(tx_q->dev, dma))
  -			return idpf_tx_dma_map_error(tx_q, skb, first, i);
  +		if (unlikely(dma_mapping_error(tx_q->dev, dma)))
  +			return idpf_tx_splitq_pkt_err_unmap(tx_q, params,
  +							    first);

   		first->nr_frags++;
   		idpf_tx_buf_compl_tag(tx_buf) = params->compl_tag;
  @@ -2939,7 +2920,9 @@
   static netdev_tx_t idpf_tx_splitq_frame(struct sk_buff *skb,
   					struct idpf_tx_queue *tx_q)
   {
  -	struct idpf_tx_splitq_params tx_params = { };
  +	struct idpf_tx_splitq_params tx_params = {
  +		.prev_ntu = tx_q->next_to_use,
  +	};
   	union idpf_flex_tx_ctx_desc *ctx_desc;
   	struct idpf_tx_buf *first;
   	unsigned int count;

  ================================================================================
  *    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -2328,7 +2380,7 @@
   		 * descriptor. Reset that here.
   		 */
   		tx_desc = &txq->flex_tx[idx];
  -		memset(tx_desc, 0, sizeof(struct idpf_flex_tx_ctx_desc));
  +		memset(tx_desc, 0, sizeof(*tx_desc));
   		if (idx == 0)
   			idx = txq->desc_count;
   		idx--;
  @@ -2819,4 +2871,5 @@
   {
   	struct idpf_tx_splitq_params tx_params = { };
  +	union idpf_flex_tx_ctx_desc *ctx_desc;
   	struct idpf_tx_buf *first;
   	unsigned int count;

Last chunk is relevant here, as explained in the commit message.
The rest I am not sure why it complains.
Check colordiff as well
49bb62933e77_colordiff.log

[DIFF] PR commit 9c1f0f3ca76a (idpf: replace flow scheduling buffer ring with buffer pool) → upstream 5f417d551324
Differences found:

  ================================================================================
  *    DELTA DIFFERENCES - code changes that differ between the patches          *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -1917,11 +1917,8 @@
   		.napi	= budget,
   	};

  -	tx_buf = &txq->tx_buf[buf_id];
  -	if (tx_buf->type == LIBETH_SQE_SKB) {
  +	if (tx_buf->type == LIBETH_SQE_SKB)
   		libeth_tx_complete(tx_buf, &cp);
  -		idpf_post_buf_refill(txq->refillq, buf_id);
  -	}

   	while (idpf_tx_buf_next(tx_buf) != IDPF_TXBUF_NULL) {
   		buf_id = idpf_tx_buf_next(tx_buf);

  ################################################################################
  !    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
  ################################################################################

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -1962,6 +1962,7 @@
   		     idpf_tx_buf_compl_tag(tx_buf) != compl_tag))
   		return false;

  +	tx_buf = &txq->tx_buf[buf_id];
   	if (tx_buf->type == LIBETH_SQE_SKB) {
   		if (skb_shinfo(tx_buf->skb)->tx_flags & SKBTX_IN_PROGRESS)
   			idpf_tx_read_tstamp(txq, tx_buf->skb);
  @@ -1965,6 +1966,7 @@
   			idpf_tx_read_tstamp(txq, tx_buf->skb);

   		libeth_tx_complete(tx_buf, &cp);
  +		idpf_post_buf_refill(txq->refillq, buf_id);
   	}

   	idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);
  @@ -2892,6 +2859,7 @@
   	struct idpf_tx_buf *first;
   	unsigned int count;
   	int tso, idx;
  +	u32 buf_id;

   	count = idpf_tx_desc_count_required(tx_q, skb);
   	if (unlikely(!count))
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -707,7 +715,7 @@
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
   			    120 + sizeof(struct u64_stats_sync),
  -			    24);
  +			    32);

   /**
    * struct idpf_buf_queue - software structure representing a buffer queue

  ================================================================================
  *    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -1917,8 +1965,12 @@
   		     idpf_tx_buf_compl_tag(tx_buf) != compl_tag))
   		return false;

  -	if (tx_buf->type == LIBETH_SQE_SKB)
  +	if (tx_buf->type == LIBETH_SQE_SKB) {
  +		if (skb_shinfo(tx_buf->skb)->tx_flags & SKBTX_IN_PROGRESS)
  +			idpf_tx_read_tstamp(txq, tx_buf->skb);
  +
   		libeth_tx_complete(tx_buf, &cp);
  +	}

   	idpf_tx_clean_buf_ring_bump_ntc(txq, idx, tx_buf);

  @@ -2746,4 +2798,4 @@
  -	struct idpf_flex_tx_ctx_desc *desc;
  +	union idpf_flex_tx_ctx_desc *desc;
   	int i = txq->next_to_use;

   	txq->tx_buf[i].type = LIBETH_SQE_CTX;
  @@ -2804,6 +2856,6 @@
   	struct idpf_tx_buf *first;
   	unsigned int count;
  -	int tso;
  +	int tso, idx;

   	count = idpf_tx_desc_count_required(tx_q, skb);
   	if (unlikely(!count))
  @@ -2842,4 +2962,4 @@
  -		u64_stats_update_end(&tx_q->stats_sync);
  +		idpf_tx_set_tstamp_desc(ctx_desc, idx);
   	}

   	/* record the location of the first descriptor for this packet */
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -685,7 +693,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    96 + sizeof(struct u64_stats_sync),
  +			    120 + sizeof(struct u64_stats_sync),
   			    24);

   /**

This is due to missing 1a49cf8 ("idpf: add Tx timestamp flows") Same for here

9c1f0f3ca76a_colordiff.log

[DIFF] PR commit 0facc01c7a6a (idpf: stop Tx if there are insufficient buffer resources) → upstream 0c3f135e840d
Differences found:

  ################################################################################
  !    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
  ################################################################################

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -2909,7 +2926,7 @@
   	};
   	union idpf_flex_tx_ctx_desc *ctx_desc;
   	struct idpf_tx_buf *first;
  -	unsigned int count;
  +	u32 count, buf_count = 1;
   	int tso, idx;
   	u32 buf_id;


  ================================================================================
  *    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -2770,7 +2821,8 @@
   	};
  +	union idpf_flex_tx_ctx_desc *ctx_desc;
   	struct idpf_tx_buf *first;
   	unsigned int count;
  -	int tso;
  +	int tso, idx;
   	u32 buf_id;

   	count = idpf_tx_desc_count_required(tx_q, skb);

Second is due to missing the same commit. All good.

0facc01c7a6a_colordiff.log

[DIFF] PR commit 044a3252be96 (idpf: remove obsolete stashing code) → upstream 6c4e68480238
Differences found:

  ================================================================================
  *    DELTA DIFFERENCES - code changes that differ between the patches          *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -1559,6 +1559,82 @@
   	wake_up(&vport->sw_marker_wq);
   }

  +/**
  + * idpf_tx_clean_stashed_bufs - clean bufs that were stored for
  + * out of order completions
  + * @txq: queue to clean
  + * @compl_tag: completion tag of packet to clean (from completion descriptor)
  + * @cleaned: pointer to stats struct to track cleaned packets/bytes
  + * @budget: Used to determine if we are in netpoll
  + */
  +static void idpf_tx_clean_stashed_bufs(struct idpf_tx_queue *txq,
  +				       u16 compl_tag,
  +				       struct libeth_sq_napi_stats *cleaned,
  +				       int budget)
  +{
  +	struct idpf_tx_stash *stash;
  +	struct hlist_node *tmp_buf;
  +	struct libeth_cq_pp cp = {
  +		.dev	= txq->dev,
  +		.ss	= cleaned,
  +		.napi	= budget,
  +	};
  +
  +	/* Buffer completion */
  +	hash_for_each_possible_safe(txq->stash->sched_buf_hash, stash, tmp_buf,
  +				    hlist, compl_tag) {
  +		if (unlikely(idpf_tx_buf_compl_tag(&stash->buf) != compl_tag))
  +			continue;
  +
  +		hash_del(&stash->hlist);
  +		libeth_tx_complete(&stash->buf, &cp);
  +
  +		/* Push shadow buf back onto stack */
  +		idpf_buf_lifo_push(&txq->stash->buf_stack, stash);
  +	}
  +}
  +
  +/**
  + * idpf_stash_flow_sch_buffers - store buffer parameters info to be freed at a
  + * later time (only relevant for flow scheduling mode)
  + * @txq: Tx queue to clean
  + * @tx_buf: buffer to store
  + */
  +static int idpf_stash_flow_sch_buffers(struct idpf_tx_queue *txq,
  +				       struct idpf_tx_buf *tx_buf)
  +{
  +	struct idpf_tx_stash *stash;
  +
  +	if (unlikely(tx_buf->type <= LIBETH_SQE_CTX))
  +		return 0;
  +
  +	stash = idpf_buf_lifo_pop(&txq->stash->buf_stack);
  +	if (unlikely(!stash)) {
  +		net_err_ratelimited("%s: No out-of-order TX buffers left!\n",
  +				    netdev_name(txq->netdev));
  +
  +		return -ENOMEM;
  +	}
  +
  +	/* Store buffer params in shadow buffer */
  +	stash->buf.skb = tx_buf->skb;
  +	stash->buf.bytes = tx_buf->bytes;
  +	stash->buf.packets = tx_buf->packets;
  +	stash->buf.type = tx_buf->type;
  +	stash->buf.nr_frags = tx_buf->nr_frags;
  +	dma_unmap_addr_set(&stash->buf, dma, dma_unmap_addr(tx_buf, dma));
  +	dma_unmap_len_set(&stash->buf, len, dma_unmap_len(tx_buf, len));
  +	idpf_tx_buf_compl_tag(&stash->buf) = idpf_tx_buf_compl_tag(tx_buf);
  +
  +	/* Add buffer to buf_hash table to be freed later */
  +	hash_add(txq->stash->sched_buf_hash, &stash->hlist,
  +		 idpf_tx_buf_compl_tag(&stash->buf));
  +
  +	tx_buf->type = LIBETH_SQE_EMPTY;
  +
  +	return 0;
  +}
  +
   #define idpf_tx_splitq_clean_bump_ntc(txq, ntc, desc, buf)	\
   do {								\
   	if (unlikely(++(ntc) == (txq)->desc_count)) {		\
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -653,7 +660,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    80 + sizeof(struct u64_stats_sync),
  +			    96 + sizeof(struct u64_stats_sync),
   			    32);

   /**

  ################################################################################
  !    REJECTED PATCH2 HUNKS - could not be compared; manual review needed       !
  ################################################################################

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -1602,87 +1462,6 @@
   	spin_unlock_bh(&tx_tstamp_caps->status_lock);
   }

  -/**
  - * idpf_tx_clean_stashed_bufs - clean bufs that were stored for
  - * out of order completions
  - * @txq: queue to clean
  - * @compl_tag: completion tag of packet to clean (from completion descriptor)
  - * @cleaned: pointer to stats struct to track cleaned packets/bytes
  - * @budget: Used to determine if we are in netpoll
  - */
  -static void idpf_tx_clean_stashed_bufs(struct idpf_tx_queue *txq,
  -				       u16 compl_tag,
  -				       struct libeth_sq_napi_stats *cleaned,
  -				       int budget)
  -{
  -	struct idpf_tx_stash *stash;
  -	struct hlist_node *tmp_buf;
  -	struct libeth_cq_pp cp = {
  -		.dev	= txq->dev,
  -		.ss	= cleaned,
  -		.napi	= budget,
  -	};
  -
  -	/* Buffer completion */
  -	hash_for_each_possible_safe(txq->stash->sched_buf_hash, stash, tmp_buf,
  -				    hlist, compl_tag) {
  -		if (unlikely(idpf_tx_buf_compl_tag(&stash->buf) != compl_tag))
  -			continue;
  -
  -		hash_del(&stash->hlist);
  -
  -		if (stash->buf.type == LIBETH_SQE_SKB &&
  -		    (skb_shinfo(stash->buf.skb)->tx_flags & SKBTX_IN_PROGRESS))
  -			idpf_tx_read_tsta044a3252be96mp(txq, stash->buf.skb);
  -
  -		libeth_tx_complete(&stash->buf, &cp);
  -
  -		/* Push shadow buf back onto stack */
  -		idpf_buf_lifo_push(&txq->stash->buf_stack, stash);
  -	}
  -}
  -
  -/**
  - * idpf_stash_flow_sch_buffers - store buffer parameters info to be freed at a
  - * later time (only relevant for flow scheduling mode)
  - * @txq: Tx queue to clean
  - * @tx_buf: buffer to store
  - */
  -static int idpf_stash_flow_sch_buffers(struct idpf_tx_queue *txq,
  -				       struct idpf_tx_buf *tx_buf)
  -{
  -	struct idpf_tx_stash *stash;
  -
  -	if (unlikely(tx_buf->type <= LIBETH_SQE_CTX))
  -		return 0;
  -
  -	stash = idpf_buf_lifo_pop(&txq->stash->buf_stack);
  -	if (unlikely(!stash)) {
  -		net_err_ratelimited("%s: No out-of-order TX buffers left!\n",
  -				    netdev_name(txq->netdev));
  -
  -		return -ENOMEM;
  -	}
  -
  -	/* Store buffer params in shadow buffer */
  -	stash->buf.skb = tx_buf->skb;
  -	stash->buf.bytes = tx_buf->bytes;
  -	stash->buf.packets = tx_buf->packets;
  -	stash->buf.type = tx_buf->type;
  -	stash->buf.nr_frags = tx_buf->nr_frags;
  -	dma_unmap_addr_set(&stash->buf, dma, dma_unmap_addr(tx_buf, dma));
  -	dma_unmap_len_set(&stash->buf, len, dma_unmap_len(tx_buf, len));
  -	idpf_tx_buf_compl_tag(&stash->buf) = idpf_tx_buf_compl_tag(tx_buf);
  -
  -	/* Add buffer to buf_hash table to be freed later */
  -	hash_add(txq->stash->sched_buf_hash, &stash->hlist,
  -		 idpf_tx_buf_compl_tag(&stash->buf));
  -
  -	tx_buf->type = LIBETH_SQE_EMPTY;
  -
  -	return 0;
  -}
  -
   #define idpf_tx_splitq_clean_bump_ntc(txq, ntc, desc, buf)	\
   do {								\
   	if (unlikely(++(ntc) == (txq)->desc_count)) {		\
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -599,9 +565,6 @@
    * @cleaned_pkts: Number of packets cleaned for the above said case
    * @stash: Tx buffer stash for Flow-based scheduling mode
    * @refillq: Pointer to refill queue
  - * @compl_tag_bufid_m: Completion tag buffer id mask
  - * @compl_tag_cur_gen: Used to keep track of current completion tag generation
  - * @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
    * @cached_tstamp_caps: Tx timestamp capabilities negotiated with the CP
    * @tstamp_task: Work that handles Tx timestamp read
    * @stats_sync: See struct u64_stats_sync
  @@ -650,10 +611,6 @@
   	struct idpf_txq_stash *stash;
   	struct idpf_sw_queue *refillq;

  -	u16 compl_tag_bufid_m;
  -	u16 compl_tag_cur_gen;
  -	u16 compl_tag_gen_max;
  -
   	struct idpf_ptp_vport_tx_tstamp_caps *cached_tstamp_caps;
   	struct work_struct *tstamp_task;

  @@ -671,7 +628,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    120 + sizeof(struct u64_stats_sync),
  +			    104 + sizeof(struct u64_stats_sync),
   			    32);

   /**

  ================================================================================
  *    CONTEXT DIFFERENCES - surrounding code differences between the patches    *
  ================================================================================

  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.c
  @@ -4,4 +4,4 @@
  -#include "idpf.h"
  +#include "idpf_ptp.h"
   #include "idpf_virtchnl.h"

   struct idpf_tx_stash {
  @@ -1727,6 +1770,11 @@
   			continue;

   		hash_del(&stash->hlist);
  +
  +		if (stash->buf.type == LIBETH_SQE_SKB &&
  +		    (skb_shinfo(stash->buf.skb)->tx_flags & SKBTX_IN_PROGRESS))
  +			idpf_tx_read_tstamp(txq, stash->buf.skb);
  +
   		libeth_tx_complete(&stash->buf, &cp);

   		/* Push shadow buf back onto stack */
  --- b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
  @@ -632,2 +638,4 @@
    * @compl_tag_gen_max: To determine when compl_tag_cur_gen should be reset
  + * @cached_tstamp_caps: Tx timestamp capabilities negotiated with the CP
  + * @tstamp_task: Work that handles Tx timestamp read
    * @stats_sync: See struct u64_stats_sync
  @@ -682,6 +690,6 @@
   	u16 compl_tag_cur_gen;
   	u16 compl_tag_gen_max;

  -	struct u64_stats_sync stats_sync;
  -	struct idpf_tx_queue_stats q_stats;
  -	__cacheline_group_end_aligned(read_write);
  +	struct idpf_ptp_vport_tx_tstamp_caps *cached_tstamp_caps;
  +	struct work_struct *tstamp_task;
  +
  @@ -696,7 +707,7 @@
   	__cacheline_group_end_aligned(cold);
   };
   libeth_cacheline_set_assert(struct idpf_tx_queue, 64,
  -			    96 + sizeof(struct u64_stats_sync),
  +			    120 + sizeof(struct u64_stats_sync),
   			    32);

   /**

Not sure why it complains so much here.
044a3252be96_colordiff.log

The only relevant change is this as explained in the commit message.

jira KERNEL-170 commit-author Brian Vazquez <brianvv@google.com> commit 7292af0 Add a helper function to correctly handle the lockless synchronization when the sender needs to block. The paradigm is if (no_resources()) { stop_queue(); barrier(); if (!no_resources()) restart_queue(); } netif_subqueue_maybe_stop already handles the paradigm correctly, but the code split the check for resources in three parts, the first one (descriptors) followed the protocol, but the other two (completions and tx_buf) were only doing the first part and so race prone. Luckily netif_subqueue_maybe_stop macro already allows you to use a function to evaluate the start/stop conditions so the fix only requires the right helper function to evaluate all the conditions at once. The patch removes idpf_tx_maybe_stop_common since it's no longer needed and instead adjusts separately the conditions for singleq and splitq. Note that idpf_tx_buf_hw_update doesn't need to check for resources since that will be covered in idpf_tx_splitq_frame. To reproduce: Reduce the threshold for pending completions to increase the chances of hitting this pause by changing your kernel: drivers/net/ethernet/intel/idpf/idpf_txrx.h -#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 1) +#define IDPF_TX_COMPLQ_OVERFLOW_THRESH(txcq) ((txcq)->desc_count >> 4) Use pktgen to force the host to push small pkts very aggressively: ./pktgen_sample02_multiqueue.sh -i eth1 -s 100 -6 -d $IP -m $MAC \ -p 10000-10000 -t 16 -n 0 -v -x -c 64 Fixes: 6818c4d ("idpf: add splitq start_xmit") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Josh Hay <joshua.a.hay@intel.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Luigi Rizzo <lrizzo@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit 7292af0) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit cb83b55 upstream-diff | adjusted the number of bytes expected in libeth_cacheline_set_assert for struct idpf_tx_queue due to missing of some elements in the struct introduced in commit 1a49cf8 ("idpf: add Tx timestamp flows"). In certain production environments, it is possible for completion tags to collide, meaning N packets with the same completion tag are in flight at the same time. In this environment, any given Tx queue is effectively used to send both slower traffic and higher throughput traffic simultaneously. This is the result of a customer's specific configuration in the device pipeline, the details of which Intel cannot provide. This configuration results in a small number of out-of-order completions, i.e., a small number of packets in flight. The existing guardrails in the driver only protect against a large number of packets in flight. The slower flow completions are delayed which causes the out-of-order completions. The fast flow will continue sending traffic and generating tags. Because tags are generated on the fly, the fast flow eventually uses the same tag for a packet that is still in flight from the slower flow. The driver has no idea which packet it should clean when it processes the completion with that tag, but it will look for the packet on the buffer ring before the hash table. If the slower flow packet completion is processed first, it will end up cleaning the fast flow packet on the ring prematurely. This leaves the descriptor ring in a bad state resulting in a crash or Tx timeout. In summary, generating a tag when a packet is sent can lead to the same tag being associated with multiple packets. This can lead to resource leaks, crashes, and/or Tx timeouts. Before we can replace the tag generation, we need a new mechanism for the send path to know what tag to use next. The driver will allocate and initialize a refillq for each TxQ with all of the possible free tag values. During send, the driver grabs the next free tag from the refillq from next_to_clean. While cleaning the packet, the clean routine posts the tag back to the refillq's next_to_use to indicate that it is now free to use. This mechanism works exactly the same way as the existing Rx refill queues, which post the cleaned buffer IDs back to the buffer queue to be reposted to HW. Since we're using the refillqs for both Rx and Tx now, genericize some of the existing refillq support. Note: the refillqs will not be used yet. This is only demonstrating how they will be used to pass free tags back to the send path. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit cb83b55) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit f2d18e1 Track the gap between next_to_use and the last RE index. Set RE again if the gap is large enough to ensure RE bit is set frequently. This is critical before removing the stashing mechanisms because the opportunistic descriptor ring cleaning from the out-of-order completions will go away. Previously the descriptors would be "cleaned" by both the descriptor (RE) completion and the out-of-order completions. Without the latter, we must ensure the RE bit is set more frequently. Otherwise, it's theoretically possible for the descriptor ring next_to_clean to never advance. The previous implementation was dependent on the start of a packet falling on a 64th index in the descriptor ring, which is not guaranteed with large packets. Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit f2d18e1) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit b61dfa9 upstream-diff | adjusted context in 2 places: - when removing func idpf_tx_dma_map_error due to different memset call that uses the hardcoded struct type; - in func idpf_tx_splitq_frame due to missing expected union idpf_flex_tx_ctx_desc *ctx_desc; both differences were introduced in commit 1a49cf8 ("idpf: add Tx timestamp flows"). Move (and rename) the existing rollback logic to singleq.c since that will be the only consumer. Create a simplified splitq specific rollback function to loop through and unmap tx_bufs based on the completion tag. This is critical before replacing the Tx buffer ring with the buffer pool since the previous rollback indexing will not work to unmap the chained buffers from the pool. Cache the next_to_use index before any portion of the packet is put on the descriptor ring. In case of an error, the rollback will bump tail to the correct next_to_use value. Because the splitq path now supports different types of context descriptors (and potentially multiple in the future), this will take care of rolling back any and all context descriptors encoded on the ring for the erroneous packet. The previous rollback logic was broken for PTP packets since it would not account for the PTP context descriptor. Fixes: 1a49cf8 ("idpf: add Tx timestamp flows") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit b61dfa9) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit 5f417d5 upstream-diff | adjusted context in: - ifpf_tx_splitq_frame and idpf_tx_clean_bufs; - libeth_cacheline_set_assert for struct idpf_tx_queue due to missing of some elements in the struct; all cases are due to missing commit 1a49cf8 ("idpf: add Tx timestamp flows"). Replace the TxQ buffer ring with one large pool/array of buffers (only for flow scheduling). This eliminates the tag generation and makes it impossible for a tag to be associated with more than one packet. The completion tag passed to HW through the descriptor is the index into the array. That same completion tag is posted back to the driver in the completion descriptor, and used to index into the array to quickly retrieve the buffer during cleaning. In this way, the tags are treated as a fix sized resource. If all tags are in use, no more packets can be sent on that particular queue (until some are freed up). The tag pool size is 64K since the completion tag width is 16 bits. For each packet, the driver pulls a free tag from the refillq to get the next free buffer index. When cleaning is complete, the tag is posted back to the refillq. A multi-frag packet spans multiple buffers in the driver, therefore it uses multiple buffer indexes/tags from the pool. Each frag pulls from the refillq to get the next free buffer index. These are tracked in a next_buf field that replaces the completion tag field in the buffer struct. This chains the buffers together so that the packet can be cleaned from the starting completion tag taken from the completion descriptor, then from the next_buf field for each subsequent buffer. In case of a dma_mapping_error occurs or the refillq runs out of free buf_ids, the packet will execute the rollback error path. This unmaps any buffers previously mapped for the packet. Since several free buf_ids could have already been pulled from the refillq, we need to restore its original state as well. Otherwise, the buf_ids/tags will be leaked and not used again until the queue is reallocated. Descriptor completions only advance the descriptor ring index to "clean" the descriptors. The packet completions only clean the buffers associated with the given packet completion tag and do not update the descriptor ring index. When operating in queue based scheduling mode, the array still acts as a ring and will only have TxQ descriptor count entries. The tx_bufs are still associated 1:1 with the descriptor ring entries and we can use the conventional indexing mechanisms. Fixes: c2d548c ("idpf: add TX splitq napi poll support") Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit 5f417d5) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit 0c3f135 upstream-diff | adjusted conflict in idpf_tx_splitq_frame func due to missing 1a49cf8 ("idpf: add Tx timestamp flows"). The Tx refillq logic will cause packets to be silently dropped if there are not enough buffer resources available to send a packet in flow scheduling mode. Instead, determine how many buffers are needed along with number of descriptors. Make sure there are enough of both resources to send the packet, and stop the queue if not. Fixes: 7292af0 ("idpf: fix a race in txq wakeup") Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit 0c3f135) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

jira KERNEL-170 commit-author Joshua Hay <joshua.a.hay@intel.com> commit 6c4e684 upstream-diff | - adjusted context due to missing idpf_tx_read_tstamp func; - adjusted the number of bytes expected in libeth_cacheline_set_assert for struct idpf_tx_queue due to missing of some elements in the struct; both are due to missing commit 1a49cf8 ("idpf: add Tx timestamp flows"). With the new Tx buffer management scheme, there is no need for all of the stashing mechanisms, the hash table, the reserve buffer stack, etc. Remove all of that. Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> (cherry picked from commit 6c4e684) Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>

Brian Vazquez and others added 3 commits February 25, 2026 10:57

roxanan1996 marked this pull request as draft February 25, 2026 13:30

jahay1 added 4 commits February 25, 2026 14:41

roxanan1996 force-pushed the {rnicolescu}/rlc-10/6.12.0-124.38.1.el10_1 branch from 044a325 to 622375b Compare February 25, 2026 13:41

roxanan1996 self-assigned this Feb 25, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[rlc-10] Fix idpf tx timeout issue#917

[rlc-10] Fix idpf tx timeout issue#917
roxanan1996 wants to merge 7 commits intorlc-10/6.12.0-124.38.1.el10_1from
{rnicolescu}/rlc-10/6.12.0-124.38.1.el10_1

roxanan1996 commented Feb 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

roxanan1996 commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

COMMITS

BUILD

Kselftests

Check_kernel_commits

Run interdiff

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

roxanan1996 commented Feb 25, 2026 •

edited

Loading