Skip to content

Rrivera/dma fix#140

Draft
rrivera747 wants to merge 9 commits into
developfrom
rrivera/DMAFix
Draft

Rrivera/dma fix#140
rrivera747 wants to merge 9 commits into
developfrom
rrivera/DMAFix

Conversation

@rrivera747
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown
Contributor

@corrodis corrodis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really able to follow all these changes, but this is the code that we tested, and it works - so I guess we go with it.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DAQ DMA debugging and robustness by extending diagnostic output and updating subevent DMA parsing to better handle subevent headers that straddle DMA buffer boundaries.

Changes:

  • Extend mu2edev::spy() “wide view” mode to also print the tail of each DMA buffer.
  • Persist the last successfully parsed DTC_SubEventHeader across GetSubEventData() calls and log it on exceptions.
  • Refactor ReadNextDAQSubEventDMA() to handle subevent headers split across DMA buffer boundaries by assembling the header/payload contiguously.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
dtcInterfaceLib/mu2edev.cpp Adds “wide view” tail-buffer printing to improve DMA buffer diagnostics.
dtcInterfaceLib/DTC.h Adds members to retain the last good subevent header for exception-time diagnostics.
dtcInterfaceLib/DTC.cpp Adds exception diagnostics and reworks subevent DMA extraction for split-header cases.
Comments suppressed due to low confidence (5)

dtcInterfaceLib/DTC.cpp:255

  • The raw header dump reads 32-bit words via *reinterpret_cast<const uint32_t*>(&ptr[i]) from a uint8_t* pointing into lastGoodSubEventHeader_. This can be undefined behavior due to alignment/strict-aliasing (and may crash while handling an exception). Safer approach is to memcpy each 4-byte chunk into a uint32_t before formatting, or iterate bytes.
			auto ptr = reinterpret_cast<const uint8_t*>(&lastGoodSubEventHeader_);
			std::stringstream rawSS;
			rawSS << "Last good subevent header raw data (" << sizeof(DTC_SubEventHeader) << " bytes): 0x ";
			for (size_t i = 0; i < sizeof(DTC_SubEventHeader); i += 4)
				rawSS << std::hex << std::setw(8) << std::setfill('0') << *reinterpret_cast<const uint32_t*>(&ptr[i]) << ' ';
			DTC_TLOG(TLVL_ERROR) << rawSS.str();

dtcInterfaceLib/DTC.cpp:275

  • The raw header dump reads 32-bit words via *reinterpret_cast<const uint32_t*>(&ptr[i]) from a uint8_t* pointing into lastGoodSubEventHeader_. This can be undefined behavior due to alignment/strict-aliasing (and may crash while handling an exception). Safer approach is to memcpy each 4-byte chunk into a uint32_t before formatting, or iterate bytes.
			auto ptr = reinterpret_cast<const uint8_t*>(&lastGoodSubEventHeader_);
			std::stringstream rawSS;
			rawSS << "Last good subevent header raw data (" << sizeof(DTC_SubEventHeader) << " bytes): 0x ";
			for (size_t i = 0; i < sizeof(DTC_SubEventHeader); i += 4)
				rawSS << std::hex << std::setw(8) << std::setfill('0') << *reinterpret_cast<const uint32_t*>(&ptr[i]) << ' ';
			DTC_TLOG(TLVL_ERROR) << rawSS.str();

dtcInterfaceLib/DTC.cpp:1273

  • subEventByteCount is extracted with *reinterpret_cast<uint32_t*>(headerBuf), where headerBuf is a uint8_t[]. This can be undefined behavior due to alignment/strict-aliasing. Use memcpy (or std::bit_cast from std::array<std::byte,4>) into a uint32_t before masking.
			// Extract subEventByteCount from the assembled header (first 25 bits of first 4 bytes)
			auto subEventByteCount = static_cast<size_t>(
				*reinterpret_cast<uint32_t*>(headerBuf) & 0x1FFFFFF);

dtcInterfaceLib/DTC.cpp:1344

  • In the split-header continuation loop, bytes_read is incremented by buffer_size even though only copySize bytes are copied. If copySize < buffer_size (common on the final buffer), this will overcount and can terminate the loop early, leaving the assembled subevent incomplete/corrupt. Increment bytes_read by copySize (the actual copied bytes).

				memcpy(const_cast<uint8_t*>(static_cast<const uint8_t*>(inmem->GetRawBufferPointer()) + bytes_read),
					   daqDMAInfo_.currentReadPtr, copySize);
				bytes_read += buffer_size;

				daqDMAInfo_.currentReadPtr = reinterpret_cast<char*>(daqDMAInfo_.currentReadPtr) + copySize;

dtcInterfaceLib/DTC.cpp:1502

  • In the multi-DMA continuation path, bytes_read is incremented by buffer_size even though only copySize bytes are copied. If copySize < buffer_size, this can cause an incomplete subevent assembly and incorrect pointer/remaining-size calculations. Increment bytes_read by copySize (or the actual bytes appended).
					DTC_TLOG(TLVL_ReadNextDAQPacket) << "ReadNextDAQSubEventDMA got new buffer for subevent continuation, buffer_size=" << buffer_size << " copySize=" << copySize << " remainingEventSize=" << remainingEventSize;
					DTC_TLOG(TLVL_ReadNextDAQPacket) << "ReadNextDAQSubEventDMA so far bytes_read = " << bytes_read << " packets = " << bytes_read / 16 - sizeof(DTC_SubEventHeader) / 16;
					DTC_TLOG(TLVL_ReadNextDAQPacket) << "ReadNextDAQSubEventDMA copySize = " << copySize << " packets = " << copySize / 16;
					memcpy(const_cast<uint8_t*>(static_cast<const uint8_t*>(inmem->GetRawBufferPointer()) + bytes_read), daqDMAInfo_.currentReadPtr, copySize);
					bytes_read += buffer_size;

Comment thread dtcInterfaceLib/DTC.cpp
@rrivera747 rrivera747 requested a review from eflumerf May 13, 2026 17:51
@rrivera747 rrivera747 mentioned this pull request May 13, 2026
std::cout << " ...";
for (size_t dd = totalQwords - 13; dd < totalQwords; ++dd)
{
std::cout << " " << std::hex << std::setw(16) << std::setfill('0') << datap[dd];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty much guaranteed to blow up log files to an unacceptable degree

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only happens when void mu2edev::spy(int chn, unsigned optsmsk) is called with 8 in the mask (which is on exception handling only .. which is when we need to see the data)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed with d77422c

Copy link
Copy Markdown
Contributor

@pavel1murat pavel1murat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, at this point the best is to bite this bullet and give it a try

@rrivera747
Copy link
Copy Markdown
Contributor Author

We know there is an issue with this (fails when all emulated ROCs are set to 700 packets) ... need to evaluate. Seems like a software issue, because another Opus 4.6 iteration was able to read 100K events but abused memory.

@rrivera747 rrivera747 marked this pull request as draft May 14, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants