stomp: emit content-length header for bodies containing embedded NUL#1158
stomp: emit content-length header for bodies containing embedded NUL#1158Devansh-567 wants to merge 1 commit into
Conversation
|
OK, the zypper plugin does not respond with messages containing NUL. Apart from that: If the Stomp class does not trust the user to insert a content-length if required, why should it trust the user to insert a content-length with the correct value? |
That is a completely fair point regarding API responsibility. My reasoning for the fallback is the structural difference between a user providing explicit (but incorrect) metadata versus the library performing a silently destructive default serialization:
The goal was defensive masking: ensuring the library's automated serialization path remains protocol-compliant even if binary data is passed implicitly. However, since the zypper plugin doesn't exchange payloads containing |
|
Yes, all the zypper plugin does is AFAIS reply with Anyway thanks for the review. |
Description
This PR addresses an edge-case serialization bug in the STOMP message writer where frame bodies containing binary payloads or embedded null (
\0) bytes cannot safely round-trip. It ensures that acontent-lengthheader is dynamically generated whenever the payload demands it under the STOMP specification.Problem
The current implementation of
write_messageserializes headers exactly as they are populated inside theMessage::headersmap. However, according to the STOMP protocol specification, acontent-lengthheader is explicitly mandatory if the frame body contains embedded null bytes.When a frame with an embedded
\0in the body is sent without acontent-lengthheader:write_messagewrites out the raw bytes along with the trailing global frame terminator.read_message) receives this frame, it hits the fallback evaluation branch becausehas_content_lengthevaluates to false.getline(is, msg.body, '\0'), which treats the very first embedded null byte inside the payload as the end-of-frame delimiter.This results in silent data truncation, corrupting binary or multiplexed payloads during a serialization round-trip.
How I Found This
While tracing the data path within the
stompmodule to ensure end-to-end data integrity for non-text payloads, I compared the streaming constraints in the reader against the writer.I observed that
read_messagerelies completely on two distinct parsing modes depending on the presence of thecontent-lengthheader: fixed-size block reads (is.read) vs. delimiter tokenization (getlinesplitting on\0).Looking at
write_message, there was no automated fallback or validation mechanism ensuring that the writer switches to or enforces a fixed-size footprint layout if the body contains embedded nulls. To confirm this mismatch, I simulated a test case passing a payload structured with internal null boundaries. As suspected, the writer serialized the entire string, but the reader prematurely cut off the data at the first null character, confirming the round-trip parsing breakdown.Solution
I updated
write_messageto safeguard compliance with the STOMP specifications:has_content_length) during the primary header serialization loop to check if the user manually defined a length constraint.content-lengthwas not explicitly configured andmsg.body.find('\0') != string::nposindicates an embedded null character, the encoder automatically calculates and emits the appropriatecontent-length:<size>header line into the outgoing sequence stream.This change guarantees that payloads containing arbitrary binary data can round-trip flawlessly without breaking downstream parsers that rely strictly on standard STOMP delimiter rules.
If any cosmetic changes needed please do let me know :)