Conversation
Implement a UNIQUE constraint on the archive_id column (XEP-0359 stanza-id) to prevent duplicate messages in the chat log. This addresses problems where the same message could be stored multiple times if received via both MAM and regular. Signed-off-by: Michael Vetter <jubalh@iodoru.org>
Implement security checks to ensure the 'archive_id' (XEP-0359) used for database deduplication originates from a trusted source. XEP-0359 Section 3.1: "The 'by' attribute MUST be the XMPP address (JID) of the entity assigning the unique and stable stanza ID." Furthermore, Section 4 (Security Considerations) specifies: "A client SHOULD only trust <stanza-id/> elements from its own server or from a MUC service it is joined to." XEP-0313 Section 4.1.2: "The 'by' attribute of the <result/> element is the JID of the archive being queried." and "If the 'by' attribute is not present, the recipient MUST assume that the results are from their own personal archive." Let _handle_chat verify <stanza-id/> 'by' attribute matches our bare JID. Let _handle_groupchat verify <stanza-id/> 'by' attribute matches the MUC s bare JID. Let _handle_mam verify the <result/> 'by' attribute matches the outer message 'from' (archive JID). If 'by' is missing then 'from' matches our own bare JID (personal archive). Signed-off-by: Michael Vetter <jubalh@iodoru.org>
Replace 'INSERT OR IGNORE' with 'INSERT ... ON CONFLICT(`archive_id`) DO NOTHING RETURNING id'. This is only available since sqlite 3.35.0. So deduplication only happens for `archive_id` and we don't silently ignore other errors or constraints (like not null). We can now detect if an insertion was skipped due to duplication by checking the result of 'RETURNING id'. We don't print out when we don't insert duplicated messages since this will happen often and will be too noisy. So we match the behaviour of what Dino is doing. Signed-off-by: Michael Vetter <jubalh@iodoru.org>
Member
Author
|
Anybody ready to review this? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implement a UNIQUE constraint on the archive_id column (XEP-0359 stanza-id) to prevent duplicate messages in the chat log. This addresses problems where the same message could be stored multiple times if received via both MAM and regular.