Fix "list quarantined media" API to have semi-stable ordering by turt2live · Pull Request #19308 · element-hq/synapse

turt2live · 2025-12-16T05:13:24Z

Note: This cleans up previous code introduced by #19268

When first writing the API, "unquarantining" media was not properly considered. If media was unquarantined and an application was treating from tokens as long-lived when listing quarantined media, the endpoint could skip rows like so:

Media A is quarantined
Media B is quarantined
Application lists quarantined media, caches from=2
Media A is unquarantined, shifting B from row 1 to 0
Media C is quarantined, becoming the new row 1
Media D is quarantined
Application lists media with from=2, gets [D]

To fix this, we invent a pagination token which uses time and a relative index. It's not super stable still because the relative index can still change, but it's likely stable enough for most usage (iterate as fast as possible to the end).

If an application requires a proper time-based stable token, it can generate a timestamp then append -0 to it to set the relative position to the zeroth row. This may return rows the application has already seen, as described by the admin API docs. This particular method of generating the timestamp manually is not documented because it's not as stable as relying on the last seen next_batch's internal timestamp.

This PR doesn't shift the whole endpoint to timestamp-only tokens because the prior PR populates rows with 0 for a timestamp, which may span thousands (or millions) of rows, breaking the ability to use limit properly.

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Code style is correct (run the linters)

https://github.com/element-hq/synapse/pull/19268/changes#r2614217300 wasn't applied and the migration had copy/paste artifacts. PR: #19268

turt2live · 2025-12-16T05:19:22Z

+`from` and `limit` are optional parameters, and default to the first page and `100` respectively. `from` is the `next_batch`
+token returned by a previous request and `limit` is the number of rows to return. Note that `next_batch` is not intended 
+to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or 
+`limit` is a timestamp, though `from` does encode a timestamp.
+
+If you require a long-lived `from` token, split `next_batch` on `-` and combine the first part with a `0`, separated by
+a `-` again. For example: `1234-5678` becomes `1234-0`. Your application will need to deduplicate `media` rows it has 
+already seen if using this method.


I do accept that this is custom, weird, and not-quite how pagination works. My defence is I have a custom, weird, not-quite-paginating use case 😇

(though, if someone more qualified wants to make this use PaginationHandler instead, I'd happily close this PR in favour of that one)

turt2live · 2025-12-16T06:10:08Z

+        elif len(start) > 0:
+            start_index = int(start)


We probably don't need this backwards compatibility given the PR which introduced the endpoint has only been on develop for a few days as of writing. I've kept it anyway because I honestly just don't want to fight the unit tests too much. I can be convinced to enter battle, however.

We should get rid of the tech debt

MadLittleMods · 2025-12-23T21:26:01Z

+If you require a long-lived `from` token, split `next_batch` on `-` and combine the first part with a `0`, separated by
+a `-` again. For example: `1234-5678` becomes `1234-0`. Your application will need to deduplicate `media` rows it has 
+already seen if using this method.


huh, why are we suggesting this? Why is this a use case?

The use case is part of the linked PRs/projects: https://github.com/matrix-org/hma-matrix/blob/4f0b9676beb7b5d72b2d55ae7034609593f5fba3/matrix_exchanges/synapse_quarantined.py requires a time-relative token to work from, so it creates one.

The use case should be explained in the PR description (and also probably in #19268)

MadLittleMods · 2025-12-23T21:26:51Z

+token returned by a previous request and `limit` is the number of rows to return. Note that `next_batch` is not intended 
+to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or 


Where is this limitation coming from?

we don't want people to be storing them forever (or if they are, they're using the ts-0 trick) because the token will become invalid upon media being (un)quarantined. On some servers this may be often, but others it could be close to never. We strike a balance here and choose "about a minute" to indicate the estimated volatility.

Alternatively, we could try to express the detail of volatility, but that felt a bit too technical at the time.

MadLittleMods · 2025-12-23T21:27:35Z

+to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or 
+`limit` is a timestamp, though `from` does encode a timestamp.


Superfluous details. The end user can treat them as opaque.

Suggested change

to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or

`limit` is a timestamp, though `from` does encode a timestamp.

to survive longer than about a minute and may produce inconsistent results if used after that time.

This sort of detail was requested in the prior PR: #19268 (comment)

From the linked thread, it seems like the suggestion was also to treat them as opaque and then magically settled for some other state and I don't even see the change mentioned.

MadLittleMods · 2025-12-23T21:33:36Z

+        elif len(start) > 0:
+            start_index = int(start)


We should get rid of the tech debt

MadLittleMods · 2025-12-23T21:34:37Z

+            # Batch tokens are structured as `timestamp-index`, where `index` is relative
+            # to the timestamp. This is done to support pages having many records with
+            # the same timestamp (like existing servers having a ton of `ts=0` records).


This is different (partial) from the reason in the PR description

The reason from the PR description explains that we're doing all of this because media can be unquarantined and we want to make sure to get a semi-stable order.

MadLittleMods · 2025-12-23T21:41:03Z

            # known) to ensure the ordering is stable for established servers.
            if local:
-                sql = "SELECT '' as media_origin, media_id FROM local_media_repository WHERE quarantined_by IS NOT NULL ORDER BY quarantined_ts, media_id ASC LIMIT ? OFFSET ?"
+                sql = "SELECT '' as media_origin, media_id, quarantined_ts FROM local_media_repository WHERE quarantined_by IS NOT NULL AND quarantined_ts >= ? ORDER BY quarantined_ts, media_id ASC LIMIT ? OFFSET ?"


Instead of relying on quarantined_ts and index for pagination (which is still flawed), we could add a new column quarantined_stream_id that is filled in when something is quarantined.

Docs: docs/development/synapse_architecture/streams.md#cheatsheet-for-creating-a-new-stream

A stream feels like way more overhead than we need for this. Is there something lighter weight we can use instead? (like just a simple ID generator?)

I think a stream is the Synapse way to solve this. We only need to do the first few steps from the docs since this isn't something that is going to be part of /sync or the StreamToken.

This is sufficiently complicated where someone from the Synapse team will likely need to take it over, or provide a bit more precise instructions as to what is expected here. The sync steps are the very last bit, at which point there's already a ton of what feels like excess infrastructure.

The sync steps are the very last bit, at which point there's already a ton of what feels like excess infrastructure.

The sync stuff is not necessary (mentioned above). This isn't something that's going to be used in /sync.

turt2live · 2026-01-06T22:59:31Z

This feature was removed in #19351 and doesn't look like it'll pass review, so closing.

Fixes #19352 (See issue for history of this feature and previous PRs) > First, a [naive implementation](#19268) of the endpoint was introduced, but it quickly ran into [performance issues on query](#19312) and [long startup times](#19349), leading to its [removal](#19351). It also didn't actually work, and would fail to expose media when it was "unquarantined", so a [partial fix](#19308) was attempted, where the suggested direction is to use a [stream](https://element-hq.github.io/synapse/latest/development/synapse_architecture/streams.html#cheatsheet-for-creating-a-new-stream) instead of a timestamp column. This PR re-introduces the API building on the previous feedback: * Adds a stream which tracks when media becomes (un)quarantined. * Runs a background update to capture already-quarantined media. * Adds a new admin API to return rows from the stream table. We track both quarantine and unquarantine actions in the stream to allow downstream consumers to process the records appropriately. Namely, to allow our Synapse exchange in HMA to remove hashes for unquarantined media (use case further explained in the [issue](#19352)). **Note**: This knowingly does not capture all cases of media being quarantined. Other call sites are lower priority for T&S, and can be addressed in a future PR. ~~An issue will be created after this PR is merged to track those sites.~~ #19672 ### Pull Request Checklist  * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: turt2live <1190097+turt2live@users.noreply.github.com> Co-authored-by: Eric Eastwood <madlittlemods@gmail.com> Co-authored-by: Eric Eastwood <erice@element.io>

turt2live added 2 commits December 15, 2025 21:24

Cleanup from previous PR

32ce7a3

https://github.com/element-hq/synapse/pull/19268/changes#r2614217300 wasn't applied and the migration had copy/paste artifacts. PR: #19268

Switch to a semi-stable pagination ordering

a4036bf

turt2live changed the title ~~Travis/fix quarantine list~~ Fix "list quarantined media" API to have semi-stable ordering Dec 16, 2025

changelog

ad230c4

turt2live commented Dec 16, 2025

View reviewed changes

Comment thread synapse/rest/admin/media.py Outdated

turt2live commented Dec 16, 2025

View reviewed changes

turt2live force-pushed the travis/fix-quarantine-list branch from c7ec79d to 1128c59 Compare December 16, 2025 06:28

linting & fix tests

516b740

turt2live force-pushed the travis/fix-quarantine-list branch from 1128c59 to 516b740 Compare December 16, 2025 06:41

turt2live marked this pull request as ready for review December 16, 2025 06:53

turt2live requested a review from a team as a code owner December 16, 2025 06:53

This was referenced Dec 16, 2025

Add a Synapse Quarantined Media exchange matrix-org/hma-matrix#7

Merged

Add indexes for quarantined media lookups #19312

Closed

Merge branch 'develop' into travis/fix-quarantine-list

ca8d945

MadLittleMods added the A-Admin-API label Dec 23, 2025

MadLittleMods reviewed Dec 23, 2025

View reviewed changes

turt2live mentioned this pull request Jan 6, 2026

DB Delta 04_add_quarantined_ts_to_media.sql is slow and blocks media access while updating #19349

Closed

turt2live closed this Jan 6, 2026

turt2live mentioned this pull request Jan 6, 2026

Admin API to list quarantined media #19352

Closed

MadLittleMods mentioned this pull request Mar 23, 2026

Add an API to list changes to quarantine state of media #19558

Merged

3 tasks

		token returned by a previous request and `limit` is the number of rows to return. Note that `next_batch` is not intended
		to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or

		to survive longer than about a minute and may produce inconsistent results if used after that time. Neither `from` or
		`limit` is a timestamp, though `from` does encode a timestamp.

Conversation

turt2live commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Checklist

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turt2live Jan 1, 2026 • edited by MadLittleMods Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MadLittleMods Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

turt2live commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

turt2live commented Dec 16, 2025 •

edited

Loading

turt2live Jan 1, 2026 •

edited by MadLittleMods

Loading

MadLittleMods Dec 23, 2025 •

edited

Loading