feat(preprod): Add snapshots subcommand by lcian · Pull Request #3110 · getsentry/sentry-cli

lcian · 2026-01-28T09:49:51Z

Updated version of #3049 to discuss and iterate on things.

Notable changes:

Removed shard_index parameter from the command. I'm not sure what the purpose of that was originally.
This uses the new many (batch) API from objectstore_client. All uploads are executed as batch requests, reducing network overhead. Unfortunately, with they way things are implemented now, we will still have to buffer all files in memory before sending the request, as we need to hash their contents to determine the filename. If we could just use the filename as the key in objectstore, it would be much better because that way we could stream the files over.

Note that auth enforcement still needs to be enabled for objectstore, so that's currently blocking this to be used for anything but internal testing.

Ref FS-233

lcian · 2026-01-28T10:01:42Z

src/api/mod.rs

+#[derive(Deserialize, Debug)]
+pub struct OrganizationDetails {
+    pub id: String,
+    pub links: OrganizationLinks,


This endpoint has a bunch more information but what we really care about is the id and links.
The region_url contains the regional API url, e.g. us.sentry.io. This way we can hit that endpoint directly instead of going through control silo (sentry.io).

If I understand correctly I think this information might already be somehow embedded in org tokens, but not in personal tokens. Or maybe I'm confusing this with something else.

If I understand correctly I think this information might already be somehow embedded in org tokens, but not in personal tokens

This is 100% correct

Ok then I should see how to resolve the id from the token instead of using the API call when we have an org token.

To clarify, only the region URL is embedded. Org ID is not

Let's discuss about the need to resolve the org and project IDs in the first place. Almost all other endpoints are capable of accepting both slugs and IDs interchangeably in the backend (Sentry side). To me it seems like a pretty serious limitation if objectstore cannot accept either.

That said, what you mentioned about having arbitrary keys in your other comment makes sense, and I guess it makes it harder to accept both IDs and slugs. Let's see if we can come up with a solution here, though

I understand, and this is the point I'd like us to think about:

That's why I'm saying it should be responsibility of the user to normalize the IDs to write to the expected path if needed.

Would be nice if the CLI did not need to perform this normalization. If objectstore doesn't do it, maybe we can introduce a layer in between on the server side that does.

Admittedly I'm not too familiar with Objectstore's architecture, so maybe this is not a great idea at all, let's think about it though 😅

I'm happy to jump in on this discussion!

@runningcode We discussed this today and came up with a better idea: we could extend the /preprod/retention endpoint (https://github.com/getsentry/sentry-cli/pull/3110/changes#diff-4715d1ca31922b72b1d5c42d64759dfa3e247bf3fb4a9066d1c4711e6f8478f5R994)
to be something like /preprod/upload-options instead.
This could return the retention to use for a particular org, the Objectstore proxy URL where to perform the upload (with org and project normalized to IDs), as well as the token to use for Objectstore auth once we roll that out.

This would require just a single request pre-upload and has the additional benefit of leaving the door open if we ever decide to expose Objectstore to the internet at some point in the future: we would just need to change the URL we return to CLI.

What do you think @runningcode @rbro112?

PR to illustrate the idea: getsentry/sentry#108312

I like this idea quite a bit and have gone ahead and pre-stamped that sentry PR. I'm good if you want to land this backend API then update this command to use it @lcian !

lcian · 2026-01-28T10:06:26Z

src/utils/api/mod.rs

+use anyhow::{Context, Result};
+
+/// Given an org and project slugs or IDs, returns the IDs of both.
+pub fn get_org_project_id(api: impl AsRef<Api>, org: &str, project: &str) -> Result<(u64, u64)> {


As far as I know, for all commands a user can provide --project and --org as slugs or IDs.
These utils are needed to get the corresponding IDs, so that objects from the same org/proj all have the same paths in objectstore regardless if the user passed in slugs or IDs.

It might be weird to take in Api and it might be possible that these functions should live somewhere else, IDK.
They certainly don't belong to the Api struct as its methods seem to all map 1to1 to Sentry API calls.

Can the backend interpret the slugs in to ids instead of the frontend?

Can the backend interpret the slugs in to ids instead of the frontend?

Indeed, I agree that this would be preferable. The current code adds two additional API calls that could be avoided if the backend resolves the slugs/IDs appropriately.

Not sure what backend and frontend here refers to.
How objectstore works is that the scope (org and project in this case) is an arbitrary sequence of key-value pairs. We recommend using org and project ID, or at least org ID, but that's not mandatory.
Therefore it's not a responsibility of objectstore or its endpoint in the monolith to normalize org and project, we simply take what the user provides. It's responsibility of the user to normalize.

lcian · 2026-01-28T10:11:13Z

Cargo.toml

 lazy_static = "1.4.0"
 libc = "0.2.139"
 log = { version = "0.4.17", features = ["std"] }
+objectstore-client = { git =  "https://github.com/getsentry/objectstore.git", branch = "lcian/feat/rust-batch-client" }


This indirectly adds a dependency to reqwest and we need to double check what the implications of that are, especially in regard to rustls vs native-tls which could be problematic.

It also adds a bunch of deps but I think most of them are inevitable.

Let's discuss the TLS stuff next week, because I am uncertain what problems you're concerned about.

@lcian and I discussed offline today. We agreed we should use native-tis and native-tls-vendored (the latter possibly only on Linux), as these should pull in the same underlying dependencies as curl, thus limiting the increase in binary size.

linear · 2026-01-28T10:15:48Z

FS-233 Add new command to Sentry CLI to upload to Objectstore

src/commands/build/snapshots.rs

szokeasaurusrex · 2026-02-06T15:04:56Z

Hey @lcian, are you ready for me to review this, or are you still planning to check the feedback from Bugbot and/or iterate further?

rbro112 · 2026-02-10T05:47:04Z

feat(preprod): Add snapshots subcommand #3110 👈 (View in Graphite)
master

This stack of pull requests is managed by Graphite. Learn more about stacking.

Adds the initial snapshots POST API. This api does a few things and is intended to be invoked by the CLI: - Creates `PreprodArtifact`, `PreprodSnapshotMetrics` and `CommitComparison` DB models - Creates image metadata based on what's uploaded from CLI - Stores metadata in objectstore Notably images are uploaded directly to objectstore from CLI Tested E2E locally with objectstore and WIP CLI branch (getsentry/sentry-cli#3110) Resolves EME-773

szokeasaurusrex

I left some comments. Most of them are minor and optional, but I would call your attention to the comment about memory usage and the potential for leaking tokens.

Let's discuss further next week

src/api/mod.rs

szokeasaurusrex · 2026-02-13T09:49:22Z

src/api/mod.rs

+#[derive(Deserialize, Debug)]
+pub struct OrganizationDetails {
+    pub id: String,
+    pub links: OrganizationLinks,


If I understand correctly I think this information might already be somehow embedded in org tokens, but not in personal tokens

This is 100% correct

src/commands/build/snapshots.rs

szokeasaurusrex · 2026-02-13T13:20:00Z

Cargo.toml

 lazy_static = "1.4.0"
 libc = "0.2.139"
 log = { version = "0.4.17", features = ["std"] }
+objectstore-client = { git =  "https://github.com/getsentry/objectstore.git", branch = "lcian/feat/rust-batch-client" }


Let's discuss the TLS stuff next week, because I am uncertain what problems you're concerned about.

szokeasaurusrex · 2026-02-13T13:20:10Z

Cargo.toml

 lazy_static = "1.4.0"
 libc = "0.2.139"
 log = { version = "0.4.17", features = ["std"] }
+objectstore-client = { git =  "https://github.com/getsentry/objectstore.git", branch = "lcian/feat/rust-batch-client" }


h: Before we merge these changes, we should ensure we are depending on a version that's been published to crates.io, not a Git branch.

@lcian will let you take this one as it's more objectstore related

src/commands/build/snapshots.rs

szokeasaurusrex · 2026-02-13T13:23:46Z

src/utils/api/mod.rs

l: I would move this to src/utils/api.rs for now

@lcian will let you take this one as it's more objectstore related

src/utils/objectstore/mod.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-16T09:44:26Z

src/commands/build/snapshots.rs

+
+        // SOF markers: C0-CF except C4 (DHT), C8 (JPG extension), and CC (DAC)
+        if (0xC0..=0xCF).contains(&marker) && marker != 0xC4 && marker != 0xC8 && marker != 0xCC {
+            if i + 7 < data.len() {


Off-by-one in JPEG dimension bounds check

Low Severity

The bounds check i + 7 < data.len() is one byte too strict. The highest index accessed is i + 6, which requires i + 7 <= data.len(). The current check requires data.len() >= i + 8, meaning valid JPEG files where the SOF segment ends exactly at the data boundary will incorrectly return None instead of the parsed dimensions.

cursor · 2026-02-16T09:44:26Z

src/commands/build/snapshots.rs

+        debug!("Processing image: {}", image.path.display());
+
+        let contents = fs::read(&image.path)
+            .with_context(|| format!("Failed to read image: {}", image.path.display()))?;


Double file read causes TOCTOU hash-content mismatch

Medium Severity

Each image file is read twice: once in collect_image_info to compute the SHA256 hash and dimensions, and again in upload_images to get the upload content. The hash from the first read becomes the objectstore key, but the content from the second read is what gets uploaded. If a file changes between reads, the manifest hash won't match the uploaded content. Storing the file contents in ImageInfo from the first read would fix both the correctness issue and avoid redundant I/O.

Additional Locations (1)

src/commands/build/snapshots.rs#L200-L201

sentry · 2026-02-16T14:56:37Z

src/commands/build/snapshots.rs

+    let hash = compute_sha256_hash(&contents);
+    Ok(Some(ImageInfo {
+        path: path.to_path_buf(),
+        relative_path: relative,
+        hash,
+        width,
+        height,
+    }))


Bug: Image files are read twice, creating a race condition where a file modified between reads can cause silent data corruption in the uploaded snapshot.
_{Severity: MEDIUM}

Suggested Fix

Read the file contents only once. Store the file bytes in memory within the ImageInfo struct alongside the computed hash during the collection phase. Use these in-memory bytes for the upload phase instead of re-reading the file from disk.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/commands/build/snapshots.rs#L215-L222 Potential issue: A race condition exists where image files are read twice without protection against concurrent modification. First, in `collect_image_info`, a file is read to compute its SHA256 hash. Later, in `upload_images`, the same file is read again from disk for the upload process. If the file is modified between these two reads, the content uploaded to the object store will not match the hash used as its key. This results in silent data corruption, where the snapshot manifest and the stored file content are misaligned.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

cursor · 2026-02-16T14:57:26Z

src/commands/build/snapshots.rs

+        debug!("Processing image: {}", image.path.display());
+
+        let contents = fs::read(&image.path)
+            .with_context(|| format!("Failed to read image: {}", image.path.display()))?;


Files read twice causing potential hash-content mismatch

Medium Severity

Each image file is read from disk twice: once in collect_image_info to compute the SHA-256 hash and dimensions, then again in upload_images to get the content for upload. The hash stored in ImageInfo comes from the first read, but the content uploaded to objectstore comes from the second fs::read. If a file is modified between the two reads, the obj_key (derived from the first read's hash) won't match the actually uploaded content, causing a silent data integrity mismatch in the manifest. Storing the file contents in ImageInfo during the first read would fix both the correctness issue and avoid the redundant I/O.

Additional Locations (1)

src/commands/build/snapshots.rs#L200-L215

cursor · 2026-02-16T14:57:26Z

src/commands/build/snapshots.rs

+
+        // SOF markers: C0-CF except C4 (DHT), C8 (JPG extension), and CC (DAC)
+        if (0xC0..=0xCF).contains(&marker) && marker != 0xC4 && marker != 0xC8 && marker != 0xCC {
+            if i + 7 < data.len() {


JPEG dimension bounds check off by one

Low Severity

The bounds check i + 7 < data.len() is one byte too strict. The highest index accessed is i + 6 (for the width bytes), so the correct guard is i + 7 <= data.len() (equivalently, i + 6 < data.len()). This causes read_jpeg_dimensions to return None for valid JPEG data where the SOF segment ends exactly at the buffer boundary.

noahsmartin and others added 8 commits December 17, 2025 20:02

feat(preprod): Add snapshots subcommand

95876c4

update deps

e45f588

no multiple projects, get rid of shard index

f3cfadb

add organization details API util

4794046

introduce necessary utils

c6e67e9

use batch API

1cd38b7

improve

15a89af

improve

1bc440d

This comment was marked as outdated.

Sign in to view

cargo clippy --fix

3417b53

lcian commented Jan 28, 2026

View reviewed changes

improve

3f206dd

lcian commented Jan 28, 2026

View reviewed changes

fmt all

e902bfc

fix test

e5bd2d7

lcian commented Feb 3, 2026

View reviewed changes

src/commands/build/snapshots.rs Show resolved Hide resolved

lcian added 4 commits February 5, 2026 08:47

add auth, fix url

081e550

cargo clippy --fix

d13ddec

changelog

db855ee

Merge branch 'master' into lcian/feat/snapshots

b70e7b1

lcian marked this pull request as ready for review February 5, 2026 18:32

lcian requested review from a team and szokeasaurusrex as code owners February 5, 2026 18:32

cursor bot reviewed Feb 5, 2026

View reviewed changes

src/commands/build/snapshots.rs Outdated Show resolved Hide resolved

src/commands/build/snapshots.rs Show resolved Hide resolved

src/commands/build/snapshots.rs Outdated Show resolved Hide resolved

Updates to get e2e run working

0d59c90

rbro112 mentioned this pull request Feb 10, 2026

feat(preprod): Snapshots upload API getsentry/sentry#107825

Merged

szokeasaurusrex reviewed Feb 13, 2026

View reviewed changes

rbro112 and others added 7 commits February 13, 2026 17:09

Feedback

46b3edc

Add retention policy

56e0050

ref(api): Use rename_all on OrganizationLinks

e5f1f08

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(utils): Move objectstore module from directory to single file

4a3da60

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(utils): Accept AuthenticatedApi in get_objectstore_url

07c3b7a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ref(api): Make PathArg public and use it in objectstore URL

3f4c1ff

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(snapshots): Handle invalid auth token header gracefully

4596f8d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor bot reviewed Feb 16, 2026

View reviewed changes

use objectstore_client with native-tls feature

b0fc8b0

sentry bot reviewed Feb 16, 2026

View reviewed changes

cursor bot reviewed Feb 16, 2026

View reviewed changes

Uh oh!

Conversation

lcian commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

lcian Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linear bot commented Jan 28, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

szokeasaurusrex commented Feb 6, 2026

Uh oh!

rbro112 commented Feb 10, 2026

Uh oh!

szokeasaurusrex left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

lcian commented Jan 28, 2026 •

edited

Loading

lcian Jan 28, 2026 •

edited

Loading

szokeasaurusrex left a comment •

edited

Loading