Skip to content

Tinfoil doc async#45

Merged
AnthonyRonning merged 4 commits intomasterfrom
tinfoil-doc-async
Jul 1, 2025
Merged

Tinfoil doc async#45
AnthonyRonning merged 4 commits intomasterfrom
tinfoil-doc-async

Conversation

@AnthonyRonning
Copy link
Copy Markdown
Contributor

@AnthonyRonning AnthonyRonning commented Jul 1, 2025

Summary by CodeRabbit

  • New Features

    • Introduced a unified command to update and append PCR values for both development and production environments, with confirmation messaging.
    • Enhanced document upload process to support asynchronous submission with immediate task ID response and status polling endpoint.
  • Bug Fixes

    • Improved document upload process to use asynchronous processing with status polling, reducing request timeouts and enhancing reliability.
  • Chores

    • Appended new entries to PCR development and production history records with updated values and timestamps.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jul 1, 2025

Warning

Rate limit exceeded

@AnthonyRonning has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 1 minutes and 43 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 66b082b and 6c8418f.

⛔ Files ignored due to path filters (3)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
  • tinfoil-proxy/dist/tinfoil-proxy is excluded by !**/dist/**
📒 Files selected for processing (4)
  • pcrDevHistory.json (1 hunks)
  • pcrProdHistory.json (1 hunks)
  • src/web/documents.rs (4 hunks)
  • tinfoil-proxy/main.go (5 hunks)

Walkthrough

A new justfile recipe was introduced to automate sequential PCR update and append operations for both development and production. New entries were appended to the PCR history JSON files. The document upload handler in the proxy was refactored to use an asynchronous API with polling for job status and result retrieval, replacing the previous synchronous approach. Corresponding client-side handlers and routes were added to support asynchronous upload status checking.

Changes

File(s) Change Summary
justfile Added update-pcr-all recipe to sequentially run PCR update and append steps for dev and prod, then print a confirmation.
pcrDevHistory.json, pcrProdHistory.json Appended new entries with updated PCR hash values, timestamps, and signatures to the respective history arrays.
tinfoil-proxy/main.go Refactored document upload handler to submit jobs asynchronously, added status polling endpoint and related types.
src/web/documents.rs Updated upload handler to handle async submission, added new status check handler and route, changed response types accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant WebClient
    participant ProxyServer
    participant AsyncAPI

    User->>WebClient: Upload Document
    WebClient->>ProxyServer: POST /v1/documents (document)
    ProxyServer->>AsyncAPI: POST /async-upload (document)
    AsyncAPI-->>ProxyServer: { "task_id": id }
    ProxyServer-->>WebClient: 202 Accepted + { task_id, filename, size }

    loop Polling (every few seconds)
        WebClient->>ProxyServer: POST /v1/documents/status { task_id }
        ProxyServer->>AsyncAPI: GET /v1/documents/status/{task_id}
        AsyncAPI-->>ProxyServer: { status, progress?, error?, document? }
        ProxyServer-->>WebClient: { status, progress?, error?, document? }
    end

    alt status == "success"
        WebClient->>User: Show processed document text
    else status == "failure" or timeout
        WebClient->>User: Show error message
    end
Loading

Possibly related PRs

  • OpenSecretCloud/opensecret#43: Adds the initial synchronous document upload feature; both PRs modify the uploadDocument handler but this PR refactors it for async processing.
  • OpenSecretCloud/opensecret#40: Appends new entries to PCR history JSON files, relating to the same files updated in this PR.
  • OpenSecretCloud/opensecret#26: Modifies PCR verification and append logic; both PRs address PCR update/append commands and history management.

Poem

A recipe hops in, all PCRs in tow,
Dev and prod updated, their histories grow.
Async uploads now, with polling in play,
The proxy awaits what the server will say.
JSONs grow longer, hashes anew—
A rabbit’s delight in all that you do!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Post Copyable Unit Tests in Comment
  • Commit Unit Tests in branch tinfoil-doc-async

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Major architectural change to document processing in tinfoil-proxy, alongside PCR measurement updates for both dev and prod environments to maintain system integrity verification.

  • Converted document upload endpoint in tinfoil-proxy/main.go from synchronous to asynchronous with 2-second polling interval
  • Added new update-pcr-all command in justfile to streamline PCR updates across environments
  • Updated PCR measurements in both environments with new values in pcrDev.json and pcrProd.json, maintaining consistent PCR1 baseline
  • Added corresponding PCR history entries with timestamps 1751331570 (dev) and 1751331592 (prod) in their respective history files

6 files reviewed, 3 comments
Edit PR Review Bot Settings | Greptile

Comment thread tinfoil-proxy/main.go Outdated
Comment thread tinfoil-proxy/main.go Outdated
Comment thread tinfoil-proxy/main.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tinfoil-proxy/main.go (1)

474-614: LGTM - Comprehensive polling implementation with good resource management.

The polling mechanism is well-implemented with proper timeout handling, context management, and support for multiple status field names. The 2-second polling interval and 5-minute timeout are reasonable for document processing.

One minor suggestion: consider adding a maximum retry count for unknown status responses to avoid indefinite polling in edge cases.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 817e653 and 0d1ee94.

⛔ Files ignored due to path filters (3)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
  • tinfoil-proxy/dist/tinfoil-proxy is excluded by !**/dist/**
📒 Files selected for processing (4)
  • justfile (1 hunks)
  • pcrDevHistory.json (1 hunks)
  • pcrProdHistory.json (1 hunks)
  • tinfoil-proxy/main.go (4 hunks)
🔇 Additional comments (8)
pcrDevHistory.json (1)

107-112: LGTM - Consistent data structure maintained.

The new PCR history entry follows the established JSON structure with all required fields (PCR0, PCR1, PCR2, timestamp, signature) and maintains data consistency.

pcrProdHistory.json (1)

107-112: LGTM - Production PCR data correctly appended.

The new production PCR entry maintains structural consistency and includes all required cryptographic measurement fields.

justfile (1)

455-461: LGTM - Well-designed automation recipe.

The update-pcr-all recipe provides a convenient way to update both development and production PCR values sequentially. The implementation correctly chains the existing recipes and provides clear feedback.

tinfoil-proxy/main.go (5)

122-133: LGTM - Well-designed types for async job handling.

The new types correctly model the async job workflow with appropriate optional fields and multiple status field variations for API compatibility.


409-413: LGTM - Appropriate timeout and endpoint for async submission.

The reduced timeout (30 seconds) is appropriate for job submission, and the async endpoint URL correctly reflects the new workflow.


452-470: LGTM - Robust async response handling.

Correctly handles both 200 and 202 status codes for async operations, with proper JSON parsing validation and error logging.


542-595: LGTM - Flexible and robust result processing.

The result fetching logic correctly reuses the secure HTTP client, handles multiple possible response formats for text extraction, and maintains original file metadata in the response.


597-612: LGTM - Comprehensive error handling for async workflow.

Error handling appropriately covers job failures, timeout scenarios, and unknown states with proper logging for debugging and meaningful error responses to clients.

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Enhanced document processing resilience in src/web/documents.rs with more robust async implementation and improved error handling.

  • Reduced initial upload timeout from 5 minutes to 30 seconds in src/web/documents.rs for faster client feedback
  • Added /v1/documents/status endpoint with comprehensive status field detection (Status, State, TaskStatus)
  • Implemented proper HTTP response body cleanup with defer statements in tinfoil-proxy/main.go
  • Added structured response types DocumentUploadInitResponse and DocumentStatusResponse for better API consistency

6 files reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

Comment thread pcrProd.json
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Final refinements to the document processing system with additional error handling and timeout optimizations.

  • Added exponential backoff for status polling in src/web/documents.rs with configurable retry limits
  • Implemented comprehensive error type mapping for various document processing failure scenarios
  • Added request context cancellation support to prevent resource leaks during timeouts

1 file reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

Comment thread src/web/documents.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
src/web/documents.rs (1)

1-286: Fix code formatting to pass CI checks.

The pipeline indicates formatting issues. Please run cargo fmt --all to fix the formatting.

♻️ Duplicate comments (4)
tinfoil-proxy/main.go (4)

489-618: Add retry logic with exponential backoff.

Status check errors currently fail immediately. Consider implementing retry logic with a maximum retry count to handle transient failures gracefully.

Would you like me to generate a retry mechanism with exponential backoff for the status checking logic?


420-420: Consider using a URL constant/config for the API endpoint.

The hardcoded URL makes it difficult to maintain and update across environments. Consider extracting this to a configuration constant or environment variable.

+const (
+    docUploadAsyncEndpoint = "/v1alpha/convert/file/async"
+)

-req, err := http.NewRequestWithContext(ctx, "POST", "https://doc-upload.model.tinfoil.sh/v1alpha/convert/file/async", &requestBody)
+req, err := http.NewRequestWithContext(ctx, "POST", fmt.Sprintf("https://%s%s", docUploadConfig.Enclave, docUploadAsyncEndpoint), &requestBody)

518-519: Extract hardcoded URLs to configuration.

Multiple hardcoded URLs make the code difficult to maintain. These should be configurable.

+const (
+    statusPollEndpoint = "/v1alpha/status/poll/%s"
+    resultEndpoint = "/v1alpha/result/%s"
+)

-fmt.Sprintf("https://doc-upload.model.tinfoil.sh/v1alpha/status/poll/%s", taskID),
+fmt.Sprintf("https://%s%s", docUploadConfig.Enclave, fmt.Sprintf(statusPollEndpoint, taskID)),

Also applies to: 568-569


549-556: Add default case for empty status fields.

If all status fields are empty, the function could set an invalid empty status. This could lead to unexpected behavior.

 actualStatus := status.Status
 if actualStatus == "" {
     actualStatus = status.State
 }
 if actualStatus == "" {
     actualStatus = status.TaskStatus
 }
+if actualStatus == "" {
+    // Default to pending if no status is available
+    actualStatus = "pending"
+    log.Printf("No status field found in response, defaulting to 'pending'")
+}
🧹 Nitpick comments (1)
src/web/documents.rs (1)

148-151: Consider reducing idle timeout for async operations.

Since document processing is now async with quick initial responses, the 5-minute idle timeout may be excessive.

 let client = Client::builder()
-    .pool_idle_timeout(std::time::Duration::from_secs(300)) // 5 minutes
+    .pool_idle_timeout(std::time::Duration::from_secs(60)) // 1 minute
     .build::<_, Body>(https);
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d1ee94 and 1b6acb6.

⛔ Files ignored due to path filters (3)
  • pcrDev.json is excluded by !pcrDev.json
  • pcrProd.json is excluded by !pcrProd.json
  • tinfoil-proxy/dist/tinfoil-proxy is excluded by !**/dist/**
📒 Files selected for processing (4)
  • pcrDevHistory.json (1 hunks)
  • pcrProdHistory.json (1 hunks)
  • src/web/documents.rs (4 hunks)
  • tinfoil-proxy/main.go (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pcrProdHistory.json
  • pcrDevHistory.json
🧰 Additional context used
🪛 GitHub Actions: Rust CI
src/web/documents.rs

[error] 1-1: cargo fmt formatting check failed. Code style does not match rustfmt standards. Please run 'cargo fmt --all' to fix formatting issues.

⏰ Context from checks skipped due to timeout of 100000ms (1)
  • GitHub Check: Development Reproducible Build
🔇 Additional comments (2)
tinfoil-proxy/main.go (1)

122-140: Well-designed async job types.

The type definitions properly model the async job lifecycle with appropriate fields for tracking status, progress, and errors. Good defensive design including multiple status field variations.

src/web/documents.rs (1)

57-63: Router configuration looks good.

The new status endpoint is properly configured with decryption middleware, maintaining consistency with the existing upload endpoint.

Comment thread tinfoil-proxy/main.go
Comment thread src/web/documents.rs
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Latest updates to PCR history records and async document processing implementation.

  • Added latest PCR entries with timestamps 1751334849 and 1751335494 (dev) and 1751335014 and 1751335521 (prod) while maintaining consistent PCR1 values
  • Reduced status check timeout to 10 seconds in src/web/documents.rs for more responsive polling
  • Added structured logging for document processing state transitions

5 files reviewed, no comments
Edit PR Review Bot Settings | Greptile

AnthonyRonning and others added 2 commits June 30, 2025 21:06
…ling

- Modified tinfoil-proxy to use docling async API endpoints
- Upload endpoint now returns task ID immediately (no more blocking)
- Added new /v1/documents/status/:taskId endpoint for checking progress
- Updated Rust backend to support the new async flow
- Frontend can now control polling to avoid 60s timeout issues

This architecture is more resilient:
- Handles connection drops gracefully
- No more frontend timeouts during long document processing
- Client controls retry logic and can show progress updates
- Better separation of concerns between upload and processing

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

Finalized implementation of the asynchronous document processing system with additional PCR measurements recorded for system state verification.

  • Recorded two new PCR measurements in pcrProdHistory.json with timestamps 1751335014 and 1751335521
  • Optimized request timeouts in src/web/documents.rs by reducing status check timeout to 10 seconds
  • Implemented structured logging for document processing state transitions for better observability

6 files reviewed, 1 comment
Edit PR Review Bot Settings | Greptile

Comment thread pcrDev.json
@AnthonyRonning AnthonyRonning merged commit 5c92146 into master Jul 1, 2025
10 checks passed
@AnthonyRonning AnthonyRonning deleted the tinfoil-doc-async branch July 1, 2025 02:17
@coderabbitai coderabbitai Bot mentioned this pull request Sep 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant