Skip to content

Docs: image and PDF use cases + misc touchups#343

Open
anyacherniss wants to merge 19 commits intomainfrom
anya/image-use-case
Open

Docs: image and PDF use cases + misc touchups#343
anyacherniss wants to merge 19 commits intomainfrom
anya/image-use-case

Conversation

@anyacherniss
Copy link
Collaborator

@anyacherniss anyacherniss commented Sep 26, 2025

Closes: #210
Closes: #322
Closes: #432

@vercel
Copy link

vercel bot commented Sep 26, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
anotherai Ready Ready Preview Comment Oct 7, 2025 5:33pm
anotherai-docs Ready Ready Preview Comment Oct 7, 2025 5:33pm

Example:

```
Can you retry the last 5 completions of anotherai/agent/image-description-generator and compare the outputs with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anyacherniss does this use-case works?
is Claude/Cursor able to fetch the image via their url to understand what the image is about?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears so yes, I asked Claude to analyze an image from https://anotherai.dev/experiments/01999b76-0e5c-7086-4849-3ef9e28f0ff6, which reused inputs from https://anotherai.dev/experiments/01999b4a-5300-73c2-9ff7-a7e75094c124.

The description initially had some errors: initially it said 4 donuts, not three, and I made up asking about a "green thing" to see if it would actually note that it doesn't exist. After prompting it to recheck, it appears it can view and analyze the image.

This is Sonnet 4.5

CleanShot 2025-09-30 at 12 43 02@2x

@anyacherniss anyacherniss changed the title Docs: image and PDF use cases Docs: image and PDF use cases + misc touchups Oct 2, 2025
@anyacherniss
Copy link
Collaborator Author

@claude can you review and suggest any improvements that either:

  • increase the readability of the changes, particularly for new/prospective users who have minimal context about the product
  • fix technical inaccuracies

@claude
Copy link
Contributor

claude bot commented Oct 3, 2025

Claude finished @anyacherniss's task —— View job


PR Review: Documentation Improvements for Image and PDF Use Cases

I've reviewed all the documentation changes in this PR. The overall structure and content are strong, but I've found several areas where readability can be improved for new/prospective users and a few technical inaccuracies that should be addressed.

✅ Strengths

  • Excellent step-by-step guides for both image and PDF processing agents
  • Good use of callouts and accordions to organize information
  • Helpful visual examples with screenshots
  • Clear workflow for experimentation and optimization

🔍 Issues Found

1. Missing Space in Prompt Examples (image-agents.mdx:41, pdf-processing-agents.mdx:41)

Create a new AnotherAI agent that provides a list of detected foods and their approximate calorie count from a given imagewhere the average response time is less than 4 seconds

Missing space between "image" and "where". Should be:

Create a new AnotherAI agent that provides a list of detected foods and their approximate calorie count from a given image where the average response time is less than 4 seconds

2. Inconsistent Capitalization (Both files)

Line 58: "User identication" → "User identification"

3. Confusing Wording for Dataset Structure (image-agents.mdx:101-103, pdf-processing-agents.mdx:101-103)

The current wording is unclear for new users:

"A dataset is a collection of images in a file format that your AI assistant can understand. Datasets can contain expected outputs, but do not need to."

Suggested rewrite:

"A dataset is a JSON file containing URLs or file paths to your images/PDFs along with optional expected outputs. This allows you to test multiple inputs at once."

4. Technical Inaccuracy in Callout (image-agents.mdx:144-156)

The callout about file path references is partially incorrect based on the review comments. Anya noted that dragged and dropped PDFs in Claude Code do include file paths. Consider clarifying or removing this callout to avoid confusion.

5. Grammatical Error (Both files)

Line 97: "There a couple ways to do this" → "There are a couple ways to do this"

6. Unclear JSON Structure (image-agents.mdx:124, 130, 135)

The JSON examples have trailing commas that make them invalid:

{
  "image_url": "https://example.com/images/photo1.jpg",
}

Should be:

{
  "image_url": "https://example.com/images/photo1.jpg"
}

7. Duplicate Content Issue

The PDF processing document appears to be largely copied from the image agents document with minimal changes. While the structure is appropriate, consider:

  • Adding PDF-specific considerations (e.g., multi-page handling, text extraction vs visual analysis)
  • Including different examples that showcase PDF-specific use cases
  • Mentioning any limitations or differences in PDF processing vs image processing

8. Missing Context for New Users (user-feedback.mdx:33-45)

The example jumps into asking the AI assistant to create a function without explaining:

  • What the annotations API is
  • How completion IDs are obtained
  • What the API key is and where to get it

Consider adding a brief introduction or linking to relevant documentation.

9. Inconsistent Image References

Some image references use generic placeholder names (e.g., /images/add-annotation-completion-output.png) which may not exist. Ensure all image paths are correct and images are included in the PR.

💡 Suggestions for Improvement

  1. Add a Brief Introduction: Both the image and PDF guides could benefit from a 2-3 sentence introduction explaining what these agents do and common use cases before diving into the tutorial.

  2. Clarify Local vs URL Inputs: Add a clear section explaining when to use URLs vs local files, and any limitations of each approach.

  3. Include Troubleshooting Section: Consider adding common issues users might encounter (e.g., image size limits, supported formats, timeout issues).

  4. Link Between Related Docs: Add cross-references between the image and PDF processing guides since they share similar workflows.

  5. Real-World Examples: Include more concrete examples of what users might build (e.g., "receipt scanner", "document classifier", "product image analyzer").

✅ Todo List

  • Examine all changed documentation files
  • Review image-agents.mdx for readability and accuracy
  • Review pdf-processing-agents.mdx for readability and accuracy
  • Review other modified documentation files
  • Compile and provide comprehensive feedback

The documentation additions are valuable and well-structured overall. Addressing these issues will make the guides more accessible to new users and technically accurate.


BranchJob run


#### Uploading Individual PDFs

If you have a small number of PDFs to test that you can also upload them individually to your AI assistant. Since PDFs uploaded this way aren't stored in a persistent dataset, this approach is best suited for ad-hoc testing rather than systematic experimentation. While any completions created using these PDFs will be stored in AnotherAI and can be reused in future experiments, if you wanted to test any specific PDFs again at a later date, you would need to provide links to the specific completions that used those documents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does "upload a PDF to a AI assistant" works? is there a screenshot or video that we can show?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can record a video, yes. Adding as a todo for later in the process - want to get the text part polished first

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the video in 4a997d1. Due to edits I made to the content in this section, I chose to put the video a bit lower on the page. I feel strongly that placing it under this text is not the best place for it, given the additional content I've added since you reviewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify deployment naming in documentation Documentation: use case for PDF extraction Documentation: use case for images as input

3 participants