Skip to content

A question regarding the completion rate of camera app tasks in benchmarks. #4

@ldya97

Description

@ldya97

Hello, I am a researcher on large-scale intelligent agents. Recently, I have been using your benchmark for testing, and during the testing process, I found that the agent performs very poorly in completing tasks within the camera app. A very obvious reason is that the settings and the switch for camera modes are not initially displayed on the interface; they only become visible after a leftward swipe. I believe that without providing relevant cues to the large model, it is difficult for the agent to perform this action autonomously. Since the performance of my own agent is quite poor, I would like to ask what the completion rates are for various apps in your benchmark. This information seems to be missing from the paper.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions