Skip to content

Amayama/tool_usage

Repository files navigation

AgentTools

更新在这里,数据下载链接:(瑾瑜补充)

日常尝试logs放这里,之前尝试也给我:(李琨补充)

描述数据放着:(林尧补充)

8.23 Readme sample

An Autonomous Agent for Human Mobility Prediction

This repository contains the implementation of a novel autonomous agent for predicting human mobility patterns. The agent leverages a Large Language Model (LLM) to emulate a human-like reasoning process, moving beyond traditional statistical models by incorporating a dynamic, multi-step cognitive cycle.

Overview

Predicting human movement is a complex task that depends on historical habits, semantic context, and logical reasoning. This project introduces an agent that tackles this challenge by employing a Plan-Execute-Reflect architecture. Instead of making a single, monolithic prediction, the agent breaks down the problem into a sequence of logical steps, using a set of specialized "tools" to gather information and build a conclusion, progressively refining its understanding of a user's behavior.

The core capabilities of the agent include:

  • Dynamic Planning: Using an LLM to generate a step-by-step plan to solve a prediction task.
  • Tool Use: Executing a series of tools to gather and process information, such as finding historical precedents and enriching location data.
  • Reflective Learning: Analyzing the outcomes of its actions to discover and memorize high-level behavioral patterns.

Architecture: The Plan-Execute-Reflect Cycle

The agent operates in a continuous cognitive loop for each prediction it needs to make. This loop is orchestrated by the TrajectoryPredictionAgent (core/agent.py).

  1. Planner (core/planner.py): This is the agent's "brain." At each step, it assesses the overall goal and the history of actions taken so far. It then queries the LLM to decide the single most logical next_action to take from a predefined set of tools.
  2. Executor (core/executor.py): This is the agent's "hands." It receives the action command from the Planner and executes the corresponding tool. The tools are granular and specific:
    • find_similar_historical_visits: Finds relevant past visits to generate candidate locations.
    • enrich_candidate_details: Fetches semantic descriptions for candidate locations.
    • make_final_prediction: The final step, where the Planner uses the LLM to synthesize all gathered evidence into a final prediction.
  3. Reflector (core/reflector.py): After each action, the Reflector analyzes the outcome (observation) to identify meaningful patterns (e.g., "User visits the gym on weekday evenings"). These insights are stored in the PatternMemory.
  4. Memory (memory/): The agent is equipped with short-term memory modules that are active during a prediction task:
    • OperationMemory: A "scratchpad" that remembers the sequence of actions and observations in the current loop.
    • PatternMemory: Stores high-level insights discovered by the Reflector, providing long-term context for future tasks.

Project Structure

Agent/
├── config/
│   └── config.yaml           # Main configuration for models, paths, etc.
├── core/
│   ├── agent.py              # Orchestrates the main agent loop
│   ├── planner.py            # The "brain": decides the next action via LLM
│   ├── executor.py           # The "hands": executes tools
│   └── reflector.py          # Analyzes results and creates patterns
├── memory/
│   ├── short_term/           # Operation and Pattern memory modules
│   └── ...
├── prompts/
│   ├── templates/
│   │   ├── system_prompt.txt # Defines the agent's role and goals
│   │   ├── user_prompt.txt   # Structures the input for the LLM at each step
│   │   └── tool_definitions.txt # Describes the available tools to the LLM
│   └── prompt_manager.py     # Loads and formats prompts
├── retrieval/
│   └── retriever.py          # Data access layer for historical trajectories
├── main_evaluation.py        # Main script for running multi-model, multi-user evaluations
└── requirements.txt          # Python dependencies

Getting Started

1. Prerequisites

  • Python 3.9+
  • NVIDIA GPU with CUDA for running local transformer models.
  • Conda or another virtual environment manager is recommended.

2. Installation

Clone the repository and install the required dependencies:

git clone [https://github.com/your-username/your-repo-name.git](https://github.com/your-username/your-repo-name.git)
cd your-repo-name/Agent
pip install -r requirements.txt

3. Configuration

The primary configuration is handled in config/config.yaml. You will need to set up:

  • Model Configuration: Specify the models you want to evaluate (e.g., qwen3-8b, qwen3-32b). The system uses Hugging Face Transformers to load models locally, so ensure the model_id matches the Hugging Face Hub identifier. You can also configure quantization and other performance settings.
  • Data Paths: Provide the correct paths to your trajectory data files.

4. Data Preparation

The agent expects user trajectory data in a specific format. The Retriever is configured to load:

  • A main trajectory file (e.g., user_grid_trajectories.parquet).
  • A grid details file containing semantic descriptions (e.g., tokyo_grid.csv).

Ensure your data is pre-processed and placed in the paths specified in your config.yaml.

Usage

The project is designed as a comprehensive evaluation framework. You can run multi-model and multi-user comparisons directly from the command line.

Running the Evaluation

The main entry point is main_evaluation.py. The script is pre-configured to run a batch evaluation. You can modify the following constants directly in the main() function of the script:

  • USER_IDS_TO_PROCESS: A list of user IDs to evaluate.
  • MODELS_TO_COMPARE: A list of model keys (which must be defined in config.yaml).
  • KNOWLEDGE_CUTOFF, PREDICTION_START, PREDICTION_END: Date strings to define the training/testing split.

To run the evaluation, simply execute the script:

python main_evaluation.py

Output

The script will generate:

  1. Detailed Logs: Real-time console output showing the agent's step-by-step reasoning process for each prediction.
  2. CSV Results: For each model and user, two CSV files are saved in the evaluation_results/ directory:
    • raw_predictions_...csv: The raw output from the agent.
    • evaluated_results_...csv: The predictions merged with the ground truth, used for metric calculation.
  3. Final Summary: A summary report is printed to the console at the end, ranking the models by performance (Acc@1, Acc@5).

Citation

If you find this work useful in your research, please consider citing:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages