Skip to content
View Yaxin9Luo's full-sized avatar

Block or report Yaxin9Luo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Yaxin9Luo/README.md


   About Me

   My long-term goal is to develop intelligent machines    capable of understanding, generation, reasoning and agentic action on multi-modality content

   Currently working on Multimodal Foundation Model

   Based in Abu Dhabi | PhD Candidate

   Quick Links

   

   

   


divider

Tech Stack

Core

Python PyTorch MATLAB

AI / ML Frameworks

Hugging Face Transformers Diffusers DeepSpeed OpenAI API

Tools

Git Linux Docker W&B

divider

GitHub Stats

  


Trophies


divider

Contributions

  


Activity Graph


divider

Let's Connect

Website   LinkedIn   Twitter   Google Scholar   Email


Profile Views

Pinned Loading

  1. Gamma-MOD Gamma-MOD Public

    [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models

    Python 43 2

  2. MetaAgentX/OpenCaptchaWorld MetaAgentX/OpenCaptchaWorld Public

    [NeurIPS 2025] The first web-based benchmark and platform to evaluate visual reasoning and interaction capabilities of MLLM powered agents through diverse and dynamic CAPTCHA puzzles.

    JavaScript 67 2

  3. MetaAgentX/NextGen-CAPTCHAs MetaAgentX/NextGen-CAPTCHAs Public

    [ICML 2026]A defense framework against MLLM-based web GUI agents. This repository provides both the generative CAPTCHA system and tools for evaluating agent resistance.

    Python 20