vlm

Star

Here are 23 public repositories matching this topic...

bytedance / UI-TARS-desktop

Star

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

agent mcp vision vlm tars multimodal computer-use mcp-server gui-agent browser-use gui-operator ui-tars agent-tars

Updated Dec 7, 2025
TypeScript

Aident-AI / open-cuak

Star

Reliable Automation Agents at Scale

agent automation remote-desktop vlm rpa remote-browser llm aiagents

Updated Jun 28, 2025
TypeScript

599yongyang / DatasetLoom

Star

一个面向多模态大模型训练的智能数据集构建与评估平台

typescript ai nextjs dataset vlm nestjs llm shadcn-ui

Updated Sep 30, 2025
TypeScript

opendilab / PsyDI

Star

PsyDI: Towards a Personalized and Progressively In-depth Chatbot for Psychological Measurements. (e.g. MBTI Measurement Agent)

reinforcement-learning chatbot mbti vlm llm

Updated Aug 4, 2025
TypeScript

tri2820 / unblink

Star

Modern video analytics with VLM

monitoring computer-vision camera nvr vlm

Updated Dec 7, 2025
TypeScript

hcompai / surfer-h-cli

Star

Run Surfer-H agents powered by Holo1 using the Surfer-H-CLI. Includes example tasks, scripts, and configurations.

agent cli ai task-automation vlm web-automation web-agent ai-agent surfer-h

Updated Dec 4, 2025
TypeScript

xlang-ai / OSWorld-G

Star

[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis

agent benchmark natural-language-processing gui models dataset vlm rpa multimodal large-action-model

Updated Nov 6, 2025
TypeScript

kiranbaby14 / TalkMateAI

Star

🎭 Real-time voice-controlled 3D avatar with multimodal AI - speak naturally and watch your AI companion respond with perfect lip-sync

websocket nextjs vlm fastapi huggingface whisper-ai flash-attention-2 multimodal-ai kokoro-tts smolvlm

Updated Jul 5, 2025
TypeScript

vlm-run / n8n-nodes-vlmrun

Star

Official n8n custom node for VLM Run

ocr ai computer-vision vlm n8n n8n-workflow n8n-community-node-package n8n-automation

Updated Nov 26, 2025
TypeScript

arc53 / doc2md

Sponsor

Star

Convert pdf and image files into markdown

visual doc vlm doc2md llm vllm

Updated Jan 22, 2025
TypeScript

przeprogramowani / 10x-test-planner

Star

A Node-based CLI tool to generate test plans from video recordings using Google's Gemini models.

tests gemini e2e vlm playwright genai video-to-code

Updated Oct 7, 2025
TypeScript

vlm-run / vlmrun-node-sdk

Star

Official Node.js SDK for VLM Run

nodejs ocr ai computer-vision vlm llm

Updated Nov 25, 2025
TypeScript

o-messai / fastVLM

Star

An implementation of FastVLM/LLaVA or any llm/vlm model using FastAPI (backend) and react js (backend) + Action/Caption mode and frame control

reactjs llama vlm fastapi llava llm-inference llm-framework llava-llama3 vlm-inference fastvlm

Updated Sep 6, 2025
TypeScript

alaa-nadi / UI-TARS-desktop

Star

A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.

electron agent mcp vision vlm vite gui-agents computer-use mcp-server browser-use

Updated Dec 7, 2025
TypeScript

anuj0456 / ailert-nextjs

Star

This repository contains the frontend code for Ailert.tech build on Next.js, Tailwind CSS, and Python.

github opensource research ai nextjs newsletter rss-feed arxiv producthunt vlm huggingface arxiv-papers llm

Updated Jan 12, 2025
TypeScript

hemangjoshi37a / FactoryAIOptimize

Star

AI-Powered Multi-Camera Vision LLM System for Factory Optimization

automation factory ai machines vision manufacturing vlm llm mllm gidc

Updated Jul 13, 2025
TypeScript

ihaterynn / Construction-Analysis

Star

YOLO + VLM for Construction Floor Plan Analysis

vlm yolov8

Updated Sep 29, 2025
TypeScript

dmunish / reach

Star

AI-powered disaster alert system for Pakistan that automatically processes official emergency warnings and delivers location-targeted alerts to communities in real-time.

geospatial vlm early-warning-systems disaster-preparedness

Updated Dec 7, 2025
TypeScript

earvienne305 / unblink

Star

👀 Monitor camera streams in real-time with AI vision models for object detection, contextual understanding, and intelligent video search.

nodejs slack php website monitoring agile slackbot style-guide systemd nvr forum pug policy scrum unb unclassified-newsboard headless-chrome vlm

Updated Dec 7, 2025
TypeScript

donganh1409 / perfect-cafe-finder

Star

An AI-powered location discovery system using multi-modal data (text, images, reviews, real-time factors)

nlp computer-vision artificial-intelligence reinforcement-learning-algorithms heuristic-search-algorithms vlm vector-search llm

Updated Mar 16, 2025
TypeScript

Improve this page

Add a description, image, and links to the vlm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vlm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vlm

Here are 23 public repositories matching this topic...

bytedance / UI-TARS-desktop

Aident-AI / open-cuak

599yongyang / DatasetLoom

opendilab / PsyDI

tri2820 / unblink

hcompai / surfer-h-cli

xlang-ai / OSWorld-G

kiranbaby14 / TalkMateAI

vlm-run / n8n-nodes-vlmrun

arc53 / doc2md

przeprogramowani / 10x-test-planner

vlm-run / vlmrun-node-sdk

o-messai / fastVLM

alaa-nadi / UI-TARS-desktop

anuj0456 / ailert-nextjs

hemangjoshi37a / FactoryAIOptimize

ihaterynn / Construction-Analysis

dmunish / reach

earvienne305 / unblink

donganh1409 / perfect-cafe-finder

Improve this page

Add this topic to your repo