Skip to content
View AEON-7's full-sized avatar

Block or report AEON-7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AEON-7/README.md

AEON-7

NVFP4 quantizations Β· Abliterated LLMs Β· DGX Spark deployments Β· AI media production toolchain

GitHub followers GitHub stars Focus Stack


I build deployment-ready open releases for next-gen GPUs β€” NVFP4-quantized abliterated LLMs (Gemma 4, Qwen 3.6, Nemotron 3) running on NVIDIA DGX Spark (GB10 / Blackwell / sm_121a), DFlash speculative decoding, EAGLE drafters, and a complete agent-driven AI media production toolchain.

Everything below is public, MIT/Apache-licensed, and reproducible β€” Docker stacks, pre-built vLLM images, deployment guides, and benchmark numbers included.


🎬 AEON Media Production

Open-source AI-driven media production toolchain. Five focused repositories β€” each generating one kind of media (music, radio drama, music video, cinematic video, or the base ComfyUI stack), all designed for AI agents through skill MD files and CLI scripts. No node-graph wrangling.

Repo What it does β˜…
comfyui-aeon-spark Bleeding-edge ComfyUI for DGX Spark (CUDA 13 + SageAttention v3 + NVFP4 + 14 custom-node packs + Flux 2 Dev / LTX 2.3 22B / ACE-Step v1.5 XL Turbo pre-bundled). Foundation for every other repo in this section.
aeon-music-maker ACE Step 1.5 XL music generation with dynamics-preserving mastering chain (HPF β†’ EQ β†’ tape sat β†’ LUFS gain-match β†’ true-peak ceiling). FLAC-lossless output, auto-detected mastering presets, CLI-driven.
aeon-radio-drama Full-pipeline radio drama / audiobook production β€” dialogue (Qwen3-TTS) + music (ACE Step) + SFX (MMAudio / Stable Audio Open / ACE) + sidechain mix in one command. Three-Lock voice persistence. Bundles standalone music_maker.py + sfx_maker.py for one-shot music or SFX generation.
aeon-music-video Audio-reactive music video builder. librosa-driven beat / onset / RMS / spectral-centroid detection drives ffmpeg filter chains for synced visual effects. CPU-only β€” no GPU, no ComfyUI, no model downloads required.
aeon-movie-maker Fast cinematic video via LTX 2.3 22B. Single clips, full screenplays with character continuity (last-frame carry-forward + per-character seed offsets), and sidechain-mixed final cuts. CLI-tunable LoRA strengths + saturation troubleshooting guide.

What every AEON Media Production repo ships with

File What it is
README.md Quick start, configuration table, local-vs-remote ComfyUI execution modes, env-var reference, model-installation paths
AGENTS.md Step-0 execution-mode detection guide for AI agents, invocation contract, recovery patterns
SKILL.md Full prompt-engineering recipes, troubleshooting decision trees, the canonical agent skill definition
ATTRIBUTION.md Upstream credits β€” every model, library, and custom node properly attributed
.env.example Verbose, self-documenting β€” every variable has inline instructions on where to get values (HF tokens, Civitai tokens, ComfyUI URL patterns)
setup.sh First-time install β€” validates ComfyUI reachability, installs Python deps, inventories model files, prints download commands for missing pieces
sync.sh Incremental update β€” diff preview, auto-stash local edits, ff-only pull, refresh deps, re-run model check. Supports --dry-run / --yes / --no-models
.gitignore Standard β€” never commits output/, models/, .env, __pycache__, etc.
LICENSE MIT

Lifecycle (every repo, same pattern)

git clone https://github.com/AEON-7/<repo>     β†’   ./setup.sh        β†’   start using
                                                       β”‚
                                                       β–Ό
                                              copy .env.example β†’ .env
                                              edit COMFYUI_URL etc.
                                                       β”‚
                                                       β–Ό
                                          python scripts/<tool>.py ...
                                                       β”‚
                            (later, when upstream updates) β–Ό
                                                ./sync.sh
                                              (preview β†’ confirm β†’ pull β†’ refresh)

Local vs remote ComfyUI

Every tool that uses ComfyUI supports two execution modes, documented per-repo:

  • Mode A β€” Local: CLI runs on the same machine as ComfyUI. Just python scripts/<tool>.py ....
  • Mode B β€” Remote: ComfyUI on a GPU box (DGX Spark, headless server). Either invoke the CLI over SSH (ssh user@gpu-host 'cd repo && python ...') or hit the remote ComfyUI HTTP API directly via SSH tunnel or --listen 0.0.0.0.

aeon-movie-maker has additional constraints documented (I2V + screenplay carry-forward needs filesystem-level access β€” pure HTTP-only remote works for T2V single clips only).


πŸ’Ž Gemma 4 Models β€” NVFP4 quantizations for DGX Spark

Abliterated Gemma 4 deployments at NVFP4 precision (4-bit weights + 8-bit activations) for NVIDIA DGX Spark / Blackwell GPUs. Includes EAGLE speculative-decoding drafters for both base models.

Repo Model Architecture Description β˜…
Gemma-4-31B-DECKARD-HERETIC-Uncensored-NVFP4 Gemma 4 31B DECKARD HERETIC Dense, thinking NVFP4-quantized abliterated 31B dense reasoning model. AWQ_FULL + SVDQuant variants.
Gemma-4-26B-A4B-it-Uncensored-NVFP4 Gemma 4 26B A4B-it MoE NVFP4-quantized 26B MoE. 50 tok/s single, 1430 tok/s aggregate on DGX Spark.
supergemma4-26b-abliterated-multimodal-nvfp4 SuperGemma 4 26B Abliterated Multimodal MoE, multimodal NVFP4 AWQ Full quantization for Blackwell GPUs. Pre-built vLLM container + patches included.
Gemma-4-E4B-it-Uncensored-NVFP4 EAGLE drafter for 26B MoE Speculative decoding EAGLE E4B speculative-decoding drafter for the Gemma 4 26B MoE. NVFP4 AWQ.
Gemma-4-E4B-DECKARD-HERETIC-Uncensored-NVFP4 EAGLE drafter for 31B DECKARD Speculative decoding EAGLE E4B speculative-decoding drafter for the 31B DECKARD HERETIC.

πŸ‰ Qwen 3.5 Models

Reserved for upcoming Qwen 3.5 abliterated NVFP4 deployments. No releases yet β€” coming soon.


πŸ‰ Qwen 3.6 Models β€” NVFP4 + DFlash on DGX Spark

Lossless abliteration of Qwen 3.6 with hardware NVFP4 quantization, optionally combined with DFlash speculative decoding for higher single-stream throughput.

Repo Model Architecture Description β˜…
Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash Qwen 3.6 27B AEON Ultimate Uncensored Dense Lossless abliteration with NVFP4 hardware quantization. BF16 (51 GB) + NVFP4 (26 GB) deployment guide, docker-compose, QuickStart.
Qwen3.6-NVFP4-DFlash Qwen 3.6 35B-A3B-heretic MoE NVFP4 + DFlash speculative decoding on DGX Spark (GB10 / sm_121a). Source-built vLLM image + 7 patches + comprehensive deployment guide.

🌌 Nemotron Models

NVIDIA Nemotron deployments for Blackwell-class hardware.

Repo Model Architecture Description β˜…
Nemotron-3-Nano-Omni-AEON-Ultimate-Uncensored Nemotron 3 Nano Omni 12-D abliterated multimodal BF16 + NVFP4 multimodal reasoning model for DGX Spark / Blackwell. Source-built vLLM v0.20.0 image + 4 patches + benchmark + deployment guide.

πŸ› οΈ Inference & Optimization Tools

Lower-level building blocks: speculative decoding, KV-cache compression, quantization tooling, and vLLM image builds.

Repo What it does β˜…
vllm-dflash DFlash vLLM for DGX Spark β€” Plug & Play Block-Diffusion Speculative Decoding. Pre-built Docker image with NVFP4, sm_121a kernels, and Qwen-targeted optimizations.
turboquant Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration.
Model-Optimizer Unified library of SOTA model-optimization techniques β€” quantization, pruning, distillation, speculative decoding β€” for TensorRT-LLM / TensorRT / vLLM deployment.
modelopt-fast-moe MoE-targeted quantization + AWQ calibration tooling. NVFP4 routing, expert-aware modelopt.

πŸ§ͺ Apps & Utilities

Side projects, tools, and infrastructure that aren't model deployments but might be useful.

Repo What it does β˜…
matrix-voip-agent Headless Matrix WebRTC voice agent β€” auto-answers VoIP calls and bridges audio to any AI agent via PipeWire.
cosmic-mind Security-and-resiliency-focused deployment of the Quartz web app. A place to build your second mind and share it.
regex-builder Simple and elegant RegEx builder.
quartz Fast batteries-included static-site generator that transforms Markdown into fully functional websites.

πŸ“Š Stats

GitHub Stats Top Languages


β˜• Support the work

If any of these releases have been useful to you, tips are deeply appreciated β€” they go directly toward more compute, more models, and more open releases. Scan a QR with your wallet, or click any address below to copy.

β‚Ώ Bitcoin (BTC)
BTC QR
bc1q09xmzn00q4z3c5raene0f3pzn9d9pvawfm0py4
Ξ Ethereum (ETH)
ETH QR
0x1512667F6D61454ad531d2E45C0a5d1fd82D0500
β—Ž Solana (SOL)
SOL QR
DgQsjHdAnT5PNLQTNpJdpLS3tYGpVcsHQCkpoiAKsw8t
β“œ Monero (XMR)
XMR QR
836XrSKw4R76vNi3QPJ5Fa9ugcyvE2cWmKSPv3AhpTNNKvqP8v5ba9JRL4Vh7UnFNjDz3E2GXZDVVenu3rkZaNdUFhjAvgd

Ethereum L2s (Base, Arbitrum, Optimism, Polygon, etc.) and EVM-compatible tokens can be sent to the same Ethereum address.


🀝 Get in touch

  • 🌐 Open an issue on any repo for questions, bug reports, or feature requests
  • πŸ“œ Most releases include a deployment guide + benchmark numbers β€” start there

Built for the open source community on NVIDIA DGX Spark, RTX 5090, and Blackwell-class GPUs.

Popular repositories Loading

  1. Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash Qwen3.6-27B-AEON-Ultimate-Uncensored-DFlash Public

    Lossless abliteration of Qwen3.6-27B with NVFP4 hardware quantization for DGX Spark / Blackwell. BF16 (51 GB) + NVFP4 (26 GB) deployment guide, docker-compose, and QuickStart.

    Python 101 9

  2. Qwen3.6-NVFP4-DFlash Qwen3.6-NVFP4-DFlash Public

    Qwen3.6-35B-A3B-heretic NVFP4 + DFlash speculative decoding on DGX Spark (GB10/sm_121a). Source-built vLLM image + 7 patches + comprehensive deployment guide.

    Python 49 4

  3. vllm-dflash vllm-dflash Public

    DFlash vLLM for DGX Spark β€” Plug & Play Block-Diffusion Speculative Decoding

    Python 30 5

  4. comfyui-aeon-spark comfyui-aeon-spark Public

    Bleeding-edge ComfyUI for NVIDIA DGX Spark (GB10/Blackwell/sm_121a). CUDA 13 + SageAttention v3 (sm_121a) + NVFP4 + 14 custom-node packs + Flux 2 Dev / LTX 2.3 22B / ACE-Step v1.5 XL Turbo pre-bund…

    Shell 13 3

  5. AEON-7 AEON-7 Public

    Profile repo β€” categorized index of NVFP4 model deployments, inference tooling, and the AEON Media Production toolchain Β· β˜• Tips welcome

    7 1

  6. supergemma4-26b-abliterated-multimodal-nvfp4 supergemma4-26b-abliterated-multimodal-nvfp4 Public

    NVFP4 AWQ Full quantization of SuperGemma4-26B-Abliterated-Multimodal for Blackwell GPUs β€” pre-built vLLM container + patches included

    Python 6