Skip to content

# v1.0.0 - First Release 🎉#306

Merged
kevin-mindverse merged 21 commits intomasterfrom
release_0428
Apr 29, 2025
Merged

# v1.0.0 - First Release 🎉#306
kevin-mindverse merged 21 commits intomasterfrom
release_0428

Conversation

@kevin-mindverse
Copy link
Copy Markdown
Contributor

@kevin-mindverse kevin-mindverse commented Apr 28, 2025

Overview

We're excited to announce our first release! This version introduces multi-platform support, thinking mode capabilities, and CUDA integration.

Deployment Options 🚀

  • Multi-platform Support
    • Docker deployment available
    • Native support for macOS and Linux
    • Users can utilize uv or other virtual environments to avoid package conflicts

New Features ✨

Thinking Mode (Beta)

  • Available in playground mode
  • Enhanced reasoning chain with improved response quality
  • Requirements:
    • Recommended for models with 3B+ parameters
    • DeepSeek API required (currently exclusive to DeepSeek)
  • Note: Response time may be slightly longer due to enhanced reasoning process

CUDA Support 🎮

Special thanks to our community contributors!

  • Supports A100 and equivalent consumer-grade GPUs
  • Training pipeline utilizing CUDA
  • Inference still powered by llama.cpp (CPU-based)
  • Tested primarily in Linux + GPU environments

Notes

This is our first stable release, and we welcome feedback from the community to help improve future versions.

Tags: release thinking-mode cuda multi-platform docker

yexiangle and others added 21 commits April 23, 2025 20:46
* fix:fix l1 save problem

* fix:simplify the code

* fix:delete no use import

* fix:delete useless data
* fix password update logic, if there's more than one load

* update fix
* fix: modify thinking_model loading configuration

* feat: realize thinkModel ui

* feat:store

* feat: add combined_llm_config_dto

* add thinking_model_config & database migration

* directly add thinking model to user_llm_config

* delete thinking model repo dto service

* delete thinkingmodel table migration

* add is_cot config

* feat: allow define  is_cot

* feat: simplify logs info

* feat: add training model

* feat: fix is_cot problem

* fix: fix chat message

* fix: fix progress error

* fix: disable no settings thinking

* feat: add thinking warning

* fix: fix start service error

* feat:fix init trainparams problem

* feat: change playGround prompt

* feat: Add Dimension Mismatch Handling for ChromaDB (#157) (#207)

* Fix Issue #157

Add chroma_utils.py to manage chromaDB and added docs for explanation

* Add logging and debugging process

- Enhanced the`reinitialize_chroma_collections` function in`chroma_utils.py` to properly check if collections exist before attempting to delete them, preventing potential errors when collections don't exist.
- Improved error handling in the`_handle_dimension_mismatch` method in`embedding_service.py` by adding more robust exception handling and verification steps after reinitialization.
- Enhanced the collection initialization process in`embedding_service.py` to provide more detailed error messages and better handle cases where collections still have incorrect dimensions after reinitialization.
- Added additional verification steps to ensure that collection dimensions match the expected dimension after creation or retrieval.
- Improved logging throughout the code to provide more context in error messages, making debugging easier.

* Change topics_generator timeout to 30 (#263)

* quick fix

* fix: shade -> shade_merge_info (#265)

* fix: shade -> shade_merge_info

* add convert array

* quick fix import error

* add log

* add heartbeat

* new strategy

* sse version

* add heartbeat

* zh to en

* optimize code

* quick fix convert function

* Feat/new branch management (#267)

* feat: new branch management

* feat: fix multi-upload

* optimize contribute management

---------

Co-authored-by: Crabboss Mr <1123357821@qq.com>
Co-authored-by: Ye Xiangle <yexiangle@mail.mindverse.ai>
Co-authored-by: Xinghan Pan <sampan090611@gmail.com>
Co-authored-by: doubleBlack2 <108928143+doubleBlack2@users.noreply.github.com>
Co-authored-by: kevin-mindverse <kevin@mindverse.ai>
Co-authored-by: KKKKKKKevin <115385420+kevin-mindverse@users.noreply.github.com>
* feat: replace tutorial link

* replace video link

---------

Co-authored-by: kevin-mindverse <kevin@mindverse.ai>
* Add CUDA support

- CUDA detection
- Memory handling
- Ollama model release after training

* Fix logging issue

added cuda support flag so log accurately reflected cuda toggle

* Update llama.cpp rebuild

Changed llama.cpp to only check if cuda support is enabled and if so rebuild during the first build rather than each run

* Improved vram management

Enabled memory pinning and optimizer state offload

* Fix CUDA check

rewrote llama.cpp rebuild logic, added manual y/n toggle if user wants to enable cuda support

* Added fast restart and fixed CUDA check command

Added make docker-restart-backend-fast to restart the backend and reflect code changes without causing a full llama.cpp rebuild

Fixed make docker-check-cuda command to correctly reflect cuda support

* Added docker-compose.gpu.yml

Added docker-compose.gpu.yml to fix error on machines without nvidia gpu and made sure "\n" is added before .env modification

* Fixed cuda toggle

Last push accidentally broke cuda toggle

* Code review fixes

Fixed errors resulting from removed code:
- Added return save_path to end of save_hf_model function
- Rolled back download_file_with_progress function

* Update Makefile

Use cuda by default when using docker-restart-backend-fast

* Minor cleanup

Removed unnecessary makefile command and fixed gpu logging

* Delete .gpu_selected

* Simplified cuda training code

- Removed dtype setting to let torch automatically handle it
- Removed vram logging
- Removed Unnecessary/old comments

* Fixed gpu/cpu selection

Made "make docker-use-gpu/cpu" command work with .gpu_selected flag and changed "make docker-restart-backend-fast" command to respect flag instead of always using gpu

* Fix Ollama embedding error

Added custom exception class for Ollama embeddings, which seemed to be returning keyword arguments while the Python exception class only accepts positional ones

* Fixed model selection & memory error

Fixed training defaulting to 0.5B model regardless of selection and fixed "free(): double free detected in tcache 2" error caused by cuda flag being passed incorrectly
* feature: use uv to setup python environment

* TrainProcessService add singleten method: get_instance
* feature: use uv to setup python environment

* TrainProcessService add singleten method: get_instance

* feat: fix code

* Added CUDA support (#228)

* Add CUDA support

- CUDA detection
- Memory handling
- Ollama model release after training

* Fix logging issue

added cuda support flag so log accurately reflected cuda toggle

* Update llama.cpp rebuild

Changed llama.cpp to only check if cuda support is enabled and if so rebuild during the first build rather than each run

* Improved vram management

Enabled memory pinning and optimizer state offload

* Fix CUDA check

rewrote llama.cpp rebuild logic, added manual y/n toggle if user wants to enable cuda support

* Added fast restart and fixed CUDA check command

Added make docker-restart-backend-fast to restart the backend and reflect code changes without causing a full llama.cpp rebuild

Fixed make docker-check-cuda command to correctly reflect cuda support

* Added docker-compose.gpu.yml

Added docker-compose.gpu.yml to fix error on machines without nvidia gpu and made sure "\n" is added before .env modification

* Fixed cuda toggle

Last push accidentally broke cuda toggle

* Code review fixes

Fixed errors resulting from removed code:
- Added return save_path to end of save_hf_model function
- Rolled back download_file_with_progress function

* Update Makefile

Use cuda by default when using docker-restart-backend-fast

* Minor cleanup

Removed unnecessary makefile command and fixed gpu logging

* Delete .gpu_selected

* Simplified cuda training code

- Removed dtype setting to let torch automatically handle it
- Removed vram logging
- Removed Unnecessary/old comments

* Fixed gpu/cpu selection

Made "make docker-use-gpu/cpu" command work with .gpu_selected flag and changed "make docker-restart-backend-fast" command to respect flag instead of always using gpu

* Fix Ollama embedding error

Added custom exception class for Ollama embeddings, which seemed to be returning keyword arguments while the Python exception class only accepts positional ones

* Fixed model selection & memory error

Fixed training defaulting to 0.5B model regardless of selection and fixed "free(): double free detected in tcache 2" error caused by cuda flag being passed incorrectly

* fix: train service singlten

---------

Co-authored-by: Zachary Pitroda <30330004+zpitroda@users.noreply.github.com>
* fix: adjustment status order

* fix: adjustment train status

* fix: split the status of service and train
* Update README.md

Changed the updated tutorial link

* Update README.md with FAQ

New section for FAQ doc
* fix: adjustment status order

* fix: adjustment train status

* fix: split the status of service and train

* feat: adjustment train rule
* feat: what? no llama.cpp

* add cache
* fix: add relative
Copy link
Copy Markdown
Contributor

@Llux-C Llux-C left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

Copy link
Copy Markdown
Contributor

@yingapple yingapple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work

@kevin-mindverse kevin-mindverse changed the title Release 0428 # v1.0.0 - First Release 🎉 Apr 28, 2025
@kevin-mindverse kevin-mindverse merged commit ddfcd15 into master Apr 29, 2025
1 check passed
Cybercricetus pushed a commit to Cybercricetus/Second-Me that referenced this pull request May 29, 2025
EOMZON pushed a commit to EOMZON/Second-Me that referenced this pull request Feb 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants