Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .devops/nix/package.nix
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
vulkan-loader,
openssl,
shaderc,
spirv-headers,
useBlas ?
builtins.all (x: !x) [
useCuda
Expand Down Expand Up @@ -145,6 +146,7 @@ effectiveStdenv.mkDerivation (finalAttrs: {
ninja
pkg-config
git
spirv-headers
]
++ optionals useCuda [
cudaPackages.cuda_nvcc
Expand Down
24 changes: 6 additions & 18 deletions .github/workflows/build-riscv.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,22 +47,10 @@ jobs:
steps:
- name: Install dependencies
run: |
sudo apt-get update

# Install necessary packages
sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 cmake build-essential wget git-lfs

# Set gcc-14 and g++-14 as the default compilers
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100

if ! which rustc; then
# Install Rust stable version
sudo apt-get install -y rustup
rustup install stable
rustup default stable
fi

git lfs install

- name: GCC version check
Expand All @@ -74,12 +62,12 @@ jobs:
id: checkout
uses: actions/checkout@v6

# FIXME: Enable when ggml-org/ccache-action works on riscv64
# - name: ccache
# uses: ggml-org/ccache-action@v1.2.21
# with:
# key: ubuntu-riscv64-native-sanitizer-${{ matrix.sanytizer }}-${{ matrix.build_type }}
# save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}
- name: ccache
uses: ggml-org/ccache-action@afde29e5b5422e5da23cb1f639e8baecadeadfc3 # https://github.com/ggml-org/ccache-action/pull/1
with:
key: ubuntu-riscv64-native-sanitizer-${{ matrix.sanitizer }}-${{ matrix.build_type }}
evict-old-files: 1d
save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

- name: Build
id: cmake_build
Expand Down
25 changes: 8 additions & 17 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1001,22 +1001,14 @@ jobs:
steps:
- name: Install dependencies
run: |
sudo apt-get update

# Install necessary packages
sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 cmake build-essential libssl-dev wget git-lfs
sudo apt-get update
sudo apt-get install -y libssl-dev

# Set gcc-14 and g++-14 as the default compilers
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100

if ! which rustc; then
# Install Rust stable version
sudo apt-get install -y rustup
rustup install stable
rustup default stable
fi

git lfs install

- name: Check environment
Expand All @@ -1032,13 +1024,12 @@ jobs:
id: checkout
uses: actions/checkout@v6

# FIXME: Enable when ggml-org/ccache-action works on riscv64
# - name: ccache
# uses: ggml-org/ccache-action@v1.2.21
# with:
# key: ubuntu-cpu-riscv64-native
# evict-old-files: 1d
# save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}
- name: ccache
uses: ggml-org/ccache-action@afde29e5b5422e5da23cb1f639e8baecadeadfc3 # https://github.com/ggml-org/ccache-action/pull/1
with:
key: ubuntu-cpu-riscv64-native
evict-old-files: 1d
save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

- name: Build
id: cmake_build
Expand Down
18 changes: 17 additions & 1 deletion CODEOWNERS
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# collaborators can optionally add themselves here to indicate their availability for reviewing related PRs
# multiplie collaborators per item can be specified
# multiple collaborators per item can be specified
#
# ggml-org/ci : CISC, danbev, ggerganov, netrunnereve, ngxson, taronaeo
# ggml-org/ggml-cann : hipudding
# ggml-org/ggml-cuda : JohannesGaessler, am17an, IMbackK, ORippler
# ggml-org/ggml-hexagon : lhez, max-krasnyansky
# ggml-org/ggml-metal : ggerganov
# ggml-org/ggml-opencl : lhez, max-krasnyansky
# ggml-org/ggml-rpc : rgerganov
# ggml-org/ggml-sycl : arthw
# ggml-org/ggml-vulkan : 0cc4m, jeffbolznv
# ggml-org/ggml-webgpu : reeselevine
# ggml-org/ggml-zdnn : taronaeo
# ggml-org/llama-common : ggerganov, aldehir, angt, danbev, ngxson, pwilkin
# ggml-org/llama-mtmd : ngxson
# ggml-org/llama-server : ggerganov, ngxson, allozaur, angt, ServeurpersoCom
# ggml-org/llama-webui : allozaur

/.devops/*.Dockerfile @ngxson
/.github/actions/ @ggml-org/ci
Expand Down
59 changes: 58 additions & 1 deletion convert_hf_to_gguf.py
Original file line number Diff line number Diff line change
Expand Up @@ -10893,7 +10893,64 @@ def set_gguf_parameters(self):
self.gguf_writer.add_moe_latent_size(latent_size)

def set_vocab(self):
super().set_vocab()
# The NemotronH config uses pattern characters (e.g. '-') that may not
# be supported by the installed transformers version. AutoTokenizer
# internally calls AutoConfig which triggers this parsing failure.
# Using trust_remote_code=True to load the model's own config class.
tokens: list[str] = []
toktypes: list[int] = []

from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(self.dir_model, trust_remote_code=True)

# Pad vocab size (from Mamba2Model/GraniteHybridModel)
self.hparams["pad_vocab_size_multiple"] = 8 # Setting this here since GraniteHybridModel.set_vocab() isn't being invoked now.
# From Mamba2Model.set_vocab():
vocab_size = self.hparams["vocab_size"]
pad_vocab = self.hparams.get("pad_vocab_size_multiple", 16)
# ref: https://stackoverflow.com/a/17511341/22827863
vocab_size = -(vocab_size // -pad_vocab) * pad_vocab
self.hparams["vocab_size"] = vocab_size

assert max(tokenizer.vocab.values()) < vocab_size

Check failure on line 10915 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10915:20: unresolved-attribute: Attribute `vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10915 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10915:20: unresolved-attribute: Attribute `vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

tokpre = self.get_vocab_base_pre(tokenizer)

reverse_vocab = {id_: encoded_tok for encoded_tok, id_ in tokenizer.vocab.items()}

Check failure on line 10919 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10919:67: unresolved-attribute: Attribute `vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10919 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10919:67: unresolved-attribute: Attribute `vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default
added_vocab = tokenizer.get_added_vocab()

Check failure on line 10920 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10920:23: unresolved-attribute: Attribute `get_added_vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10920 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10920:23: unresolved-attribute: Attribute `get_added_vocab` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

added_tokens_decoder = tokenizer.added_tokens_decoder

Check failure on line 10922 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10922:32: unresolved-attribute: Attribute `added_tokens_decoder` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10922 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10922:32: unresolved-attribute: Attribute `added_tokens_decoder` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

for i in range(vocab_size):
if i not in reverse_vocab:
tokens.append(f"[PAD{i}]")
toktypes.append(gguf.TokenType.UNUSED)
else:
token: str = reverse_vocab[i]
if token in added_vocab:
if not added_tokens_decoder[i].normalized:
previous_token = token
token = tokenizer.decode(tokenizer.encode(token, add_special_tokens=False))

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10933:50: unresolved-attribute: Attribute `encode` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10933:33: unresolved-attribute: Attribute `decode` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (invalid-assignment)

convert_hf_to_gguf.py:10933:33: invalid-assignment: Object of type `Unknown | str | list[str]` is not assignable to `str` convert_hf_to_gguf.py:10933:25: Declared type `str` info: rule `invalid-assignment` is enabled by default

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10933:50: unresolved-attribute: Attribute `encode` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (unresolved-attribute)

convert_hf_to_gguf.py:10933:33: unresolved-attribute: Attribute `decode` is not defined on `None` in union `Unknown | TokenizersBackend | None | SentencePieceBackend` info: rule `unresolved-attribute` is enabled by default

Check failure on line 10933 in convert_hf_to_gguf.py

View workflow job for this annotation

GitHub Actions / python type-check

ty (invalid-assignment)

convert_hf_to_gguf.py:10933:33: invalid-assignment: Object of type `Unknown | str | list[str]` is not assignable to `str` convert_hf_to_gguf.py:10933:25: Declared type `str` info: rule `invalid-assignment` is enabled by default
if previous_token != token:
logger.info(f"{repr(previous_token)} is encoded and decoded back to {repr(token)} using AutoTokenizer")

if added_tokens_decoder[i].special or self.does_token_look_special(token):
toktypes.append(gguf.TokenType.CONTROL)
else:
token = token.replace(b"\xe2\x96\x81".decode("utf-8"), " ") # pre-normalize user-defined spaces
toktypes.append(gguf.TokenType.USER_DEFINED)
else:
toktypes.append(gguf.TokenType.NORMAL)
tokens.append(token)

# From TextModel.set_vocab_gpt2():
self.gguf_writer.add_tokenizer_model("gpt2")
self.gguf_writer.add_tokenizer_pre(tokpre)
self.gguf_writer.add_token_list(tokens)
self.gguf_writer.add_token_types(toktypes)

special_vocab = gguf.SpecialVocab(self.dir_model, load_merges=True)
special_vocab.add_to_gguf(self.gguf_writer)

# The tokenizer _does_ add a BOS token (via post_processor type
# TemplateProcessing) but does not set add_bos_token to true in the
Expand Down
1 change: 1 addition & 0 deletions docs/backend/SYCL.md
Original file line number Diff line number Diff line change
Expand Up @@ -689,6 +689,7 @@ use 1 SYCL GPUs: [0] with Max compute units:512
| GGML_SYCL_F16 | OFF *(default)* \|ON *(optional)* | Enable FP16 build with SYCL code path. (1.) |
| GGML_SYCL_GRAPH | OFF *(default)* \|ON *(Optional)* | Enable build with [SYCL Graph extension](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc). |
| GGML_SYCL_DNN | ON *(default)* \|OFF *(Optional)* | Enable build with oneDNN. |
| GGML_SYCL_HOST_MEM_FALLBACK | ON *(default)* \|OFF *(Optional)* | Allow host memory fallback when device memory is full during quantized weight reorder. Enables inference to continue at reduced speed (reading over PCIe) instead of failing. Requires Linux kernel 6.8+. |
| CMAKE_C_COMPILER | `icx` *(Linux)*, `icx/cl` *(Windows)* | Set `icx` compiler for SYCL code path. |
| CMAKE_CXX_COMPILER | `icpx` *(Linux)*, `icx` *(Windows)* | Set `icpx/icx` compiler for SYCL code path. |

Expand Down
16 changes: 8 additions & 8 deletions docs/ops.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@ Legend:
| ARANGE | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ARGMAX | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ |
| ARGSORT | ❌ | ✅ | ✅ | ✅ | ✅ | 🟡 | 🟡 | ✅ | ✅ | ❌ | ❌ |
| CEIL | ❌ | ❌ | ✅ | 🟡 | | ❌ | ✅ | 🟡 | ✅ | ❌ | ❌ |
| CEIL | ❌ | ❌ | ✅ | 🟡 | | ❌ | ✅ | 🟡 | ✅ | ❌ | ❌ |
| CLAMP | ❌ | ✅ | ✅ | ✅ | ✅ | 🟡 | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| CONCAT | ❌ | ✅ | ✅ | 🟡 | ✅ | 🟡 | ✅ | ✅ | ✅ | ❌ | ❌ |
| CONT | ❌ | 🟡 | ✅ | ✅ | 🟡 | 🟡 | 🟡 | ✅ | 🟡 | ❌ | ❌ |
| CONT | ❌ | 🟡 | ✅ | ✅ | | 🟡 | 🟡 | ✅ | 🟡 | ❌ | ❌ |
| CONV_2D | ❌ | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| CONV_2D_DW | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ |
| CONV_3D | ❌ | ❌ | ✅ | ❌ | | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| CONV_3D | ❌ | ❌ | ✅ | ❌ | | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| CONV_TRANSPOSE_1D | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| CONV_TRANSPOSE_2D | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ |
| COS | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
Expand All @@ -46,7 +46,7 @@ Legend:
| EXPM1 | ❌ | ❌ | ✅ | 🟡 | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
| FILL | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |
| FLASH_ATTN_EXT | ❌ | 🟡 | ✅ | 🟡 | 🟡 | 🟡 | 🟡 | 🟡 | 🟡 | ❌ | ❌ |
| FLOOR | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| FLOOR | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| GATED_DELTA_NET | ❌ | ❌ | ✅ | ❌ | 🟡 | ❌ | ✅ | ❌ | ✅ | ❌ | ❌ |
| GATED_LINEAR_ATTN | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ❌ |
| GEGLU | ❌ | ✅ | ✅ | ✅ | 🟡 | ✅ | ✅ | 🟡 | ✅ | ❌ | ❌ |
Expand Down Expand Up @@ -84,10 +84,10 @@ Legend:
| REPEAT_BACK | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| RMS_NORM | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
| RMS_NORM_BACK | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ROLL | ❌ | ❌ | ✅ | ✅ | | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ROLL | ❌ | ❌ | ✅ | ✅ | | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ROPE | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
| ROPE_BACK | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| ROUND | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| ROUND | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| RWKV_WKV6 | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| RWKV_WKV7 | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ |
| SCALE | ❌ | 🟡 | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ |
Expand Down Expand Up @@ -116,6 +116,6 @@ Legend:
| TIMESTEP_EMBEDDING | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ |
| TOP_K | ❌ | ❌ | ✅ | ❌ | ✅ | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| TRI | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ❌ |
| TRUNC | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| TRUNC | ❌ | ❌ | ✅ | 🟡 | | ❌ | 🟡 | 🟡 | ✅ | ❌ | ❌ |
| UPSCALE | ❌ | 🟡 | ✅ | ✅ | ✅ | 🟡 | ✅ | ✅ | ❌ | ❌ | ❌ |
| XIELU | ❌ | ❌ | ✅ | ❌ | | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |
| XIELU | ❌ | ❌ | ✅ | ❌ | | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ |
Loading
Loading