Skip to content

Add general name to train#6752

Merged
ggerganov merged 3 commits intoggml-org:masterfrom
teleprint-me:add-general-name-to-train
Apr 19, 2024
Merged

Add general name to train#6752
ggerganov merged 3 commits intoggml-org:masterfrom
teleprint-me:add-general-name-to-train

Conversation

@teleprint-me
Copy link
Copy Markdown
Contributor

@teleprint-me teleprint-me commented Apr 18, 2024

This commit adds the model name to a GGML trained model when using train-text-from-scratch.

19:41:23 | /mnt/valerie/forked/ggerganov/llama.cpp
 git:(add-general-name-to-train | θ) λ python gguf-py/scripts/gguf-dump.py models/valerie/v0.1/ggml-valerie-v0.1-256x32-f32-LATEST.gguf --no-tensors
* Loading: models/valerie/v0.1/ggml-valerie-v0.1-256x32-f32-LATEST.gguf
* File is LITTLE endian, script is running on a LITTLE endian host.

* Dumping 24 key/value pair(s)
      1: UINT32     |        1 | GGUF.version = 3
      2: UINT64     |        1 | GGUF.tensor_count = 147
      3: UINT64     |        1 | GGUF.kv_count = 21
      4: STRING     |        1 | general.architecture = 'llama'
      5: STRING     |        1 | general.name = 'llama'  # Adds the models name
      6: UINT32     |        1 | general.file_type = 0
      7: UINT32     |        1 | llama.context_length = 256
      8: UINT32     |        1 | llama.embedding_length = 256
      9: UINT32     |        1 | llama.feed_forward_length = 768
     10: UINT32     |        1 | llama.attention.head_count = 8
     11: UINT32     |        1 | llama.block_count = 16
     12: UINT32     |        1 | llama.rope.dimension_count = 32
     13: FLOAT32    |        1 | llama.attention.layer_norm_rms_epsilon = 9.999999747378752e-06
     14: FLOAT32    |        1 | llama.rope.freq_base = 10000.0
     15: FLOAT32    |        1 | llama.rope.scale_linear = 1.0
     16: STRING     |        1 | tokenizer.ggml.model = 'llama'
     17: [FLOAT32]  |    32000 | tokenizer.ggml.scores
     18: [INT32]    |    32000 | tokenizer.ggml.token_type
     19: [STRING]   |    32000 | tokenizer.ggml.tokens
     20: UINT32     |        1 | tokenizer.ggml.bos_token_id = 1
     21: UINT32     |        1 | tokenizer.ggml.eos_token_id = 2
     22: UINT32     |        1 | tokenizer.ggml.unknown_token_id = 0
     23: UINT32     |        1 | tokenizer.ggml.seperator_token_id = 4294967295
     24: UINT32     |        1 | tokenizer.ggml.padding_token_id = 4294967295

This commit simply uses the models architecture as a base to keep the changes both minimal and simple until I have time to come up with a more customizable approach.

@ggerganov ggerganov merged commit 8b1b1f4 into ggml-org:master Apr 19, 2024
@teleprint-me teleprint-me deleted the add-general-name-to-train branch May 9, 2024 00:06
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* llama : make general.name optional

* train: Add 'general.name' to model metadata

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

---------

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
* llama : make general.name optional

* train: Add 'general.name' to model metadata

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

---------

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants