Skip to content

Support Llama 3 conversion#6745

Merged
ggerganov merged 8 commits intoggml-org:masterfrom
pcuenca:llama3-conversion
Apr 21, 2024
Merged

Support Llama 3 conversion#6745
ggerganov merged 8 commits intoggml-org:masterfrom
pcuenca:llama3-conversion

Conversation

@pcuenca
Copy link
Copy Markdown
Contributor

@pcuenca pcuenca commented Apr 18, 2024

The tokenizer is BPE.

The tokenizer is BPE.
@osanseviero
Copy link
Copy Markdown

What a 🐐

@Josh-XT
Copy link
Copy Markdown

Josh-XT commented Apr 18, 2024

What a champion lol. PR open within 30 minutes of model release.

@m18coppola
Copy link
Copy Markdown
Contributor

Doesn't seem that the eos_token is working with either of the convert scripts in this PR

@USBhost
Copy link
Copy Markdown

USBhost commented Apr 18, 2024

I can't convert 70b on this

EDIT: run with "--vocab-type bpe"

@mchiang0610
Copy link
Copy Markdown

mchiang0610 commented Apr 18, 2024

This is what we did to get the model out -- it doesn't seem like the special tokens are added properly.

We are looking deeper for further improvements / fixes.

Edit by JG: made collapsible
{
  "added_tokens_decoder": {
    "128000": {
      "content": "<|begin_of_text|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128001": {
      "content": "<|end_of_text|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128002": {
      "content": "<|reserved_special_token_0|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128003": {
      "content": "<|reserved_special_token_1|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128004": {
      "content": "<|reserved_special_token_2|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128005": {
      "content": "<|reserved_special_token_3|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128006": {
      "content": "<|start_header_id|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "128007": {
      "content": "<|end_header_id|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "128008": {
      "content": "<|reserved_special_token_4|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128009": {
      "content": "<|eot_id|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": false
    },
    "128010": {
      "content": "<|reserved_special_token_5|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128011": {
      "content": "<|reserved_special_token_6|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128012": {
      "content": "<|reserved_special_token_7|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128013": {
      "content": "<|reserved_special_token_8|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128014": {
      "content": "<|reserved_special_token_9|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128015": {
      "content": "<|reserved_special_token_10|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128016": {
      "content": "<|reserved_special_token_11|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128017": {
      "content": "<|reserved_special_token_12|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128018": {
      "content": "<|reserved_special_token_13|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128019": {
      "content": "<|reserved_special_token_14|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128020": {
      "content": "<|reserved_special_token_15|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128021": {
      "content": "<|reserved_special_token_16|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128022": {
      "content": "<|reserved_special_token_17|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128023": {
      "content": "<|reserved_special_token_18|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128024": {
      "content": "<|reserved_special_token_19|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128025": {
      "content": "<|reserved_special_token_20|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128026": {
      "content": "<|reserved_special_token_21|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128027": {
      "content": "<|reserved_special_token_22|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128028": {
      "content": "<|reserved_special_token_23|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128029": {
      "content": "<|reserved_special_token_24|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128030": {
      "content": "<|reserved_special_token_25|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128031": {
      "content": "<|reserved_special_token_26|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128032": {
      "content": "<|reserved_special_token_27|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128033": {
      "content": "<|reserved_special_token_28|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128034": {
      "content": "<|reserved_special_token_29|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128035": {
      "content": "<|reserved_special_token_30|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128036": {
      "content": "<|reserved_special_token_31|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128037": {
      "content": "<|reserved_special_token_32|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128038": {
      "content": "<|reserved_special_token_33|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128039": {
      "content": "<|reserved_special_token_34|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128040": {
      "content": "<|reserved_special_token_35|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128041": {
      "content": "<|reserved_special_token_36|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128042": {
      "content": "<|reserved_special_token_37|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128043": {
      "content": "<|reserved_special_token_38|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128044": {
      "content": "<|reserved_special_token_39|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128045": {
      "content": "<|reserved_special_token_40|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128046": {
      "content": "<|reserved_special_token_41|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128047": {
      "content": "<|reserved_special_token_42|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128048": {
      "content": "<|reserved_special_token_43|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128049": {
      "content": "<|reserved_special_token_44|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128050": {
      "content": "<|reserved_special_token_45|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128051": {
      "content": "<|reserved_special_token_46|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128052": {
      "content": "<|reserved_special_token_47|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128053": {
      "content": "<|reserved_special_token_48|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128054": {
      "content": "<|reserved_special_token_49|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128055": {
      "content": "<|reserved_special_token_50|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128056": {
      "content": "<|reserved_special_token_51|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128057": {
      "content": "<|reserved_special_token_52|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128058": {
      "content": "<|reserved_special_token_53|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128059": {
      "content": "<|reserved_special_token_54|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128060": {
      "content": "<|reserved_special_token_55|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128061": {
      "content": "<|reserved_special_token_56|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128062": {
      "content": "<|reserved_special_token_57|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128063": {
      "content": "<|reserved_special_token_58|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128064": {
      "content": "<|reserved_special_token_59|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128065": {
      "content": "<|reserved_special_token_60|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128066": {
      "content": "<|reserved_special_token_61|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128067": {
      "content": "<|reserved_special_token_62|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128068": {
      "content": "<|reserved_special_token_63|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128069": {
      "content": "<|reserved_special_token_64|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128070": {
      "content": "<|reserved_special_token_65|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128071": {
      "content": "<|reserved_special_token_66|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128072": {
      "content": "<|reserved_special_token_67|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128073": {
      "content": "<|reserved_special_token_68|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128074": {
      "content": "<|reserved_special_token_69|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128075": {
      "content": "<|reserved_special_token_70|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128076": {
      "content": "<|reserved_special_token_71|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128077": {
      "content": "<|reserved_special_token_72|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128078": {
      "content": "<|reserved_special_token_73|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128079": {
      "content": "<|reserved_special_token_74|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128080": {
      "content": "<|reserved_special_token_75|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128081": {
      "content": "<|reserved_special_token_76|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128082": {
      "content": "<|reserved_special_token_77|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128083": {
      "content": "<|reserved_special_token_78|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128084": {
      "content": "<|reserved_special_token_79|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128085": {
      "content": "<|reserved_special_token_80|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128086": {
      "content": "<|reserved_special_token_81|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128087": {
      "content": "<|reserved_special_token_82|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128088": {
      "content": "<|reserved_special_token_83|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128089": {
      "content": "<|reserved_special_token_84|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128090": {
      "content": "<|reserved_special_token_85|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128091": {
      "content": "<|reserved_special_token_86|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128092": {
      "content": "<|reserved_special_token_87|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128093": {
      "content": "<|reserved_special_token_88|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128094": {
      "content": "<|reserved_special_token_89|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128095": {
      "content": "<|reserved_special_token_90|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128096": {
      "content": "<|reserved_special_token_91|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128097": {
      "content": "<|reserved_special_token_92|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128098": {
      "content": "<|reserved_special_token_93|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128099": {
      "content": "<|reserved_special_token_94|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128100": {
      "content": "<|reserved_special_token_95|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128101": {
      "content": "<|reserved_special_token_96|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128102": {
      "content": "<|reserved_special_token_97|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128103": {
      "content": "<|reserved_special_token_98|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128104": {
      "content": "<|reserved_special_token_99|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128105": {
      "content": "<|reserved_special_token_100|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128106": {
      "content": "<|reserved_special_token_101|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128107": {
      "content": "<|reserved_special_token_102|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128108": {
      "content": "<|reserved_special_token_103|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128109": {
      "content": "<|reserved_special_token_104|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128110": {
      "content": "<|reserved_special_token_105|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128111": {
      "content": "<|reserved_special_token_106|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128112": {
      "content": "<|reserved_special_token_107|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128113": {
      "content": "<|reserved_special_token_108|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128114": {
      "content": "<|reserved_special_token_109|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128115": {
      "content": "<|reserved_special_token_110|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128116": {
      "content": "<|reserved_special_token_111|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128117": {
      "content": "<|reserved_special_token_112|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128118": {
      "content": "<|reserved_special_token_113|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128119": {
      "content": "<|reserved_special_token_114|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128120": {
      "content": "<|reserved_special_token_115|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128121": {
      "content": "<|reserved_special_token_116|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128122": {
      "content": "<|reserved_special_token_117|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128123": {
      "content": "<|reserved_special_token_118|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128124": {
      "content": "<|reserved_special_token_119|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128125": {
      "content": "<|reserved_special_token_120|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128126": {
      "content": "<|reserved_special_token_121|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128127": {
      "content": "<|reserved_special_token_122|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128128": {
      "content": "<|reserved_special_token_123|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128129": {
      "content": "<|reserved_special_token_124|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128130": {
      "content": "<|reserved_special_token_125|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128131": {
      "content": "<|reserved_special_token_126|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128132": {
      "content": "<|reserved_special_token_127|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128133": {
      "content": "<|reserved_special_token_128|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128134": {
      "content": "<|reserved_special_token_129|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128135": {
      "content": "<|reserved_special_token_130|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128136": {
      "content": "<|reserved_special_token_131|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128137": {
      "content": "<|reserved_special_token_132|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128138": {
      "content": "<|reserved_special_token_133|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128139": {
      "content": "<|reserved_special_token_134|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128140": {
      "content": "<|reserved_special_token_135|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128141": {
      "content": "<|reserved_special_token_136|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128142": {
      "content": "<|reserved_special_token_137|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128143": {
      "content": "<|reserved_special_token_138|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128144": {
      "content": "<|reserved_special_token_139|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128145": {
      "content": "<|reserved_special_token_140|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128146": {
      "content": "<|reserved_special_token_141|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128147": {
      "content": "<|reserved_special_token_142|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128148": {
      "content": "<|reserved_special_token_143|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128149": {
      "content": "<|reserved_special_token_144|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128150": {
      "content": "<|reserved_special_token_145|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128151": {
      "content": "<|reserved_special_token_146|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128152": {
      "content": "<|reserved_special_token_147|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128153": {
      "content": "<|reserved_special_token_148|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128154": {
      "content": "<|reserved_special_token_149|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128155": {
      "content": "<|reserved_special_token_150|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128156": {
      "content": "<|reserved_special_token_151|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128157": {
      "content": "<|reserved_special_token_152|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128158": {
      "content": "<|reserved_special_token_153|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128159": {
      "content": "<|reserved_special_token_154|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128160": {
      "content": "<|reserved_special_token_155|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128161": {
      "content": "<|reserved_special_token_156|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128162": {
      "content": "<|reserved_special_token_157|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128163": {
      "content": "<|reserved_special_token_158|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128164": {
      "content": "<|reserved_special_token_159|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128165": {
      "content": "<|reserved_special_token_160|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128166": {
      "content": "<|reserved_special_token_161|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128167": {
      "content": "<|reserved_special_token_162|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128168": {
      "content": "<|reserved_special_token_163|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128169": {
      "content": "<|reserved_special_token_164|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128170": {
      "content": "<|reserved_special_token_165|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128171": {
      "content": "<|reserved_special_token_166|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128172": {
      "content": "<|reserved_special_token_167|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128173": {
      "content": "<|reserved_special_token_168|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128174": {
      "content": "<|reserved_special_token_169|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128175": {
      "content": "<|reserved_special_token_170|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128176": {
      "content": "<|reserved_special_token_171|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128177": {
      "content": "<|reserved_special_token_172|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128178": {
      "content": "<|reserved_special_token_173|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128179": {
      "content": "<|reserved_special_token_174|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128180": {
      "content": "<|reserved_special_token_175|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128181": {
      "content": "<|reserved_special_token_176|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128182": {
      "content": "<|reserved_special_token_177|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128183": {
      "content": "<|reserved_special_token_178|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128184": {
      "content": "<|reserved_special_token_179|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128185": {
      "content": "<|reserved_special_token_180|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128186": {
      "content": "<|reserved_special_token_181|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128187": {
      "content": "<|reserved_special_token_182|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128188": {
      "content": "<|reserved_special_token_183|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128189": {
      "content": "<|reserved_special_token_184|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128190": {
      "content": "<|reserved_special_token_185|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128191": {
      "content": "<|reserved_special_token_186|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128192": {
      "content": "<|reserved_special_token_187|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128193": {
      "content": "<|reserved_special_token_188|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128194": {
      "content": "<|reserved_special_token_189|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128195": {
      "content": "<|reserved_special_token_190|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128196": {
      "content": "<|reserved_special_token_191|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128197": {
      "content": "<|reserved_special_token_192|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128198": {
      "content": "<|reserved_special_token_193|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128199": {
      "content": "<|reserved_special_token_194|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128200": {
      "content": "<|reserved_special_token_195|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128201": {
      "content": "<|reserved_special_token_196|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128202": {
      "content": "<|reserved_special_token_197|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128203": {
      "content": "<|reserved_special_token_198|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128204": {
      "content": "<|reserved_special_token_199|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128205": {
      "content": "<|reserved_special_token_200|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128206": {
      "content": "<|reserved_special_token_201|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128207": {
      "content": "<|reserved_special_token_202|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128208": {
      "content": "<|reserved_special_token_203|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128209": {
      "content": "<|reserved_special_token_204|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128210": {
      "content": "<|reserved_special_token_205|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128211": {
      "content": "<|reserved_special_token_206|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128212": {
      "content": "<|reserved_special_token_207|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128213": {
      "content": "<|reserved_special_token_208|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128214": {
      "content": "<|reserved_special_token_209|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128215": {
      "content": "<|reserved_special_token_210|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128216": {
      "content": "<|reserved_special_token_211|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128217": {
      "content": "<|reserved_special_token_212|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128218": {
      "content": "<|reserved_special_token_213|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128219": {
      "content": "<|reserved_special_token_214|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128220": {
      "content": "<|reserved_special_token_215|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128221": {
      "content": "<|reserved_special_token_216|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128222": {
      "content": "<|reserved_special_token_217|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128223": {
      "content": "<|reserved_special_token_218|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128224": {
      "content": "<|reserved_special_token_219|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128225": {
      "content": "<|reserved_special_token_220|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128226": {
      "content": "<|reserved_special_token_221|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128227": {
      "content": "<|reserved_special_token_222|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128228": {
      "content": "<|reserved_special_token_223|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128229": {
      "content": "<|reserved_special_token_224|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128230": {
      "content": "<|reserved_special_token_225|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128231": {
      "content": "<|reserved_special_token_226|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128232": {
      "content": "<|reserved_special_token_227|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128233": {
      "content": "<|reserved_special_token_228|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128234": {
      "content": "<|reserved_special_token_229|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128235": {
      "content": "<|reserved_special_token_230|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128236": {
      "content": "<|reserved_special_token_231|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128237": {
      "content": "<|reserved_special_token_232|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128238": {
      "content": "<|reserved_special_token_233|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128239": {
      "content": "<|reserved_special_token_234|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128240": {
      "content": "<|reserved_special_token_235|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128241": {
      "content": "<|reserved_special_token_236|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128242": {
      "content": "<|reserved_special_token_237|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128243": {
      "content": "<|reserved_special_token_238|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128244": {
      "content": "<|reserved_special_token_239|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128245": {
      "content": "<|reserved_special_token_240|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128246": {
      "content": "<|reserved_special_token_241|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128247": {
      "content": "<|reserved_special_token_242|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128248": {
      "content": "<|reserved_special_token_243|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128249": {
      "content": "<|reserved_special_token_244|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128250": {
      "content": "<|reserved_special_token_245|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128251": {
      "content": "<|reserved_special_token_246|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128252": {
      "content": "<|reserved_special_token_247|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128253": {
      "content": "<|reserved_special_token_248|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128254": {
      "content": "<|reserved_special_token_249|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    },
    "128255": {
      "content": "<|reserved_special_token_250|>",
      "lstrip": false,
      "normalized": false,
      "rstrip": false,
      "single_word": false,
      "special": true
    }
  },
  "bos_token": "<|begin_of_text|>",
  "chat_template": "{% set loop_messages = messages %}{% for message in loop_messages %}{% set content = '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' %}{% if loop.index0 == 0 %}{% set content = bos_token + content %}{% endif %}{{ content }}{% endfor %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}",
  "clean_up_tokenization_spaces": true,
  "eos_token": "<|end_of_text|>",
  "model_input_names": [
    "input_ids",
    "attention_mask"
  ],
  "model_max_length": 1000000000000000019884624838656,
  "tokenizer_class": "PreTrainedTokenizerFast"
}

@pcuenca
Copy link
Copy Markdown
Contributor Author

pcuenca commented Apr 18, 2024

I can't convert 70b on this

@USBhost did you try with convert-hf-to-gguf.py?

@jxy
Copy link
Copy Markdown
Contributor

jxy commented Apr 18, 2024

The instruct models need the tokenizer.ggml.eos_token_id to be 128009, or <|eot_id|>.

@USBhost
Copy link
Copy Markdown

USBhost commented Apr 18, 2024

I can't convert 70b on this

@USBhost did you try with convert-hf-to-gguf.py?

python convert-hf-to-gguf.py /mnt/36TB/AI/Meta-Llama-3-70B/ --outtype f16
Loading model: Meta-Llama-3-70B
gguf: This GGUF file is for Little Endian only
Set model parameters
gguf: context length = 8192
gguf: embedding length = 8192
gguf: feed forward length = 28672
gguf: head count = 64
gguf: key-value head count = 8
gguf: rope theta = 500000.0
gguf: rms norm epsilon = 1e-05
gguf: file type = 1
Set model tokenizer
Traceback (most recent call last):
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 1302, in set_vocab
    self. _set_vocab_sentencepiece()
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 330, in _set_vocab_sentencepiece
    raise FileNotFoundError(f"File not found: {tokenizer_path}")
FileNotFoundError: File not found: /mnt/36TB/AI/Meta-Llama-3-70B/tokenizer.model

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 2736, in <module>
    main()
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 2723, in main
    model_instance.set_vocab()
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 1305, in set_vocab
    self._set_vocab_llama_hf()
  File "/home/usbhost/llama.cpp/convert-hf-to-gguf.py", line 377, in _set_vocab_llama_hf
    vocab = LlamaHfVocab(self.dir_model)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/usbhost/llama.cpp/convert.py", line 539, in __init__
    raise FileNotFoundError('Cannot find Llama BPE tokenizer')
FileNotFoundError: Cannot find Llama BPE tokenizer

When I add https://huggingface.co/meta-llama/Meta-Llama-3-70B/blob/main/original/tokenizer.model I get the same error as on convert.py

@pcuenca
Copy link
Copy Markdown
Contributor Author

pcuenca commented Apr 18, 2024

Doesn't seem that the eos_token is working with either of the convert scripts in this PR

@m18coppola the instruct models use two different EOS tokens: the standard one (<|end_of_text|>), and a second one that signals the end of the assistant turn (<|eot_id|>). Generation must stop when either one is encountered.

I'm not sure how to replicate this behaviour yet. The best solution would be to use a list of eos/stop tokens, but I don't know how to do it, any suggestions on where to look?

Another idea would be to use <|eot_id|> (the assistant finalization token) as the only EOS when converting an instruct model, and <|end_of_text|> when converting a pre-trained model.

@mchiang0610
Copy link
Copy Markdown

@pcuenca for the changes:

"special": false on <|start_header_id|> <|end_header_id|> <|eot_id|>

@pcuenca
Copy link
Copy Markdown
Contributor Author

pcuenca commented Apr 18, 2024

The instruct models need the tokenizer.ggml.eos_token_id to be 128009, or <|eot_id|>.

@jxy Our comments were sent at the same time :) Yes, that's one of the solutions I mentioned, but I'm not sure it will work consistently, I've seen models that use various terminators depending on context.

We can try it out though, I'll take a look.

@USBhost
Copy link
Copy Markdown

USBhost commented Apr 18, 2024

Sorry lads I had to run with --vocab-type bpe
So automatic detection is broken.

@arch-btw arch-btw mentioned this pull request Apr 18, 2024
@ddh0
Copy link
Copy Markdown
Contributor

ddh0 commented Apr 18, 2024

The instruct models need the tokenizer.ggml.eos_token_id to be 128009, or <|eot_id|>.

@jxy Our comments were sent at the same time :) Yes, that's one of the solutions I mentioned, but I'm not sure it will work consistently, I've seen models that use various terminators depending on context.

We can try it out though, I'll take a look.

From the model card on HF:

terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|eot_id|>")
]

Not sure if this is helpful or not 😅 but thought I might as well mention it.

@jxy
Copy link
Copy Markdown
Contributor

jxy commented Apr 18, 2024

It seems the model generates <|eot_id|> with the official chat template. Otherwise it may generate <|end_of_text|>.

@teleprint-me
Copy link
Copy Markdown
Contributor

teleprint-me commented Apr 18, 2024

It's always the tokenizer. The tokenizers are always a mess.

Special tokens apply to the instruct tuned model.

The ChatFormat class in the source code shows how they implemented it.

The encode_header is interesting. That's a new one? Then they have encode_message and encode_dialog_prompt.

They're using tiktoken for the Tokenizer.

Lots of new special tokens.

        special_tokens = [
            "<|begin_of_text|>",
            "<|end_of_text|>",
            "<|reserved_special_token_0|>",
            "<|reserved_special_token_1|>",
            "<|reserved_special_token_2|>",
            "<|reserved_special_token_3|>",
            "<|start_header_id|>",
            "<|end_header_id|>",
            "<|reserved_special_token_4|>",
            "<|eot_id|>",  # end of turn
        ] + [
            f"<|reserved_special_token_{i}|>"
            for i in range(5, self.num_reserved_special_tokens - 5)
        ]

This should be interesting (and not in a fun way either). This is gonna create another level of complexity.

@bullno1
Copy link
Copy Markdown
Contributor

bullno1 commented Apr 18, 2024

Doesn't seem that the eos_token is working with either of the convert scripts in this PR

@m18coppola the instruct models use two different EOS tokens: the standard one (<|end_of_text|>), and a second one that signals the end of the assistant turn (<|eot_id|>). Generation must stop when either one is encountered.

I'm not sure how to replicate this behaviour yet. The best solution would be to use a list of eos/stop tokens, but I don't know how to do it, any suggestions on where to look?

Another idea would be to use <|eot_id|> (the assistant finalization token) as the only EOS when converting an instruct model, and <|end_of_text|> when converting a pre-trained model.

Instead of remapping which creates more confusion, just update the generation code to stop on eot_id.
It's like one line of config/code change.

At least from my cursory tests, all special texts are tokenized properly out of the box.

I did a bit of testing and chat works.

@teleprint-me
Copy link
Copy Markdown
Contributor

teleprint-me commented Apr 18, 2024

Okay, it's in there.

        # BOS / EOS token IDs
        self.bos_id: int = self.special_tokens["<|begin_of_text|>"]
        self.eos_id: int = self.special_tokens["<|end_of_text|>"]
        self.pad_id: int = -1
        self.stop_tokens = {
            self.special_tokens["<|end_of_text|>"],
            self.special_tokens["<|eot_id|>"],
        }

@pcuenca The list of stop tokens are usually added during inference. The chat templates have been embedded lately into llama.cpp. Haven't gotten that far yet, though.

I think I get it now.

Completions:

<|end_of_text|>

Instructions:

<|eot_id|>

That's how I'm interpreting it at the moment. Feel free to correct me.

@bullno1
Copy link
Copy Markdown
Contributor

bullno1 commented Apr 18, 2024

@teleprint-me Yep, you just have to stop on eot_id instead which is: 128009.

You can use the tokenization tool to test: https://github.com/ggerganov/llama.cpp/blob/master/examples/tokenize/tokenize.cpp

<|begin_of_text|>, <|start_header_id|> , <|end_header_id|>, <|eot_id|> are all mapped correctly.

@dranger003
Copy link
Copy Markdown
Contributor

This appears to work for chatting with the model (instruct):

./build/bin/main -ngl 33 -c 0 --interactive-first --color -e --in-prefix '<|start_header_id|>user<|end_header_id|>\n\n' --in-suffix '<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n' -r '<|eot_id|>' -m ggml-meta-llama-3-8b-instruct-f16.gguf

@teleprint-me
Copy link
Copy Markdown
Contributor

teleprint-me commented Apr 18, 2024

In the original convert.py, before the refactoring, BpeVocab would scan the first line if could and assume it was a plaintext format.

Now it's assuming the huggingface BPE format instead of BPE in a general implementation as it was originally. These changes continue to break the convert.py repeatedly even though any huggingface "features" should be isolated to convert-hf-to-gguf.py or should be isolated to the HfVocab class... See the referenced PR's continually pushing these changes over time below.

The current implementation for convert.py with BpeVocab now solely relies upon the huggingface format, which is blocking the conversion process for the torch Llama 3 model.

17:46:36 | /mnt/valerie/remote/ggerganov/llama.cpp
(.venv) git:(llama3-conversion | θ) λ python convert.py --vocab-type bpe /mnt/valerie/models/meta-llama/Meta-Llama-3-8B-Instruct
Loading model file /mnt/valerie/models/meta-llama/Meta-Llama-3-8B-Instruct/consolidated.00.pth
params = Params(n_vocab=128256, n_embd=4096, n_layer=32, n_ctx=4096, n_ff=14336, n_head=32, n_head_kv=8, n_experts=None, n_experts_used=None, f_norm_eps=1e-05, rope_scaling_type=None, f_rope_freq_base=500000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=PosixPath('/mnt/valerie/models/meta-llama/Meta-Llama-3-8B-Instruct'))
Traceback (most recent call last):
  File "/mnt/valerie/remote/ggerganov/llama.cpp/convert.py", line 1555, in <module>
    main()
  File "/mnt/valerie/remote/ggerganov/llama.cpp/convert.py", line 1522, in main
    vocab, special_vocab = vocab_factory.load_vocab(vocab_types, model_parent_path)
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/valerie/remote/ggerganov/llama.cpp/convert.py", line 1424, in load_vocab
    vocab = self._create_vocab_by_path(vocab_types)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/valerie/remote/ggerganov/llama.cpp/convert.py", line 1414, in _create_vocab_by_path
    raise FileNotFoundError(f"Could not find a tokenizer matching any of {vocab_types}")
FileNotFoundError: Could not find a tokenizer matching any of ['bpe']

The issue propogates from the BpeVocab constructor which assumes a vocab.json and added_tokens.json are present which is based on the assumptions that it is going to be a huggingface model, which the facebook/meta models are not; They're PyTorch models built without the transformers framework.

These issues are not related to this PR but are now affecting it.

BPE tokenizer implementations keep me up at night.

Weird things keeping me awake at night

Rant aside, the BpeVocab is assuming the huggingface "fast" or "slow" tokenizer, but this is neither.

It should be noted that Llama 1 and Llama 2 used sentencepiece models. Llama-3 is using a tiktoken implementation and released it with a plaintext BPE "model" file.

17:51:49 | /mnt/valerie/models/meta-llama
  λ file Meta-Llama-3-8B-Instruct/tokenizer.model 
Meta-Llama-3-8B-Instruct/tokenizer.model: ASCII text
17:52:05 | /mnt/valerie/models/meta-llama
  λ file /mnt/scsm/models/facebook/llama-2/llama-2-7b/tokenizer.model 
/mnt/scsm/models/facebook/llama-2/llama-2-7b/tokenizer.model: data

@kalomaze
Copy link
Copy Markdown
Contributor

kalomaze commented Apr 19, 2024

cc @dranger003 - I really appreciated your ppl chart visual + measured ppl gap table for different quantization types for CommandR+. Do you think you would be willing to recreate those comparisons on L3 70b (base or Instruct, preferably base?) Thanks

@bullno1
Copy link
Copy Markdown
Contributor

bullno1 commented Apr 19, 2024

@teleprint-me Are you saying that it's a happy coincidence that the current llama.cpp implementation happens to tokenize correctly or there exists character sequences out there that will be tokenized incorrectly?

@ggerganov
Copy link
Copy Markdown
Member

Does anyone have convert instructions that work - I'm trying both Meta and HF models using this PR and none of the convert scripts work:

$ python3.11 convert.py ~/Data/llama3/Meta-Llama-3-8B/ --outfile ./models/llama-8b-v3/ggml-model-f16.gguf --outtype f16 --vocab-type bpe

FileNotFoundError: Could not find a tokenizer matching any of ['bpe']
$ python3.11 convert-hf-to-gguf.py ~/Data/huggingface/Meta-Llama-3-8B/ --outfile ./models/llama-8b-v3/ggml-model-f16.gguf --outtype f16

FileNotFoundError: File not found: /Users/ggerganov/Data/huggingface/Meta-Llama-3-8B/tokenizer.model

I see a few other people reporting the same problems. Those who succeeded - what were the necessary changes?

@ddh0
Copy link
Copy Markdown
Contributor

ddh0 commented Apr 19, 2024

@ggerganov Try --vocab-type bpe with convert-hf-to-gguf.py, that worked for me

@ggerganov
Copy link
Copy Markdown
Member

@ddh0 The convert-hf-to-gguf.py script does not support the --vocab-type argument:

$ python3.11 convert-hf-to-gguf.py ~/Data/huggingface/Meta-Llama-3-8B/ --outfile ./models/llama-8b-v3/ggml-model-f16.gguf --outtype f16 --vocab-type bpe

usage: convert-hf-to-gguf.py [-h] [--vocab-only] [--awq-path AWQ_PATH] [--outfile OUTFILE] [--outtype {f32,f16}] [--bigendian] [--use-temp-file] model
convert-hf-to-gguf.py: error: unrecognized arguments: --vocab-type bpe

@dranger003
Copy link
Copy Markdown
Contributor

@ggerganov These are working on my end.

convert.py works using master and this PR:

python convert.py /models/hub/Meta-Llama-3-8B/ --outfile /models/meta-llama/ggml-meta-llama-3-8b-f16.gguf --vocab-type bpe --outtype f16

convert-hf-to-gguf.py only works using this PR:

python convert-hf-to-gguf.py /models/hub/Meta-Llama-3-8B/ --outfile /models/meta-llama/ggml-meta-llama-3-8b-f16.gguf

@XiongjieDai
Copy link
Copy Markdown

XiongjieDai commented May 2, 2024

Sorry for bothering you guys. It's just a lack of slash in the path... Thank you for your prompt reply!

@teleprint-me
Copy link
Copy Markdown
Contributor

Trust me, you're not alone 😅. I don't know how many times I've been stymied by a '/' or a '\'.

@oldgithubman
Copy link
Copy Markdown

@abasu0713 @XiongjieDai is using the wrong script.

python3 convert.py

Should use HF script instead.

python convert-hf-to-gguf.py

Use the HF model too, not the one distributed from meta.

https://huggingface.co/meta-llama/Meta-Llama-3-8B

It will work afterwards.

The documentation really needs to be better. How many resources are being wasted because the documentation is telling people to use convert.py? I know it just cost me about a day. Why isn't there just one interface anyway? Very confusing. I've been making my own quants for months now and I still don't know which one to use when. There should be only one interface and the documentation should be up-to-date and accurate. Crazy ideas, I know

@LostRuins
Copy link
Copy Markdown
Collaborator

Actually I've been wondering, what's the purpose of the convert.py script? If the hf one does everything needed, should convert.py be removed?

@oldgithubman
Copy link
Copy Markdown

Actually I've been wondering, what's the purpose of the convert.py script? If the hf one does everything needed, should convert.py be removed?

Well, according to the main readme, that's the only one you should even know about and use. I feel like I'm taking crazy pills

@dranger003
Copy link
Copy Markdown
Contributor

cc @dranger003 - I really appreciated your ppl chart visual + measured ppl gap table for different quantization types for CommandR+. Do you think you would be willing to recreate those comparisons on L3 70b (base or Instruct, preferably base?) Thanks

Thanks, sorry for the late response, just saw this one.
I'll see if I can find some time to do it.

@teleprint-me
Copy link
Copy Markdown
Contributor

teleprint-me commented May 2, 2024

The purpose of the convert.py script is to partially load tensors dynamically as the conversion occurs. This reduces memory usage during the conversion process.

Normally, the entire models weights are loaded which is very RAM intensive. So thats why it exists. It should be easy to understand why this is valuable to have.

@bartowski1182
Copy link
Copy Markdown
Contributor

The purpose of the convert.py script is to partially load tensors dynamically as the conversion occurs. This reduces memory usage during the conversion process.

Normally, the entire models weights are loaded which is very RAM intensive. So thats why it exists. It should be easy to understand why this is valuable to have.

Is this true..? I need the full amount of RAM to load models when using convert-hf-to-gguf, multiple hundred GB for the biggest ones

@teleprint-me
Copy link
Copy Markdown
Contributor

teleprint-me commented May 2, 2024

It depends on the model. Load a raw torch model (like a 7B one) and watch it as begins to consume about 40gb of RAM. Thats a lot of RAM. Just to load the model!

I think clraifying the scripts name would probably help with the confusion. Perhaps convert-torch.py would be more appropriate.

@bartowski1182
Copy link
Copy Markdown
Contributor

I load raw 7B models with only like 20gb of VRAM :S are you loading in FP32?

@teleprint-me
Copy link
Copy Markdown
Contributor

Well, the other option would be bfloat, or half, right? Quants weren't as popular and as widely available when I originally tested it. This was about 2 years ago... wow, time flies. 💀

@oldgithubman
Copy link
Copy Markdown

I think clraifying the scripts name would probably help with the confusion. Perhaps convert-torch.py would be more appropriate.

A step in the right direction. Why not merge the scripts into a unified convert.py?

@dranger003
Copy link
Copy Markdown
Contributor

@kalomaze Here it is, this is using 400 chunks imatrix on wiki.train for quants below Q6_K.

Quantization Size (GiB) Perplexity (wiki.test) Delta (FP16)
IQ1_S 14.29 9.8655 +/- 0.0625 248.51%
IQ1_M 15.60 8.5193 +/- 0.0530 200.95%
IQ2_XXS 17.79 6.6705 +/- 0.0405 135.64%
IQ2_XS 19.69 5.7486 +/- 0.0334 103.07%
IQ2_S 20.71 5.5215 +/- 0.0318 95.05%
Q2_K_S 22.79 5.4334 +/- 0.0325 91.94%
IQ2_M 22.46 4.8959 +/- 0.0276 72.95%
Q2_K 24.56 4.7763 +/- 0.0274 68.73%
IQ3_XXS 25.58 3.9671 +/- 0.0211 40.14%
IQ3_XS 27.29 3.7210 +/- 0.0191 31.45%
Q3_K_S 28.79 3.6502 +/- 0.0192 28.95%
IQ3_S 28.79 3.4698 +/- 0.0174 22.57%
IQ3_M 29.74 3.4402 +/- 0.0171 21.53%
Q3_K_M 31.91 3.3617 +/- 0.0172 18.75%
Q3_K_L 34.59 3.3016 +/- 0.0168 16.63%
IQ4_XS 35.30 3.0310 +/- 0.0149 7.07%
IQ4_NL 37.30 3.0261 +/- 0.0149 6.90%
Q4_K_S 37.58 3.0050 +/- 0.0148 6.15%
Q4_K_M 39.60 2.9674 +/- 0.0146 4.83%
Q5_K_S 45.32 2.8843 +/- 0.0141 1.89%
Q5_K_M 46.52 2.8656 +/- 0.0139 1.23%
Q6_K 53.91 2.8441 +/- 0.0138 0.47%
Q8_0 69.83 2.8316 +/- 0.0138 0.03%
F16 131.43 2.8308 +/- 0.0138 0.00%

ggml-meta-llama-3-70b-ppl

@mapleroyal
Copy link
Copy Markdown

Does anyone have convert instructions that work - I'm trying both Meta and HF models using this PR and none of the convert scripts work:

$ python3.11 convert.py ~/Data/llama3/Meta-Llama-3-8B/ --outfile ./models/llama-8b-v3/ggml-model-f16.gguf --outtype f16 --vocab-type bpe

FileNotFoundError: Could not find a tokenizer matching any of ['bpe']

I see a few other people reporting the same problems. Those who succeeded - what were the necessary changes?

Did anyone find a solution to this?

python convert.py /Users/user/ai_models/Meta-Llama-3-70B-Instruct --vocab-type bpe
produces
FileNotFoundError: Could not find a tokenizer matching any of ['bpe']

@dranger003
Copy link
Copy Markdown
Contributor

@mapleroyal You should use convert-hf-to-gguf.py instead.

python convert-hf-to-gguf.py ./Meta-Llama-3-8B/ --outfile ggml-model-f16.gguf --outtype f16

@mapleroyal
Copy link
Copy Markdown

@dranger003

@mapleroyal You should use convert-hf-to-gguf.py instead.

python convert-hf-to-gguf.py ./Meta-Llama-3-8B/ --outfile ggml-model-f16.gguf --outtype f16

Even though I'm using the original meta (i.e. non-hf) model?

@dranger003
Copy link
Copy Markdown
Contributor

By original you mean the .pth? I don't think either of the convert script supports converting the pth weights, but it should work fine on the safetensors from Meta on HF.

@mapleroyal
Copy link
Copy Markdown

By original you mean the .pth? I don't think either of the convert script supports converting the pth weights, but it should work fine on the safetensors from Meta on HF.

Yes, exactly. Ok, got it. Thank you.

@mofosyne mofosyne added Review Complexity : High Generally require indepth knowledge of LLMs or GPUs enhancement New feature or request labels May 10, 2024
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* Support Llama 3 conversion

The tokenizer is BPE.

* style

* Accept suggestion

Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* llama : add llama_token_is_eog()

ggml-ci

* llama : auto-detect more EOT tokens when missing in KV data

* convert : replacing EOS token is a hack

* llama : fix codegemma EOT token + add TODOs

* llama : fix model type string for 8B model

---------

Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
* Support Llama 3 conversion

The tokenizer is BPE.

* style

* Accept suggestion

Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>

* llama : add llama_token_is_eog()

ggml-ci

* llama : auto-detect more EOT tokens when missing in KV data

* convert : replacing EOS token is a hack

* llama : fix codegemma EOT token + add TODOs

* llama : fix model type string for 8B model

---------

Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request Review Complexity : High Generally require indepth knowledge of LLMs or GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.