Skip to content

微调chatglm3-6b报错Could not locate the tokenization_chatglm.py inside THUDM/chatglm3-6b. #488

@fan2goa1

Description

@fan2goa1

根据xtuner提供的chatglm3-alpaca-qlora的config文件进行了相关更改,采用加载本地模型和自创数据集(格式没有问题)。按照同样的方法微调internlm、mistral、qwen都没有问题,但是微调chatglm3-6b时报错:

Could not locate the tokenization_chatglm.py inside THUDM/chatglm3-6b.
Traceback (most recent call last):
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
OSError: [Errno 101] Network is unreachable。`

按照网上的修改,将tokenizer_config.json中的

"AutoTokenizer": [
  "THUDM/chatglm3-6b--tokenization_chatglm.ChatGLMTokenizer",
  null
]

改为

"AutoTokenizer": [
  "tokenization_chatglm.ChatGLMTokenizer",
  null
]

之后仍会报其他错:

Traceback (most recent call last):
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/xtuner/tools/train.py", line 307, in <module>
    main()
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/xtuner/tools/train.py", line 303, in main
    runner.train()
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 1160, in train
    self._train_loop = self.build_train_loop(
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 958, in build_train_loop
    loop = LOOPS.build(
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
    obj = obj_cls(**args)  # type: ignore
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/xtuner/engine/runner/loops.py", line 32, in __init__
    dataloader = runner.build_dataloader(
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/runner/_flexible_runner.py", line 824, in build_dataloader
    dataset = DATASETS.build(dataset_cfg)
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
    return self.build_func(cfg, *args, **kwargs, registry=self)
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 121, in build_from_cfg
    obj = obj_cls(**args)  # type: ignore
  File "/root/anaconda3/envs/xt-new/lib/python3.10/site-packages/xtuner/dataset/huggingface.py", line 235, in process_hf_dataset
    dataset = process(*args, **kwargs)
TypeError: process() got an unexpected keyword argument 'ataset'
Exception ignored in atexit callback: <function matmul_ext_update_autotune_table at 0x7f6911ca11b0>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions