[BUG]: UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value

### Is there an existing issue for this bug?

- [X] I have searched the existing issues

### 🐛 Describe the bug

I got an error when run applications/Colossal-LLaMA/prepare_sft_dataset.py 

<img width="1127" alt="image" src="https://github.com/user-attachments/assets/86f1dc6a-4505-414d-87d3-db201e480257">

the script is :
``
python /mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py \
    --data_input_dirs "/mnt/data/dataset/llama3/prepare/original/2000items" \
    --tokenizer_dir "/mnt/data/model/modelscope/Meta-Llama-3-8B-Instruct" \
    --data_output_dirs "/mnt/data/dataset/llama3/prepare/2000items-llama3" \
    --max_length 1024 \
    --num_spliced_dataset_bins 10 \
    --llama_version 3
``

the error is:
``
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[07/19/24 16:52:06] INFO     colossalai - colossalai - INFO: 
/mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py:102 main INFO     colossalai - colossalai - INFO: Start to process part-0/10 of all original datasets.                         
Traceback (most recent call last):
  File "/mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py", line 147, in <module>
    main()
  File "/mnt/data/tool/ColossalAI-0.4.0/applications/Colossal-LLaMA/prepare_sft_dataset.py", line 106, in main
    "tokenizer": tokenizer,
                 ^^^^^^^^^
UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value
``

I've solved this bug and commit a PR soon...

### Environment


● ubuntu22.04
● CPU：96c；
● RAM：736 GiB；
● GPU：8 * NVIDIA V100 (32GB)
● Python 3.11.5；
● ColossalAI 0.4.0；
● cuda_11.8；
● pytorch 2.1.0+cu118

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value #5930

Is there an existing issue for this bug?

🐛 Describe the bug

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG]: UnboundLocalError: cannot access local variable 'default_conversation' where it is not associated with a value #5930

Description

Is there an existing issue for this bug?

🐛 Describe the bug

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions