├── graph_stage1/ # stage1_code
│ ├── config/
│ ├── models/
├── graphvq / # Stage2_code
│ ├── models/
│ ├── ds_stage2.py(under constructing)
│ ├── stage2_base.py(simple training procedure without any DDP/Deepspeed)
├── data/ # data
├── evaluate/ # TBD
For stage1:
deepspeed --num_gpus=2 new_stage1_training.py
--lm_model="/fs-computility/mabasic/shared/models/Qwen2.5-7B-Instruct"
--batch_size=1
--num_epochs=1
--learning_rate=2e-3
--bf16
--freeze_backbone
--deepspeed_config="zero2.json"
--max_seq_length=256
--use_wandb \
For stage 2: cd graphvq
python main.py --data_path ../data/tag/dataset --dataset_name "cora" --output_dir ./output/ \ hiratoken_joint_qwen2_qlora_orig_prompt_ddp --llm_model_name /fs-computility/mabasic/shared/models/Qwen2.5-7B-Instruct \ --load_in_4bit --use_lora --lora_r 16 --lora_alpha 32 --lora_dropout 0.05 --use_amp