Skip to content

refactor: split train and val dataset in preference dataset#1763

Merged
terrykong merged 15 commits intomainfrom
yukih/split-train-val-dataset-preference
Feb 4, 2026
Merged

refactor: split train and val dataset in preference dataset#1763
terrykong merged 15 commits intomainfrom
yukih/split-train-val-dataset-preference

Conversation

@yuki-97
Copy link
Copy Markdown
Contributor

@yuki-97 yuki-97 commented Jan 13, 2026

Closes #1050.

  1. Split train and val in build-in preference dataset, so that we could unblock multiple dataset support.
  2. Unify the built-in datasets under nemo_rl/data/datasets/preference_datasets/ into a similar format.
  3. Move setup_preference_data to nemo_rl/data/utils.py and reuse it.

Usage

data:
  # other data settings, see `examples/configs/sft.yaml` for more details
  ...
  # dataset settings
  train:
    # this dataset will override prompt_key and use the default values for other vars
    data_path: /path/to/local/train_dataset.jsonl  # local file or hf_org/hf_dataset_name (HuggingFace)
    prompt_key: context
    split: train  # used for HuggingFace datasets
  validation:
    # this dataset will use the default values for other vars except data_path
    data_path: /path/to/local/val_dataset.jsonl
  default:
    # will use below vars as default values if dataset doesn't specify it
    dataset_name: BinaryPreferenceDataset
    prompt_key: prompt
    chosen_key: chosen
    rejected_key: rejected
    prompt_file: null
    system_prompt_file: null

Migrate Guide

  1. For dataset that loads from local JSONL file or HuggingFace (BinaryPreferenceDataset and PreferenceDataset)
# old
data:
  dataset_name: BinaryPreferenceDataset
  train_data_path: <PathToTrainingDataset>  # e.g., /path/to/local/dataset.jsonl or hf_org/hf_dataset_name (HuggingFace)
  val_data_path: <PathToValidationDataset>
  prompt_key: <PromptKey>, default is "prompt"
  chosen_key: <ChosenKey>, default is "chosen"
  rejected_key: <RejectedKey>, default is "rejected"
  train_split: <TrainSplit>, default is None  # used for HuggingFace datasets
  val_split: <ValSplit>, default is None  # used for HuggingFace datasets

# new
data:
  # other data settings, see `examples/configs/sft.yaml` for more details
  ...
  # dataset settings
  train:
    # this dataset will override prompt_key and use the default values for other vars
    data_path: /path/to/local/train_dataset.jsonl  # local file or hf_org/hf_dataset_name (HuggingFace)
    prompt_key: context
    split: train  # used for HuggingFace datasets
  validation:
    # this dataset will use the default values for other vars except data_path
    data_path: /path/to/local/val_dataset.jsonl
  default:
    # will use below vars as default values if dataset doesn't specify it
    dataset_name: BinaryPreferenceDataset
    prompt_key: prompt
    chosen_key: chosen
    rejected_key: rejected
    prompt_file: null
    system_prompt_file: null
  1. For some built-in datasets that needs change
    1. HelpSteer3
      # old
      data:
        dataset_name: HelpSteer3
      
      # new
      data:
        train:
          dataset_name: HelpSteer3
          split: train
        validation:
          dataset_name: HelpSteer3
          split: validation
    2. Tulu3Preference
      # old
      data:
        dataset_name: Tulu3Preference
      
      # new
      data:
        train:
          dataset_name: Tulu3Preference
        validation: null

Test Result
Nightly tests are all good.

algo result
rm image
dpo image

@github-actions github-actions Bot added the Documentation Improvements or additions to documentation label Jan 13, 2026
@yuki-97 yuki-97 linked an issue Jan 13, 2026 that may be closed by this pull request
@yuki-97 yuki-97 changed the base branch from yukih/split-train-val-dataset to main January 13, 2026 08:22
@yuki-97 yuki-97 added the CI:L1 Run doctests, unit tests, and functional tests label Jan 13, 2026
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 13, 2026
@yuki-97 yuki-97 force-pushed the yukih/split-train-val-dataset-preference branch from 0923975 to 2fb1777 Compare January 13, 2026 11:32
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 13, 2026
@yuki-97 yuki-97 force-pushed the yukih/split-train-val-dataset-preference branch from 2fb1777 to 6086b51 Compare January 13, 2026 14:38
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 13, 2026
@yuki-97 yuki-97 force-pushed the yukih/split-train-val-dataset-preference branch from 6086b51 to 994a15f Compare January 13, 2026 15:26
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Jan 13, 2026
@terrykong terrykong marked this pull request as ready for review January 21, 2026 23:16
@terrykong terrykong requested review from a team as code owners January 21, 2026 23:16
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 3, 2026
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 3, 2026
@yuki-97 yuki-97 force-pushed the yukih/split-train-val-dataset-preference branch from eb00be0 to ba45d87 Compare February 3, 2026 05:51
@yuki-97 yuki-97 added CI:L1 Run doctests, unit tests, and functional tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Feb 3, 2026
@terrykong terrykong enabled auto-merge (squash) February 3, 2026 06:11
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
Signed-off-by: Yuki Huang <yukih@nvidia.com>
@yuki-97 yuki-97 force-pushed the yukih/split-train-val-dataset-preference branch from ba45d87 to 5a9ab6f Compare February 3, 2026 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L1 Run doctests, unit tests, and functional tests Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decouple train and eval dataset

3 participants