Skip to content

[FEATURE] Update documentation for execution loop and input/output language refactor #43

@Mattdl

Description

@Mattdl

Problem

The problem is that PR #34 (refactor: generalize dataset indexing from language-based to dataset_id-based) introduces API-breaking changes to the task interface and results structure, but the documentation in README.md and CONTRIBUTING.md still references the old API. Once #34 is merged, new contributors following the docs will write code against a stale interface (load_monolingual_data, lang_datasets, language_results, etc.) that no longer exists.

Related: #33, #34

Proposal

  • Type:

    • New Ontology (data source for multiple tasks)
    • New Task(s)
    • New Model(s)
    • New Metric(s)
    • Other
  • Area(s) of code: README.md, CONTRIBUTING.md, examples/custom_task_example.py

Update all documentation and examples to reflect the new dataset_id-based API introduced in #34. Specifically:

README.md

  1. Checkpointing section (line ~115): Change "saves result checkpoints after each task completion in a specific language" to reflect that checkpointing is now per-dataset (dataset_id), not per-language.

  2. Metrics & Aggregation section (lines ~174–181):

    • Step 1 currently says "Macro-average languages per task" — update to reflect the new dataset-based aggregation.
    • Document the new aggregation_mode parameter and the three supported modes:
      • monolingual_only (default)
      • crosslingual_group_input_languages
      • crosslingual_group_output_languages
    • Note that mean_per_language behavior now depends on the chosen aggregation mode.
  3. Results structure (line ~164): The checkpoint.json description should mention datasetid_results instead of implying language-keyed results.

CONTRIBUTING.md

  1. "Adding a New Task" — Step 2 code example (lines ~138–206):

    • Rename load_monolingual_data(self, split, language)load_dataset(self, dataset_id, split).
    • Update the RankingDataset construction accordingly.
    • Add guidance on the new optional override methods: languages_to_dataset_ids() and get_dataset_language() (with input_language/output_language distinction).
    • Briefly explain when a task author would need to override these (multi-dataset per language, cross-lingual, or multilingual tasks).
  2. "Adding a New Task" — Step 4 test example (line ~234):

  3. "Adding a New Task" — general: Add a note or subsection explaining the difference between monolingual, cross-lingual, and multilingual dataset scenarios and how the new dataset_id system handles them.

examples/custom_task_example.py

  1. load_monolingual_data method (line 81): Rename to load_dataset with the new (self, dataset_id, split) signature. This file is referenced by both README and CONTRIBUTING as the canonical example.

Additional Context

Implementation

  • [x ] I plan to implement this in a PR
  • [] I am proposing the idea and would like someone else to pick it up

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions