Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .spellcheck-en-custom.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ Splitter
src
subdirectory
subfolder
submodlib
Tatsu
templating
Tesseract
Expand Down
16 changes: 15 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
## Upcoming v0.8.x
## v0.8.1

### Fixes

* Unpin submodlib-py dependency from 0.0.1 so we can pick up newer releases as they come out.

## v0.8.0

### Features

Expand All @@ -10,6 +16,14 @@ Each `LLMBlock` in a `Pipeline` can now specify `model_family` or `model_id` in

The parameters `model_family`, `model_id`, and `num_instructions_to_generate` are no longer required in `PipelineContext` objects. They used to be required, and if passed in will still get used as before. However, they can now be omitted if your `Pipeline` contains no `LLMBlock` entries or if your `LLMBlock` config specifies these values in the `Pipeline` yaml.

### Added Knowledge Prompts and Pipelines for Llama-3.3-70B-Instruct teacher model

There is a new pipeline for knowledge data generation optimized for Llama-3.3-70B-Instruct as the teacher model. It's shipped under a new `llama` pipelines package, and can be activated via `ilab data generate --pipeline llama ...` when using the `ilab` command line interface.

### Added a new preview subset_selection Python API

There's a new `instructlab.sdg.subset_selection` API that can be used to select subsets of larger generated datasets.

## v0.7.3

### Fixes
Expand Down