-
Notifications
You must be signed in to change notification settings - Fork 634
[BREAKING] FEAT: Ensemble scoring for Crescendo #905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@microsoft-github-policy-service agree |
|
I think this has some merge conflicts & pipeline errors! Also would be great if you wrote some docs on the ensemble scorer itself and how this differs from the composite_scorer ! |
|
@martinpollack could you resolve the conflicts on this PR? looks like a simple init file merge conflict. Thanks! |
|
@eugeniavkim which part of this PR is "breaking"? Looks like a new scorer is being added. |
Description
This change creates a full pipeline for performing ensemble scoring with crescendo. Included are two new scorers: EnsembleScorer which is the driver of this change and allows results of many scorers to be aggregated, as well as SubstringsMultipleScorer which extends SubstringScorer to allow multiple strings to be searched for in a response. In addition, the crescendo orchestrator has been updated to abstract out the logic for creating the objective scorer. This is now created outside of the orchestrator in a new notebook which has been created as a template to demonstrate the capabilities of a crescendo ensemble orchestrator.
Received support from @eugeniavkim @jbolor21.
This change is breaking because it changes how a CrescendoOrchestrator object is instantiated. Instead of providing a PromptChatTarget as a scoring target for the scorer, the user needs to create a Scorer object outside of the CrescendoOrchestrator and then pass it to objective_float_scale_scorer to be used for scoring. This just abstracts the objective scorer outside of the Orchestrator object and allows for more flexibility.
Tests and Documentation
Still in pogress