-
-
Notifications
You must be signed in to change notification settings - Fork 2
Added disclaimer to the drafts #659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
49103c7 to
98829d0
Compare
Update QA yaml values Update to machine 3.6.4 Add corpora validation & appropriate error messages Add missing '!' Add fileless corpus test; no files monolingual corpus check More tests Translation engine tests; pretranslation tests Update to client version 1.9.1 Changes
616758a to
08fa260
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #659 +/- ##
==========================================
+ Coverage 64.90% 64.93% +0.02%
==========================================
Files 342 342
Lines 18836 18871 +35
Branches 2455 2462 +7
==========================================
+ Hits 12225 12253 +28
- Misses 5728 5734 +6
- Partials 883 884 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to implement the insertion of the \rem marker in the USFM updater class, then silnlp would be able to use the same code.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval/src/Serval.Translation/Services/IPretranslationService.cs line 23 at r1 (raw file):
PretranslationUsfmMarkerBehavior embedBehavior, PretranslationUsfmMarkerBehavior styleMarkerBehavior, bool addDisclaimer = false,
We don't need this parameter.
src/Serval/src/Serval.Translation/Services/PretranslationService.cs line 184 at r1 (raw file):
if (addDisclaimer) { idPosition = usfm.IndexOf("\\id");
We should turn this into a function. Also, I believe that silnlp inserts the \rem after the \ide marker if it exists.
mudiagaobrikisil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted. Refactoring.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
|
If the model name is important enough for the disclaimer, it is also important enough for the build itself. What if we put it in executionData and then grab it from there. I would say open a new issue to grab the model name and put it in executionData but for right now, don't include it for this version of the disclaimer. |
|
Let me know if you need help, Mudi, with reworking this. Also, @johnml1135 and I discussed the issue of adding the 'model name' to the remark this morning. For context, the 'model name' in EITL drafts includes the name of the base model, names of source project(s) and the target project, and the book filter that was applied to the data in training (e.g., NT+GEN) - so maybe 'model name' isn't really the best moniker 😆 . John suggested that we could just try and get a basic disclaimer through and then spin off issues for getting the base model name and training data/filters specified. How does that sound, @ddaspit? |
Enkidu93
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135 and @mudiagaobrikisil)
mudiagaobrikisil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnml1135, noted. I will refactor and get this disclaimer out and then open an issue for the model name. @Enkidu93 , would a call in the morning be fine?
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135)
ddaspit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Enkidu93 I agree. We should just start with a basic disclaimer.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @johnml1135 and @mudiagaobrikisil)
|
Since the strategy that we've arrived at for implementing this is different than what Mudi coded, I'm just going to close this and open a new PR. |
This PR adds a disclaimer line on the generated draft. The draft is pf the format
\rem This draft of MAT was machine translated on from project <source_project_name>. It should be reviewed and edited carefully.
It adds a new property called addDisclaimer to the getUsfmAsync method, which is used in the translation engines controller class. This property, when set to true, ensures the disclaimer is shown and the other way round.
It also has two new tests to ensure the disclaimer is not added when the addDisclaimer is set to false, and that is is shown when the addDisclaimer is set to true.
The model name has not yet been added to the disclaimer. After discussing with @Enkidu93, it seemed there is no direct way to get the model name in serval, as it is in machine.py. It is captured in the settings.yaml of machine.py, e.g. parent_model_name: facebook/nllb-200-distilled-1.3B. The two suggestions were to either add a new variable to serval settings that captures the model name or to figure out a way to get it from clearml. However, I think the model name I have seen in ClearML is the one used for testing, not the actual model name. @johnml1135, @ddaspit, how would you suggest I go about the model name, and is it something I should do at this point?
Also, it has been manually tested on swagger and seems to work.
This change is