RepresentationTestFactory is for tests that are running only on the dataset, not on the model. They are meant to check the representation of certain elements within a dataset.
Note: @JulesBelveze is working on the classifier for gender-based tests, so please communicate with him to integrate into your implementation
New tests to implement
representation:
min_gender_representation_count:
min_count:
male: 30
female: 30
unknown: 40
representation:
min_gender_representation_proportion:
min_proportion:
male: 0.30
female: 0.30
unknown: 0.40
Conditions:
- The user will not be able to pass/control the default in the
defaults section of the config since the number of class for each representation_proportion tests is different.
- The backend code should set the default for each class by doing:
1/num_classes * 0.8 (this makes the passing conditions less strict).
- If the user wants to control the proportion for each class, they should write it the same way as the config example above. There should be a check to see that the sum of the proportions for the classes specified by the user is not greater than 1, otherwise throw an error.
- The user should be able to define a min_proportion for only some of the classes if they want to. For example, if I only want
male: 0.45 then I don't have to define any proportions for female and unknown, and no proportion will be assigned to those classes. They should not be checked at all when running the tests, only male would be checked in this case.
Results should look like this:
| category |
testype |
original |
testcase |
expectedresult |
actualresult |
pass |
| representation |
min_gender_representation_count |
- |
male |
30 |
60 |
TRUE |
| representation |
min_gender_representation_count |
- |
female |
30 |
60 |
TRUE |
| representation |
min_gender_representation_count |
- |
unknown |
20 |
30 |
TRUE |
Report should look like this (minimum pass rate will be 100% for all metrics-based tests from now on):
| category |
test_type |
fail_count |
pass_count |
pass_rate |
minimum_pass_rate |
pass |
| representation |
min_gender_representation_count |
0 |
3 |
100% |
100% |
TRUE |
Re-implement in new test factory structure
Waiting for dataset
RepresentationTestFactory is for tests that are running only on the dataset, not on the model. They are meant to check the representation of certain elements within a dataset.
Note: @JulesBelveze is working on the classifier for gender-based tests, so please communicate with him to integrate into your implementation
New tests to implement
min_gender_representation_count@ArshaanNazir @alytarikWith a
min_countparam to define a minimum count of how many sentences are male/female/unknown, user should be able to define a custom count for each gender category, an example config file looks like this:min_gender_representation_proportion@ArshaanNazir @alytarikWith a
min_proportionparam to define a minimum proportion for each gender category, an example config file looks like this:Conditions:
defaultssection of the config since the number of class for eachrepresentation_proportiontests is different.1/num_classes * 0.8(this makes the passing conditions less strict).male: 0.45then I don't have to define any proportions forfemaleandunknown, and no proportion will be assigned to those classes. They should not be checked at all when running the tests, onlymalewould be checked in this case.Results should look like this:
Report should look like this (minimum pass rate will be 100% for all metrics-based tests from now on):
Re-implement in new test factory structure
min_ethnicity_name_representation_count@ArshaanNazirmin_ethnicity_name_representation_proportion@ArshaanNazirmin_label_representation_count@ArshaanNazirmin_label_representation_proportion@ArshaanNazirmin_religion_name_representation_count@ArshaanNazirmin_religion_name_representation_proportion@ArshaanNazirmin_country_economic_representation_count@ArshaanNazirmin_country_economic_representation_proportion@ArshaanNazirWaiting for dataset
min_indian_caste_name_representation_count@ArshaanNazirmin_indian_caste_name_representation_proportion@ArshaanNazir