Skip to content

Add and re-implement tests to RepresentationTestFactory #143

@luca-martial

Description

@luca-martial

RepresentationTestFactory is for tests that are running only on the dataset, not on the model. They are meant to check the representation of certain elements within a dataset.

Note: @JulesBelveze is working on the classifier for gender-based tests, so please communicate with him to integrate into your implementation

New tests to implement

  • min_gender_representation_count @ArshaanNazir @alytarik
    With a min_count param to define a minimum count of how many sentences are male/female/unknown, user should be able to define a custom count for each gender category, an example config file looks like this:
representation:
    min_gender_representation_count: 
        min_count:
            male: 30
            female: 30
            unknown: 40
  • min_gender_representation_proportion @ArshaanNazir @alytarik
    With a min_proportion param to define a minimum proportion for each gender category, an example config file looks like this:
representation:
    min_gender_representation_proportion: 
        min_proportion:
            male: 0.30
            female: 0.30
            unknown: 0.40

Conditions:

  • The user will not be able to pass/control the default in the defaults section of the config since the number of class for each representation_proportion tests is different.
  • The backend code should set the default for each class by doing: 1/num_classes * 0.8 (this makes the passing conditions less strict).
  • If the user wants to control the proportion for each class, they should write it the same way as the config example above. There should be a check to see that the sum of the proportions for the classes specified by the user is not greater than 1, otherwise throw an error.
  • The user should be able to define a min_proportion for only some of the classes if they want to. For example, if I only want male: 0.45 then I don't have to define any proportions for female and unknown, and no proportion will be assigned to those classes. They should not be checked at all when running the tests, only male would be checked in this case.

Results should look like this:

category testype original testcase expectedresult actualresult pass
representation min_gender_representation_count - male 30 60 TRUE
representation min_gender_representation_count - female 30 60 TRUE
representation min_gender_representation_count - unknown 20 30 TRUE

Report should look like this (minimum pass rate will be 100% for all metrics-based tests from now on):

category test_type fail_count pass_count pass_rate minimum_pass_rate pass
representation min_gender_representation_count 0 3 100% 100% TRUE

Re-implement in new test factory structure

Waiting for dataset

Metadata

Metadata

Labels

⭐ FeatureIndicates new feature requests

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions