Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
55 changes: 55 additions & 0 deletions src/python/evaluation/qodana/imitation_model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,3 +116,58 @@ Output is a `predictions.csv` file with the column names matches the number of c
- `0` ‑ if the model didn't found an error in a sample.

- `1` ‑ if the error was found in a sample.


## How to use model, pretrained on Java code snippets from Stepik

There are 2 trained models available for the usage and 2 datasets on which models were trained and evaluated.
Access to the datasets is restricted.
### Model that uses program text as an input:
- [`train_dataset`](https://drive.google.com/drive/folders/1bdLExLIbY53SVobT0y4Lnz9oeZENqLmt?usp=sharing) – private access;
- [`evaluation_dataset`](https://drive.google.com/file/d/1hZlP7q3gVoIl8vmOur0UFpEyFDYyVZko/view?usp=sharing) – private access;
- [`test_dataset`](https://drive.google.com/file/d/1oappcDcH-p-2LwjdOfZHRSiRB9Vi39mc/view?usp=sharing) – private access;
- [`model_weights`](https://drive.google.com/file/d/1PFVHVd4JDjFUD3b5fDSGXoYBWEDlaEAg/view?usp=sharing) – public access.

The model was trained to detect 110 Qodana inspections. The whole
list of inspections can be found via the link [here](https://drive.google.com/file/d/1PVqjx7QEot1dIXyiYP_-dJnWGup2Ef7v/view?usp=sharing).

Evaluation results are:

Inspection | Description | F1-Score
--- | --- | ---
|No Errors | No errors from the [list](https://docs.google.com/spreadsheets/d/14BTj_lTTRrGlx-GPTcbMlc8zdt--WXLZHRnegKRrZYM/edit?usp=sharing) were detected by Qodana.| 0.73 |
| Syntax Error |Reports any irrelevant usages of java syntax.| 0.99|
| System Out Error | Reports any usages of System.out or System.err. | 0.99 |
| IO Resources | Reports any I/O resource which is not safely closed. | 0.97 |

The rests of the inspections were not learnt by the model due to the class disbalance.
### Model that uses a line of program text as an input:
- [`train_dataset`](https://drive.google.com/file/d/1c-kJUV4NKuehCoLiIC3JWrJh3_NgTmvi/view?usp=sharing) – private access;
- [`evaluation_dataset`](https://drive.google.com/file/d/1AVN4Uj4omPEquC3EAL6XviFATkYKcY_2/view?usp=sharing) – private access;
- [`test_dataset`](https://drive.google.com/file/d/1J3gz3wS_l63SI0_OMym8x5pCj7-PCIgG/view?usp=sharing) – private access;
- [`model_weights`](https://drive.google.com/file/d/1fc32-5XyUeOpZ5AkRotqv_3cWksHjat_/view?usp=sharing) – public access.

One sample in the dataset consists of one line of program in the context. The context is 2 lines of the same
program before and after the target line. When there are not enough lines before or after target, special
token `NOC` is added.

The model was also trained to detect 110 inspections. The whole
list of inspections can be found via the link [here](https://drive.google.com/file/d/1PVqjx7QEot1dIXyiYP_-dJnWGup2Ef7v/view?usp=sharing).

Evaluation results are:

Inspection | Description | F1-score
--- | --- | ---
|No Errors | No errors from the [list](https://docs.google.com/spreadsheets/d/14BTj_lTTRrGlx-GPTcbMlc8zdt--WXLZHRnegKRrZYM/edit?usp=sharing) were detected by Qodana.| 0.99 |
| Syntax Error |Reports any irrelevant usages of java syntax.| 0.23|
| System Out Error | Reports any usages of System.out or System.err.| 0.30 |
| IO Resources | Reports any I/O resource which is not safely closed | 0.23 |

The rests of the inspections were not learnt by the model due to the class disbalance.

To use any of the model follow [`fine-tuning`](https://huggingface.co/transformers/training.html) tutorial from HuggingFace. Unarchive `model weights` zip and use absolute path to the root folder instead of built-in name of pretrained model.

For example:

RobertaForSequenceClassification.from_pretrained(<path to the folder with weights>,
num_labels=<unique number of inspections in your dataset>)