CodeChecker diff works inconsistently in local-local, local-remote, remote-remote cases, especially with tags

# GUI problems

CodeChecker diff is a more complicated feature than it seems to be. The greatest problem with it is that diffing is not done on sets of reports, but rather sets of _outstanding reports_.  Looking at this (cropped) screenshot, this is not evident at all:
![image](https://user-images.githubusercontent.com/23276031/232476954-1134412e-9c00-4b8e-b71d-65f1761aeaa4.png)

Nowhere in the report page (after pressing diff) can you see that the result is the set of outstanding reports:

![image](https://user-images.githubusercontent.com/23276031/232477251-c9591112-61a8-4c39-8393-501d4dd4c00d.png)

The icons for "Diff type" make this worse, since nothing explicitly states that we mean "Only _outstanding reports_ in Baseline":

![image](https://user-images.githubusercontent.com/23276031/232477657-0f324156-970b-47d5-9143-189dba451b08.png)

Meaning that it is easy to misinterpret the GUI and mistakenly believe that these are set operations on the set of all reports. The command line interface does a little better as one of the following 3 flags are manditory to `CodeChecker cmd diff`: `--new`, `--resolved`, `--unresolved`.

# Issues with diff
## Run diffs (without tags)

Suppose we have the following two files, test1.cpp and test2.cpp.

### test1.cpp

Analysis results on test1.cpp are stored in the folder `result1`, and remotely as `RunA`.

Analysis:
```
$ CodeChecker check -b "g++ -c test1.cpp" -o result1 --analyzers clangsa -c
```
Store:
```
CodeChecker store result1/ -n RunA --url 0.0.0.0/Default
```
```c++
// test1.cpp
#include "stdlib.h"

void a() {
  int i = 0;
  (void)(10 / i); // division by zero
}

void b() {
  int *i = 0;
  *i = 5; // nullptr dereference
}

void c() {
  int i = 0; // deadstore, this value is never read
  //
  i = 5;
}

// void d() {
//   int *k = new int;
//   (void)k;
// } // memory leak

void e() {
  int *k = new int;
  // codechecker_suppress [all] SUPPRESS ALL
  free(k); // mismatched deallocator, should be delete k;
}

void f() {
  int *k = new int;
  delete k;
  *k = 5; // use after free
}
```

Analysis results:

| Report ID | Checker name | Fixed at date & reason |
| --- | --- | --- |
| A | core.DivideZero | - |
| **B** | **core.NullDereference** | **-** |
| C | deadcode.DeadStores | - |
| E | unix.MismatchedDeallocator | fix1, in source |
| F | cplusplus.NewDelete | fix1, on GUI *|

*When stored as RunA, on the GUI, a review status rule is set to false positive.
### test2.cpp
Analysis results on test2.cpp are stored in the folder `result2`, and remotely as `RunB`.

Analysis:
```
$ CodeChecker check -b "g++ -c test2.cpp" -o result2 --analyzers clangsa -c
```
Store:
```
CodeChecker store result2/ -n RunB --url 0.0.0.0/Default
```
```c++
// test2.cpp
#include "stdlib.h"

void a() {
  int i = 0;
  (void)(10 / i); // division by zero
}

// void b() {
//   int *i = 0;
//   *i = 5; // nullptr dereference
// }

void c() {
  int i = 0; // deadstore, this value is never read
  // codechecker_suppress [all] SUPPRESS ALL
  i = 5;
}

void d() {
  int *k = new int;
  (void)k;
} // memory leak

void e() {
  int *k = new int;
  //
  free(k); // mismatched deallocator, should be delete k;
}

void f() {
  int *k = new int;
  delete k;
  *k = 5; // use after free
}
```

Analysis results:

| Report ID | Checker name | Fixed at date  & reason|
| --- | --- | --- |
| A | core.DivideZero | - |
| C | deadcode.DeadStores | fix2, in source |
| **D** | **cplusplus.NewDeleteLeaks** | **-** |
| E | unix.MismatchedDeallocator | - |
| F | cplusplus.NewDelete | - |

#### Summary

| Report ID | Checker name | In test1.cpp? | In test2.cpp? |
| --- | --- | --- | --- |
| A | core.DivideZero | :heavy_check_mark: | :heavy_check_mark: |
| **B** | **core.NullDereference** | :heavy_check_mark: | :x: |
| C | deadcode.DeadStores | :heavy_check_mark: | :heavy_check_mark:, source code suppressed  |
| **D** | **cplusplus.NewDeleteLeaks** | :x: | :heavy_check_mark: |
| E | unix.MismatchedDeallocator | :heavy_check_mark:, source code suppressed | :heavy_check_mark: |
| F | cplusplus.NewDelete | :heavy_check_mark:, GUI suppressed | :heavy_check_mark:, GUI suppressed |

### Preface to the results

**Note that each report has its own instance. Two reports can be identical (to the point of having the same bug hash), but if they are in two different runs, they are not the same, and have their own row in the reports table. As a result, they have their own fixed_at date!**

Diffing in theory is done with set operations on the set of _outstanding reports_. In practice, a report is considered outstanding (or open) if **all of the following** is true:
* Its detection status is new, reopened, or unresolved
* Its review status is unreviewed or confirmed

A report is closed (_not outstanding_) if **any of the following** is true:
* Its review status is false positive or intentional
* Its detection status is resolved or off

However, in practice, we don't check these these properties, but whether they have their fixed_at property set. The idea is that outstanding reports **do not** have this property set, and closed reports **do**. Any deviation from this (unfortunately unwritten) rule is a bug, so whether fixed_at is set should perfectly reflect whether a report is closed.

Since diffing is done on the set of outstanding reports, filters on the diff page (seen in an image above) don't matter much, reports with false positive or intentional review statuses should never appear, nor reports with resolved or off detection status.

### Local-Local results

Commands used for the results:
```
CodeChecker cmd diff -b result1/ -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n result2/ --unresolved --review-status --detection-status
```

Note that the last two options are there to make sure that all review statuses and detection statuses are included.

| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| result1/ | result2/ | NEW (only in Run set 2) | D | D, E |
| result1/ | result2/  | RESOLVED (only in Run set 1) | B, C  | B, C|
| result1/ | result2/ | UNRESOLVED (Run set 1 and Run set 2) | A, E, F |  A, F* |
| result1/ | result2/ | neither | - | - |

*TODO: Consider adding a feature where we query the review status rule even in the local-local case: if we did that, F should be in the neither set.

### Local-Remote / Remote-Local results

Commands used for the results:
```
CodeChecker cmd diff -b RunA -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b RunA -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b RunA -n result2/ --unresolved --review-status --detection-status
```

| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| RunA | result2/ | NEW (only in Run set 2) | D, **E** * | D, E |
| RunA | result2/  | RESOLVED (only in Run set 1) | B, C, **E** *, F | B, C  |
| RunA | result2/ | UNRESOLVED (Run set 1 and Run set 2) | A | A |
| RunA | result2/ | neither | - | F |

*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why E is in two disjointed sets.

```
CodeChecker cmd diff -b result1/ -n RunB --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n RunB --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n RunB --unresolved --review-status --detection-status
```

| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| result1/ | RunB | NEW (only in Run set 2) | **C** *, D, E, F | D, E|
| result1/ | RunB  | RESOLVED (only in Run set 1) | B, **C** * | B, C |
| result1/ | RunB | UNRESOLVED (Run set 1 and Run set 2) | A | A |
| result1/ | RunB | neither | - | F |

*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why C is in two disjointed sets.

### Remote-Remote results:

No filters were applied in the GUI as a result (besides the default, which is no filters at all). I also checked the results on the coammnd line, and got the same thing.

Commands for the cmd:
```
CodeChecker cmd diff -b RunA -n RunB --new --review-status --detection-status
CodeChecker cmd diff -b RunA -n RunB --resolved --review-status --detection-status
CodeChecker cmd diff -b RunA -n RunB --unresolved --review-status --detection-status
```

| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| RunA | RunB | NEW (only in Run set 2) | D, E | D, E|
| RunA | RunB  | RESOLVED (only in Run set 1) | B, C |B, C |
| RunA | RunB | UNRESOLVED (Run set 1 and Run set 2) | A | A |
| RunA | RunB | neither | F | F |

### Discusssion

**First if all, all 3 of these should be identical, except for report F (discussed below). The fact that they aren't is already a sign of a bug (likely several bugs).**


[Here](https://github.com/Ericsson/codechecker/blob/master/docs/usage.md#how-diffs-between-runs-are-calculated) are some docs regarding review status rules. _Review status rules_ are created when a report's review status is categorized _in the GUI_.

This bit is mostly about what is the difference in between them and source code suppression.

When it comes to _false positive_ (or _intentional_) review status rules, our current thinking is that if a report was categorized as such, every other report with the same bug hash is a false positive as well. (and as a result, a _closed_ report). Mind that it doesn't matter whether a report's detection date is before or after the rule is set -- every already stored report, and those that will be stored in the future will have its review status set (and its fixed_at date as well). This to implies that in some sense, review status rules are a timeless property.

[Source code suppressions](https://github.com/Ericsson/codechecker/blob/0dd2dd3159aa134eb655d8d0fd6d2bb70b63170d/docs/analyzer/user_guide.md#suppressing-false-positives-source-code-comments-for-review-status), when stored on the server, won't create a review status rule. Reports marked as a FP in the source code only affect the report instance in that particular analysis (and are only _closed_ in that run). This means that if the source code comment itself is removed from run to another, we regard it as a new outstanding report. This also implies that source code suppressions are NOT a timeless property -- reports E and C are not false positives, ONLY in the context of the runs that suppressed them.

We can definitely say that our logic regarding source code suppression doesn't work right at the moment. Aside from the remote-remote case, reports E and C are almost always misplaced.

In short, _source code suppressions_ and _review status rule suppressions_ are handled very differently for diffs, but look identical. The first row is suppressed with a comment, the second one with a review status rule.
![image](https://user-images.githubusercontent.com/23276031/232779562-f1426270-d9fe-48b8-8ddd-b785fc6e12c3.png)
This is not necessarily a problem, just something to be aware of.

## Tag diffs

### Preface to the results

**When storing under the same run name but with different tags, the statement above on different report instances no longer hold. A report in tag1 and tag2 don't just look equivalent, they refer to the same bug report instance. For this reason, the fixed_at date can be changed by a subsequent tag!**

The following example (among other shortcomings) showcases that the result of diff(tag1, tag2) _before_ storing a new tag onto the run, than _after_ it.

Store commands:
```
CodeChecker store result1/ -n "SingleRun" --tag tag1 --url 0.0.0.0/5569
CodeChecker store result2/ -n "SingleRun" --tag tag2 --url 0.0.0.0/5569
```

FIXME: Tags don't seem to work at all. I created a separate issue for this: #3889

Later, the results in `result1/` will be stored again under the tag `tag3`.
```
CodeChecker store result1/ -n "SingleRun" --tag tag3 --url 0.0.0.0/5569
```

#### Summary

| Report ID | Checker name | In test1.cpp? (tag1) | In test2.cpp? (tag2)| In test1.cpp? (tag3) |
| --- | --- | --- | --- | --- |
| A | core.DivideZero | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: 
| **B** | **core.NullDereference** | :heavy_check_mark: | :x: | :heavy_check_mark: 
| C | deadcode.DeadStores | :heavy_check_mark: | :heavy_check_mark:, src suppressed  | :heavy_check_mark: 
| **D** | **cplusplus.NewDeleteLeaks** | :x: | :heavy_check_mark: | :x: |
| E | unix.MismatchedDeallocator | :heavy_check_mark:, src suppressed | :heavy_check_mark: | :heavy_check_mark:, src suppressed |
| F | cplusplus.NewDelete | :heavy_check_mark:, GUI suppressed | :heavy_check_mark:, GUI suppressed | :heavy_check_mark:, GUI suppressed |

### Local-Remote

```
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --unresolved --review-status --detection-status
```

(before tag3):
| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| result1/ | SingleRun tag2 | NEW (only in Run set 2) | D, E | D, E |
| result1/ | SingleRun tag2  | RESOLVED (only in Run set 1) | B, C | B, C |
| result1/ | SingleRun tag2 | UNRESOLVED (Run set 1 and Run set 2) | A | A |
| result1/ | SingleRun tag2 | neither | F | F |

```
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --unresolved --review-status --detection-status
```

(before tag3):
| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| SingleRun tag1 | result2/ | NEW (only in Run set 2) | - | D, E|
| SingleRun tag1 | result2/  | RESOLVED (only in Run set 1) | B, C | B, C |
| SingleRun tag1 | result2/ | UNRESOLVED (Run set 1 and Run set 2) | A, D*, E** | A |
| SingleRun tag1 | result2/ | neither | F | F |

*FIXME: How did this get here???
**Report E is no longer suppressed in the command line from tag1 to tag2, so this is a case where the fixed_at property was removed, but should not have been. For architectural reasons, it is impossible for previously stored tags to act as if no subsequent tags were stored on top of it (because the fixed_at date is irrecoverably overwritten). In the reverse case (report C) we could work around this: detection dates never chance (theoretically speaking), and we may be able to express that that the fixed_at date is precisely the detect_at date in tag2, and as such, is still outstanding in tag1.

### Remote-Remote

No filters were applied in the GUI as a result (besides the default, which is no filters at all). TODO: Check cmd

(before tag3):
| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| SingleRun tag1 | SingleRun tag2 | NEW (only in Run set 2) | D |D, E |
| SingleRun tag1 | SingleRun tag2  | RESOLVED (only in Run set 1) | - |B, C |
| SingleRun tag1 | SingleRun tag2 | UNRESOLVED (Run set 1 and Run set 2) | A, E | A |
| SingleRun tag1 | SingleRun tag2 | neither | B, C, F | F |

**This should be identical to the non-tag remote-remote results.**

(after tag3):
| Run set 1 (-b) | Run set 2 (-n) | Diff type | Actual result | Expected result |
| --- | --- | --- | --- | --- |
| SingleRun tag1 | SingleRun tag2 | NEW (only in Run set 2) | -  | D, E|
| SingleRun tag1 | SingleRun tag2  | RESOLVED (only in Run set 1) | - | B, C |
| SingleRun tag1 | SingleRun tag2 | UNRESOLVED (Run set 1 and Run set 2) | D, A, E | A|
| SingleRun tag1 | SingleRun tag2 | neither | B, C, F |F |

**This should be identical to the results where tag3 hasn't been added yet.**

### Discussion

The problem here is that diffs are done on the latest status of the reports, but should instead be considered in the only in the context of the diff. If `tag3` doesn't participate in the diff, we should act as if it never existed (surely when calculating the set of outstanding reports).

Report ID	Checker name	In test1.cpp?	In test2.cpp?
A	core.DivideZero	✔️	✔️
B	core.NullDereference	✔️	❌
C	deadcode.DeadStores	✔️	✔️, source code suppressed
D	cplusplus.NewDeleteLeaks	❌	✔️
E	unix.MismatchedDeallocator	✔️, source code suppressed	✔️
F	cplusplus.NewDelete	✔️, GUI suppressed	✔️, GUI suppressed

Report ID	Checker name	In test1.cpp? (tag1)	In test2.cpp? (tag2)	In test1.cpp? (tag3)
A	core.DivideZero	✔️	✔️	✔️
B	core.NullDereference	✔️	❌	✔️
C	deadcode.DeadStores	✔️	✔️, src suppressed	✔️
D	cplusplus.NewDeleteLeaks	❌	✔️	❌
E	unix.MismatchedDeallocator	✔️, src suppressed	✔️	✔️, src suppressed
F	cplusplus.NewDelete	✔️, GUI suppressed	✔️, GUI suppressed	✔️, GUI suppressed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CodeChecker diff works inconsistently in local-local, local-remote, remote-remote cases, especially with tags #3884

GUI problems

Issues with diff

Run diffs (without tags)

test1.cpp

test2.cpp

Summary

Preface to the results

Local-Local results

Local-Remote / Remote-Local results

Remote-Remote results:

Discusssion

Tag diffs

Preface to the results

Summary

Local-Remote

Remote-Remote

Discussion

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Report ID	Checker name	Fixed at date & reason
A	core.DivideZero	-
C	deadcode.DeadStores	fix2, in source
D	cplusplus.NewDeleteLeaks	-
E	unix.MismatchedDeallocator	-
F	cplusplus.NewDelete	-

Run set 1 (-b)	Run set 2 (-n)	Diff type	Actual result	Expected result
result1/	result2/	NEW (only in Run set 2)	D	D, E
result1/	result2/	RESOLVED (only in Run set 1)	B, C	B, C
result1/	result2/	UNRESOLVED (Run set 1 and Run set 2)	A, E, F	A, F*
result1/	result2/	neither	-	-

Run set 1 (-b)	Run set 2 (-n)	Diff type	Actual result	Expected result
RunA	result2/	NEW (only in Run set 2)	D, E *	D, E
RunA	result2/	RESOLVED (only in Run set 1)	B, C, E *, F	B, C
RunA	result2/	UNRESOLVED (Run set 1 and Run set 2)	A	A
RunA	result2/	neither	-	F

Run set 1 (-b)	Run set 2 (-n)	Diff type	Actual result	Expected result
result1/	RunB	NEW (only in Run set 2)	C *, D, E, F	D, E
result1/	RunB	RESOLVED (only in Run set 1)	B, C *	B, C
result1/	RunB	UNRESOLVED (Run set 1 and Run set 2)	A	A
result1/	RunB	neither	-	F

Run set 1 (-b)	Run set 2 (-n)	Diff type	Actual result	Expected result
result1/	SingleRun tag2	NEW (only in Run set 2)	D, E	D, E
result1/	SingleRun tag2	RESOLVED (only in Run set 1)	B, C	B, C
result1/	SingleRun tag2	UNRESOLVED (Run set 1 and Run set 2)	A	A
result1/	SingleRun tag2	neither	F	F

Run set 1 (-b)	Run set 2 (-n)	Diff type	Actual result	Expected result
SingleRun tag1	result2/	NEW (only in Run set 2)	-	D, E
SingleRun tag1	result2/	RESOLVED (only in Run set 1)	B, C	B, C
SingleRun tag1	result2/	UNRESOLVED (Run set 1 and Run set 2)	A, D, E*	A
SingleRun tag1	result2/	neither	F	F

CodeChecker diff works inconsistently in local-local, local-remote, remote-remote cases, especially with tags #3884

Description

GUI problems

Issues with diff

Run diffs (without tags)

test1.cpp

test2.cpp

Summary

Preface to the results

Local-Local results

Local-Remote / Remote-Local results

Remote-Remote results:

Discusssion

Tag diffs

Preface to the results

Summary

Local-Remote

Remote-Remote

Discussion

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions