GUI problems
CodeChecker diff is a more complicated feature than it seems to be. The greatest problem with it is that diffing is not done on sets of reports, but rather sets of outstanding reports. Looking at this (cropped) screenshot, this is not evident at all:

Nowhere in the report page (after pressing diff) can you see that the result is the set of outstanding reports:

The icons for "Diff type" make this worse, since nothing explicitly states that we mean "Only outstanding reports in Baseline":

Meaning that it is easy to misinterpret the GUI and mistakenly believe that these are set operations on the set of all reports. The command line interface does a little better as one of the following 3 flags are manditory to CodeChecker cmd diff: --new, --resolved, --unresolved.
Issues with diff
Run diffs (without tags)
Suppose we have the following two files, test1.cpp and test2.cpp.
test1.cpp
Analysis results on test1.cpp are stored in the folder result1, and remotely as RunA.
Analysis:
$ CodeChecker check -b "g++ -c test1.cpp" -o result1 --analyzers clangsa -c
Store:
CodeChecker store result1/ -n RunA --url 0.0.0.0/Default
// test1.cpp
#include "stdlib.h"
void a() {
int i = 0;
(void)(10 / i); // division by zero
}
void b() {
int *i = 0;
*i = 5; // nullptr dereference
}
void c() {
int i = 0; // deadstore, this value is never read
//
i = 5;
}
// void d() {
// int *k = new int;
// (void)k;
// } // memory leak
void e() {
int *k = new int;
// codechecker_suppress [all] SUPPRESS ALL
free(k); // mismatched deallocator, should be delete k;
}
void f() {
int *k = new int;
delete k;
*k = 5; // use after free
}
Analysis results:
| Report ID |
Checker name |
Fixed at date & reason |
| A |
core.DivideZero |
- |
| B |
core.NullDereference |
- |
| C |
deadcode.DeadStores |
- |
| E |
unix.MismatchedDeallocator |
fix1, in source |
| F |
cplusplus.NewDelete |
fix1, on GUI * |
*When stored as RunA, on the GUI, a review status rule is set to false positive.
test2.cpp
Analysis results on test2.cpp are stored in the folder result2, and remotely as RunB.
Analysis:
$ CodeChecker check -b "g++ -c test2.cpp" -o result2 --analyzers clangsa -c
Store:
CodeChecker store result2/ -n RunB --url 0.0.0.0/Default
// test2.cpp
#include "stdlib.h"
void a() {
int i = 0;
(void)(10 / i); // division by zero
}
// void b() {
// int *i = 0;
// *i = 5; // nullptr dereference
// }
void c() {
int i = 0; // deadstore, this value is never read
// codechecker_suppress [all] SUPPRESS ALL
i = 5;
}
void d() {
int *k = new int;
(void)k;
} // memory leak
void e() {
int *k = new int;
//
free(k); // mismatched deallocator, should be delete k;
}
void f() {
int *k = new int;
delete k;
*k = 5; // use after free
}
Analysis results:
| Report ID |
Checker name |
Fixed at date & reason |
| A |
core.DivideZero |
- |
| C |
deadcode.DeadStores |
fix2, in source |
| D |
cplusplus.NewDeleteLeaks |
- |
| E |
unix.MismatchedDeallocator |
- |
| F |
cplusplus.NewDelete |
- |
Summary
| Report ID |
Checker name |
In test1.cpp? |
In test2.cpp? |
| A |
core.DivideZero |
✔️ |
✔️ |
| B |
core.NullDereference |
✔️ |
❌ |
| C |
deadcode.DeadStores |
✔️ |
✔️, source code suppressed |
| D |
cplusplus.NewDeleteLeaks |
❌ |
✔️ |
| E |
unix.MismatchedDeallocator |
✔️, source code suppressed |
✔️ |
| F |
cplusplus.NewDelete |
✔️, GUI suppressed |
✔️, GUI suppressed |
Preface to the results
Note that each report has its own instance. Two reports can be identical (to the point of having the same bug hash), but if they are in two different runs, they are not the same, and have their own row in the reports table. As a result, they have their own fixed_at date!
Diffing in theory is done with set operations on the set of outstanding reports. In practice, a report is considered outstanding (or open) if all of the following is true:
- Its detection status is new, reopened, or unresolved
- Its review status is unreviewed or confirmed
A report is closed (not outstanding) if any of the following is true:
- Its review status is false positive or intentional
- Its detection status is resolved or off
However, in practice, we don't check these these properties, but whether they have their fixed_at property set. The idea is that outstanding reports do not have this property set, and closed reports do. Any deviation from this (unfortunately unwritten) rule is a bug, so whether fixed_at is set should perfectly reflect whether a report is closed.
Since diffing is done on the set of outstanding reports, filters on the diff page (seen in an image above) don't matter much, reports with false positive or intentional review statuses should never appear, nor reports with resolved or off detection status.
Local-Local results
Commands used for the results:
CodeChecker cmd diff -b result1/ -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n result2/ --unresolved --review-status --detection-status
Note that the last two options are there to make sure that all review statuses and detection statuses are included.
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| result1/ |
result2/ |
NEW (only in Run set 2) |
D |
D, E |
| result1/ |
result2/ |
RESOLVED (only in Run set 1) |
B, C |
B, C |
| result1/ |
result2/ |
UNRESOLVED (Run set 1 and Run set 2) |
A, E, F |
A, F* |
| result1/ |
result2/ |
neither |
- |
- |
*TODO: Consider adding a feature where we query the review status rule even in the local-local case: if we did that, F should be in the neither set.
Local-Remote / Remote-Local results
Commands used for the results:
CodeChecker cmd diff -b RunA -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b RunA -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b RunA -n result2/ --unresolved --review-status --detection-status
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| RunA |
result2/ |
NEW (only in Run set 2) |
D, E * |
D, E |
| RunA |
result2/ |
RESOLVED (only in Run set 1) |
B, C, E *, F |
B, C |
| RunA |
result2/ |
UNRESOLVED (Run set 1 and Run set 2) |
A |
A |
| RunA |
result2/ |
neither |
- |
F |
*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why E is in two disjointed sets.
CodeChecker cmd diff -b result1/ -n RunB --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n RunB --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n RunB --unresolved --review-status --detection-status
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| result1/ |
RunB |
NEW (only in Run set 2) |
C *, D, E, F |
D, E |
| result1/ |
RunB |
RESOLVED (only in Run set 1) |
B, C * |
B, C |
| result1/ |
RunB |
UNRESOLVED (Run set 1 and Run set 2) |
A |
A |
| result1/ |
RunB |
neither |
- |
F |
*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why C is in two disjointed sets.
Remote-Remote results:
No filters were applied in the GUI as a result (besides the default, which is no filters at all). I also checked the results on the coammnd line, and got the same thing.
Commands for the cmd:
CodeChecker cmd diff -b RunA -n RunB --new --review-status --detection-status
CodeChecker cmd diff -b RunA -n RunB --resolved --review-status --detection-status
CodeChecker cmd diff -b RunA -n RunB --unresolved --review-status --detection-status
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| RunA |
RunB |
NEW (only in Run set 2) |
D, E |
D, E |
| RunA |
RunB |
RESOLVED (only in Run set 1) |
B, C |
B, C |
| RunA |
RunB |
UNRESOLVED (Run set 1 and Run set 2) |
A |
A |
| RunA |
RunB |
neither |
F |
F |
Discusssion
First if all, all 3 of these should be identical, except for report F (discussed below). The fact that they aren't is already a sign of a bug (likely several bugs).
Here are some docs regarding review status rules. Review status rules are created when a report's review status is categorized in the GUI.
This bit is mostly about what is the difference in between them and source code suppression.
When it comes to false positive (or intentional) review status rules, our current thinking is that if a report was categorized as such, every other report with the same bug hash is a false positive as well. (and as a result, a closed report). Mind that it doesn't matter whether a report's detection date is before or after the rule is set -- every already stored report, and those that will be stored in the future will have its review status set (and its fixed_at date as well). This to implies that in some sense, review status rules are a timeless property.
Source code suppressions, when stored on the server, won't create a review status rule. Reports marked as a FP in the source code only affect the report instance in that particular analysis (and are only closed in that run). This means that if the source code comment itself is removed from run to another, we regard it as a new outstanding report. This also implies that source code suppressions are NOT a timeless property -- reports E and C are not false positives, ONLY in the context of the runs that suppressed them.
We can definitely say that our logic regarding source code suppression doesn't work right at the moment. Aside from the remote-remote case, reports E and C are almost always misplaced.
In short, source code suppressions and review status rule suppressions are handled very differently for diffs, but look identical. The first row is suppressed with a comment, the second one with a review status rule.

This is not necessarily a problem, just something to be aware of.
Tag diffs
Preface to the results
When storing under the same run name but with different tags, the statement above on different report instances no longer hold. A report in tag1 and tag2 don't just look equivalent, they refer to the same bug report instance. For this reason, the fixed_at date can be changed by a subsequent tag!
The following example (among other shortcomings) showcases that the result of diff(tag1, tag2) before storing a new tag onto the run, than after it.
Store commands:
CodeChecker store result1/ -n "SingleRun" --tag tag1 --url 0.0.0.0/5569
CodeChecker store result2/ -n "SingleRun" --tag tag2 --url 0.0.0.0/5569
FIXME: Tags don't seem to work at all. I created a separate issue for this: #3889
Later, the results in result1/ will be stored again under the tag tag3.
CodeChecker store result1/ -n "SingleRun" --tag tag3 --url 0.0.0.0/5569
Summary
| Report ID |
Checker name |
In test1.cpp? (tag1) |
In test2.cpp? (tag2) |
In test1.cpp? (tag3) |
| A |
core.DivideZero |
✔️ |
✔️ |
✔️ |
| B |
core.NullDereference |
✔️ |
❌ |
✔️ |
| C |
deadcode.DeadStores |
✔️ |
✔️, src suppressed |
✔️ |
| D |
cplusplus.NewDeleteLeaks |
❌ |
✔️ |
❌ |
| E |
unix.MismatchedDeallocator |
✔️, src suppressed |
✔️ |
✔️, src suppressed |
| F |
cplusplus.NewDelete |
✔️, GUI suppressed |
✔️, GUI suppressed |
✔️, GUI suppressed |
Local-Remote
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --new --review-status --detection-status
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --resolved --review-status --detection-status
CodeChecker cmd diff -b result1/ -n SingleRun --tag tag2 --unresolved --review-status --detection-status
(before tag3):
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| result1/ |
SingleRun tag2 |
NEW (only in Run set 2) |
D, E |
D, E |
| result1/ |
SingleRun tag2 |
RESOLVED (only in Run set 1) |
B, C |
B, C |
| result1/ |
SingleRun tag2 |
UNRESOLVED (Run set 1 and Run set 2) |
A |
A |
| result1/ |
SingleRun tag2 |
neither |
F |
F |
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --new --review-status --detection-status
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --resolved --review-status --detection-status
CodeChecker cmd diff -b SingleRun --tag tag1 -n result2/ --unresolved --review-status --detection-status
(before tag3):
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| SingleRun tag1 |
result2/ |
NEW (only in Run set 2) |
- |
D, E |
| SingleRun tag1 |
result2/ |
RESOLVED (only in Run set 1) |
B, C |
B, C |
| SingleRun tag1 |
result2/ |
UNRESOLVED (Run set 1 and Run set 2) |
A, D*, E** |
A |
| SingleRun tag1 |
result2/ |
neither |
F |
F |
*FIXME: How did this get here???
**Report E is no longer suppressed in the command line from tag1 to tag2, so this is a case where the fixed_at property was removed, but should not have been. For architectural reasons, it is impossible for previously stored tags to act as if no subsequent tags were stored on top of it (because the fixed_at date is irrecoverably overwritten). In the reverse case (report C) we could work around this: detection dates never chance (theoretically speaking), and we may be able to express that that the fixed_at date is precisely the detect_at date in tag2, and as such, is still outstanding in tag1.
Remote-Remote
No filters were applied in the GUI as a result (besides the default, which is no filters at all). TODO: Check cmd
(before tag3):
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| SingleRun tag1 |
SingleRun tag2 |
NEW (only in Run set 2) |
D |
D, E |
| SingleRun tag1 |
SingleRun tag2 |
RESOLVED (only in Run set 1) |
- |
B, C |
| SingleRun tag1 |
SingleRun tag2 |
UNRESOLVED (Run set 1 and Run set 2) |
A, E |
A |
| SingleRun tag1 |
SingleRun tag2 |
neither |
B, C, F |
F |
This should be identical to the non-tag remote-remote results.
(after tag3):
| Run set 1 (-b) |
Run set 2 (-n) |
Diff type |
Actual result |
Expected result |
| SingleRun tag1 |
SingleRun tag2 |
NEW (only in Run set 2) |
- |
D, E |
| SingleRun tag1 |
SingleRun tag2 |
RESOLVED (only in Run set 1) |
- |
B, C |
| SingleRun tag1 |
SingleRun tag2 |
UNRESOLVED (Run set 1 and Run set 2) |
D, A, E |
A |
| SingleRun tag1 |
SingleRun tag2 |
neither |
B, C, F |
F |
This should be identical to the results where tag3 hasn't been added yet.
Discussion
The problem here is that diffs are done on the latest status of the reports, but should instead be considered in the only in the context of the diff. If tag3 doesn't participate in the diff, we should act as if it never existed (surely when calculating the set of outstanding reports).
GUI problems
CodeChecker diff is a more complicated feature than it seems to be. The greatest problem with it is that diffing is not done on sets of reports, but rather sets of outstanding reports. Looking at this (cropped) screenshot, this is not evident at all:

Nowhere in the report page (after pressing diff) can you see that the result is the set of outstanding reports:
The icons for "Diff type" make this worse, since nothing explicitly states that we mean "Only outstanding reports in Baseline":
Meaning that it is easy to misinterpret the GUI and mistakenly believe that these are set operations on the set of all reports. The command line interface does a little better as one of the following 3 flags are manditory to
CodeChecker cmd diff:--new,--resolved,--unresolved.Issues with diff
Run diffs (without tags)
Suppose we have the following two files, test1.cpp and test2.cpp.
test1.cpp
Analysis results on test1.cpp are stored in the folder
result1, and remotely asRunA.Analysis:
Store:
Analysis results:
*When stored as RunA, on the GUI, a review status rule is set to false positive.
test2.cpp
Analysis results on test2.cpp are stored in the folder
result2, and remotely asRunB.Analysis:
Store:
Analysis results:
Summary
Preface to the results
Note that each report has its own instance. Two reports can be identical (to the point of having the same bug hash), but if they are in two different runs, they are not the same, and have their own row in the reports table. As a result, they have their own fixed_at date!
Diffing in theory is done with set operations on the set of outstanding reports. In practice, a report is considered outstanding (or open) if all of the following is true:
A report is closed (not outstanding) if any of the following is true:
However, in practice, we don't check these these properties, but whether they have their fixed_at property set. The idea is that outstanding reports do not have this property set, and closed reports do. Any deviation from this (unfortunately unwritten) rule is a bug, so whether fixed_at is set should perfectly reflect whether a report is closed.
Since diffing is done on the set of outstanding reports, filters on the diff page (seen in an image above) don't matter much, reports with false positive or intentional review statuses should never appear, nor reports with resolved or off detection status.
Local-Local results
Commands used for the results:
Note that the last two options are there to make sure that all review statuses and detection statuses are included.
*TODO: Consider adding a feature where we query the review status rule even in the local-local case: if we did that, F should be in the neither set.
Local-Remote / Remote-Local results
Commands used for the results:
*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why E is in two disjointed sets.
*FIXME: Its a reasonable suspicion that we don't precalculate the set of outstanding reports AND THEN do set calculations, which could be why C is in two disjointed sets.
Remote-Remote results:
No filters were applied in the GUI as a result (besides the default, which is no filters at all). I also checked the results on the coammnd line, and got the same thing.
Commands for the cmd:
Discusssion
First if all, all 3 of these should be identical, except for report F (discussed below). The fact that they aren't is already a sign of a bug (likely several bugs).
Here are some docs regarding review status rules. Review status rules are created when a report's review status is categorized in the GUI.
This bit is mostly about what is the difference in between them and source code suppression.
When it comes to false positive (or intentional) review status rules, our current thinking is that if a report was categorized as such, every other report with the same bug hash is a false positive as well. (and as a result, a closed report). Mind that it doesn't matter whether a report's detection date is before or after the rule is set -- every already stored report, and those that will be stored in the future will have its review status set (and its fixed_at date as well). This to implies that in some sense, review status rules are a timeless property.
Source code suppressions, when stored on the server, won't create a review status rule. Reports marked as a FP in the source code only affect the report instance in that particular analysis (and are only closed in that run). This means that if the source code comment itself is removed from run to another, we regard it as a new outstanding report. This also implies that source code suppressions are NOT a timeless property -- reports E and C are not false positives, ONLY in the context of the runs that suppressed them.
We can definitely say that our logic regarding source code suppression doesn't work right at the moment. Aside from the remote-remote case, reports E and C are almost always misplaced.
In short, source code suppressions and review status rule suppressions are handled very differently for diffs, but look identical. The first row is suppressed with a comment, the second one with a review status rule.

This is not necessarily a problem, just something to be aware of.
Tag diffs
Preface to the results
When storing under the same run name but with different tags, the statement above on different report instances no longer hold. A report in tag1 and tag2 don't just look equivalent, they refer to the same bug report instance. For this reason, the fixed_at date can be changed by a subsequent tag!
The following example (among other shortcomings) showcases that the result of diff(tag1, tag2) before storing a new tag onto the run, than after it.
Store commands:
FIXME: Tags don't seem to work at all. I created a separate issue for this: #3889
Later, the results in
result1/will be stored again under the tagtag3.Summary
Local-Remote
(before tag3):
(before tag3):
*FIXME: How did this get here???
**Report E is no longer suppressed in the command line from tag1 to tag2, so this is a case where the fixed_at property was removed, but should not have been. For architectural reasons, it is impossible for previously stored tags to act as if no subsequent tags were stored on top of it (because the fixed_at date is irrecoverably overwritten). In the reverse case (report C) we could work around this: detection dates never chance (theoretically speaking), and we may be able to express that that the fixed_at date is precisely the detect_at date in tag2, and as such, is still outstanding in tag1.
Remote-Remote
No filters were applied in the GUI as a result (besides the default, which is no filters at all). TODO: Check cmd
(before tag3):
This should be identical to the non-tag remote-remote results.
(after tag3):
This should be identical to the results where tag3 hasn't been added yet.
Discussion
The problem here is that diffs are done on the latest status of the reports, but should instead be considered in the only in the context of the diff. If
tag3doesn't participate in the diff, we should act as if it never existed (surely when calculating the set of outstanding reports).