Skip to content

Conversation

@yiguolei
Copy link
Contributor

@yiguolei yiguolei commented Apr 6, 2022

Currently, there are 2 status code in BE, one is common/Status.h, and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations, I will unify these code to common/Status.

Proposed changes

Issue Number: close #xxx

Problem Summary:

Describe the overview of changes.

Checklist(Required)

  1. Does it affect the original behavior: (Yes/No/I Don't know)
  2. Has unit tests been added: (Yes/No/No Need)
  3. Has document been added or modified: (Yes/No/No Need)
  4. Does it need to update dependencies: (Yes/No)
  5. Are there any changes that cannot be rolled back: (Yes/No)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@yiguolei yiguolei changed the title [Refactor] remove OLAPStatus to Status [Refactor] Change ALL OLAPStatus to Status Apr 6, 2022
jackwener
jackwener previously approved these changes Apr 7, 2022
Copy link
Member

@jackwener jackwener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR

@jackwener jackwener added the kind/refactor Issues or PRs to refactor code label Apr 7, 2022
@morningman
Copy link
Contributor

We should separate this PR

@morningman morningman added the dev/backlog waiting to be merged in future dev branch label Apr 7, 2022
@yiguolei yiguolei force-pushed the remove_olap_status branch 2 times, most recently from e70fdbe to c513b19 Compare April 10, 2022 13:02
# Set boost/stacktrace use backtrace api to unwind
add_definitions(-DBOOST_STACKTRACE_USE_BACKTRACE)
add_definitions(-DPRINT_ALL_ERR_STATUS_STACKTRACE)
#add_definitions(-DPRINT_ALL_ERR_STATUS_STACKTRACE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why comment out this config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this marco is enabled, BE will print all error status to log, it may cause performance problem. So that I disabled it by default. Should enable it manually if it is needed.

Status err_code =
SnapshotManager::instance()->make_snapshot(snapshot_request, &snapshot_path, &allow_incremental_clone);
if (err_code != OLAP_SUCCESS) {
if (!err_code) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not err_code.ok()? !err_code is same with !err_code.ok() ??

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe ret_code is better.
err_code is not a good name.It seems like !error_code means no error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not modify variable name in this PR. Because there maybe multiple status variable in same function. Modify the variable name may cause logic bug. And also, there are too many variables, it may take lot of time.
We may refactor the status variable name in other PRs.

@yiguolei yiguolei force-pushed the remove_olap_status branch from 51bd540 to df40099 Compare April 13, 2022 03:40
if (res == OLAP_ERR_BE_NO_SUITABLE_VERSION) {
res = base_compaction.compact();
if (!res) {
if (res == Status::OLAPInternalError(OLAP_ERR_BE_NO_SUITABLE_VERSION)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you call Status::OLAPInternalError only for checking the error code?
Why not do thing like if (res.err_code == OLAP_ERR_BE_NO_SUITABLE_VERSION)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We'd better not change logic here. And there are many code like this because I use search and replace to change the code. Maybe we could use some code like isNoSuitableVersionError here to check it.

Status s = memtable->flush();
LOG(WARNING) << "Flush memtable failed with res = " << s;
// If s is not ok, ignore the code, just use other code is ok
_flush_status.store(s.ok() ? OLAP_SUCCESS : OLAP_ERR_OTHER_ERROR);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using the code from s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error code in S is int16_t, it may not be converted to ErrorCode. And I find we did not justify the errorcode in other code. So that I print the error stack in LOG(WARNING) and using OTHER_ERROR here.

beta_rowset_reader.cpp
beta_rowset_writer.cpp)

target_compile_options(Rowset PUBLIC "-Wno-error=maybe-uninitialized")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some variables in rowset is not initialized, so that I has to add this option to compile successfully. I will add another PR to init all variables and remove this option.

yiguolei and others added 2 commits April 13, 2022 19:40
if (largest_segment_group->find_short_key(end_key, &helper_cursor, false, &end_pos) !=
OLAP_SUCCESS) {
if (largest_segment_group->find_last_row_block(&end_pos) != OLAP_SUCCESS) {
if (!largest_segment_group->find_short_key(end_key, &helper_cursor, false, &end_pos)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!largest_segment_group->find_short_key(end_key, &helper_cursor, false, &end_pos)) {
if (largest_segment_group->find_short_key(end_key, &helper_cursor, false, &end_pos) != Status::OK()) {

OLAP_SUCCESS) {
if (largest_segment_group->find_last_row_block(&end_pos) != OLAP_SUCCESS) {
if (!largest_segment_group->find_short_key(end_key, &helper_cursor, false, &end_pos)) {
if (!largest_segment_group->find_last_row_block(&end_pos)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!largest_segment_group->find_last_row_block(&end_pos)) {
if (largest_segment_group->find_last_row_block(&end_pos) != Status::OK()) {


if (_byte_reader->has_next()) {
if (OLAP_SUCCESS != (res = _byte_reader->next(&_current))) {
if (!(res = _byte_reader->next(&_current))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!(res = _byte_reader->next(&_current))) {
if (_byte_reader->next(&_current) != Status::OK()) {

Copy link
Contributor Author

@yiguolei yiguolei Apr 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we do not modify logic here, it will make the refactor work very difficult and may import bugs. For example, in this case, the res value maybe used in the latter code.


if (0 == _bits_left) {
if (OLAP_SUCCESS != (res = _read_byte())) {
if (!(res = _read_byte())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!(res = _read_byte())) {
if (_read_byte() != res) {

Status res = Status::OK();

if (OLAP_SUCCESS != (res = _byte_reader->seek(position))) {
if (!(res = _byte_reader->seek(position))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!(res = _byte_reader->seek(position))) {
if (_byte_reader->seek(position) != res) {

if (OLAP_SUCCESS != res) {
OLAP_LOG_WARNING("fail to init segment reader. [res=%d]", res);
if (!res.ok()) {
LOG(WARNING) << "fail to init segment reader. res = " << res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using OLAP_LOG_WARNING?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OLAP_LOG_XXXX will be replaced with LOG(WARNING) in the future. LOGGING SYSTEM should also be uniformed.

Copy link
Contributor

@caiconghui caiconghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic LGTM, but we need to keep one check status style in the future.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Apr 13, 2022
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@yiguolei
Copy link
Contributor Author

Add some performance test:

class Status {

public:

Status() {
_length = 0;
_precise_code = 0;
}

Status(int i) {
_precise_code = i % 17;
}

static Status OK() {
return Status();
}

static Status ERROR(int i) {
return Status(i);
}

bool operator == (const Status& st) { return _precise_code == st._precise_code; }

union {
char _state[2048];

  struct {
      // Message length == HEADER(7 bytes) + message size
      // Sometimes error message is empty, so that we could not use length==0 to indicate
      // whether there is error happens
      int64_t _length : 32;      
      int64_t _code : 8;          
      int64_t _precise_code : 16;
      int64_t _message : 8;       // save message since here
  };

};

};

static void newstatus(benchmark::State& state) {
// Code inside this loop is measured repeatedly
bool result = false;
int i = 0;
for (auto _ : state) {
Status status(i - 1);
++i;
if (status == Status::ERROR(i)) {
result = true;
} else {
result = false;
}
}
}
// Register the function as a benchmark
BENCHMARK(newstatus);

static void olapstatus(benchmark::State& state) {
// Code before the loop is not measured
bool result = false;
int i = 0;
for (auto _ : state) {
//Status status(i - 1);
++i;
if (i == 16) {
result = true;
} else {
result = false;
}
}
}
BENCHMARK(olapstatus);

new status is slower than olapstatus about 3.6%.

Since there are lot of other work during query and the status is returned batch by batch, so that the impact could be ignored.

@yiguolei
Copy link
Contributor Author

Add SSB benchmark:
Environment:

  1. ONE BE, ONE FE
  2. CREATE LINEORDER TABLE with bucket num == 1, lineorder table with 600037902 rows.
  3. set vectoried_engine = true;
  4. set scanner thread == 1;
  5. using select count(lo_shipmode) from lineorder; to test

new status:
image

old olapstatus:
image

It is very curious new status is a litter better than olapstatus.

@morningman
Copy link
Contributor

morningman commented Apr 14, 2022

I tested with jmeter, 50 concurrency, 1 FE, 1 BE
select * from tbl1 where k1=1 limit 1

The avg qps decreased from 1400/s to 1300/s, about 7% performance impact at high concurrency senario.
So I think it deserved to improve it.

But I think this PR can be merged first, and improve it later

@morningman morningman merged commit e5e0dc4 into apache:master Apr 14, 2022
morningman added a commit that referenced this pull request Apr 14, 2022
1. The compile bug is introduced from #8855
2. FE ut bug is introduced from #8848 and #8770
weizhengte pushed a commit to weizhengte/incubator-doris that referenced this pull request Apr 22, 2022
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
weizhengte pushed a commit to weizhengte/incubator-doris that referenced this pull request Apr 22, 2022
1. The compile bug is introduced from apache#8855
2. FE ut bug is introduced from apache#8848 and apache#8770
zhengshiJ pushed a commit to zhengshiJ/incubator-doris that referenced this pull request Apr 27, 2022
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
zhengshiJ pushed a commit to zhengshiJ/incubator-doris that referenced this pull request Apr 27, 2022
1. The compile bug is introduced from apache#8855
2. FE ut bug is introduced from apache#8848 and apache#8770
starocean999 pushed a commit to starocean999/incubator-doris that referenced this pull request May 19, 2022
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
starocean999 pushed a commit to starocean999/incubator-doris that referenced this pull request May 19, 2022
1. The compile bug is introduced from apache#8855
2. FE ut bug is introduced from apache#8848 and apache#8770
englefly pushed a commit to englefly/incubator-doris that referenced this pull request May 23, 2022
1. The compile bug is introduced from apache#8855
2. FE ut bug is introduced from apache#8848 and apache#8770
@morningman morningman mentioned this pull request Nov 21, 2022
@yiguolei yiguolei deleted the remove_olap_status branch March 30, 2023 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. area/vectorization dev/backlog waiting to be merged in future dev branch kind/refactor Issues or PRs to refactor code reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants