Skip to content

Conversation

@xinyiZzz
Copy link
Contributor

Proposed changes

mainly includes:

  • OLAP_SCAN_NODE profile layering: OLAP_SCAN_NODE,OlapScanner, and SegmentIterator.
  • Delete meaningless statistical values. mainly in scan_node.cpp.
  • Increase RowsConditionsFiltered statistical, split from RowsDelFiltered, the meaning is the number of rows filtered by various column indexes, only in segment V2.
  • Modify the document based on the above, and enhance readability.

Types of changes

  • Documentation Update (if none of the other choices apply)
  • Code refactor (Modify the code structure, format the code, etc...)

Checklist

}

void ScanNode::init_scan_profile() {
_scanner_profile.reset(new RuntimeProfile("OlapScanner"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where to show this _scanner_profile, it's not added into _runtime_profile

RuntimeProfile::Counter* _num_scanner_threads_started_counter;

boost::scoped_ptr<RuntimeProfile> _scanner_profile;
boost::scoped_ptr<RuntimeProfile> _segment_profile;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use std::unique_ptr replace boost::scoped_ptr

_scanner_profile.reset(new RuntimeProfile("OlapScanner"));
runtime_profile()->add_child(_scanner_profile.get(), true, NULL);

_segment_profile.reset(new RuntimeProfile("SegmentIterator"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should _segment_profile and scanner_profile be scan_node?
Like MysqlScanNode is a child of scan_node, _segment_profile and scanner_profile is useless in MysqlScanNode

@xinyiZzz
Copy link
Contributor Author

xinyiZzz commented Nov 2, 2020

@HappenLee Good question, I have moved _segment_profile and scanner_profile to OlapScanNode and modified the document.

- `OlapScanner` 下的很多指标,如 `IOTimer``BlockFetchTime` 等都是所有 Scanner 线程指标的累加,因此数值可能会比较大。并且因为 Scanner 线程是异步读取数据的,所以这些累加指标只能反映 Scanner 累加的工作时间,并不直接代表 ScanNode 的耗时。ScanNode 在整个查询计划中的耗时占比为 `Active` 字段记录的值。有时会出现比如 `IOTimer` 有几十秒,而 `Active` 实际只有几秒钟。这种情况通常因为:
- `IOTimer` 为多个 Scanner 的累加时间,而 Scanner 数量较多。
- 上层节点比较耗时。比如上层节点耗时 100秒,而底层 ScanNode 只需 10秒。则反映在 `Active` 的字段可能只有几毫秒。因为在上层处理数据的同时,ScanNode 已经异步的进行了数据扫描并准备好了数据。当上层节点从 ScanNode 获取数据时,可以获取到已经准备好的数据,因此 Active 时间很短。
- `NumScanners` 表示 Scanner 线程数。线程数过多或过少都会影响查询效率。同时可以用一些汇总指标除以线程数来大致的估算每个线程的耗时。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NumScanner应该表示的是提交到线程池Task数目,并不能直接推导出他的线程数

@HappenLee
Copy link
Contributor

LGTM

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@morningman morningman added kind/docs Categorizes issue or PR as related to documentation. kind/improvement kind/refactor Issues or PRs to refactor code approved Indicates a PR has been approved by one committer. labels Nov 9, 2020
@morningman morningman merged commit 66132d2 into apache:master Nov 11, 2020
morningman added a commit that referenced this pull request Nov 12, 2020
Fix UT failed by #4825 and remove useless profile
morningman pushed a commit that referenced this pull request Nov 13, 2020
bug introduced from pr #4825, will cause `schema_change` to report an error:
```
schema_change.cpp:1271] fail to check row num! source_rows=1, merged_rows=0, filtered_rows=0, new_index_rows=0
schema_change.cpp:1921] failed to process the version. version=2-2
schema_change.cpp:1615] failed to alter tablet. base_tablet=44643.1383650721.b140317f6662c1e0-65bcbc87db8d22bc, drop new_tablet=45680.1530531459.474e41f3dd538fb6-9284085daac24f83
```
@yangzhg yangzhg mentioned this pull request Feb 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. kind/docs Categorizes issue or PR as related to documentation. kind/improvement kind/refactor Issues or PRs to refactor code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Running Profile OLAP_SCAN_NODE node has poor readability

4 participants