Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Aug 8, 2023

Proposed changes

add information_schema.metadata_name_idsfor quickly get catlogs,db,table.

  1. table struct :
mysql> desc  internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field         | Type         | Null | Key   | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID    | BIGINT       | Yes  | false | NULL    |       |
| CATALOG_NAME  | VARCHAR(512) | Yes  | false | NULL    |       |
| DATABASE_ID   | BIGINT       | Yes  | false | NULL    |       |
| DATABASE_NAME | VARCHAR(64)  | Yes  | false | NULL    |       |
| TABLE_ID      | BIGINT       | Yes  | false | NULL    |       |
| TABLE_NAME    | VARCHAR(64)  | Yes  | false | NULL    |       |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec) 


mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
   CATALOG_ID: 113008
 CATALOG_NAME: hive1
  DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
     TABLE_ID: 114009
   TABLE_NAME: dates
1 row in set (0.07 sec)
  1. when you create / drop catalog , need not refresh catalog .
mysql> select count(*) from internal.information_schema.metadata_name_ids\G; 
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)


mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G; 
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec) 


mysql> create catalog hive3 ... 
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;                                                                        
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
  1. create / drop table , need not refresh catalog .
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;


mysql> select count(*) from internal.information_schema.metadata_name_ids\G; 
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)

mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G; 
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec) 
  1. you can set query time , prevent queries from taking too long .

fe.conf :  query_metadata_name_ids_timeout 

the time used to obtain all tables in one database

  1. add information_schema.profiling in order to Compatible with mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)

mysql> set profiling=1;                                                                                 
Query OK, 0 rows affected (0.01 sec)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2023

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2023

clang-tidy review says "All clean, LGTM! 👍"

@hubgeter hubgeter changed the title add information_schema.simple_tables for quickly get catlogs,db,table. [feature](information_schema)add simple_tables for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql Aug 8, 2023
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add regression test cases


//query internal.information_schema.simple_tables time
@VariableMgr.VarAttr(name = QUERY_SIMPLE_TABLES_TIMEOUT)
public int querySimpleTablesTimeout = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be a FE config, not session variable.
Because we always set it globally

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok good idea

SCH_COLUMN_STATISTICS,
SCH_PARAMETERS;
SCH_PARAMETERS,
SCH_SIMPLE_TABLES,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SCH_SIMPLE_TABLES,
SCH_SIMPLE_TABLES,

SCH_PARAMETERS;
SCH_PARAMETERS,
SCH_SIMPLE_TABLES,
SCH_PROFILING;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SCH_PROFILING;
SCH_PROFILING;

}

struct TSimpleTableStatus {
1: required string name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use 4 spaces instead of tab.
and use optional instead of required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok good idea

2: required i64 id
}
struct TListSimpleTableStatusResult {
1: required list<TSimpleTableStatus> tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

.column("ROUTINE_TYPE", ScalarType.createVarchar(64))
.column("DATA_TYPEDTD_IDENDS", ScalarType.createVarchar(64))
.build()))
.put("simple_tables", new SchemaTable(SystemIdGenerator.getNextId(), "simple_tables", TableType.SCHEMA,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about just name it as metadata_name_ids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok good idea

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hubgeter hubgeter changed the title [feature](information_schema)add simple_tables for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql [feature](information_schema)add metadata_name_ids for quickly get catlogs,db,table and add profiling table in order to Compatible with mysql Aug 11, 2023
@hubgeter
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hubgeter
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 44.7 seconds
stream load tsv: 535 seconds loaded 74807831229 Bytes, about 133 MB/s
stream load json: 21 seconds loaded 2358488459 Bytes, about 107 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.3 seconds inserted 10000000 Rows, about 341K ops/s
storage size: 17162394725 Bytes

morningman
morningman previously approved these changes Aug 20, 2023
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 20, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

morningman
morningman previously approved these changes Aug 24, 2023
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hubgeter
Copy link
Contributor Author

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Aug 25, 2023
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 46.38 seconds
stream load tsv: 537 seconds loaded 74807831229 Bytes, about 132 MB/s
stream load json: 20 seconds loaded 2358488459 Bytes, about 112 MB/s
stream load orc: 65 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.5 seconds inserted 10000000 Rows, about 338K ops/s
storage size: 17162092058 Bytes

@hubgeter
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@hello-stephen
Copy link
Contributor

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 47.5 seconds
stream load tsv: 537 seconds loaded 74807831229 Bytes, about 132 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 64 seconds loaded 1101869774 Bytes, about 16 MB/s
stream load parquet: 31 seconds loaded 861443392 Bytes, about 26 MB/s
insert into select: 29.4 seconds inserted 10000000 Rows, about 340K ops/s
storage size: 17162023308 Bytes

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Aug 25, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit e680d42 into apache:master Aug 31, 2023
hubgeter added a commit to hubgeter/doris that referenced this pull request Sep 1, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (apache#22702)

add information_schema.metadata_name_idsfor quickly get catlogs,db,table.

1. table  struct :
```mysql
mysql> desc  internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field         | Type         | Null | Key   | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID    | BIGINT       | Yes  | false | NULL    |       |
| CATALOG_NAME  | VARCHAR(512) | Yes  | false | NULL    |       |
| DATABASE_ID   | BIGINT       | Yes  | false | NULL    |       |
| DATABASE_NAME | VARCHAR(64)  | Yes  | false | NULL    |       |
| TABLE_ID      | BIGINT       | Yes  | false | NULL    |       |
| TABLE_NAME    | VARCHAR(64)  | Yes  | false | NULL    |       |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)

mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
   CATALOG_ID: 113008
 CATALOG_NAME: hive1
  DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
     TABLE_ID: 114009
   TABLE_NAME: dates
1 row in set (0.07 sec)
```

2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)

mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```

3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)

mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

```

4. you can set query time , prevent queries from taking too long .
```

fe.conf :  query_metadata_name_ids_timeout

the time used to obtain all tables in one database

```
5. add information_schema.profiling in order to Compatible with  mysql

```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)

mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
hubgeter added a commit to hubgeter/doris that referenced this pull request Sep 5, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (apache#22702)

add information_schema.metadata_name_idsfor quickly get catlogs,db,table.

1. table  struct :
```mysql
mysql> desc  internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field         | Type         | Null | Key   | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID    | BIGINT       | Yes  | false | NULL    |       |
| CATALOG_NAME  | VARCHAR(512) | Yes  | false | NULL    |       |
| DATABASE_ID   | BIGINT       | Yes  | false | NULL    |       |
| DATABASE_NAME | VARCHAR(64)  | Yes  | false | NULL    |       |
| TABLE_ID      | BIGINT       | Yes  | false | NULL    |       |
| TABLE_NAME    | VARCHAR(64)  | Yes  | false | NULL    |       |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)

mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
   CATALOG_ID: 113008
 CATALOG_NAME: hive1
  DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
     TABLE_ID: 114009
   TABLE_NAME: dates
1 row in set (0.07 sec)
```

2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)

mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```

3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)

mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

```

4. you can set query time , prevent queries from taking too long .
```

fe.conf :  query_metadata_name_ids_timeout

the time used to obtain all tables in one database

```
5. add information_schema.profiling in order to Compatible with  mysql

```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)

mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
xiaokang pushed a commit that referenced this pull request Sep 5, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23753)
BiteTheDDDDt added a commit that referenced this pull request Sep 5, 2023
…y get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23753)"

This reverts commit 39c339c.
BiteTheDDDDt added a commit that referenced this pull request Sep 5, 2023
…y get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23753)" (#23928)

This reverts commit 39c339c.
hubgeter added a commit to hubgeter/doris that referenced this pull request Sep 6, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (apache#22702)

add information_schema.metadata_name_idsfor quickly get catlogs,db,table.

1. table  struct :
```mysql
mysql> desc  internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field         | Type         | Null | Key   | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID    | BIGINT       | Yes  | false | NULL    |       |
| CATALOG_NAME  | VARCHAR(512) | Yes  | false | NULL    |       |
| DATABASE_ID   | BIGINT       | Yes  | false | NULL    |       |
| DATABASE_NAME | VARCHAR(64)  | Yes  | false | NULL    |       |
| TABLE_ID      | BIGINT       | Yes  | false | NULL    |       |
| TABLE_NAME    | VARCHAR(64)  | Yes  | false | NULL    |       |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)

mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
   CATALOG_ID: 113008
 CATALOG_NAME: hive1
  DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
     TABLE_ID: 114009
   TABLE_NAME: dates
1 row in set (0.07 sec)
```

2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)

mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```

3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)

mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

```

4. you can set query time , prevent queries from taking too long .
```

fe.conf :  query_metadata_name_ids_timeout

the time used to obtain all tables in one database

```
5. add information_schema.profiling in order to Compatible with  mysql

```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)

mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
xiaokang pushed a commit that referenced this pull request Sep 6, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23980)
xiaokang added a commit that referenced this pull request Sep 6, 2023
…y get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23980)"

This reverts commit 01ee595.
xiaokang added a commit that referenced this pull request Sep 6, 2023
…y get catlogs,db,table and add profiling table in order to Compatible with mysql (#22702) (#23980)" (#24002)

This reverts commit 01ee595.
hubgeter added a commit to hubgeter/doris that referenced this pull request Sep 7, 2023
…tlogs,db,table and add profiling table in order to Compatible with mysql (apache#22702)

add information_schema.metadata_name_idsfor quickly get catlogs,db,table.

1. table  struct :
```mysql
mysql> desc  internal.information_schema.metadata_name_ids;
+---------------+--------------+------+-------+---------+-------+
| Field         | Type         | Null | Key   | Default | Extra |
+---------------+--------------+------+-------+---------+-------+
| CATALOG_ID    | BIGINT       | Yes  | false | NULL    |       |
| CATALOG_NAME  | VARCHAR(512) | Yes  | false | NULL    |       |
| DATABASE_ID   | BIGINT       | Yes  | false | NULL    |       |
| DATABASE_NAME | VARCHAR(64)  | Yes  | false | NULL    |       |
| TABLE_ID      | BIGINT       | Yes  | false | NULL    |       |
| TABLE_NAME    | VARCHAR(64)  | Yes  | false | NULL    |       |
+---------------+--------------+------+-------+---------+-------+
6 rows in set (0.00 sec)

mysql> select * from internal.information_schema.metadata_name_ids where CATALOG_NAME="hive1" limit 1 \G;
*************************** 1. row ***************************
   CATALOG_ID: 113008
 CATALOG_NAME: hive1
  DATABASE_ID: 113042
DATABASE_NAME: ssb1_parquet
     TABLE_ID: 114009
   TABLE_NAME: dates
1 row in set (0.07 sec)
```

2. when you create / drop catalog , need not refresh catalog .
```mysql
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.34 sec)

mysql> drop catalog hive2;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

mysql> create catalog hive3 ...
mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 21301
1 row in set (0.32 sec)
```

3. create / drop table , need not refresh catalog .
```mysql
mysql> CREATE TABLE IF NOT EXISTS demo.example_tbl ... ;

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10666
1 row in set (0.04 sec)

mysql> drop table demo.example_tbl;
Query OK, 0 rows affected (0.01 sec)

mysql> select count(*) from internal.information_schema.metadata_name_ids\G;
*************************** 1. row ***************************
count(*): 10665
1 row in set (0.04 sec)

```

4. you can set query time , prevent queries from taking too long .
```

fe.conf :  query_metadata_name_ids_timeout

the time used to obtain all tables in one database

```
5. add information_schema.profiling in order to Compatible with  mysql

```mysql
mysql> select * from information_schema.profiling;
Empty set (0.07 sec)

mysql> set profiling=1;
Query OK, 0 rows affected (0.01 sec)
```
@xiaokang xiaokang mentioned this pull request Sep 30, 2023
@xiaokang xiaokang mentioned this pull request Dec 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.0.2-merged merge_conflict reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants