Skip to content

ddl: Add sql_require_primary_key sysvar#36146

Merged
ti-chi-bot merged 20 commits into
pingcap:masterfrom
joycse06:add-sql-require-primary-key
Aug 3, 2022
Merged

ddl: Add sql_require_primary_key sysvar#36146
ti-chi-bot merged 20 commits into
pingcap:masterfrom
joycse06:add-sql-require-primary-key

Conversation

@joycse06
Copy link
Copy Markdown
Contributor

@joycse06 joycse06 commented Jul 12, 2022

What problem does this PR solve?

Issue Number: close #28544

Problem Summary:

This is a feature from MySQL 8.0:
https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_require_primary_key

It allows you to enforce tables have a primary key. In MySQL it is important because Row-based replication benefits from it.

With TiDB, TiCDC (mysql sink) will not replicate tables without a primary/unique key by default. If you chose to replicate tables without a primary key, the data will be duplicated. For our use-case this is an issue, since we use TiCDC to async replicate between clusters we intend to failover to.

What is changed and how it works?

sql_require_primary_key sysvar has been added. When enabled it will throw an error if a create table ddl doesn't have a primary key.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

The system variable `sql_require_primary_key` has been added. This behaves the same as the feature in MySQL 8.0, and when enabled tables can not be created without primary keys.

@joycse06 joycse06 requested a review from a team as a code owner July 12, 2022 10:48
@ti-chi-bot
Copy link
Copy Markdown
Member

ti-chi-bot commented Jul 12, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • dveeden
  • morgo

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Details

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 12, 2022
@sre-bot
Copy link
Copy Markdown
Contributor

sre-bot commented Jul 12, 2022

@joycse06
Copy link
Copy Markdown
Contributor Author

/run-unit-test
/run-check_dev_2

@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 12, 2022

sql> SET sql_require_primary_key=ON;
Query OK, 0 rows affected (0.0007 sec)

sql> CREATE TABLE t1(id int primary key);
Query OK, 0 rows affected (0.0076 sec)

sql> CREATE TABLE t2(id SERIAL);
ERROR: 1173 (42000): This table type requires a primary key%!(EXTRA model.CIStr=t2)

sql> SET sql_require_primary_key=OFF;
Query OK, 0 rows affected (0.0007 sec)

sql> CREATE TABLE t2(id SERIAL);
Query OK, 0 rows affected (0.0091 sec)

sql> SHOW CREATE TABLE t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.0011 sec)

I think a unique key instead of a PK should be allowed. However I don't think this is common or advisable, so this is a very minor thing.

@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 12, 2022

I don't think this should be allowed:

sql> SET SESSION sql_require_primary_key=ON;
Query OK, 0 rows affected (0.0007 sec)

sql> SHOW CREATE TABLE t4\G
*************************** 1. row ***************************
       Table: t4
Create Table: CREATE TABLE `t4` (
  `id` int(11) NOT NULL,
  `c1` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] NONCLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.0010 sec)

sql> ALTER TABLE t4 DROP PRIMARY KEY;
Query OK, 0 rows affected (0.0222 sec)

@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 12, 2022

/cc @dveeden

@ti-chi-bot ti-chi-bot requested a review from dveeden July 12, 2022 12:57
@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 12, 2022

Not sure if this should apply to temp tables:

sql> CREATE TEMPORARY TABLE t5 (id int);
ERROR: 1173 (42000): This table type requires a primary key%!(EXTRA model.CIStr=t5)

Another thing to check/test for is test for upgrades, internal tables, etc. as that caused some issues for MySQL when this was introduced.

@7yyo
Copy link
Copy Markdown
Contributor

7yyo commented Jul 12, 2022

Comment thread ddl/ddl_api.go Outdated
@morgo
Copy link
Copy Markdown
Contributor

morgo commented Jul 12, 2022

https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_sql_require_primary_key

It doesn't just work on create table

These alter tables probably are not supported in TiDB (ADD/DROP PRIMARY KEY is deprecated in TiDB; but it looks like it might work on non-clustered tables so we will need to test this.)

Another thing to check/test for is test for upgrades, internal tables, etc. as that caused some issues for MySQL when this was introduced.

Good point! I suggest we always enable sql_require_primary_key for the bootstrap session. I can't imagine we require non-PK tables in the internal dictionary and that will get us out of settings compatibility trouble.

sql> CREATE TABLE t2(id SERIAL);
ERROR: 1173 (42000): This table type requires a primary key%!(EXTRA model.CIStr=t2)

I think this might be a bug somewhere. The code that does tbInfo.GetPkName() is supposed to consider the InnoDB rules for promoting unique indexes to "PK", which should apply here.

@morgo
Copy link
Copy Markdown
Contributor

morgo commented Jul 12, 2022

Getting a linter error:

[2022-07-12T11:52:36.931Z]   ✘  https://revive.run/r#var-naming  const SqlRequirePrimaryKey should be SQLRequirePrimaryKey  
[2022-07-12T11:52:36.932Z]   sessionctx/variable/sysvar.go:1987:2

You may also need to run make errdoc to update the errors file.

@joycse06
Copy link
Copy Markdown
Contributor Author

Not sure if this should apply to temp tables:

MySQL 8 applies it to temporary tables as well.

From the doc:

sql_require_primary_key applies to both base tables and TEMPORARY tables

@joycse06
Copy link
Copy Markdown
Contributor Author

I wasn't using the correct error code but a similar one. I have now added the actual error code. Have also added check to stop dropping Primary Key indexes (non-clustered one's) when this is set.

tidb> set sql_require_primary_key = 1;
Query OK, 0 rows affected (0.00 sec)

tidb> SHOW VARIABLES LIKE 'sql_require_primary_key';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| sql_require_primary_key | ON    |
+-------------------------+-------+
1 row in set (0.00 sec)

tidb> CREATE TABLE `t4` (`id` int(11) NOT NULL, `c1` int(11) DEFAULT NULL);
ERROR 3750 (HY000): Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message. Note that tables without a primary key can cause performance problems in row-based replication, so please consult your DBA before changing this setting.
tidb> CREATE TABLE `t4` (`id` int(11) NOT NULL, `c1` int(11) DEFAULT NULL, PRIMARY KEY (`id`));
Query OK, 0 rows affected (0.01 sec)

tidb> ALTER TABLE t4 DROP COLUMN id;
ERROR 8200 (HY000): Unsupported drop integer primary key
tidb> ALTER TABLE t4 DROP PRIMARY KEY;
ERROR 8200 (HY000): Unsupported drop primary key when the table's pkIsHandle is true
tidb> DROP TABLE t4;
Query OK, 0 rows affected (0.01 sec)

tidb> CREATE TABLE `t4` (`id` int(11) NOT NULL, `c1` int(11) DEFAULT NULL, PRIMARY KEY (`id`) NONCLUSTERED);
Query OK, 0 rows affected (0.00 sec)

tidb> ALTER TABLE t4 DROP COLUMN id;
ERROR 8200 (HY000): can't drop column id with composite index covered or Primary Key covered now
tidb> ALTER TABLE t4 DROP PRIMARY KEY;
ERROR 3750 (HY000): Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message. Note that tables without a primary key can cause performance problems in row-based replication, so please consult your DBA before changing this setting.

For some cases TiDB is returning ERROR 8200 instead of ERROR 3750 for thats coming from existing code, I am not sure if I should change those.

Comment thread errors.toml

["schema:3750"]
error = '''
Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message. Note that tables without a primary key can cause performance problems in row-based replication, so please consult your DBA before changing this setting.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same error mysql returns, copied from here

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 13, 2022
@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 13, 2022

ERROR 3750 (HY000): Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message. Note that tables without a primary key can cause performance problems in row-based replication, so please consult your DBA before changing this setting.

The "...row-based replication..." part is a bit weird in a TiDB context, we could change this to:

ERROR 3750 (HY000): Unable to create or change a table without a primary key, when the system variable 'sql_require_primary_key' is set. Add a primary key to the table or unset this variable to avoid this message.

But keeping this message identical to MySQL has my personal preference.

@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 13, 2022

/cc @7yyo

@ti-chi-bot ti-chi-bot requested a review from 7yyo July 13, 2022 10:20
Copy link
Copy Markdown
Contributor

@7yyo 7yyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot
Copy link
Copy Markdown
Member

@7yyo: Thanks for your review. The bot only counts approvals from reviewers and higher roles in list, but you're still welcome to leave your comments.

Details

In response to this:

LGTM

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@dveeden
Copy link
Copy Markdown
Contributor

dveeden commented Jul 15, 2022

Note that DM doesn't use real PKs either:

sql> show create table dm_meta.`demo-mysql8-tidb_loader_checkpoint`\G
*************************** 1. row ***************************
       Table: demo-mysql8-tidb_loader_checkpoint
Create Table: CREATE TABLE `demo-mysql8-tidb_loader_checkpoint` (
  `id` char(32) NOT NULL,
  `filename` varchar(255) NOT NULL,
  `cp_schema` varchar(128) NOT NULL,
  `cp_table` varchar(128) NOT NULL,
  `offset` bigint(20) NOT NULL,
  `end_pos` bigint(20) NOT NULL,
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `uk_id_f` (`id`,`filename`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.0005 sec)

sql> show create table dm_meta.`demo-mysql8-tidb_syncer_checkpoint`\G
*************************** 1. row ***************************
       Table: demo-mysql8-tidb_syncer_checkpoint
Create Table: CREATE TABLE `demo-mysql8-tidb_syncer_checkpoint` (
  `id` varchar(32) NOT NULL,
  `cp_schema` varchar(128) NOT NULL,
  `cp_table` varchar(128) NOT NULL,
  `binlog_name` varchar(128) DEFAULT NULL,
  `binlog_pos` int(10) unsigned DEFAULT NULL,
  `binlog_gtid` text DEFAULT NULL,
  `exit_safe_binlog_name` varchar(128) DEFAULT '',
  `exit_safe_binlog_pos` int(10) unsigned DEFAULT '0',
  `exit_safe_binlog_gtid` text DEFAULT NULL,
  `table_info` json NOT NULL,
  `is_global` tinyint(1) DEFAULT NULL,
  `create_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `update_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  UNIQUE KEY `uk_id_schema_table` (`id`,`cp_schema`,`cp_table`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
1 row in set (0.0016 sec)

This is with DM v5.4.2. I assume the unique keys with all columns defined as NOT NULL should be seen as good-enough.

@bb7133
Copy link
Copy Markdown
Member

bb7133 commented Jul 21, 2022

Hi @joycse06 , thanks for the contribution!

If this variable is supposed to be used with TiCDC-based replication, should we consider unique key as well?

I know that unique key is not included in sql_require_primary_key for MySQL, is it better to introduce another system variables for both primary key and unique key check?

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 21, 2022
@morgo
Copy link
Copy Markdown
Contributor

morgo commented Jul 22, 2022

If this variable is supposed to be used with TiCDC-based replication, should we consider unique key as well?

We don't have a use-case beyond what the MySQL feature offers. It's not a typical use-case to have a table which doesn't have a primary key, but has a unique key.

…rimary-key

* upstream/master: (104 commits)
  br: fix compatibility issue with concurrent ddl (pingcap#36474)
  infoschema: fix PromQL for `tidb_distsql_copr_cache` (pingcap#36450)
  test: stabilize TestTopSQLCPUProfile (pingcap#36468)
  parser: add support of 'ADMIN SHOW DDL JOB QUERIES LIMIT m OFFSET n' transferring to AST (pingcap#36285)
  *: enable flaky test for all test (pingcap#36385)
  expression: fix return type of agg func `bit_or` when handling varbinary column (pingcap#36415)
  executor: fix aggregating enum zero value gets different results from mysql  (pingcap#36208)
  server: skip check tiflash version (pingcap#36451)
  *: Minor update to SECURITY.md to improved clarity (pingcap#36346)
  table partition: add telemetry for partition table (pingcap#36204)
  ddl: invalid multiple MAXVALUE partitions (pingcap#36329) (pingcap#36345)
  planner: Fixed `Merge` hint in nested CTE (pingcap#36432)
  metric: impove concurrency ddl metrics (pingcap#36405)
  planner: add more test cases for leading outer join (pingcap#36409)
  ddl: only set concurrent variable if no error (pingcap#36437)
  ddl: fix update panic in the middle of multi-schema change (pingcap#36421)
  session: Mising OptimizeWithPlanAndThenWarmUp in prepare-execute path (pingcap#36347)
  executor,metrics: add a metric for observing execution phases (pingcap#35906)
  br: unified docker image align with tidb (pingcap#36016)
  ddl: skip to close nil sessPool  (pingcap#36425)
  ...
@ti-chi-bot ti-chi-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 23, 2022
@joycse06
Copy link
Copy Markdown
Contributor Author

If this variable is supposed to be used with TiCDC-based replication, should we consider unique key as well?

@bb7133 I agree with Morgan on this. As we don't have a compelling use case to go beyond what MySQL offers atm, I think we should have it do what MySQL does.

Comment thread parser/parser.go Outdated
Copy link
Copy Markdown
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please fix the lint as commented by @morgo: #36146 (comment)

@joycse06 joycse06 force-pushed the add-sql-require-primary-key branch from 9e9038b to d3c184e Compare August 3, 2022 00:13
@joycse06 joycse06 force-pushed the add-sql-require-primary-key branch from 2f60169 to 4eb5856 Compare August 3, 2022 01:11
@bb7133
Copy link
Copy Markdown
Member

bb7133 commented Aug 3, 2022

/merge

@ti-chi-bot
Copy link
Copy Markdown
Member

This pull request has been accepted and is ready to merge.

DetailsCommit hash: 4eb5856

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 3, 2022
@ti-chi-bot ti-chi-bot merged commit 6a6fa8a into pingcap:master Aug 3, 2022
@sre-bot
Copy link
Copy Markdown
Contributor

sre-bot commented Aug 3, 2022

TiDB MergeCI notify

🔴 Bad News! [1] CI still failing after this pr merged.
These failed integration tests don't seem to be introduced by the current PR.

CI Name Result Duration Compare with Parent commit
idc-jenkins-ci-tidb/integration-common-test 🔴 failed 3, success 14, total 17 22 min Existing failure
idc-jenkins-ci/integration-cdc-test 🟢 all 36 tests passed 34 min Existing passed
idc-jenkins-ci-tidb/common-test 🟢 all 11 tests passed 14 min Existing passed
idc-jenkins-ci-tidb/integration-ddl-test 🟢 all 6 tests passed 10 min Existing passed
idc-jenkins-ci-tidb/sqllogic-test-2 🟢 all 28 tests passed 9 min 59 sec Existing passed
idc-jenkins-ci-tidb/tics-test 🟢 all 1 tests passed 7 min 55 sec Existing passed
idc-jenkins-ci-tidb/sqllogic-test-1 🟢 all 26 tests passed 7 min 25 sec Existing passed
idc-jenkins-ci-tidb/mybatis-test 🟢 all 1 tests passed 4 min 35 sec Existing passed
idc-jenkins-ci-tidb/integration-compatibility-test 🟢 all 1 tests passed 4 min 13 sec Existing passed
idc-jenkins-ci-tidb/plugin-test 🟢 build success, plugin test success 4min Existing passed

@bb7133
Copy link
Copy Markdown
Member

bb7133 commented Aug 3, 2022

Hi @joycse06 , thanks for your contribution! Would you like to add the document for this variable(https://github.com/pingcap/docs/blob/master/system-variables.md)?

@joycse06
Copy link
Copy Markdown
Contributor Author

joycse06 commented Aug 8, 2022

@bb7133 Sure, I will give it a go. I will raise a PR in that repo when ready. Thanks.

@joycse06
Copy link
Copy Markdown
Contributor Author

joycse06 commented Aug 9, 2022

@bb7133 Can you please have a look at pingcap/docs#9886 ? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for sql_require_primary_key

8 participants