Skip to content

Conversation

@xieyi888
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #600

Problem Description: TxnKV Scan lose data when table has more than one regions
see #600

What is changed and how does it work?

Code changes

Has methods of interface change

before change

   if (currentCache.size() < limit) {
        startKey = curRegionEndKey;
        lastKey = Key.toRawKey(curRegionEndKey);
      } else if (currentCache.size() > limit) {
        throw new IndexOutOfBoundsException(
            "current cache size = "
                + currentCache.size()
                + ", larger than "
                + conf.getScanBatchSize());
      } else {
        // Start new scan from exact next key in current region
        lastKey = Key.toRawKey(currentCache.get(currentCache.size() - 1).getKey());
        startKey = lastKey.next().toByteString();
      }

after change

      int scanLimit = Math.min(limit, conf.getScanBatchSize());
      if (currentCache.size() < scanLimit) {
        startKey = curRegionEndKey;
        lastKey = Key.toRawKey(curRegionEndKey);
      } else if (currentCache.size() > scanLimit) {
        throw new IndexOutOfBoundsException(
            "current cache size = "
                + currentCache.size()
                + ", larger than "
                + scanLimit);
      } else {
        // Start new scan from exact next key in current region
        lastKey = Key.toRawKey(currentCache.get(currentCache.size() - 1).getKey());
        startKey = lastKey.next().toByteString();
      }

Check List for Tests

  • Manual test (add detailed scripts or steps below)

step 1: change code

step 2: build the tikv-client-java 3.2.0-SNAPSHOT

mvn clean install -DskipTests

step 3: prepare a TiDB Table which has two regions

for example:
total count of tha table is 200000:

mysql> select count(1) from tikv_client_test;
+----------+
| count(1) |
+----------+
|   200000 |
+----------+
1 row in set (0.08 sec)

table regions:

mysql> show table tikv_client_test  regions \G;
*************************** 1. row ***************************
           REGION_ID: 15097
           START_KEY: t_172_
             END_KEY: t_172_r_535012
           LEADER_ID: 15100
     LEADER_STORE_ID: 5
               PEERS: 15098, 15099, 15100
          SCATTERING: 0
       WRITTEN_BYTES: 0
          READ_BYTES: 0
APPROXIMATE_SIZE(MB): 1
    APPROXIMATE_KEYS: 0
*************************** 2. row ***************************
           REGION_ID: 2
           START_KEY: t_172_r_535012
             END_KEY:
           LEADER_ID: 66
     LEADER_STORE_ID: 4
               PEERS: 3, 66, 76
          SCATTERING: 0
       WRITTEN_BYTES: 63913588
          READ_BYTES: 0
APPROXIMATE_SIZE(MB): 78
    APPROXIMATE_KEYS: 51395
2 rows in set (0.01 sec)

step 4: useing TxnKV to scan the table

tikv-client-java version: 3.2.0-SNAPSHOT which is built in step2
code

public class TikvScanTemplateRepeat {
    public static void main(String[] args) throws Exception {
        TiConfiguration conf = TiConfiguration.createDefault("****");
        TiSession session = TiSession.create(conf);
        KVClient scanClient = session.createKVClient();
        long startTs = session.getTimestamp().getVersion();

        final String database = "****";
        final String tableName = "tikv_client_test";
        long tableId = session.getCatalog().getTable(database, tableName).getId();

        long startPos = 2L;//the minKey of the table
        ByteString startKey = RowKey.toRowKey(tableId, startPos).toByteString();
        ByteString endKey = RowKey.toRowKey(tableId, Long.MAX_VALUE).toByteString();

        int totalSize = 0;
        try {
            while (true) {
                final List<Kvrpcpb.KvPair> segment =
                        scanClient.scan(startKey, endKey, startTs);

                if (segment.isEmpty()) {
                    break;
                }
                System.out.println("scan segment size:" + segment.size());
                totalSize+=segment.size();
                startKey =
                        RowKey.toRawKey(segment.get(segment.size() - 1).getKey())
                                .next()
                                .toByteString();
            }
        } finally {
            scanClient.close();
            session.close();
        }
        System.out.println("scan total size: "+totalSize);
    }
}

result:

total size: 200000
fix the data lose

Side effects

Related changes

@xieyi888 xieyi888 force-pushed the dev-issue600 branch 2 times, most recently from b226d00 to 222b4ff Compare May 10, 2022 14:55
@iosmanthus iosmanthus changed the title [fix 600]TxnKV Scan lose data when table has more than one regions [close #600] ScanIterator missing fetch the region keys because the wrong limit setting May 12, 2022
iosmanthus
iosmanthus previously approved these changes May 12, 2022
Copy link
Member

@iosmanthus iosmanthus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codecov
Copy link

codecov bot commented May 12, 2022

Codecov Report

Merging #603 (25ca639) into master (660199a) will decrease coverage by 0.06%.
The diff coverage is 50.00%.

@@             Coverage Diff              @@
##             master     #603      +/-   ##
============================================
- Coverage     34.08%   34.01%   -0.07%     
+ Complexity     1369     1361       -8     
============================================
  Files           270      270              
  Lines         17174    17174              
  Branches       1956     1956              
============================================
- Hits           5853     5842      -11     
- Misses        10708    10717       +9     
- Partials        613      615       +2     
Impacted Files Coverage Δ
...g/tikv/common/operation/iterator/ScanIterator.java 76.31% <50.00%> (ø)
...ty/handler/codec/http2/Http2ConnectionHandler.java 48.16% <0.00%> (-3.92%) ⬇️
src/main/java/io/grpc/netty/WriteQueue.java 74.43% <0.00%> (-2.26%) ⬇️
.../codec/http2/DefaultHttp2RemoteFlowController.java 59.31% <0.00%> (-0.39%) ⬇️
src/main/java/io/grpc/internal/ClientCallImpl.java 56.82% <0.00%> (-0.25%) ⬇️
...rc/main/java/io/grpc/netty/NettyClientHandler.java 56.89% <0.00%> (-0.22%) ⬇️
.../main/java/org/tikv/common/region/RegionCache.java 50.92% <0.00%> (+1.85%) ⬆️
...va/org/tikv/common/region/StoreHealthyChecker.java 79.48% <0.00%> (+5.12%) ⬆️
src/main/java/org/tikv/common/region/TiStore.java 74.35% <0.00%> (+12.82%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 660199a...25ca639. Read the comment docs.

@xieyi888
Copy link
Contributor Author

@iosmanthus Thank you for your reply. Merging is blocked. what else should I do now, please?

@iosmanthus
Copy link
Member

@iosmanthus Thank you for your reply. Merging is blocked. what else should I do now, please?

You may execute the ./dev/javafmt to format the whole project. The format checker of the CI is failed.

Signed-off-by: Yi Xie <xieyi01@rd.netease.com>
@xieyi888
Copy link
Contributor Author

thanks for you reply. I had format the project.

@iosmanthus Thank you for your reply. Merging is blocked. what else should I do now, please?

You may execute the ./dev/javafmt to format the whole project. The format checker of the CI is failed.

@xieyi888 xieyi888 requested a review from iosmanthus May 12, 2022 10:03
Copy link
Member

@iosmanthus iosmanthus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@xieyi888
Copy link
Contributor Author

hi @iosmanthus, what else should I do now? Thank you for your patience

@iosmanthus iosmanthus merged commit 3163793 into tikv:master May 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TxnKV Scan lose data when table has more than one regions

3 participants