HDDS-4883. Persist replicationIndex on datanode side #2069

elek · 2021-03-22T12:45:05Z

What changes were proposed in this pull request?

When a container is created for EC replication, the replicaIndex metadata should be persisted to the container yaml file and reported back to the SCM with the container reports.

Note: this PR requires #2055 merged (therefore I mark it as draft, but feel free to comment it)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4883

How was this patch tested?

e2e tests with https://github.com/elek/ozone/tree/ec

umamaheswararao · 2021-04-06T16:32:05Z

...rvice/src/test/java/org/apache/hadoop/ozone/container/common/impl/TestContainerDataYaml.java

    }
  }
-
-


Why this test removed?

Very good question, I planned to discuss it.

- // This test is for if we upgrade, and then .container files added by new - // server will have new fields added to .container file, after a while we - // decided to rollback. Then older ozone can read .container files - // created or not.

This is a limitation of the current checksum calculation. With this restriction we can never add new fields to the container file.

I think the proper fix here is removing the test and using the upgrade framework to avoid some situation. This unit test is perfect for the master, temporary, but we should have an option to add new fields if new features are allowed (after finalize).

Today it's not possible as upgrade is not merged, but will be possible with upgrade framework (or just denying all the new EC requests). Therefore, I think it's safe to remove the unit test, move forward, and later add specific upgrade tests for EC.

(cc @avijayanhwx )

I understand your point.
I think we can try the following way to handle this? ( we discussed this point and to keep the discussion open, I am posting the comment here)

How about having additional field which calculates the new checksum including new field. Old checksum field unchanged, so, older clients will go with old checksum and newer clients will check new checksum field?
Let's see if this can work.

Yes, this is possible, but we wouldn't like to introduce new checksum fields when we introduce new fields. It should be done (IMHO) together with another fix which ensures that the new checksum field is always backward compatible (for example because all the fields are used to calculate the checksum).

I totally agree that it should be done, but I think it should be done in a separated issue / patch.

I was thinking the newer client will always do checksums for available fields in file, but not from java enum fields. Older client will do as is.
Also think about the option ignore the additional field when replIndex is 0, which is default. So, for non-existent we can assume that as 0 in newer clients.

We can do it, but it doesn't help the compatibility issue, and it requires a bigger refactor on the write code. I checked it, and it requires refactoring ContainerDataRepresenter (or the surrounding code) which may require bigger work.

With ignoring replIndex =0, for EC disabled clusters should not have any impact right?

Thanks for creating the separated JIRA for improving.

With ignoring replIndex =0, for EC disabled clusters should not have any impact right?

Sorry, I am not sure if I understood this question.

Today we have a generic code to serialize all the white-listed fields. It can be modified to support default values and write out spefic if it has a value different from the default, but it requires some code re-organization.

Even if we do this (do not write replIndex to the container file for non EC containers) it doesn't help the backward compatible problem. When a new EC type of container is written (eg. replIndex=1) it couldn't be read by the old cluster code (as it's not known if it's an EC container or not). Therefore, we should enable to write the replIndex only after the finalize step). In this case there is no advantage to write or not write replIndex to the container yaml file for normal containers. (It's more like a code style question, if you think it's worth to add more code re-organization for this, I am fine to add it, but from behavior point of view we will have the same results).

I meant that, after EC branch merged, but not using EC containers at all, will have not any impact at all if we skip writing index.
Anyway for the EC containers, we need to deal. One question: do we allow old code DN to serve the EC container data?
I am worried that Old DN will not know if there is any EC specific logic added at DN.

Anyway for the EC containers, we need to deal. One question: do we allow old code DN to serve the EC container data?

No. It's not possible as EC data will be written only after finalization which means that old DN code (downgrade) won't be supported any more.

Yes. Let's skip the replicationIndex=0 field writing to yaml file in this patch.
@avijayanhwx If we bump keyvalueContainer version, does the upgrade framework handled to check the version numbers?

Currently, as you noticed in this JIRA, we need to write one additional field in yaml file for EC. Since it's changing the on disk metadata, we need to worry about compt. While in upgrade in progress, your plan is not to allow new features to be used right? In that case, if you are already having check for keyValue container version, then bumping version would make things cleaner. Could you please comment your thoughts here? Thanks

umamaheswararao · 2021-04-06T16:33:53Z

...iner-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/HddsDispatcher.java

        ContainerProtos.ContainerType.KeyValueContainer;
    createRequest.setContainerType(containerType);

+    if (containerRequest.hasWriteChunk()) {


I think we decided to pack the index from pipeline object but not from BlockID right?

Sorry, I am not sure about the question. What do you mean pack from the BockID?

I think you suggested including the replicaIndex in the block id which became a (containerid, localid, replicaindex) tuplet. This is the reason to have getBlockd().getReplicaIndex() here.

But let me know if I misunderstand something.

I got ur point. I confused because we have not added the code to send the index from client yet, but we are trying to using here. For completion it may be good to include client side sending this idx?

A simplified client side change is here: elek@c868788

I think this is a well scoped and unit-tested change, client side change may need additional work. But if you would like to include this small commit, I would be happy to add it.

It is fine for me. When need at client side we can add that.

…he#2187)

)

Co-authored-by: Doroszlai, Attila <adoroszlai@apache.org>

…Policy#testRandomChoosingPolicy (apache#2188)

…pache#2190)

…ed (apache#2177)

…che#2051)

…is changed (apache#2177)" This reverts commit 5dd0943.

…apache#2195)

apache#1788)

)

…he#2210)

…ed (apache#2199)

…ad of global DN (apache#2246)

elek · 2021-05-19T09:40:38Z

@umamaheswararao #4c243349b persists replicaIndex (and use it in checksum) only if it's >0. Somewhat hacky but works.

Can you PTAL?

…DN startup (apache#2266)

…pache#2249)

…2273)

Co-authored-by: wangzhaohui8 <wangzhaohui8@jd.com>

…og. (apache#2267)

…lication, data read, scm/om snapshot download (apache#2256)

…rts (apache#2268)

umamaheswararao

The latest changes looks good to me.
Please take care of a nit comment before commit.

+1

umamaheswararao · 2021-05-24T14:47:22Z

...r-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerDataYaml.java

-import java.util.Set;
-import java.util.TreeSet;
+import java.util.*;



nit: can we just avoid this auto organize?

…ine 202 (apache#2279)

* Require block token for all chunk/block operations * Addendum: implement getBlockID; add tests * No block token for ListBlock (unsupported operation anyway), as it has no block ID * Test for more block/chunk operations * Require container token for ListBlock, but not for ListContainer * Do not require...Token for unsupported operations

This reverts commit 8953a12, reversing changes made to 79b34a9.

elek · 2021-05-25T14:19:55Z

Thanks for the review @umamaheswararao. Merging it now (import is fixed and we have green build)

elek changed the base branch from master to HDDS-3816-ec March 22, 2021 12:45

elek force-pushed the HDDS-4883 branch from fa03937 to 95872ba Compare March 30, 2021 09:46

HDDS-4883. Persist replicationIndex on datanode side

95872ba

elek marked this pull request as ready for review March 30, 2021 09:46

elek added 2 commits March 30, 2021 15:16

fixing unit test (disable one old unit test)

2251a73

additional fixes

d37b56d

umamaheswararao requested a review from nandakumar131 April 6, 2021 16:29

umamaheswararao reviewed Apr 6, 2021

View reviewed changes

elek added 4 commits April 13, 2021 10:39

fake replicationId propagation on client side

627ba35

Merge remote-tracking branch 'origin/master' into HDDS-4883

7cd3506

Merge remote-tracking branch 'origin/HDDS-3816-ec' into HDDS-4883

6c66fa5

fix compilation error

58e543f

swagle requested a review from sodonnel April 26, 2021 15:28

bharatviswa504 and others added 17 commits April 28, 2021 07:05

HDDS-5060. [SCM HA Security] Make InterSCM grpc channel secure. (apac…

6f9c3f2

…he#2187)

HDDS-4515. Datanodes should be able to persist and load CRL (apache#2181

bddc873

)

HDDS-5148. Bump ratis version to 2.1.0-ff8aa66-SNAPSHOT (apache#2184)

ce29843

Co-authored-by: Doroszlai, Attila <adoroszlai@apache.org>

HDDS-5152. Fix Suggested leader in Client. (apache#2189)

bfb6be8

HDDS-5147. Intermittent test failure in TestContainerDeletionChoosing…

bccc439

…Policy#testRandomChoosingPolicy (apache#2188)

HDDS-5153. Decommissioning a dead node should complete immediately (a…

a920f25

…pache#2190)

HDDS-4585. Support bucket acl operation in S3g (apache#1701)

66a411b

HDDS-5144. Create github check to alert when dependency tree is chang…

5dd0943

…ed (apache#2177)

HDDS-4983. Display key offset for each block in command key info (apa…

10312fb

…che#2051)

Revert "HDDS-5144. Create github check to alert when dependency tree …

799c38e

…is changed (apache#2177)" This reverts commit 5dd0943.

HDDS-5166. Remove duplicate assignment of OZONE_OPTS for freon and sh (…

dcac0f1

…apache#2195)

HDDS-2212. Genconf tool should generate config files for secure clust… (

48b84c5

apache#1788)

HDDS-5182. Acceptance test may exit with 0 in case of error (apache#2212

a3e7f04

)

HDDS-5178. Update project information of Contribution guideline (apac…

2e07c9f

…he#2210)

HDDS-5177. Update link of weekly calls in README (apache#2209)

8bbbf3f

HDDS-5144. Create github check to alert when dependency tree is chang…

00eba3e

…ed (apache#2199)

HDDS-5185. Update commons-io to 2.8.0 (apache#2215)

142a2dd

Gui Hecheng and others added 3 commits May 19, 2021 14:54

HDDS-5209. Datanode hasEnoughSpace check should apply on volume inste…

5a50866

…ad of global DN (apache#2246)

Persist replicaIndex (and use it in checksum) only if it's > 0)

4c24334

HDDS-5245. Fix OzoneContainer TLS configuration. (apache#2264)

30fe6ea

elek and others added 14 commits May 19, 2021 12:23

restore rat exclude

188922c

HDDS-5206. Support revoking S3 secret (apache#2239)

4fd8187

HDDS-5246. Wait for ever to obtain CA list which is needed during OM/…

08375d7

…DN startup (apache#2266)

fix container upgrade

283a5bd

HDDS-5233. SCM subsequent init failed when previous scm init failed. (a…

70ef475

…pache#2249)

HDDS-5257. Avoid SCM call to get CA certs in non-HA from OM. (apache#…

09c2278

…2273)

HDDS-5256. Fix fall back of config in SCM HA Cluster (apache#2272)

e1acbb0

HDDS-5261. Delete Useless ozone subcommand (apache#2275)

61a237a

Co-authored-by: wangzhaohui8 <wangzhaohui8@jd.com>

HDDS-5041. Use getShortUserName in getTrashRoot(s) (apache#2244)

e50fe52

HDDS-5260. Bump node to v16.2.0 for Recon (apache#2276)

fc7b0bd

HDDS-5248. SCM HA: Continuous PipelineNotFoundException seen in SCM l…

9080fc3

…og. (apache#2267)

HDDS-5206. Addendum: Support revoking S3 secret (apache#2270)

e00fa80

HDDS-5142. Make generic streaming client/service for container re-rep…

8d29be1

…lication, data read, scm/om snapshot download (apache#2256)

HDDS-5249. Race Condition between Full and Incremental Container Repo…

ab8f07d

…rts (apache#2268)

umamaheswararao approved these changes May 24, 2021

View reviewed changes

ns7381 and others added 9 commits May 24, 2021 20:59

HDDS-5266 Misspelt words in S3MultipartUploadCommitPartRequest.java l…

931a041

…ine 202 (apache#2279)

Merge remote-tracking branch 'origin/master' into HDDS-4883

5defc4f

fix imports

79b34a9

HDDS-5250. Build integration tests with Maven cache (apache#2269)

19f5bb2

HDDS-5073. Use ReplicationConfig on client side (apache#2136)

0b4779c

Merge remote-tracking branch 'origin/master' into HDDS-4883

8953a12

Revert "Merge remote-tracking branch 'origin/master' into HDDS-4883"

1621836

This reverts commit 8953a12, reversing changes made to 79b34a9.

Merge remote-tracking branch 'origin/HDDS-3816-ec' into HDDS-4883

8cd9ffb

elek merged commit ad790b6 into apache:HDDS-3816-ec May 25, 2021

HDDS-4883. Persist replicationIndex on datanode side #2069

HDDS-4883. Persist replicationIndex on datanode side #2069

Uh oh!

Conversation

elek commented Mar 22, 2021

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elek commented May 19, 2021

Uh oh!

umamaheswararao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elek commented May 25, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants