Skip to content

Conversation

@ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Mar 25, 2022

What changes were proposed in this pull request?

The ContentGenerator class of the freon suite has multiple Parametrised Constructors each having various different arguments that have to be initialised, this affects the readability of the code. One way to solve this is to incorporate a builder class for ContentGenerator class so that any new parameters added in the future can be seamlessly added without the need of creating a new constructor.

When a file was closed or flushed, the file contents were written from the DataNode into the operating system using the normal close() and write() system calls.

The data may not be immediately persisted to the underlying physical storage, but may still reside in-memory in the operating system's file cache. This creates a window of vulnerability where if multiple DataNode machines fail simultaneously (e.g. loss of power to a rack), then previously written data may be lost. To combat this problem, HDFS (as of Hadoop 2.0) has introduced new APIs to provide a way to guarantee that written data will be immediately persisted to the underlying physical storage. These APIs are described in the following table.

hflush() : This Method present inside the FSDataOutputStream.java flushes all outstanding data (i.e. the current unfinished packet) from the client into the OS buffers on all DataNode replicas.

hsync() : This Method present inside the FSDataOutputStream.java flushes the data to the DataNodes, like hflush(), but should also force the data to underlying physical storage via fsync (or equivalent).

Hence I have added a method that would implement Hsync or Hflush after every write if the user wants !

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6455

How was this patch tested?

Existing UTs for ContentGeneratorClass ran successfully

@kaijchen
Copy link
Member

Thanks for the patch @ArafatKhan2198, I have left some comments for you to check.

@ArafatKhan2198
Copy link
Contributor Author

Thanks for the patch @ArafatKhan2198, I have left some comments for you to check.

@kaijchen all changes done !!

Comment on lines 142 to 143
ContentGenerator contentgenerator = new ContentGenerator(this);
return contentgenerator;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Builder's build method should verify that the object being constructed is valid. Here hflush and hsync are not applicable at the same time. Please add a check for that.

Nit: no need for local variable, return directly.

Suggested change
ContentGenerator contentgenerator = new ContentGenerator(this);
return contentgenerator;
return new ContentGenerator(this);


contentGenerator =
new ContentGenerator(fileSize, bufferSize, copyBufferSize);
new Builder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it's more readable if builder is qualified where it is being used (and also more consistent with usage in other generator classes).

Suggested change
new Builder()
new ContentGenerator.Builder()

import org.apache.hadoop.hdds.conf.OzoneConfiguration;

import com.codahale.metrics.Timer;
import org.apache.hadoop.ozone.freon.ContentGenerator.Builder;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import org.apache.hadoop.ozone.freon.ContentGenerator.Builder;

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for updating the PR with the enum from #3192.

Comment on lines +146 to +148
public Builder setFlushMode(String flushmode) {
this.flushMode = flushmode;
Flags type = Flags.valueOf(flushmode);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to pass an instance of the enum to this method, enforcing type safety.

Comment on lines +52 to +60
/**
* Issue Hsync after every write ( Cannot be used with Hflush ).
*/
private final boolean hSync;

/**
* Issue Hflush after every write ( Cannot be used with Hsync ).
*/
private final boolean hFlush;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing these with a single instance of Flags would simplify the code.

/**
* Type of flags.
*/
public enum Flags {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FlushMode would be a better name.

Comment on lines +66 to +68
hSync,
hFlush,
None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants are usually preferred in CAPS.

@adoroszlai
Copy link
Contributor

/pending

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Please note that the PR will be closed after 21 days of inactivity from now. (But can be re-opened anytime later...)

/pending

@github-actions
Copy link

Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.

It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.

It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.

If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."

@github-actions github-actions bot closed this Jul 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants