Skip to content

Conversation

@ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Mar 14, 2022

What changes were proposed in this pull request?

Micro benchmarks for Apache HDFS/HCFS perf testing. The package includes the following TEST

WriteFileThroughputBenchmark.java :- A multi-threaded write IO test that allows tweaking write parameters like replication, IO size, number of threads etc. A single thread should be able to saturate the network or single disk bandwidth (whichever is lower).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-1726?jql=text%20~%20%22FsPerfTest%22

How was this patch tested?

Since this patch is just a new feature for Ozone Freon, this won't have significant affect on the functioning of Ozone. The system that I have been using uses M1 ARM-64 architecture due to which I am not able to test out the functionality on my local cluster, hence I would need code-review so as to identify potential bugs if any.
@siddhantsangwan @jojochuang

@jojochuang
Copy link
Contributor

Re-trigger the test.

@adoroszlai
Copy link
Contributor

Please check test results before retriggering.

Acceptance test is failing due to:

Option name '-p' is used by both field String org.apache.hadoop.ozone.freon.WriteFileThroughputBenchmark.rootPath and field String org.apache.hadoop.ozone.freon.BaseFreonGenerator.prefix

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatCloudera for working on this.

Ozone already has a file generator for Freon. Would it be possible to extend HadoopFsGenerator with additional parameters as necessary (e.g. flush/sync) instead of duplicating it?

public class HadoopFsGenerator extends BaseFreonGenerator

Comment on lines +187 to +189
// Enforcing throttle delay
final long ioEndTimeNs = (isThrottled ? System.nanoTime() : 0);
enforceThrottle(ioEndTimeNs - ioStartTimeNs, expectedIoTimeNs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain the need for throttling in a performance test?

If it is really necessary, shouldn't it happen outside of the timer.time() call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original FsPerfTest tool was created to measure both throughput and latency. The throttling option would be more relevant in the context of latency. We can drop the throttling for now as this PR does not support measuring latency. But later, throttling can be added back when latency metrics are added.

Link to the FsPerfTest tool :-https://github.com/arp7/FsPerfTest/blob/master/src/main/java/net/arp7/FsPerfTest/WriteFile.java

Comment on lines +110 to +116
if (outputStream instanceof FSDataOutputStream) {
if (hSync) {
((FSDataOutputStream) outputStream).hsync();
} else if (hFlush) {
((FSDataOutputStream) outputStream).hflush();
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could avoid this runtime check/cast by using inheritance and/or generics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to implement a check using inheritance then there would be a lot more code that will be added, hence I believe we could leave out this one.

Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the hflush/hsync inheritance issue, anything else?

@ArafatKhan2198 on a second thought, we can add a method

public void write(FSDataOutputStream outputStream) {}

and call flushOrSync() in this method.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ozone already has a file generator for Freon. Would it be possible to extend HadoopFsGenerator with additional parameters as necessary (e.g. flush/sync) instead of duplicating it?

I'm still curious about the above.

// Choosing which flag is to be set
boolean flush = false;
boolean sync = false;
Flags type = Flags.valueOf(flag);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid the need for this by changing flag from String to Flags. picocli supports enums.

Comment on lines +133 to +141
switch (type) {
case hSync:
sync = true;
break;
case hFlush:
flush = true;
break;
case None:
break;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content generator also has the problem of mutually exclusive boolean flags. Can we pass type to ContentGenerator and let it use it directly in flushOrSync?

defaultValue = "1")
private short replication;

@Option(names = {"--flags"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flags is a rather vague name. I think flush-mode or something similar would be better.


@Option(names = {"--flags"},
description = "Optionally issue hSync or hFlush after every write" +
"Cannot be used with hflush",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is leftover from previous version (with booleans).

@adoroszlai
Copy link
Contributor

/pending

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking this issue as un-mergeable as requested.

Please use /ready comment when it's resolved.

Please note that the PR will be closed after 21 days of inactivity from now. (But can be re-opened anytime later...)

/pending

@github-actions
Copy link

Thank you very much for the patch. I am closing this PR temporarily as there was no activity recently and it is waiting for response from its author.

It doesn't mean that this PR is not important or ignored: feel free to reopen the PR at any time.

It only means that attention of committers is not required. We prefer to keep the review queue clean. This ensures PRs in need of review are more visible, which results in faster feedback for all PRs.

If you need ANY help to finish this PR, please contact the community on the mailing list or the slack channel."

@github-actions github-actions bot closed this Jul 25, 2022
@kerneltime
Copy link
Contributor

@ArafatKhan2198 @adoroszlai @jojochuang This still seems like a good feature for Freon. Do you think we should revisit this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants