generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 8
Refactoring and updating data generating methods. #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
2a61c7f
Adds package to process ion schema file.
linlin-s 38334bf
Changes the way of processing ion schema file.
linlin-s 1d1ff41
Updates float generating process.
linlin-s de35e99
Updates symbol and string generating process.
linlin-s 3aeabc8
Updates blob and clob generating process.
linlin-s 1eb60ff
Updates int generating process.
linlin-s 5e0619c
Updates decimal generating process.
linlin-s 8ee7330
Updates timestamp generating process.
linlin-s 5be4df5
Temporarily comment some unit tests for generating constainer types o…
linlin-s 89842cd
Updates PR based on the suggestions from comments.
linlin-s b984845
Separates data constructing from data writing process and updates the…
linlin-s 870c7a9
Updates based on the most recent comments:
linlin-s bca28f9
Remove the guava dependency and add comment for ReparsedConstraint.
linlin-s 5b59119
Updates based on comments:
linlin-s b9e259e
Adds comment to explain the case of 'open content'.
linlin-s File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| package com.amazon.ion.benchmark; | ||
|
|
||
| import java.io.FilterOutputStream; | ||
| import java.io.IOException; | ||
| import java.io.OutputStream; | ||
|
|
||
| public class CountingOutputStream extends FilterOutputStream { | ||
| private long count; | ||
|
|
||
| /** | ||
| * Creates an output stream filter built on top of the specified | ||
| * underlying output stream. | ||
| * | ||
| * @param out the underlying output stream to be assigned to | ||
| * the field <tt>this.out</tt> for later use, or | ||
| * <code>null</code> if this instance is to be | ||
| * created without an underlying stream. | ||
| */ | ||
| public CountingOutputStream(OutputStream out) { | ||
| super(out); | ||
| } | ||
|
|
||
| /** Returns the number of bytes written. */ | ||
| public long getCount() { | ||
| return count; | ||
| } | ||
|
|
||
| @Override | ||
| public void write(byte[] b, int off, int len) throws IOException { | ||
| out.write(b, off, len); | ||
| this.count += len; | ||
| } | ||
|
|
||
| @Override | ||
| public void write(int b) throws IOException { | ||
| out.write(b); | ||
| this.count++; | ||
| } | ||
| } | ||
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,52 +1,83 @@ | ||
| package com.amazon.ion.benchmark; | ||
|
|
||
| import com.amazon.ion.IonDatagram; | ||
| import com.amazon.ion.IonLoader; | ||
| import com.amazon.ion.IonReader; | ||
| import com.amazon.ion.IonStruct; | ||
| import com.amazon.ion.IonSystem; | ||
| import com.amazon.ion.IonType; | ||
| import com.amazon.ion.IonValue; | ||
| import com.amazon.ion.IonWriter; | ||
| import com.amazon.ion.system.IonReaderBuilder; | ||
| import com.amazon.ion.benchmark.schema.ReparsedType; | ||
| import com.amazon.ion.system.IonBinaryWriterBuilder; | ||
| import com.amazon.ion.system.IonSystemBuilder; | ||
| import com.amazon.ion.system.IonTextWriterBuilder; | ||
| import com.amazon.ionschema.Schema; | ||
| import com.amazon.ionschema.Type; | ||
|
|
||
| import java.io.BufferedInputStream; | ||
| import java.io.File; | ||
| import java.io.FileInputStream; | ||
| import java.io.FileOutputStream; | ||
| import java.io.OutputStream; | ||
|
|
||
| /** | ||
| * Parse Ion Schema file and get the general constraints in the file then pass the constraints to the Ion data generator. | ||
| * Parse Ion Schema file and extract the type definition as ReparsedType object then pass the re-parsed type definition to the Ion data generator. | ||
| */ | ||
| public class ReadGeneralConstraints { | ||
| public static final IonSystem SYSTEM = IonSystemBuilder.standard().build(); | ||
| public static final IonLoader LOADER = SYSTEM.newLoader(); | ||
|
|
||
| /** | ||
| * Get general constraints of Ion Schema and call the relevant generator method based on the type. | ||
| * Getting the constructed data which is conformed with ISL and writing data to the output file. | ||
| * @param size is the size of the output file. | ||
| * @param path is the path of the Ion Schema file. | ||
| * @param schema an Ion Schema loaded by ion-schema-kotlin. | ||
| * @param format is the format of the generated file, select from set (ion_text | ion_binary). | ||
| * @param outputFile is the path of the generated file. | ||
| * @throws Exception if errors occur when reading and writing data. | ||
| * @throws Exception if errors occur when writing data. | ||
| */ | ||
| public static void readIonSchemaAndGenerate(int size, String path, String format, String outputFile) throws Exception { | ||
| try (IonReader reader = IonReaderBuilder.standard().build(new BufferedInputStream(new FileInputStream(path)))) { | ||
| IonDatagram schema = LOADER.load(reader); | ||
| for (int i = 0; i < schema.size(); i++) { | ||
| IonValue schemaValue = schema.get(i); | ||
| // Assume there's only one constraint between schema_header and schema_footer, if more constraints added, here is the point where developers should start. | ||
| if (schemaValue.getType().equals(IonType.STRUCT) && schemaValue.getTypeAnnotations()[0].equals(IonSchemaUtilities.KEYWORD_TYPE)) { | ||
| IonStruct constraintStruct = (IonStruct) schemaValue; | ||
| //Construct the writer and pass the constraints to the following writing data to files process. | ||
| File file = new File(outputFile); | ||
| try (IonWriter writer = WriteRandomIonValues.formatWriter(format, file)) { | ||
| WriteRandomIonValues.writeRequestedSizeFile(size, writer, file, constraintStruct); | ||
| } | ||
| // Print the successfully generated data notification which includes the file path information. | ||
| WriteRandomIonValues.printInfo(outputFile); | ||
| public static void constructAndWriteIonData(int size, Schema schema, String format, String outputFile) throws Exception { | ||
| // Assume there's only one type definition between schema_header and schema_footer. | ||
| // If more constraints added, here is the point where developers should start. | ||
| Type schemaType = schema.getTypes().next(); | ||
| ReparsedType parsedTypeDefinition = new ReparsedType(schemaType); | ||
| CountingOutputStream outputStreamCounter = new CountingOutputStream(new FileOutputStream(outputFile)); | ||
| try (IonWriter writer = formatWriter(format, outputStreamCounter)) { | ||
| int count = 0; | ||
| long currentSize = 0; | ||
| // Determine how many values should be written before the writer.flush(), and this process aims to reduce the execution time of writer.flush(). | ||
| while (currentSize <= 0.05 * size) { | ||
| IonValue constructedData = DataConstructor.constructIonData(parsedTypeDefinition); | ||
| constructedData.writeTo(writer); | ||
| count ++; | ||
| writer.flush(); | ||
| currentSize = outputStreamCounter.getCount(); | ||
| } | ||
| while (currentSize <= size) { | ||
| for (int i = 0; i < count; i++) { | ||
| IonValue constructedData = DataConstructor.constructIonData(parsedTypeDefinition); | ||
| constructedData.writeTo(writer); | ||
| } | ||
| writer.flush(); | ||
| currentSize = outputStreamCounter.getCount(); | ||
| } | ||
| } | ||
| // Print the successfully generated data notification which includes the file path information. | ||
| DataConstructor.printInfo(outputFile); | ||
| } | ||
|
|
||
| /** | ||
| * Construct the writer based on the provided format (ion_text|ion_binary). | ||
| * @param format decides which writer should be constructed. | ||
| * @param outputStream represents the bytes stream which will be written into the output file. | ||
| * @return the writer which conforms with the required format. | ||
| */ | ||
| public static IonWriter formatWriter(String format, OutputStream outputStream) { | ||
| IonWriter writer; | ||
| Format formatName = Format.valueOf(format.toUpperCase()); | ||
| switch (formatName) { | ||
| case ION_BINARY: | ||
| writer = IonBinaryWriterBuilder.standard().withLocalSymbolTableAppendEnabled().build(outputStream); | ||
| break; | ||
| case ION_TEXT: | ||
| writer = IonTextWriterBuilder.standard().build(outputStream); | ||
| break; | ||
| default: | ||
| throw new IllegalStateException("Please input the format ion_text or ion_binary"); | ||
| } | ||
| return writer; | ||
| } | ||
| } |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also need to override
public void write(int b) throws IOException.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! My bad :( thanks for reminding. It will be resolved in the next commit.