Skip to content

Conversation

@openinx
Copy link
Member

@openinx openinx commented Nov 9, 2021

This PR is trying to address this comment: #3501 (comment)

@github-actions github-actions bot added the INFRA label Nov 9, 2021
@openinx
Copy link
Member Author

openinx commented Nov 9, 2021

Oh, I see. The correct way to enable checkstyle check for flink modules is removing the ignored flink paths in java-ci.yaml, instead of enabling all the engines checkstyle check by removing the -Pquick=true.

@openinx openinx changed the title Build: Enable checkstyle and error-prone check in Github CI Build: Enable engine's checkstyle and error-prone check in Github CI Nov 9, 2021
Copy link
Contributor

@kbendick kbendick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment. I too have noticed that checkstyle wasn't run for something, though that case might have been because we were in the middle of migrating to the broken out structure by version.

Comment on lines -31 to -36
- 'mr/**'
- 'hive3/**'
- 'hive3-orc-bundle/**'
- 'hive-runtime/**'
- 'spark/**'
- 'flink/**'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this will solve your problem, but it might also lead to running java-ci a lot more than is necessary, as it does more than just checkstyle.

Am I correct in thinking that we're just trying to ensure checkstyle runs over changed files? Is there a way we can avoid running the additional 6-11 minute action that makes up all of java-ci for all of these paths if we only need a subset of it?

Or maybe we can run checkstyle in its own workflow to avoid having to run it in every single workflow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my view, the overall checkstyle check is necessary. I think the problem is: removing all those ignored engine paths will trigger the extra iceberg-core unit tests check. Because if any engine related changes happen , it will always run the iceberg-core unit tests (even we don't ) I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @openinx. This common rule is the one that should be running the checks. This is to avoid making all of the CI builds more expensive because checkstyle and errorprone take a long time to run.

One thing that we still need to do after this is enable all Flink, Spark, and Hive versions in the job config, not just the default ones.

@kbendick
Copy link
Contributor

Oh, I see. The correct way to enable checkstyle check for flink modules is removing the ignored flink paths in java-ci.yaml, instead of enabling all the engines checkstyle check by removing the -Pquick=true.

I think it would be better to keep the ignors as are, but remote -Pquick=true instead. That way we keep the tests shorter by only running them for the change set they belong too (as well as more parallel).

I am also working on a PR to make them all exit early if one fails (like before when they were one file).

Running all of them if it's possile to avoid it uses up limited capacity for GH Actions (the whole Apache organization shares one set of runners which is pretty limited in size relatively speaking).

@openinx
Copy link
Member Author

openinx commented Nov 10, 2021

I think it would be better to keep the ignors as are, but remote -Pquick=true instead. That way we keep the tests shorter by only running them for the change set they belong too (as well as more parallel).

I choose another way in the latest commit because I don't expect to check the checkstyle for those common modules (such as iceberg-api, iceberg-core, iceberg-parquet, iceberg-data, iceberg-orc , iceberg-common etc) many times if a PR change codes from different engines or an engine with different versions (small bugfix port).

You can see the checkstyle check logs by running the command:

➜  apache-iceberg git:(master) ./gradlew -DflinkVersions=1.13 iceberg-flink:iceberg-flink-1.13:check -x javadoc -x test       

> Task :iceberg-common:compileJava
/Users/openinx/software/apache-iceberg/common/src/main/java/org/apache/iceberg/common/DynFields.java:198: warning: [ImplicitPublicBuilderConstructor] A Builder with a static factory method on the encapsulating class must have a private constructor. Minimizing unnecessary public API prevents future API breaks from impacting consumers. 
  public static class Builder {
                ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'private boolean defaultAlwaysNull = false;'?
/Users/openinx/software/apache-iceberg/common/src/main/java/org/apache/iceberg/common/DynClasses.java:35: warning: [ImplicitPublicBuilderConstructor] A Builder with a static factory method on the encapsulating class must have a private constructor. Minimizing unnecessary public API prevents future API breaks from impacting consumers. 
  public static class Builder {
                ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'private Set<String> classNames = new LinkedHashSet<>();'?
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2 warnings

> Task :iceberg-api:compileJava
/Users/openinx/software/apache-iceberg/api/src/main/java/org/apache/iceberg/util/ExceptionUtil.java:124: warning: [Finally] If you return or throw from a finally, then values returned or thrown from the try-catch block will be ignored. Consider using try-with-resources instead.
            throw new RuntimeException("Unknown exception in finally block", failure);
            ^
    (see https://errorprone.info/bugpattern/Finally)
/Users/openinx/software/apache-iceberg/api/src/main/java/org/apache/iceberg/io/FileIO.java:77: warning: [MissingOverride] close implements method in Closeable; expected @Override
  default void close() {
               ^
    (see https://errorprone.info/bugpattern/MissingOverride)
  Did you mean '@Override default void close() {'?
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
2 warnings

> Task :iceberg-api:compileTestJava
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :iceberg-core:compileJava
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/encryption/InputFilesDecryptor.java:46: warning: [StreamToIterable] Using stream::iterator creates a one-shot Iterable, which may cause surprising failures.
    Iterable<InputFile> decryptedFiles = encryption.decrypt(encrypted::iterator);
                                                            ^
    (see https://errorprone.info/bugpattern/StreamToIterable)
  Did you mean 'Iterable<InputFile> decryptedFiles = encryption.decrypt(encrypted.collect(toImmutableList()));'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/mapping/MappedFields.java:116: warning: [ObjectsHashCodeUnnecessaryVarargs] java.util.Objects.hash(non-varargs) should be replaced with java.util.Objects.hashCode(value) to avoid unnecessary varargs array allocations.
    return Objects.hash(fields);
                       ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'return Objects.hashCode(fields);'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/SnapshotSummary.java:67: warning: [ImplicitPublicBuilderConstructor] A Builder with a static factory method on the encapsulating class must have a private constructor. Minimizing unnecessary public API prevents future API breaks from impacting consumers. 
  public static class Builder {
                ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'private boolean trustPartitionMetrics = true;'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/ScanSummary.java:298: warning: [JavaUtilDate] Date has a bad API that leads to bugs; prefer java.time.Instant or LocalDate.
          new Date(dataTimestampMillis).toString() : null;
                                                ^
    (see https://errorprone.info/bugpattern/JavaUtilDate)
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/ScanSummary.java:298: warning: [JavaUtilDate] Date has a bad API that leads to bugs; prefer java.time.Instant or LocalDate.
          new Date(dataTimestampMillis).toString() : null;
          ^
    (see https://errorprone.info/bugpattern/JavaUtilDate)
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/RemoveSnapshots.java:120: warning: [JavaUtilDate] Date has a bad API that leads to bugs; prefer java.time.Instant or LocalDate.
    LOG.info("Expiring snapshots older than: {} ({})", new Date(timestampMillis), timestampMillis);
                                                       ^
    (see https://errorprone.info/bugpattern/JavaUtilDate)
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/util/StructLikeSet.java:179: warning: [ObjectsHashCodeUnnecessaryVarargs] java.util.Objects.hash(non-varargs) should be replaced with java.util.Objects.hashCode(value) to avoid unnecessary varargs array allocations.
    return Objects.hash(type) + wrapperSet.stream().mapToInt(StructLikeWrapper::hashCode).sum();
                       ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'return Objects.hashCode(type) + wrapperSet.stream().mapToInt(StructLikeWrapper::hashCode).sum();'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/util/Tasks.java:574: warning: [StreamToIterable] Using stream::iterator creates a one-shot Iterable, which may cause surprising failures.
    return new Builder<>(items::iterator);
                         ^
    (see https://errorprone.info/bugpattern/StreamToIterable)
  Did you mean 'return new Builder<>(items.collect(toImmutableList()));'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/MetricsModes.java:132: warning: [ObjectsHashCodeUnnecessaryVarargs] java.util.Objects.hash(non-varargs) should be replaced with java.util.Objects.hashCode(value) to avoid unnecessary varargs array allocations.
      return Objects.hash(length);
                         ^
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'return Objects.hashCode(length);'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/jdbc/JdbcUtil.java:32: warning: [ProtectedMembersInFinalClass] Make members of final classes package-private: CATALOG_TABLE_NAME, CATALOG_NAME, TABLE_NAMESPACE, TABLE_NAME, METADATA_LOCATION, PREVIOUS_METADATA_LOCATION, CREATE_CATALOG_TABLE, GET_TABLE_SQL, LIST_TABLES_SQL, RENAME_TABLE_SQL, DROP_TABLE_SQL, GET_NAMESPACE_SQL, LIST_NAMESPACES_SQL, DO_COMMIT_CREATE_TABLE_SQL
  protected static final String CATALOG_TABLE_NAME = "iceberg_tables";
                                ^
    (see https://errorprone.info/bugpattern/ProtectedMembersInFinalClass)
  Did you mean 'static final String CATALOG_TABLE_NAME = "iceberg_tables";'?
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/ManifestFiles.java:263: warning: [Finally] If you return or throw from a finally, then values returned or thrown from the try-catch block will be ignored. Consider using try-with-resources instead.
          throw new RuntimeIOException(e, "Failed to close manifest: %s", outputFile);
          ^
    (see https://errorprone.info/bugpattern/Finally)
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/avro/Avro.java:79: warning: [ImmutableEnumChecker] enums should be immutable: 'CodecName' has field 'avroCodec' of type 'org.apache.avro.file.CodecFactory', the declaration of type 'org.apache.avro.file.CodecFactory' is not annotated with @com.google.errorprone.annotations.Immutable
    private final CodecFactory avroCodec;
                               ^
    (see https://errorprone.info/bugpattern/ImmutableEnumChecker)
/Users/openinx/software/apache-iceberg/core/src/main/java/org/apache/iceberg/avro/Avro.java:562: warning: [UnnecessaryLambda] Returning a lambda from a helper method or saving it in a constant is unnecessary; prefer to implement the functional interface method directly and use a method reference instead.
    private final Function<Schema, DatumReader<?>> defaultCreateReaderFunc = readSchema -> {
                                                   ^
    (see https://errorprone.info/bugpattern/UnnecessaryLambda)
  Did you mean 'private  DatumReader<?> defaultCreateReaderFunc(Schema readSchema){'?
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
13 warnings

> Task :iceberg-orc:compileJava
/Users/openinx/software/apache-iceberg/orc/src/main/java/org/apache/iceberg/orc/IdToOrcName.java:42: warning: [UnescapedEntity] This HTML entity is invalid. Enclosing the code in this <pre>/<code> tag with a {@code } block will force Javadoc to interpret HTML literally.
 * struct -> field                      struct.field      struct.field
           ^
    (see https://errorprone.info/bugpattern/UnescapedEntity)
  Did you mean '* <pre><code>{@code'?

> Task :iceberg-hive-metastore:compileJava
/Users/openinx/software/apache-iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:419: warning: [ReverseDnsLookup] Calling address.getHostName may result in a reverse DNS lookup which is a network request, making the invocation significantly more expensive than expected depending on the environment.
        InetAddress.getLocalHost().getHostName());
                                              ^
  This check is intended to be advisory - it's fine to @SuppressWarnings("ReverseDnsLookup") in certain cases, but is usually not recommended.
    (see https://github.com/palantir/gradle-baseline#baseline-error-prone-checks)
  Did you mean 'InetAddress.getLocalHost().getHostAddress());'?
Note: /Users/openinx/software/apache-iceberg/hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
1 warning

> Task :iceberg-orc:compileJava
/Users/openinx/software/apache-iceberg/orc/src/main/java/org/apache/iceberg/data/orc/GenericOrcWriters.java:338: warning: [JavaLocalDateTimeGetNano] localDateTime.getNano() only accesss the nanos-of-second field. It's rare to only use getNano() without a nearby getSecond() call.
      cv.nanos[rowId] = (data.getNano() / 1_000) * 1_000; // truncate nanos to only keep microsecond precision
                                     ^
    (see https://errorprone.info/bugpattern/JavaLocalDateTimeGetNano)
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
2 warnings

> Task :iceberg-hive-metastore:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: /Users/openinx/software/apache-iceberg/hive-metastore/src/test/java/org/apache/iceberg/hive/TestHiveCatalog.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :iceberg-parquet:compileJava
/Users/openinx/software/apache-iceberg/parquet/src/main/java/org/apache/iceberg/parquet/BasePageIterator.java:166: warning: [NonCanonicalType] The type `BasePageIterator.NullIntIterator` was referred to by the non-canonical name `PageIterator.NullIntIterator`. This may be misleading.
        return new PageIterator.NullIntIterator();
                               ^
    (see https://errorprone.info/bugpattern/NonCanonicalType)
  Did you mean 'return new NullIntIterator();'?
/Users/openinx/software/apache-iceberg/parquet/src/main/java/org/apache/iceberg/parquet/ParquetMetricsRowGroupFilter.java:507: warning: [UnusedMethod] Private method 'converterFor' is never used.
  private static Function<Object, Object> converterFor(PrimitiveType parquetType, Type icebergType) {
                                          ^
    (see https://errorprone.info/bugpattern/UnusedMethod)
  Did you mean to remove this line?

> Task :iceberg-core:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :iceberg-parquet:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
2 warnings

> Task :iceberg-data:compileJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

> Task :iceberg-data:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

> Task :iceberg-flink:iceberg-flink-1.13:compileJava
/Users/openinx/software/apache-iceberg/flink/v1.13/flink/src/main/java/org/apache/iceberg/flink/FlinkCatalog.java:489: warning: [MixedMutabilityReturnType] This method returns both mutable and immutable collections or maps from different paths. This may be confusing for users of the method.
  private static List<String> toPartitionKeys(PartitionSpec spec, Schema icebergSchema) {
                              ^
    (see https://errorprone.info/bugpattern/MixedMutabilityReturnType)
  Did you mean 'private static ImmutableList<String> toPartitionKeys(PartitionSpec spec, Schema icebergSchema) {'?
/Users/openinx/software/apache-iceberg/flink/v1.13/flink/src/main/java/org/apache/iceberg/flink/data/FlinkOrcWriters.java:151: warning: [JavaInstantGetSecondsGetNano] instant.getNano() only accesses the underlying nanosecond adjustment from the whole second.
      cv.nanos[rowId] = (instant.getNano() / 1_000) * 1_000;
                                        ^
    (see https://errorprone.info/bugpattern/JavaInstantGetSecondsGetNano)
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: /Users/openinx/software/apache-iceberg/flink/v1.13/flink/src/main/java/org/apache/iceberg/flink/data/RowDataUtil.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.
2 warnings

> Task :iceberg-flink:iceberg-flink-1.13:compileTestJava
Note: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: /Users/openinx/software/apache-iceberg/flink/v1.13/flink/src/test/java/org/apache/iceberg/flink/source/BoundedTestSource.java uses unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 8.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

See https://docs.gradle.org/7.2/userguide/command_line_interface.html#sec:command_line_warnings

BUILD SUCCESSFUL in 1m 7s
34 actionable tasks: 32 executed, 2 up-to-date

@rdblue
Copy link
Contributor

rdblue commented Nov 14, 2021

Rerunning CI since I think the Javadoc failure was unrelated.

@rdblue rdblue merged commit 0759b28 into apache:master Nov 14, 2021
@rdblue
Copy link
Contributor

rdblue commented Nov 14, 2021

Thanks, @openinx!

@kbendick
Copy link
Contributor

Thank you @openinx! This was an important find!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants