Skip to content

[MNG-7129] Incremental build and shared cache feature#526

Merged
hboutemy merged 1 commit intoapache:MNG-7129from
deutschebank:MNG-7129/maven-3.6.3-incremental
Sep 8, 2021
Merged

[MNG-7129] Incremental build and shared cache feature#526
hboutemy merged 1 commit intoapache:MNG-7129from
deutschebank:MNG-7129/maven-3.6.3-incremental

Conversation

@maximilian-novikov-db
Copy link
Copy Markdown

@maximilian-novikov-db maximilian-novikov-db commented Aug 20, 2021

To familiarize yourself with the feature we recommend to start with the provided doc: https://github.com/deutschebank/maven/blob/7d1b8c094f2145909978142a0faa4257a244dc22/Documentation/CACHE.md

An example of nginx config for remote cache:

worker_processes  4;

events {
    worker_connections  1024;
}


http {
    include       mime.types;
    default_type  application/octet-stream;

    sendfile        on;

    keepalive_timeout  65;

uht="$upstream_header_time" urt="$upstream_response_time"';

    server {
        listen       10080;
        server_name  localhost;

upstream_time;
        access_log   off;

        location / {
            root  <PATH_TO_CACHE_FOLDER>;
            autoindex on;
            client_max_body_size 1G;
            client_body_temp_path <PATH_TO_TEMP_CACHE_FOLDER>;
            dav_methods PUT DELETE MKCOL COPY MOVE;
            create_full_put_path  on;
            dav_access            group:rw  all:r;
        }
    }
}

…en-3.6.3 to MNG-7129/maven-3.6.3-incremental

Squashed commit of the following:

commit 21c620a34ddbbf84b3c44014b069a7b222cf91a7
Author: Maximilian Novikov <maximilian.novikov@db.com>
Date:   Tue Aug 10 19:41:41 2021 +0200

    Merge pull request apache#3 in ABFX/maven-forked from code-cleanup-2 to incremental-maven-3.6.3

    Squashed commit of the following:

    commit 3401739ec9eb957fb03887f6a3c24477c9c3327b
    Author: Alexander Ashitkin <alexander.ashitkin@db.com>
    Date:   Tue Aug 10 13:37:13 2021 -0400

        cleanup db specifics

    commit 5d32f00559951445ecaf7d5ec4d847541c10e7e9
    Author: Alexander Ashitkin <alexander.ashitkin@db.com>
    Date:   Tue Aug 10 11:05:28 2021 -0400

        cleanup db specifics

commit fcec28f82c0d447de64032c610b39fa75bb344b0
Merge: 6fd401af2 5b0dc1858
Author: Alexander Ashitkin <alexander.ashitkin@db.com>
Date:   Mon Aug 9 13:22:16 2021 +0100

    Pull request apache#9: Remote cache setup

    Merge in BND/maven-forked from feature/remote-cache-setup to incremental-maven-3.6.3

    * commit '5b0dc1858cf4ca499502f0741b2f3bf2ac55530c':
      remote cache setup - fail fast, baseline build, diff reporting, documentation.

commit 5b0dc1858cf4ca499502f0741b2f3bf2ac55530c
Author: Alexander Ashitkin <alexander.ashitkin@db.com>
Date:   Mon Mar 22 03:30:03 2021 +0100

    remote cache setup - fail fast, baseline build, diff reporting, documentation.

commit 6fd401af285692df568312ee66ee00edfe21f1cc
Author: Maximilian Novikov <maximilian.novikov@db.com>
Date:   Tue Apr 13 17:36:49 2021 +0100

    Pull request apache#8: Incremental Maven - added non-overrideable cache entries

    Merge in BND/maven-forked from non-overritable-build-info to incremental-maven-3.6.3

    Squashed commit of the following:

    commit 405bbef8a59cabc0b83e9a64717f65fbd6fb9e3d
    Author: maximilian.novikov@db.com <maximilian.novikov@db.com>
    Date:   Mon Apr 12 20:18:27 2021 +0200

        Incremental Maven - added non-overrideable cache entries - review fixes

    commit 197704672756e558fb3521e16859d0be324044b8
    Author: maximilian.novikov@db.com <maximilian.novikov@db.com>
    Date:   Fri Apr 9 11:35:48 2021 +0200

        Incremental Maven - added non-overrideable cache entries

commit 64c4b1f5cbdf920ad43560f34cc5be56d4557acf
Author: Alexander Ashitkin <alexander.ashitkin@db.com>
Date:   Tue Mar 9 04:10:20 2021 +0100

    injected version

commit 727e507ae2fd90dce7011f3f1c76eb4969577d5e
Author: maximilian.novikov@db.com <maximilian.novikov@db.com>
Date:   Wed Jan 13 16:17:25 2021 +0100

    Incremental Maven - Java 11 fix

commit b2f297566d5afcb0a285985cb5540685a28cafec
Author: maximilian.novikov@db.com <maximilian.novikov@db.com>
Date:   Tue Jan 12 14:10:08 2021 +0100

    Incremental Maven - initial commit
@maximilian-novikov-db maximilian-novikov-db changed the title Incremental build and shared cache feature [MNG-7129] Incremental build and shared cache feature Aug 20, 2021
@michael-o
Copy link
Copy Markdown
Member

Massive! I already see room for improvement.
How long did it take to go from concept to production?

@rmannibucau
Copy link
Copy Markdown
Contributor

Hi,

Looks very promishing, congrats!
I just have 2 small notes on the overall PR.

  1. Since we got a positive vote on mvnd, did you think about making both converging to avoid to have two concurrent impl in the repository at the end? Think it is very doable. Can be worth a thread on the list but at the end I see it very beneficial for maven.
  2. The only small technical note I would do right now is that jaxb should probably be replaced by standard maven xml parsing (xpp3 or alike), http client by wagon or other built-in dependencies - long story short no maven-core/pom.xml dep change.
    For the charset detection I wonder if something like https://github.com/apache/johnzon/blob/7751e96c4731a16a620a99e1165efac98db30566/johnzon-core/src/main/java/org/apache/johnzon/core/RFC4627AwareInputStreamReader.java#L86 would help and avoid the ENCODING_DETECTOR which looks a bit overkill for text file only.

Hope it makes sense.

@rfscholte
Copy link
Copy Markdown
Contributor

If I recall correclty there were one or two existing classes that needed to be extended to be able to support the cache. Could we move those to a separate PR? That will make reduce the amount of code to validate the rest of the and gives us enough time to experiment with it.
One of the things we need to discuss is whether the rest of the code should be moved to an extension.
With a separate PR we can verify all options.

@hboutemy
Copy link
Copy Markdown
Member

@rmannibucau

Since we got a positive vote on mvnd

@olamy any news on this? need help?

@@ -0,0 +1,49 @@
package org.apache.maven.caching; /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, * software distributed under the License is distributed on an * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY * KIND, either express or implied. See the License for the * specific language governing permissions and limitations * under the License. */ /* * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, * software distributed under the License is distributed on an * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY * KIND, either express or implied. See the License for the * specific language governing permissions and limitations * under the License. */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a merge typo

Comment thread Documentation/CACHE-HOWTO.md

### Minimal config

Absolutely minimal config which enables incremental maven with local cache
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should state to put the content in .mvn/maven-cache-config.xml file

@hboutemy
Copy link
Copy Markdown
Member

If I recall correclty there were one or two existing classes that needed to be extended to be able to support the cache. Could we move those to a separate PR? That will make reduce the amount of code to validate the rest of the and gives us enough time to experiment with it.
One of the things we need to discuss is whether the rest of the code should be moved to an extension.
With a separate PR we can verify all options.

from first PR review, most code is put in maven-core in a new org.apache.maven.caching Java package (and generated jaxb sub-package): I suppose this whole code could me moved to a separate maven-caching Maven module...

this caching is injected in maven-core through quite a few core classes modifications:

  • org.apache.maven.graph.DefaultGraphBuilder
  • org.apache.maven.lifecycle.internal.DependencyContext
  • org.apache.maven.lifecycle.internal.MojoExecutor
  • org.apache.maven.lifecycle.internal.NoResolutionContext
  • org.apache.maven.plugin.DefaultBuildPluginManager
  • org.apache.maven.plugin.MojoCheker
  • org.apache.maven.plugin.MojoExecution
  • org.apache.maven.project.DefaultProjectBuilder
  • org.apache.maven.project.MavenProject

I fear swiching these updates to core extension will be hard...

on how to learn the code, I don't see the PR being split in multiple PRs, but perhaps the unique commit being split in multiple

one source of noisy modifications (and many new dependencies addition) is using jaxb to read XML config files instead of using our usual Modello

mojoExecutionListener.beforeMojoExecution( mojoExecutionEvent );

mojo.execute();
if ( mojoExecution.canRun( mojo, session ) )
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the modifications for this class seem to be purely formatting: restoring initial formatting would ease review

@AlexanderAshitkin
Copy link
Copy Markdown

AlexanderAshitkin commented Aug 22, 2021

Hi
Thanks a lot for the feedback. The idea of this PR is not to finalize the feature but to demonstrate curent state, and to gather initial feedback to decide on the further plan. Realistically looking at the current state, there is an expectedly long way forward to align this feature with all the guidelines and architecure requirements. Though the feedback is reasonable and this is exactly what we are looked for, the implementation in this branch is stable and better to be kept as is. Reworks in small increments looks easier to move forward. At the moment forward following apporach appears reasonable:

  1. Merge this to a feature branch as is as a baseline
  2. Find way to safely enable for those who is interested under some "Experimental feature" umbrella (with minimal modifications). Ideally merge it to an official distribution (of course safely and compliantly) under a feature flag which will provide a sort of user consent.
  3. Gather and accumulate community feedback to produce a sort of feature roadmap. The ultimate target is integrating it to an official release with all the requirements satisified.
  4. Continue work in feature branch to adress p.2 decisions and achive mutual agreement between community members
  5. integrate it in an agreed/safe way and remove "Experimental flag"

Small clarifications to review comments:

  1. @rmannibucau rework repositories - absolutely reasonable, should be aligned. There are some open question to discuss, like best layout for cache, associated use cases - like purging cache, store cache metadata, etc. All this could evolve to some repositories design decisions. Overall approach was is to minimize changes in existing classes.
  2. @rmannibucau Replace with built-in dependencies - reasonable, to be discussed. On use of jaxb - the reasone to use it was schema validation. If modello supports that - could easily be reused. ENCODING_DETECTOR - it supports wide set ofencodings beyond UTF unlike of AwareInputStreamReader. So to be discussed case by case, but in overall agree here.
  3. @rfscholte on splitting PR - though technically of course possible, but decomposing seems to be a lot of work by itself and will leave feature in a half state. Our idea is to merge this as is to a feature branch and continue with smaller enhancements from there as necessary. They could be targeted to master or to another feature branch.
  4. regarding location of the feature - it feels like it should be a seprate module or an extension. Happy to rework in that direction - guidance here is very welcome
  5. @hboutemy - we will restore original formatting in core and similar to not add noise in review

From my perspective there are folowing implementation related streams of work:

  • Move feature to a separate module/extension safely isolated from core and find a proper way to plug it in safely.
  • Align implementation with built-in dependencies (like the mentioned Wagon, Modello, etc)
  • Agree proper approach to tests implementation for this feature (like introducing wiremock?) and achieve necessary level of confidence in tests coverage
  • Work on supplementary features like versionless projects/builds and version injection, better plugins paramters api and some other

Thank you

@olamy
Copy link
Copy Markdown
Member

olamy commented Aug 23, 2021

@olamy any news on this? need help?

@hboutemy my bad totally forgot this... I will have a try next week.

@maximilian-novikov-db
Copy link
Copy Markdown
Author

Massive! I already see room for improvement.
How long did it take to go from concept to production?

2 weeks for a PoC and a month for 1st working version in prod :)

@michael-o
Copy link
Copy Markdown
Member

Massive! I already see room for improvement.
How long did it take to go from concept to production?

2 weeks for a PoC and a month for 1st working version in prod :)

How many FTEs?

@maximilian-novikov-db
Copy link
Copy Markdown
Author

Massive! I already see room for improvement.
How long did it take to go from concept to production?

2 weeks for a PoC and a month for 1st working version in prod :)

How many FTEs?

~1.5

@Tibor17
Copy link
Copy Markdown
Contributor

Tibor17 commented Aug 30, 2021 via email

@rmannibucau
Copy link
Copy Markdown
Contributor

@qweek
Copy link
Copy Markdown

qweek commented Sep 1, 2021

Hi Alex, Excellent work and an elaborat. Unfortunately the xxHash algorithm is not in the list https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html We must guarantee that every java would support it. T

Used open-source implementation:
https://github.com/OpenHFT/Zero-Allocation-Hashing

@rmannibucau
Copy link
Copy Markdown
Contributor

Hi Alex, Excellent work and an elaborat. Unfortunately the xxHash algorithm is not in the list https://docs.oracle.com/javase/8/docs/technotes/guides/security/StandardNames.html We must guarantee that every java would support it. T

Used open-source implementation:
https://github.com/OpenHFT/Zero-Allocation-Hashing

FYI it is trivial to fork and takes only <150 lines and enables to drop the unsafe usage and other deps - it only depends on bytebuffer or alike - which requires some careness on recent java versions - does not dramatically change performances compared to sha algorithms.

@qweek
Copy link
Copy Markdown

qweek commented Sep 8, 2021

FYI it is trivial to fork and takes only <150 lines and enables to drop the unsafe usage and other deps - it only depends on bytebuffer or alike - which requires some careness on recent java versions - does not dramatically change performances compared to sha algorithms.

Unfortunately it's not 150 lines of code, I removed all unused code and it still depends on Unsafe and ByteBuffer

@rmannibucau
Copy link
Copy Markdown
Contributor

@qweek had this kind of thing in mind: https://gist.github.com/rmannibucau/756156a53e792f2b68c3339db8ee97a6. Depending on bytebuffer is fine (it is in the JRE), unsafe is acceptable even if not needed and depends how you integrate with JVM - Java agreed to keep it but some version(s) can require some integration so I preferred to drop it in the project I'm using it.

@hboutemy hboutemy merged commit a66d468 into apache:MNG-7129 Sep 8, 2021
@hboutemy
Copy link
Copy Markdown
Member

hboutemy commented Sep 8, 2021

PR merged to branch: thanks a lot for the great contribution

@qweek
Copy link
Copy Markdown

qweek commented Sep 9, 2021

@rmannibucau I see what you mean: it will kill a few optimizations, generate more objects and probably we need to change ByteOrder (default byte buffer order is BIG_ENDIAN and algorithm order is LITTLE_ENDIAN). But you're right, it's much cleaner and simpler, thank you for sharing your vision.
Safe version with almost 100% code coverage (about 7% slower, but still very fast compared to SHA)

@Dunemaster
Copy link
Copy Markdown

@maximilian-novikov-db , how does this work with Idea?
Does the IDE use cache?
Can I import a maven submodule, or should I always import the whole monorepo in idea?

@maximilian-novikov-db
Copy link
Copy Markdown
Author

maximilian-novikov-db commented Nov 3, 2021

@maximilian-novikov-db , how does this work with Idea? Does the IDE use cache? Can I import a maven submodule, or should I always import the whole monorepo in idea?

@Dunemaster

  1. Yes, it uses the cache. The know limitation - if you have a code generation in your project, you will need to manually reload the project in Idea after the restoration from the cache, to sync Idea's caches. (literally click Maven/'Reload project' in menu)
  2. You can do both ways, you can just import a submodule.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants