Expand value interning optimization and fix XML factory transformer usage#2495
Merged
gnodet merged 1 commit intoapache:masterfrom Jul 17, 2025
Merged
Conversation
8 tasks
elharo
reviewed
Jun 24, 2025
Contributor
elharo
left a comment
There was a problem hiding this comment.
Are there any benchmarks for this?
Contributor
Author
I haven't seen any performance changes really. But the idea was more to save some memory when loading big projects, where lots of data is actually the same in all poms. |
773956d to
2558e2b
Compare
2558e2b to
4515e4e
Compare
cstamas
approved these changes
Jul 11, 2025
…erty This PR expands the value interning optimization in Maven's XML parsing and fixes transformer usage across all XML factories to improve memory efficiency during Maven builds. Expanded the InterningTransformer in DefaultModelBuilder to intern 27 commonly repeated contexts: **Core Maven coordinates:** - groupId, artifactId, version, namespaceUri, packaging **Dependency-related fields:** - scope, type, classifier **Build and plugin-related fields:** - phase, goal, execution **Repository-related fields:** - layout, policy, checksumPolicy, updatePolicy **Common metadata fields:** - modelVersion, name, url, system, distribution, status **SCM fields:** - connection, developerConnection, tag **Common enum-like values:** - id, inherited, optional Added MAVEN_MODEL_BUILDER_INTERNS session property to allow users to customize which XML contexts are interned during POM parsing: - Supports comma-separated list of field names - User properties take precedence over system properties - Falls back to default contexts when property not set - Handles whitespace and empty values gracefully Usage examples: - mvn clean install -Dmaven.modelBuilder.interns="groupId,artifactId,version" - maven.modelBuilder.interns=groupId,artifactId,version,scope,type Fixed all XML factories to properly use the transformer from XmlReaderRequest: - DefaultSettingsXmlFactory - Now uses transformer - DefaultToolchainsXmlFactory - Now uses transformer - DefaultPluginXmlFactory - Now uses transformer - DefaultModelXmlFactory - Already working, verified Added comprehensive test coverage: - InterningTransformerTest.java - Tests interning logic and session property functionality - XmlFactoryTransformerTest.java - Tests transformer usage across all XML factories 1. **Memory Efficiency**: String interning reduces memory usage by ensuring identical string values share the same object reference 2. **Performance**: Faster string comparisons using == instead of .equals() for interned strings 3. **Comprehensive Coverage**: All XML parsing in Maven (POMs, settings, toolchains, plugins) now benefits from interning 4. **Customizable**: Users can tailor interning to their specific use cases 5. **Maven-specific Optimization**: Targets the most commonly repeated values in Maven files - **Backward Compatible**: No breaking changes - optimization is transparent to users - **Automatic Application**: All XML parsing automatically benefits from interning - **Proper Integration**: Transformers are correctly passed through the XML factory chain - **Conservative Approach**: Only interns commonly repeated values to avoid memory overhead - **Configurable**: Users can customize which fields are interned via session properties
4515e4e to
968190c
Compare
gnodet
added a commit
to gnodet/maven
that referenced
this pull request
Jul 17, 2025
…erty (apache#2495) This PR expands the value interning optimization in Maven's XML parsing and fixes transformer usage across all XML factories to improve memory efficiency during Maven builds. Expanded the InterningTransformer in DefaultModelBuilder to intern 27 commonly repeated contexts: **Core Maven coordinates:** - groupId, artifactId, version, namespaceUri, packaging **Dependency-related fields:** - scope, type, classifier **Build and plugin-related fields:** - phase, goal, execution **Repository-related fields:** - layout, policy, checksumPolicy, updatePolicy **Common metadata fields:** - modelVersion, name, url, system, distribution, status **SCM fields:** - connection, developerConnection, tag **Common enum-like values:** - id, inherited, optional Added MAVEN_MODEL_BUILDER_INTERNS session property to allow users to customize which XML contexts are interned during POM parsing: - Supports comma-separated list of field names - User properties take precedence over system properties - Falls back to default contexts when property not set - Handles whitespace and empty values gracefully Usage examples: - mvn clean install -Dmaven.modelBuilder.interns="groupId,artifactId,version" - maven.modelBuilder.interns=groupId,artifactId,version,scope,type Fixed all XML factories to properly use the transformer from XmlReaderRequest: - DefaultSettingsXmlFactory - Now uses transformer - DefaultToolchainsXmlFactory - Now uses transformer - DefaultPluginXmlFactory - Now uses transformer - DefaultModelXmlFactory - Already working, verified Added comprehensive test coverage: - InterningTransformerTest.java - Tests interning logic and session property functionality - XmlFactoryTransformerTest.java - Tests transformer usage across all XML factories 1. **Memory Efficiency**: String interning reduces memory usage by ensuring identical string values share the same object reference 2. **Performance**: Faster string comparisons using == instead of .equals() for interned strings 3. **Comprehensive Coverage**: All XML parsing in Maven (POMs, settings, toolchains, plugins) now benefits from interning 4. **Customizable**: Users can tailor interning to their specific use cases 5. **Maven-specific Optimization**: Targets the most commonly repeated values in Maven files - **Backward Compatible**: No breaking changes - optimization is transparent to users - **Automatic Application**: All XML parsing automatically benefits from interning - **Proper Integration**: Transformers are correctly passed through the XML factory chain - **Conservative Approach**: Only interns commonly repeated values to avoid memory overhead - **Configurable**: Users can customize which fields are interned via session properties (cherry picked from commit e5d985c) # Conflicts: # src/site/markdown/configuration.properties
Contributor
Author
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
gnodet
added a commit
that referenced
this pull request
Jul 18, 2025
…erty (#2495) (#10932) This PR expands the value interning optimization in Maven's XML parsing and fixes transformer usage across all XML factories to improve memory efficiency during Maven builds. Expanded the InterningTransformer in DefaultModelBuilder to intern 27 commonly repeated contexts: **Core Maven coordinates:** - groupId, artifactId, version, namespaceUri, packaging **Dependency-related fields:** - scope, type, classifier **Build and plugin-related fields:** - phase, goal, execution **Repository-related fields:** - layout, policy, checksumPolicy, updatePolicy **Common metadata fields:** - modelVersion, name, url, system, distribution, status **SCM fields:** - connection, developerConnection, tag **Common enum-like values:** - id, inherited, optional Added MAVEN_MODEL_BUILDER_INTERNS session property to allow users to customize which XML contexts are interned during POM parsing: - Supports comma-separated list of field names - User properties take precedence over system properties - Falls back to default contexts when property not set - Handles whitespace and empty values gracefully Usage examples: - mvn clean install -Dmaven.modelBuilder.interns="groupId,artifactId,version" - maven.modelBuilder.interns=groupId,artifactId,version,scope,type Fixed all XML factories to properly use the transformer from XmlReaderRequest: - DefaultSettingsXmlFactory - Now uses transformer - DefaultToolchainsXmlFactory - Now uses transformer - DefaultPluginXmlFactory - Now uses transformer - DefaultModelXmlFactory - Already working, verified Added comprehensive test coverage: - InterningTransformerTest.java - Tests interning logic and session property functionality - XmlFactoryTransformerTest.java - Tests transformer usage across all XML factories 1. **Memory Efficiency**: String interning reduces memory usage by ensuring identical string values share the same object reference 2. **Performance**: Faster string comparisons using == instead of .equals() for interned strings 3. **Comprehensive Coverage**: All XML parsing in Maven (POMs, settings, toolchains, plugins) now benefits from interning 4. **Customizable**: Users can tailor interning to their specific use cases 5. **Maven-specific Optimization**: Targets the most commonly repeated values in Maven files - **Backward Compatible**: No breaking changes - optimization is transparent to users - **Automatic Application**: All XML parsing automatically benefits from interning - **Proper Integration**: Transformers are correctly passed through the XML factory chain - **Conservative Approach**: Only interns commonly repeated values to avoid memory overhead - **Configurable**: Users can customize which fields are interned via session properties (cherry picked from commit e5d985c) # Conflicts: # src/site/markdown/configuration.properties
gnodet
added a commit
to gnodet/maven
that referenced
this pull request
Jul 24, 2025
…erty (apache#2495) This PR expands the value interning optimization in Maven's XML parsing and fixes transformer usage across all XML factories to improve memory efficiency during Maven builds. Expanded the InterningTransformer in DefaultModelBuilder to intern 27 commonly repeated contexts: **Core Maven coordinates:** - groupId, artifactId, version, namespaceUri, packaging **Dependency-related fields:** - scope, type, classifier **Build and plugin-related fields:** - phase, goal, execution **Repository-related fields:** - layout, policy, checksumPolicy, updatePolicy **Common metadata fields:** - modelVersion, name, url, system, distribution, status **SCM fields:** - connection, developerConnection, tag **Common enum-like values:** - id, inherited, optional Added MAVEN_MODEL_BUILDER_INTERNS session property to allow users to customize which XML contexts are interned during POM parsing: - Supports comma-separated list of field names - User properties take precedence over system properties - Falls back to default contexts when property not set - Handles whitespace and empty values gracefully Usage examples: - mvn clean install -Dmaven.modelBuilder.interns="groupId,artifactId,version" - maven.modelBuilder.interns=groupId,artifactId,version,scope,type Fixed all XML factories to properly use the transformer from XmlReaderRequest: - DefaultSettingsXmlFactory - Now uses transformer - DefaultToolchainsXmlFactory - Now uses transformer - DefaultPluginXmlFactory - Now uses transformer - DefaultModelXmlFactory - Already working, verified Added comprehensive test coverage: - InterningTransformerTest.java - Tests interning logic and session property functionality - XmlFactoryTransformerTest.java - Tests transformer usage across all XML factories 1. **Memory Efficiency**: String interning reduces memory usage by ensuring identical string values share the same object reference 2. **Performance**: Faster string comparisons using == instead of .equals() for interned strings 3. **Comprehensive Coverage**: All XML parsing in Maven (POMs, settings, toolchains, plugins) now benefits from interning 4. **Customizable**: Users can tailor interning to their specific use cases 5. **Maven-specific Optimization**: Targets the most commonly repeated values in Maven files - **Backward Compatible**: No breaking changes - optimization is transparent to users - **Automatic Application**: All XML parsing automatically benefits from interning - **Proper Integration**: Transformers are correctly passed through the XML factory chain - **Conservative Approach**: Only interns commonly repeated values to avoid memory overhead - **Configurable**: Users can customize which fields are interned via session properties
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR expands the value interning optimization in Maven's XML parsing and fixes transformer usage across all XML factories to improve memory efficiency during Maven builds.
Changes Made
1. Expanded InliningTransformer Coverage
Expanded the
InliningTransformerinDefaultModelBuilderto intern 16 additional commonly repeated contexts (from 11 to 27 total):Added contexts:
type,classifiergoal,executionmodelVersion,name,url,system,distribution,statusconnection,developerConnection,tagid,inherited,optional2. Fixed XML Factory Transformer Usage
Fixed all XML factories to properly use the transformer from
XmlReaderRequest:DefaultSettingsXmlFactory- Now uses transformerDefaultToolchainsXmlFactory- Now uses transformerDefaultPluginXmlFactory- Now uses transformerDefaultModelXmlFactory- Already working, verified3. Added Comprehensive Tests
Added two new JUnit test files:
InliningTransformerTest.java- Tests interning logic and context coverageXmlFactoryTransformerTest.java- Tests transformer usage across all XML factoriesBenefits
==instead of.equals()for interned stringsTechnical Details
Testing
This addresses the memory optimization request for repeated string values in Maven's XML processing, providing better memory efficiency during Maven builds and operations.
Files Changed
impl/maven-impl/src/main/java/org/apache/maven/impl/model/DefaultModelBuilder.java- Expanded CONTEXTS setimpl/maven-impl/src/main/java/org/apache/maven/impl/DefaultSettingsXmlFactory.java- Added transformer usageimpl/maven-impl/src/main/java/org/apache/maven/impl/DefaultToolchainsXmlFactory.java- Added transformer usageimpl/maven-impl/src/main/java/org/apache/maven/impl/DefaultPluginXmlFactory.java- Added transformer usageimpl/maven-impl/src/main/java/org/apache/maven/impl/DefaultModelXmlFactory.java- Verified transformer usageimpl/maven-impl/src/test/java/org/apache/maven/impl/model/InliningTransformerTest.java- New test fileimpl/maven-impl/src/test/java/org/apache/maven/impl/XmlFactoryTransformerTest.java- New test filePull Request opened by Augment Code with guidance from the PR author