Skip to content

Comments

7662 make solrconfig less static#8320

Closed
poikilotherm wants to merge 60 commits intoIQSS:developfrom
poikilotherm:7662-solrconfig
Closed

7662 make solrconfig less static#8320
poikilotherm wants to merge 60 commits intoIQSS:developfrom
poikilotherm:7662-solrconfig

Conversation

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Dec 21, 2021

What this PR does / why we need it:

Solr updates become easier when changes are trackable. Moving the config files around results in cumbersome searches for change history. Also, it's hard to track why and when certain additions happened. Declaring necessary changes to a vanilla solrconfig.xml makes this much easier.

This pull request will also add necessary bits to installer, docs etc to make use this. Tools like container builds and dataverse-ansible @donsizemore might reuse the XSLTs directly.

TODOs:

  • Integrate distributable ZIP with installer Makefile
  • Make it easier to configure stuff like Solr request header size, limits, etc
  • Add solr-maven-plugin to start a Solr server with a single Maven command (great for ephemeral dev usage)
  • Add a container profile to create images from Maven
  • Add GitHub Workflows to test the module on changes
  • Add docs to installation guide (changes to how this works when installing Solr)
  • Add docs to development guide (changes to how you can use this to simplify your dev setup)
  • Add docs about the container image + tips and tricks
  • Add release notes
  • Speak to @donsizemore how to integrate this with dataverse-ansible

Which issue(s) this PR closes:

Special notes for your reviewer:
This is a draft as long as outstanding TODOs are present.

Note the change to Sphinx guides retrieving the Solr version from the parent POM to make updates easier and avoid missing changes.

Suggestions on how to test this:
mvn -f modules/solr-configset/pom.xml package to compile the config, create the ZIP and execute integration test.
Usual integration testing should make sure the index configuration is working properly.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope.

Is there a release notes update needed for this change?:
Yes that will be necessary.

Additional documentation:
None so far. Will include updated docs later.

@poikilotherm poikilotherm added Type: Feature a feature request Feature: Installer Feature: Installation Guide User Role: Sysadmin Installs, upgrades, and configures the system, connects via ssh Component: Containers Anything related to cloudy Dataverse, shipped in containers. labels Dec 21, 2021
@poikilotherm poikilotherm self-assigned this Dec 21, 2021
@donsizemore
Copy link
Contributor

I take it someone saw my morning's failed attempt at a drop-in replacement of Solr-8.11.1?

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Dec 23, 2021

I take it someone saw my morning's failed attempt at a drop-in replacement of Solr-8.11.1?

Actually @donsizemore : no I didn't . This one is completely selfish, wanting to create an up-to-date solrconfig.xml for container images with arbitrary Solr versions.

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Dec 23, 2021

@donsizemore WDYT: should the dvinstall.zip ship with an already adapted solrconfig.xml? The Makefile could download the Solr release and patch the file before packaging it.

Or should we even ship a complete Solr configset? (Doing this in container images right now) Installs could untar that one and create the core from it...

What would you prefer for gdcc/dataverse-ansible?

@coveralls
Copy link

coveralls commented Feb 28, 2022

Coverage Status

Coverage remained the same at 18.969% when pulling ff0e91c on poikilotherm:7662-solrconfig into e713194 on IQSS:develop.

@poikilotherm poikilotherm force-pushed the 7662-solrconfig branch 2 times, most recently from a8c783b to 490ea3c Compare March 11, 2022 23:15
Simple Makefile to download Solr, extract the default configset
and create a Dataverse flavored one.

- Uses Maven to find the Solr distribution version to download.
- Uses xsltproc to apply our XSLT transformations to sorlconfig.xml
- Replaces the managed-schema with the static one we provide
- Zips the configset to make it distributable as artifact
As the SolrJ dependency is going to be used within more than the WAR module,
it is getting moved to the parent POM.

In the same go, the dependency is versioned with a property "solr.version",
that makes it easy to switch to a different Solr version.

The property has been present before, but now is updated to 8.11.1 to
reflect the latest upstream changes.

The Sphinx guide conf.py is also updated to retrieve the property from
the parent POM instead of the WAR POM.
The TSV parser needs to verify if a certain line is a header line
and matching the spec. To avoid duplicated validation code,
this validator can be used with an arbitrary list of strings
(so it can be reused for blocks, fields and vocabularies).

As we will need to validate URLs in certain fields, this validator
also offers a helper function to create predicates checking for valid URLs.
The Block POJO now contains the header specification (uses the Validator class
to perform the validation) and allows to parse a line into a List.
A later relaxation of the spec allowing for reordering of fields, etc is possible,
while the calling code of the parser can reuse the found header definition.

A builder pattern is used to parse and validate the actual definition.
As the block may only be used once the definition, all fields and vocabularies
have been parsed (if the is an error within the TSV the parsing has to fail!),
the builder pattern is a natural match to that.
This simple class will allow to make the parser somewhat configurable,
so future changes and command line options can be integrated more
easily.
Instead of defining a static trigger, we want to be able to configure
the trigger sign. Due to this, we use the keyword only and move the
trigger handling into the ParsingState (which is analysing the line for
state transition anyway).
- Implement first details of the Block POJO
- Change parsing with BlockBuilder to use an internal state with a not-exposed Block object
- The BlockBuilder may manipulate the Block, but after calling build() the calling code will
  have no option to edit the POJO (proper capsulation and sealing)
Add field types and make them usable as predicates for fields.
Add test.
Predicates are not null safe - need to make validate() check for null
Includes all the predicates according to spec and test for them.
@pdurbin
Copy link
Member

pdurbin commented Oct 1, 2022

@poikilotherm does this PR close the following issue? (And if so, can we please add the "closes" syntax to the description?)

@mreekie mreekie added the bk2211 label Nov 1, 2022
@mreekie mreekie removed the bk2211 label Jan 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Containers Anything related to cloudy Dataverse, shipped in containers. Feature: Installation Guide Feature: Installer Type: Feature a feature request User Role: Sysadmin Installs, upgrades, and configures the system, connects via ssh

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Solr 8.8 upgrade - remaining issues with solrconfig.xml solrconfig.xml is using deprecated cache implementation

5 participants