Skip to content

Comments

9434 app container for dev#9439

Merged
kcondon merged 24 commits intodevelopfrom
9434-app-container
Mar 20, 2023
Merged

9434 app container for dev#9439
kcondon merged 24 commits intodevelopfrom
9434-app-container

Conversation

@poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Mar 13, 2023

What this PR does / why we need it:

This pull request will add capabilities to create a Dataverse application image and run it along all necessary dependencies.

Which issue(s) this PR closes:

Special notes for your reviewer:
There are some TODOs:

  • Get Dataverse communicating with maildev properly
  • (Optional) Maybe add some more resource limitations. Maybe tweak Dataverse container to fit into less than 2 GiB of RAM
  • Add docs (can be copied from 9415 containers #9424 and extend from there). DONE.

Suggestions on how to test this:
Clone/switch. Ensure to have Docker running. Execute mvn -Pct clean package docker:run, then bootstrap when startup is done with ./scripts/dev/docker-final-setup.sh and go to localhost:8080.

Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Nope.

Is there a release notes update needed for this change?:
Yes, included.

Additional documentation:
None yet.

@poikilotherm poikilotherm added Feature: Installation Guide User Role: Sysadmin Installs, upgrades, and configures the system, connects via ssh Component: Containers Anything related to cloudy Dataverse, shipped in containers. D: DataverseInDocker Deliverable of running Dataverse within Docker labels Mar 13, 2023
@poikilotherm poikilotherm requested review from GPortas and pdurbin March 13, 2023 16:06
@poikilotherm poikilotherm self-assigned this Mar 13, 2023
@pdurbin pdurbin self-assigned this Mar 14, 2023
@pdurbin
Copy link
Member

pdurbin commented Mar 14, 2023

@poikilotherm this is working great!

Thanks for the heads up about email. I tried using the "Support" link in the header but when I clicked "Send Message" I got this error:

dev_dataverse> Caused by: java.lang.NullPointerException
dev_dataverse> 	at edu.harvard.iq.dataverse.MailServiceBean.setContactDelegation(MailServiceBean.java:233)

Honestly, I can live without email working but it would be nice to see if we can fix this.

I can also live without the TODO above about limiting resources (for now). Future PR, I'd say. I marked it as optional.

I added docs and a release note. I tried to indicate that this is dev only.

Hopefully the last thing is the SMTP thing. Or, again, I'm fine with fixing later.

There is a TON of value in this PR. Thanks!!

@pdurbin
Copy link
Member

pdurbin commented Mar 14, 2023

dev_dataverse> Caused by: java.lang.NullPointerException
dev_dataverse> at edu.harvard.iq.dataverse.MailServiceBean.setContactDelegation(MailServiceBean.java:233)

Here's a stacktrace of the MailDev/SMTP problem: stacktrace.txt

Apparently, fromAddress here is null:

String personal = fromAddress.getPersonal() != null

In the domain.xml, the from address is dataverse@localhost:

payara@dataverse:~$ grep mail-resource /opt/payara/appserver/glassfish/domains/domain1/config/domain.xml
<mail-resource auth="false" host="postfix" from="dataverse@localhost" user="dataversenotify" jndi-name="mail/notifyMailSession"></mail-resource>

The from address is not defined as a database setting:

payara@dataverse:~$ curl -s http://localhost:8080/api/admin/settings | jq .
{
  "status": "OK",
  "data": {
    ":Authority": "10.5072",
    ":Shoulder": "FK2/",
    "BuiltinUsers.KEY": "burrito",
    ":DoiProvider": "FAKE",
    ":BlockedApiPolicy": "localhost-only",
    ":Protocol": "doi",
    ":AllowSignUp": "yes",
    ":SignUpUrl": "/dataverseuser.xhtml?editMode=CREATE",
    ":UploadMethods": "native/http"
  }
}

Setting it might fix it.

If I set :SystemEmail like this...

$ curl -X PUT -d 'dataverse@localhost' http://localhost:8080/api/admin/settings/:SystemEmail
{"status":"OK","data":{":SystemEmail":"dataverse@localhost"}}

I get a new error:

dev_dataverse> [#|2023-03-14T11:51:01.758+0000|INFO|Payara 5.2022.4||_ThreadID=87;_ThreadName=http-thread-pool::http-listener-1(3);_TimeMillis=1678794661758;_LevelValue=800;|
dev_dataverse>   com.sun.mail.util.MailConnectException: Couldn't connect to host, port: postfix, 25; timeout -1;
dev_dataverse>   nested exception is:
dev_dataverse> 	java.net.ConnectException: Connection refused (Connection refused)
dev_dataverse> 	at com.sun.mail.smtp.SMTPTransport.openServer(SMTPTransport.java:2209)
dev_dataverse> 	at com.sun.mail.smtp.SMTPTransport.protocolConnect(SMTPTransport.java:722)
dev_dataverse> 	at javax.mail.Service.connect(Service.java:342)
dev_dataverse> 	at javax.mail.Service.connect(Service.java:222)
dev_dataverse> 	at javax.mail.Service.connect(Service.java:171)
dev_dataverse> 	at javax.mail.Transport.send0(Transport.java:230)
dev_dataverse> 	at javax.mail.Transport.send(Transport.java:100)
dev_dataverse> 	at edu.harvard.iq.dataverse.MailServiceBean.sendMail(MailServiceBean.java:217)
dev_dataverse> 	at edu.harvard.iq.dataverse.MailServiceBean.sendMail(MailServiceBean.java:181)

From my Mac, the SMTP server is running on port 25...

pdurbin@air dataverse % nc localhost 25
220 postfix ESMTP

... but the dataverse-1 (app container) can't reach it:

payara@dataverse:~$ nc postfix 25
payara@dataverse:~$ 

Why?

@GPortas GPortas self-assigned this Mar 14, 2023
@GPortas
Copy link
Contributor

GPortas commented Mar 14, 2023

@poikilotherm

I have done the following steps so far:

  1. Built the image through mvn -Pct clean package
  2. Executed the containers through docker-compose -f docker-compose-dev.yml up
  3. Executed ./scripts/dev/docker-final-setup.sh when startup is done

Received this error within the docker-compose logs while applying the third step:

dev_dataverse_1         | [#|2023-03-14T13:25:45.002+0000|WARNING|Payara 5.2022.4|javax.enterprise.ejb.container|_ThreadID=81;_ThreadName=http-thread-pool::http-listener-1(1);_TimeMillis=1678800345002;_LevelValue=900;|
dev_dataverse_1         |   javax.ejb.EJBException: getSingleResult() did not retrieve any entities.
dev_dataverse_1         | 	at com.sun.ejb.containers.EJBContainerTransactionManager.processSystemException(EJBContainerTransactionManager.java:723)
dev_dataverse_1         | 	at com.sun.ejb.containers.EJBContainerTransactionManager.completeNewTx(EJBContainerTransactionManager.java:652)
dev_dataverse_1         | 	at com.sun.ejb.containers.EJBContainerTransactionManager.postInvokeTx(EJBContainerTransactionManager.java:482)
dev_dataverse_1         | 	at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4601)
dev_dataverse_1         | 	at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2134)
dev_dataverse_1         | 	at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2104)

. . .

Dataverse runs on 8080, but when trying to Log In with dataverseAdmin / admin credentials, the authentication fails. Am I missing something?

@poikilotherm
Copy link
Contributor Author

@GPortas please note that the login is "dataverseAdmin:admin1". That "1" is easy to forget.

Also, the "getSingleResult()" thing is very annoying but harmless. For some reason no one ever fixed this also it would be a low hanging fruit. Same goes for all the SQL exceptions which are triggered by us generating the DDL everytime we start the application.

All containers have static names now, which might be a problem down the road.
But at least consistent now.
Now Maildev will listen on port 25 as necessary to let Dataverse reach it.
Also adapted the storage path to be easier addressable with a volume.
Switched to using a tmpfs as other another initializer would be necessary
to switch the volume permissions (like with Solr).

Added configuring the SystemEmail to match the default value from init_2_config_payara.sh.
Otherwise no mail gets sent.
@pdurbin
Copy link
Member

pdurbin commented Mar 14, 2023

javax.ejb.EJBException: getSingleResult() did not retrieve any entities.

This happens with every installation of Dataverse, so it has nothing to do with Docker. It's caused by this code:

/**
 * @todo Move this to
 * AuthenticationServiceBean.createAuthenticatedUser
 */
boolean rootDataversePresent = false;
try {
    Dataverse rootDataverse = dataverseSvc.findRootDataverse();
    if (rootDataverse != null) {
        rootDataversePresent = true;
    }
} catch (Exception e) {
    logger.info("The root dataverse is not present. Don't send a notification to dataverseAdmin.");
}
if (rootDataversePresent && sendEmailNotification) {
    userNotificationSvc.sendNotification(au,
            new Timestamp(new Date().getTime()),
            UserNotification.Type.CREATEACC, null);
}

Which is triggered by creating the dataverseAdmin user:

echo "Setting up the admin user (and as superuser)"
adminResp=$(curl -s -H "Content-type:application/json" -X POST -d @data/user-admin.json "$SERVER/builtin-users?password=$DV_SU_PASSWORD&key=burrito")
echo $adminResp
curl -X POST "$SERVER/admin/superuser/dataverseAdmin"

Then the dataverseAdminUser creates the root collection and the error stops happening for future users. Let's please treat this as out of scope. Again, it's an old bug.

be sent.


Note that the script ``init_2_configure.sh`` will apply a few very important defaults to enable quick usage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure is this section, with implementation details, contains necessary information for someone using the image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this becomes very important if you configure the app to use some S3 storage... By switching the default storage to something other than "local", the local file storage will vanish.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the target user is a developer, someone hacking on the Dataverse software itself. I think these developers have enough information, right? Can we pick up this thread in a future pull requests when there is a different, non-developer audience?

Development Usage
=================

Please note! This Docker setup is not for production!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it cannot be used in production, please provide some insight why that is not a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. This section has a heading "Development Usage" and is targeting someone hacking on Dataverse and who wants to testrun changes in containers rather than a local installation of all components.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note! This Docker setup is not for production! 😄

We could add, "You're in the dev-usage doc!" 😄

We are deliberately walking before we run. 🚶 🏃

Intro
-----

Assuming you have `Docker <https://docs.docker.com/engine/install/>`_, `Docker Desktop <https://www.docker.com/products/docker-desktop/>`_,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just link to the OCI container spec (https://opencontainers.org/) without endorsing one implementation?

Copy link
Contributor Author

@poikilotherm poikilotherm Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we can't. We can add Rancher Desktop if you want. Other than that, the Docker Maven Plugin requires using some sort of Moby daemon (Docker is a supported, paid distribution of Moby). You will not be able to build an image with podman or JIB etc with the current configuration.

It also is important to keep in mind that a future vision is to use these images with Testcontainers for more extensive tests than available now. This will also require some kind of Docker/Moby.

Copy link
Contributor

@johannes-darms johannes-darms Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to have this information in the documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johannes-darms sounds fine but this PR is in "Ready for QA" and it would be a shame to lose momentum. If you can make a PR into this PR, I'm happy to look and hopefully merge.

-------------------

See also the :doc:`/container/index`.
The :doc:`testing` section mentions using docker-aio for integration tests. We do not plan to keep this project alive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depreciation plan or end of life notice would be nice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get you here. Do you want such a notice in here, right now? Or is this something you'd like to see in the future?

Copy link
Contributor

@johannes-darms johannes-darms Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of having the feature merged, I rather say in the future. However, saying that there are no plans on keeping docker-aio alive without given an EOL date isn't that nice to users of the package.

Copy link
Member

@pdurbin pdurbin Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! Is anyone (except me, occasionally) using docker-aio?!?

Copy link
Contributor

@GPortas GPortas Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have used docker-aio occasionally like you @pdurbin (Following the testing documentation, as it recommends to use it). I agree to point out the deprecation warning and describe it in any doc page that mentions docker-aio, to make the documentation more consistent.


``mvn -Pct clean package``

Now, start all the containers with a single command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that this only starts the dataverse container, and up and running dependencies (postgres,SMTP and solr) are needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? Did you test this? Because this should start all the dependencies, too, as configured in the compose file. Let me know if this doesn't work for you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, works on my machine! Containers go! 🤖 🤖 🤖 🤖

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it also works for me!

To avoid having to install service dependencies like PostgreSQL or Solr directly on your localhost, there is the alternative of using the ``docker-compose-dev.yml`` file available in the repository root. For this option you need to have Docker and Docker Compose installed on your machine.

The ``docker-compose-dev.yml`` file runs the necessary service dependencies to support a development Dataverse installation running on localhost. In addition to PostgreSQL and Solr, it also runs a SMTP server.
The ``docker-compose-dev.yml`` can be configured to only run the service dependencies necessary to support a Dataverse installation running directly on localhost. In addition to PostgreSQL and Solr, it also runs a SMTP server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I configure that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is described below. Could you make a comment with a suggestion (Ctrl+g) on how to phrase this better? Thx!

@johannes-darms
Copy link
Contributor

The approach is also missing R and Rserve. They are listed as a prerequisite within the documentation but it looks like they are not really needed. If R/RServe is optional the documentation must be updated, otherwise another container wrapping RServe is needed.

https://guides.dataverse.org/en/latest/installation/prerequisites.html?highlight=rserve

@poikilotherm
Copy link
Contributor Author

Thanks @johannes-darms for the extensive review of the PR! I added comments to your comments.

The approach is also missing R and Rserve. They are listed as a prerequisite within the documentation but it looks like they are not really needed. If R/RServe is optional the documentation must be updated, otherwise another container wrapping RServe is needed.

https://guides.dataverse.org/en/latest/installation/prerequisites.html?highlight=rserve

Two things here: 1) Rserve is not a necessary component for developing Dataverse, which is what this is about. 2) Technically it even is not a requirement for an installation if you don't do any ingest. But yes, that should be stated in the docs. Would you be able to create an issue and/or a pull request?

@pdurbin
Copy link
Member

pdurbin commented Mar 16, 2023

not a necessary component for developing Dataverse, which is what this is about

Good point. @poikilotherm how do you feel about a new title for this PR?

"app container" -> "app container for dev"

@johannes-darms thanks for all the feedback! I also left comments on your comments. Much appreciated!!

@poikilotherm poikilotherm changed the title 9434 app container 9434 app container for dev Mar 16, 2023
@GPortas
Copy link
Contributor

GPortas commented Mar 17, 2023

@poikilotherm

Although the asadmin commands are enough to manage the Payara server, sometimes it is very useful to access the Payara administration console for better visualization.

By default, Payara exposes the administration console on port 4848. It would be interesting to map this port to outside the container within the docker-compose file.

I've tested this with this branch, by adding - "4848:4848" to the dev_dataverse mapped ports.

The problem is that the admin console now expects user and password credentials to login (As far as I know, these credentials are not documented). With the normal installation (without containers), we don't need credentials for accessing the Payara admin console. Is that login requirement a configuration that comes from the base image? It would be great to make admin credentials configurable through container environment variables.

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Mar 17, 2023

@GPortas yes, this comes from the base image. We could add a section to the base image documentation, similar to upstream Payara as this works exactly the same in our custom base image.

In fact, the credentials are available via the (currently undocumented) env vars ADMIN_USER and ADMIN_PASSWORD, but changing these env vars doesn't update the running domain.

@GPortas probably this is beyond scope for this PR. Would you add a note to https://docs.google.com/document/d/15-sqdKzpCgQBtaPaAGYMaqcxsQXmt-2CwAVNPmmAPRY so we don't forget?

@poikilotherm
Copy link
Contributor Author

@pdurbin @johannes-darms I added a few pieces of information about a future sunset of docker-aio in 7bcadab. Let me know if that helps. Feel free to hack on it.

@pdurbin
Copy link
Member

pdurbin commented Mar 17, 2023

a future sunset of docker-aio

Related:

@kcondon kcondon assigned kcondon and unassigned kcondon Mar 20, 2023
@kcondon
Copy link
Contributor

kcondon commented Mar 20, 2023

Issues so far:

  1. Running into build failure due to test failures. Am I doing something wrong?
    app_contain_bld_fail.txt
    [Kevin] I think it was because I didn't do switch -c branch, gets further now, see issue 2
  2. Mvn script doesn't complete, pauses, shows errors. Ran set up script and localhost:8080 shows only Payara.
    mvn_err_pause.txt
    final_setup_run.txt

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Mar 20, 2023

Hi @kcondon, thx for trying this. The log says that the database connection could not be established. It looks like you already had the local volumes from before (most likely #9414 / #9417), so there might be a different database already in place. Could you delete the whole ./docker-dev-volumes folder and try again? Thx!

Without setting the "env var" for docker compose within the
Maven POM, the value will not be set, leading to a broken
DB connection. The DMP does not read the values from .env.
@kcondon kcondon merged commit a9a8311 into develop Mar 20, 2023
@kcondon kcondon deleted the 9434-app-container branch March 20, 2023 19:17
@poikilotherm
Copy link
Contributor Author

Woo-hoo thanks for merging @kcondon ! Much appreciated, as always!!

@pdurbin pdurbin added this to the 5.14 milestone May 10, 2023
@cmbz cmbz added FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) and removed FY26 Sprint 14 FY26 Sprint 14 (2025-12-31 - 2026-01-14) labels Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Component: Containers Anything related to cloudy Dataverse, shipped in containers. D: DataverseInDocker Deliverable of running Dataverse within Docker Feature: Installation Guide Size: 10 A percentage of a sprint. 7 hours. User Role: Sysadmin Installs, upgrades, and configures the system, connects via ssh

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Build locally a Dataverse app image for dev

6 participants