Skip to content

Refuse startup with no server admin set up#2389

Merged
janl merged 8 commits intomasterfrom
feat/disable-launch-without-admin-password
Jan 4, 2020
Merged

Refuse startup with no server admin set up#2389
janl merged 8 commits intomasterfrom
feat/disable-launch-without-admin-password

Conversation

@janl
Copy link
Copy Markdown
Member

@janl janl commented Jan 3, 2020

In the context of #2191, we wanted to forbid starting CouchDB without an admin user. This patch does this.

Please bike shed on the error message and whatnot. I’ve picked a format that is different from all other error messages so that this particular thing very visible before the boot error garbage is thrown. If somebody knows how make this even nicer, do let me know.

Here is an example of how CouchDB exits without a server admin configured:

[os_mon] cpu supervisor port (cpu_sup): Erlang has closed
[info] 2020-01-03T18:25:20.832839Z node1@127.0.0.1 <0.281.0> -------- Preflight check: Asserting Admin Account

[info] 2020-01-03T18:25:20.835798Z node1@127.0.0.1 <0.281.0> -------- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  No Admin Account Found, aborting startup.                  
  Please configure an admin account in your local.ini file.  
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{"init terminating in do_boot",{{error,{bad_return,{{couch_app,start,[normal,[]]},admin_account_required}}},[{boot_node,start_app,3,[{file,"dev/boot_node.erl"},{line,146}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{boot_node,start_app,3,[{file,"dev/boot_node.erl"},{line,136}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{boot_node,start_app,3,[{file,"dev/boot_node.erl"},{line,136}]},{lists,foldl,3,[{file,"lists.erl"},{line,1263}]},{init,start_em,1,[]},{init,do_boot,3,[]}]}}
init terminating in do_boot ({{error,{bad_return,{{_},admin_account_required}}},[{boot_node,start_app,3,[{_},{_}]},{lists,foldl,3,[{_},{_}]},{boot_node,start_app,3,[{_},{_}]},{lists,foldl,3,[{_},{_}]}

Crash dump is being written to: erl_crash.dump...done
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed

Copy link
Copy Markdown
Member

@rnewson rnewson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 3, 2020

ah needs some work to run with the test suite.

we probably need an override. How about setting an environment variable in make eunit, something like COUCH_TEST_ADMIN_PARTY_OVERRIDE that, if set, disables the mechanisms that refuses to start without an admin pass?

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 3, 2020

@janl Had this dance with @rnewson recently who tried to special case a setting in the .ini file. He was able to work around it using ./rel/files/eunit.ini for eunit, and modifying dev/run for the latter.

We already have a command line way to pass a static admin:password to dev/run so perhaps just pass that in and bake admin:password as the auth into the tests? I bet we can do it in one place for each of Elixir and JS, except maybe for any admin level tests.

Personally I'd rather not have an escape hatch on this one, otherwise people will find it and abuse it, just as they did when we rushed 2.0 out the door and left 5986 open.

@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 3, 2020

@wohali the problem is that we have tests that rely on setting/changing admin-party, so it’s not that we could just set a admin/pass, we’d have to rework many gnarly tests which might exceed our time budget for this.

I hear you on the escape hatch, my idea was make this a COUCH_TEST prefix. One additional thing we could do is add a little process that logs an error message once a minute or hour if that variable is set and no admin is configured, just to make sure folks get the message.

@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 3, 2020

One thing we’d not do is document this escape hatch, which sets it apart from 5986 which we didn’t leave open by accident, but advertised its use for administrative purposes.

Copy link
Copy Markdown
Member

@wohali wohali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment.

I was pondering a sasl-type log here (or the new logger that's replaced it) because I believe Adam was working on removing all multi-line logging in #1373 , but this is a special case. I think a multi-line error in this case is reasonable.

I wonder about perhaps not putting in the timer:sleep. systemd's DefaultStartLimitIntervalSec is 10s and DefaultStartLimitBurst is 5s, meaning systemd will only stop trying to restart a process if a process has been attempted to start more than 5 times in 10 seconds and failed each time. We probably want to allow it to mark couchdb.service as failed in the case of no configured admin user.

For Windows, we by default do not restart on failure. runit, of course, will try forever. And I believe your macOS installer doesn't setup as a service by default, but if it does, I'm not sure the default launchd behaviour.

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 3, 2020

@janl ok, if you don't want to do the heavy lifting around fixing all the tests as a blocker, I understand. We at least have an easy way to get something into the .ini file, that might be better than an env var...unless you want to do something sneaky like put it in the app context (though IIRC that requires a recompile, doesn't it?)

Good idea on periodic logging. How about every 5 minutes? :) We should really be noisy about this.

Comment thread Makefile Outdated
@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 3, 2020

FYI Jan and I are having a sidebar on IRC; he's reducing the sleep to 500ms to deal with some couch_log latency problems.

When I get back from lunch, I will do a test package of this PR and see if systemd correctly throttles restarting when no admin is configured.

includes an admin party assert escape hatch for tests

adds a log message every 5 minutes, if escape hatch is enabled.

should play nice with systemd restart policies
@janl janl force-pushed the feat/disable-launch-without-admin-password branch from 314e566 to e8c2e7a Compare January 3, 2020 19:47
@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 3, 2020

squashed it all into one commit, includes the reduced timeout, escape hatch and annoyance reporter.

@janl janl requested a review from wohali January 3, 2020 19:48
@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 3, 2020

and the Makefile fix

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 3, 2020

Tests still failing, will need to look more closely

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 4, 2020

Aha, of course, my advice was wrong. I am working on the fix now.

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 4, 2020

This should have fixed it, tests are passing locally.

The unrelated Python change was to make make python-black succeed, the test was actually broken by the addition of $(TEST_OPTS) and was missing the formatting problem in the new file build-aux/show-test-results.py. (fyi @davisp )

Setting to approved, I'll merge tomorrow if no one else notices it first.

@wohali
Copy link
Copy Markdown
Member

wohali commented Jan 4, 2020

@janl tag. Forgot to update Makefile.win, would you please? thanks :) zzzz.

@janl
Copy link
Copy Markdown
Member Author

janl commented Jan 4, 2020

ported to Makefile.win as best I could, and simplified the annoyance process to be only spawned when the env var is set, instead of checking the env var on each iteration, which won’t ever change.

@AndyA
Copy link
Copy Markdown

AndyA commented Mar 4, 2020

This is a breaking change for many users. I understand the implications of Admin Party so this change just breaks a bunch of working setups.

Wasn't the fact that it broke your own tests a bit of a clue? Why isn't there an easy way to disable the check?

Would you accept a PR to make it possible to disable this check? I've tried setting COUCHDB_TEST_ADMIN_PARTY_OVERRIDE=1 but that didn't seem to work for me. Will investigate further but using options intended for testing doesn't seem like a good idea long term.

Thanks.

@wohali
Copy link
Copy Markdown
Member

wohali commented Mar 4, 2020

@AndyA That's why this became 3.0, not 2.4. We introduced many breaking changes in 3.0, all of which are documented at http://docs.couchdb.org/en/stable/whatsnew/3.0.html#deprecated-feature-removal .

The lack of solid out-of-the-box security in CouchDB has been a concern for a very long time, expressed by many, and amplified by some high profile leaks of CouchDB-hosted data on the web. CouchDB 0.x was written in the spirit of an open web, open data, but as people started to rely on CouchDB for more secure needs, pressure increased on the project to move in this direction.

We wouldn't accept the proposed patch, sorry.

@AndyA
Copy link
Copy Markdown

AndyA commented Mar 4, 2020

Thanks @wohali. I understand the desire for this to be the default but don't see the downside to being able to disable it (with suitably dire warnings in the documentation).

As far as I can see the other breaking changes are mainly a concern for server admins. This potentially has far wider effects. We have many scripts that assume password free admin access all of which have to be fixed before we can move to 3.0.

we_understand_the_risks_and_we_still_want_to_party = "I PROMISE NOT TO SUE"

@janl
Copy link
Copy Markdown
Member Author

janl commented Mar 4, 2020

You're welcome to stay on 2.x or maintain a custom patch. The project is not going to change course here.

@AndyA
Copy link
Copy Markdown

AndyA commented Mar 4, 2020

It seems that anyone who's installing from https://apache.bintray.com/couchdb-deb will have automatically received the breaking change.

Is there a repo address that's pinned to 2.x?

@wohali
Copy link
Copy Markdown
Member

wohali commented Mar 4, 2020

@AndyA Standard deploy practice should be:

apt install couchdb=2.3.1

If you're not pinning your production environment to specific versions, you're going to get unexpected surprises 😉

@AndyA
Copy link
Copy Markdown

AndyA commented Mar 4, 2020

Thanks.

This isn't production and I'm aware of how to pin a particular version :)

@wohali
Copy link
Copy Markdown
Member

wohali commented Mar 4, 2020

@AndyA Sorry about that. We're appreciative of your support of CouchDB over the years :)

We're not planning to separate out into different release streams for 3.x, but we probably will for CouchDB 4, since it's a complete replacement of the storage and clustering code.

@AndyA
Copy link
Copy Markdown

AndyA commented Mar 4, 2020

Thanks @wohali :)

zetavg added a commit to zetavg/couchdb-action that referenced this pull request Nov 18, 2023
zetavg added a commit to zetavg/couchdb-action that referenced this pull request Nov 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants