Conversation
| _ -> ok | ||
| end | ||
| end, | ||
| case test_util:wait(WaitFun, 5000) of |
There was a problem hiding this comment.
The default timeout for an individual test is 5 seconds.
Running the test takes time, and the wait function can take up to 5 seconds.
Adding them together would exceed 5 seconds and cause timeout.
Shall we use a smaller number, like 4500?
There was a problem hiding this comment.
Works as expected.
(node1@127.0.0.1)1> couch_server:get_js_engine().
<<"quickjs">>
(node1@127.0.0.1)2> config:set("couchdb", "js_engine", "spidermonkey").
ok
(node1@127.0.0.1)3> couch_server:get_js_engine().
<<"spidermonkey">>
(node1@127.0.0.1)4> config:set("couchdb", "js_engine", "quickjs").
ok
(node1@127.0.0.1)5> couch_server:get_js_engine().
<<"quickjs">>Just curious what happens if I make a typo... Is that possible to avoid this error?
(node1@127.0.0.1)6> config:set("couchdb", "js_engine", "spider_monkey").
ok
*** ERROR: Shell process terminated! (^G to start new job) ***
$ curl $DB
curl: (7) Failed to connect to 127.0.0.1 port 15984 after 0 ms: Couldn't connect to serverThere was a problem hiding this comment.
Good idea @jiahuili430 to lower the time a bit. It will blow up in both cases with an exception but the one in test_util will have a more helpful message
There was a problem hiding this comment.
Currently it's somewhat intentional that it would crash on function clause on an invalid engine setting. But perhaps it's better to emit an error in the logs and then use the default? But I think that's for a separate PR.
There was a problem hiding this comment.
Made a PR to handle it better : just stick with the default value #5329
| ?assertNotEqual(OldVal, get_proc_manager_default_js()), | ||
| NewVal = get_proc_manager_default_js(), | ||
|
|
||
| % Toggle back to the original default (test config:delete/3) |
There was a problem hiding this comment.
| % Toggle back to the original default (test config:delete/3) | |
| % Toggle back to the original default |
There was a problem hiding this comment.
Not sure if you want to leave that in?
There was a problem hiding this comment.
Yeah, that's a note that we're testing the case of a user running config:delete/3 and returning to the set of defaults.
|
I’m leaving this here by way of an official record for reviewing this. In the old times, we loaded the Erlang MFAs from CouchDB config and then executed those. This allowed CouchDB admins to change what native code would be running as the This patch makes loading those values dynamic again, but by the above analysis, that should not open us up to new (or old) attack vectors, as someone who can change the env of a running process is already further ahead than this vector would get them. Can someone double check my logic here? [1]: |
|
I don't see the pressing need for online reconfiguration here. As Jan pointed out we had it, and removed it (for security reasons) and with very little complaint (I don't recall any tbh). There is some protection in not allowing an attacker that can manipulate a process's environment from magically changing what (potentially compromised) native code it executes. Sure, an attacker that can do that can probably cause a couchdb restart or crash, though that is at least a tangible event, it could be logged, the pid changes, etc. So the "from env" protection and the "no dynamic change after boot" protection are both providing something. |
bb96489 to
240c2cc
Compare
|
could we put just |
|
e.g 1) couchdb boots with all the env vars it needs to know how to use spidermonkey and quicks 2) a runtime toggle as to which is used for new |
@rnewson There is a need in deploying this change at a cloud db provider ;-) , where otherwise, it would require putting each node in maintenance mode, restarting it, putting it back in production and so on. It also makes it easy for any user to toggle the config. @janl Thanks for the background. The analysis is correct that this doesn't allow adding any new MFA based options or changing arguments. Unless they attach a debugger to the beam process and update the OTP cached env red-black tree https://github.com/erlang/otp/blob/master/erts/emulator/sys/common/erl_osenv.c |
Hmm, that's what this PR does? There is already a config change handler, it reconfigures all the native (MFA) bits but didn't do the js engine switch.
That's effectively what's happening here. Erlang/OTPs' env is cached https://github.com/erlang/otp/blob/master/erts/emulator/sys/common/erl_osenv.c and not sure how it would be possible to alter it without already having high-enough access to the process and do a lot worse damage. We could cache it in the process again I suppose, but the danger is environment variables hold security bits, passwords, tokens, etc so a proc manager crashing would spill those into to the logs. Maybe cache the filtered prefix bits only? It still seems redundant? |
as I read it, it moves the pull-from-env from
gut feeling here, and no strong opinion, but the way we have it before is strictly better at expressing in our code what we mean to happen rather than relying on how OTP works today. Of course, if we did this every time, we reimplemented all of OTP and the is obviously silly. |
|
@janl
I can cache it I guess, even though I can't think of a way anyone would change the runtime env var settings of the process. Even a regular C process, how would changing its environment variable set work after it already started? |
| handle_config_change("query_server_config", _, _, _, _) -> | ||
| gen_server:cast(?MODULE, reload_config), | ||
| {ok, undefined}; | ||
| handle_config_change("couchdb", "js_engine", _, _, _) -> |
There was a problem hiding this comment.
Maybe out of scope of this PR, but we might want to handle handle_config_change("native_query_servers", _, _, _, _) in a similar way.
There was a problem hiding this comment.
Maybe out of scope of this PR, but we might want to handle
handle_config_change("native_query_servers", _, _, _, _)in a similar way.
We already do, it's just above in lines 283:285
240c2cc to
d9c4ccd
Compare
This should make it possible to toggle between engines without having to restart the node. We already had a config listener so make sure it also sets up the js_engine bits, as well. Add a test to demonstrate it works both when setting a non-default value and reverting back to the default value.
d9c4ccd to
f77a170
Compare
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
In the previous PR [1] Jessica had discovered that an invalid engine setting crashes the server. Initially I had thought we would be strict about it and just let it crash, but that seems a bit harsh for this. Especially if there is a large cluster and configuration is pushed to all nodes. It would crash the whole cluster. Instead, since we have a default, stick with that. Especially now users can validate if a non-default js_engine is active from the welcome endpoint. [1] #5327
This should make it possible to toggle between engines without having to restart the node.
We already had a config listener so make sure it also sets up the
js_enginebits, as well.Add a test to demonstrate it works both when setting a non-default value and reverting back to the default value.