Conversation
This avoids needlessly making cross-cluster fabric:update_docs(Db, [], Opts) calls.
- fix function_clause error on invalid DB security objects when the request body of PUT db/_security endpoint is not a correct json format Closes #1384
Previously `end_time` was generated converting the start_time to universal, then passing that to `httpd_util:rfc1123_date/1`. However, `rfc1123_date/1` also transates its argument from local to UTC time, that is it accepts input to be in local time format. Fixes #1841
There was a subtle bug when opening specific revisions in
fabric_doc_open_revs due to a race condition between updates being
applied across a cluster.
The underlying cause here was due to the stemming after a document had
been updated more than revs_limit number of times along with concurrent
reads to a node that had not yet made the update. To illustrate lets
consider a document A which has a revision history from `{N, RevN}` to
`{N+1000, RevN+1000}` (assuming revs_limit is the default 1000). If we
consider a single node perspective when an update comes in we added the
new revision and stem the oldest revision. The docs the revisions on the
node would be `{N+1, RevN+1}` to `{N+1001, RevN+1001}`.
The bug exists when we attempt to open revisions on a different node
that has yet to apply the new update. In this case when
fabric_doc_open_revs could be called with `{N+1000, RevN+1000}`. This
results in a response from fabric_doc_open_revs that includes two
different `{ok, Doc}` results instead of the expected one instance. The
reason for this is that one document has revisions `{N+1, RevN+1}` to
`{N+1000, RevN+1000}` from the node that has applied the update, while
the node without the update responds with revisions `{N, RevN}` to
{N+1000, RevN+1000}`.
To rephrase that, a node that has applied an update can end up returning
a revision path that contains `revs_limit - 1` revisions while a node
wihtout the update returns all `revs_limit` revisions. This slight
change in the path prevented the responses from being properly combined
into a single response.
This bug has existed for many years. However, read repair effectively
prevents it from being a significant issue by immediately fixing the
revision history discrepancy. This was discovered due to the recent bug
in read repair during a mixed cluster upgrade to a release including
clustered purge. In this situation we end up crashing the design
document cache which then leads to all of the design document requests
being direct reads which can end up causing cluster nodes to OOM and
die. The conditions require a significant number of design document
edits coupled with already significant load to those modified design
documents. The most direct example observed was a clustered that had a
significant number of filtered replications in and out of the cluster.
This server admin-only endpoint forces an n-way sync of all shards across all nodes on which they are hosted. This can be useful for an administrator adding a new node to the cluster, after updating _dbs so that the new node hosts an existing db with content, to force the new node to sync all of that db's shards. Users may want to bump their `[mem3] sync_concurrency` value to a larger figure for the duration of the shards sync. Closes #1807
It has a fix to revert user socket buffer size to 8192 and also
allow setting this buffer values directly (not necessarily
via {recbuf, ...}).
Fixes #1810
Warning:
2.19.0 blacklists a series of OTP releases: 21.2, 21.2.1, 21.2.2
This is done via a runtime check of the ssl application version.
The blacklist seems valid as there is a bug which prevents data from
being delivered on TSL sockets. That could affect either CouchDB
server side (chttpd) or replication client side (ibrowse).
This restrict _purge and _purged_infos_limit to server admin in terms of the security level required to run them. Fixes #1799
This commit introduces a new option `snooze_period_ms` (measured in milliseconds), and deprecates `snooze_period` while still supporting it for obvious legacy reasons.
The Makefile target builds a python3 venv at .venv and installs black if possible. Since black is Python 3.6 and up only, we skip the check on systems with an older Python 3.x.
|
Hi @janl , c5d9cfe (#1766) <-- you missed this one Would you consider also cherry-picking these minor fixes? These are all small in scope but moderate in covering some corner cases, especially upgrade-related, so I think they'd be good fits for a patch release. c68863a (#1808) #1794 which is: #1824 which is: |
|
Heya @wohali most of these are in. Can you double check and dedupe? :) |
|
@janl updated to dedupe, sorry about that, was going off the wrong info |
|
@wohali thanks for the dedupe, sorry it wasn’t clearer what was included already. I may have read things wrong when skimming, but I assumed this was only relevant past the partitioned databases commit. Happy to reconsider if this is generally useful cc @davisp. I had ruled dep ups for other than critical things to be out of scope for a .1, but on reread I do agree we should do this one.
felt a bit risqué for a .1, but happy to include.
Also thought this was moving around things too much for a .1, but am happy to be convinced otherwise (cc @nickva) Any I didn’t comment on I agree on adding. Will do so over the weekend while Fauxton gets into shape. |
Most changes in the PR was a code move to copy streams logic to its own fabric module in a separate commit: The main logic was here: I think it would mostly affect larger clusters with many requests timing out and being cleaned up improperly so they'd leak their rexi workers. The other ones affect might be smaller embedded system with restrictive resource (low max_dbs_open value). But maybe not as critical for average CouchDB deployments and it's a bug that's been there for years, so I can see keeping it back to reduce the .1 commit set. Oh and thank you for helping with 2.3.1! |
This enables backwards compatbility with nodes still running the old version of fabric_rpc when a cluster is upgraded to master. This has no effect once all nodes are upgraded to the latest version.
This fixes inability to set keys with regex symbols in them
This adds an API call for looking up a single design doc regardless of whether the database is clustered or not.
The underlying clustered _all_docs call can cause significant extra load during compaction.
This ensures that admin password hashes are the same on all nodes when passwords are set directly on each node rather than through the coordinator node.
|
@jaydoane merged, thanks! |
Mainly to get a CI build status for this set of cherry-picked commits between 2.3.0 and master.