Skip to content

Conversation

@TimLFletcher
Copy link
Contributor

https://jira.issues.couchbase.com/browse/DOC-13694

Part of a CBSE so of some priority.

Split lock is a feature that locks memory when an atomic instruction is split along two cache lines. When working as intended, it can dramatically slow CB processes for no major benefit. Thus, some information on how to mitigate.

I've added details to the deployment guidelines as well as troubleshooting doc. Also did a minor cleanup on the troubleshooting doc although I admit this is basically a drop in the ocean. Links tested and working.

@TimLFletcher
Copy link
Contributor Author

I'll also note that I did consider making a partial for this. But with 2 only pages and with info that is highly unlikely to change, I didn't think a partial would reduce the complexity of maintenance.

Copy link
Contributor

@sarahlwelton sarahlwelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some fixes please :)

** xref:install:thp-disable.adoc[Disable THP]
** xref:install:install-swap-space.adoc[Configure Kernel Swappiness]
** xref:install:install-security-bp.adoc[Security Considerations]
** xref:install:install-splitlock-mitigation.adoc[Split lock mitigation]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
** xref:install:install-splitlock-mitigation.adoc[Split lock mitigation]
** xref:install:install-splitlock-mitigation.adoc[]

Title case for topic titles. Also no need to add this, it will populate from the title in the .adoc file. But change to title case if you do want to keep it, please.

xref:deployment-considerations-lt-3nodes.adoc[About Deploying Clusters With Less Than Three Nodes]

| *Split lock*
| On some CPUs, the kernel's split lock mitigation can cause a severe performance drop when a process triggers a misaligned atomic memory access ("split lock").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the term in the definition that we're providing for it? Can we find a better way to phrase this?

Also please try to avoid using quotation marks.

@@ -0,0 +1,42 @@
= Split Lock Mitigation

:description: Disable Linux split lock mitigation on affected systems
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:description: Disable Linux split lock mitigation on affected systems
:description: Learn how to disable Linux split lock mitigation on affected systems.

Do you want to add a keywords attribute to your front matter, too?

[abstract]
{description}

On some Linux kernels that support split lock detection, the kernel's split lock mitigation can cause a severe performance drop when a process triggers a misaligned atomic memory access ("split lock").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before, just having the word in use before we define it seems... Off.

Disabling split lock mitigation changes kernel behavior intended to reduce the impact of split locks.
Apply this setting only on systems where you have confirmed it's contributing to performance degradation.

For background, see: https://lwn.net/Articles/790464/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please properly introduce and format the link. Give users some hint about where you're sending them.

See: https://docs.couchbase.com/styleguide/links.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I will push back on the dev for including this. It's not an official source.... don't like it.

Comment on lines 256 to 277
=== Resolution

Disable split lock mitigation on the affected Couchbase Server host.

. Permanent (boot-time) fix
+
Add `split_lock_detect=off` to the kernel command line and reboot the host.
+
[source,bash]
----
# Example kernel cmdline parameter
split_lock_detect=off
----

. Immediate (runtime) mitigation (kernel-dependent)
+
If present on your kernel, you can disable mitigation via `/proc`:
+
[source,bash]
----
echo 0 > /proc/sys/kernel/split_lock_mitigate
----
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to repeat this or just link to the page with details? No need for a partial, just a link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I will leave some details on the symptoms and link to the fix.

Typically, to enable certain features like client-server SSL, or secure XDCR; all the nodes of the cluster to be at a required compatibility level or higher that supports the feature.

Difficulties communicating with the cluster. Displaying cached information.::
== Difficulties communicating with the cluster. Displaying cached information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Difficulties communicating with the cluster. Displaying cached information.
== Difficulties Communicating With The Cluster. Displaying Cached Information.

This error often occurs due to a networking issue, particularly around DHCP and DNS resolution.

Cluster version compatibility mismatch::
== Cluster version compatibility mismatch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== Cluster version compatibility mismatch
== Cluster Version Compatibility Mismatch

----

IP address seems to have changed::
== IP address seems to have changed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== IP address seems to have changed
== IP Address Seems To Have Changed

{description}

File descriptor and core file size limits::
== File descriptor and core file size limits
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
== File descriptor and core file size limits
== File Descriptor and Core File Size Limits

@TimLFletcher
Copy link
Contributor Author

Rewrote as a procedure

Copy link
Contributor

@sarahlwelton sarahlwelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one nit.

If the CPU supports split lock detection, a misaligned atomic memory access can trigger Linux split lock mitigation.
The mitigation can artificially slow execution, causing a pronounced performance drop for the affected workload.

For mitigating information, see the article on xref:install:install-splitlock-mitigation.adoc[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For mitigating information, see the article on xref:install:install-splitlock-mitigation.adoc[]
For more information about how to resolve issues with split lock mitigation, see xref:install:install-splitlock-mitigation.adoc[].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

Copy link
Contributor

@sarahlwelton sarahlwelton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@TimLFletcher TimLFletcher merged commit 8eb4512 into release/8.0 Jan 26, 2026
@TimLFletcher TimLFletcher deleted the DOC-13694 branch January 26, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants