Skip to content

Conversation

@kgiusti
Copy link
Contributor

@kgiusti kgiusti commented Jul 31, 2020

No description provided.

@kgiusti kgiusti added WIP PR is a Work In Progress proposal Router Improvement Proposal labels Jul 31, 2020
Copy link
Contributor

@jiridanek jiridanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

////

= Mutex Locking Validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose a header for RIPs, consisting of the following metadata.

Authors: Jane Doe
Status: Submitted | Superseded | Implemented
Last updated: 2020-04-05

(I am taking this from https://docs.google.com/document/d/1ef7_drjTl4NnzfAmGPzC4_R9vGHkpIzUrlGhxPqTbwg/edit)

////

= Mutex Locking Validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a README.adoc that explains what the RIPs are. I think that the following phrase (or a variant of it) should be part of it

It’s possible that designs change as they are implemented in practice. The published design documents capture the initial design, and not the ongoing changes as designs are implemented.

Always go to the documentation for descriptions of current Qpid Dispatch functionality.

(this is from https://bazel.build/designs/)

In order to track lock nesting within a thread each thread will
maintain a per-thread stack of active locks. Taking a new lock will
result in that lock being pushed onto the stack. Attempting to push a
lock in the wrong priority order will result in calling abort().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you intend to check that the lock being unlocked is the one on top of the lock stack?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example would be really helpful. Assign some priorities to existing locks (connection work lock, log lock, python lock, etc.) and see what and what isn't allowed.


A lock with a priority of zero is an exclusive lock. A thread must
never hold another lock when taking a priority 0 lock. While holding a
priority 0 lock it is an error to take any other lock.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any example of a lock that would get priority 0? Is this actually good for anything?

I was thinking of possibly using MAX_INT instead of 0, to avoid a special case. But there still would be a special case, because more than one lock may need to have this priority (and the rest of the proposal afaik disallows having multiple locks with the same priority).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this feels inverted and it will be difficult to choose an appropriate priority for your lock.

Instead of saying 0 can't be acquired while another lock is held, could we say that when a 0 is held, no other locks may be acquired? This would suggest a default priority of 0, which would apply to locks that are acquired and released in-line while data structures are manipulated and no functions are called.

This would mean that a lock's priority would be based on the priorities of other locks that may be recursively acquired, establishing a lock hierarchy. Locks like the Python lock should have a relatively high priority since they are held over long-running operations.

////

= Mutex Locking Validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That README.adoc should explain what RIP stands for ;) I actually don't know. (But it sounds really cool.)

while a thread is running within the Python interpreter. Occasionally
the Python interpreter will invoke native C functions. These
functions may take addional locks, such as the global logging lock
(log_lock).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offtopic: is it necessary to have logging lock? There is lock-free logging in Java, so maybe C has it too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log locking is used to keep the written log in strict chronological order. gettimeofday, vsnprintf, and write_log are the items that need to be locked but the code locks only a little more than that.
It would be bad to have the logs out of order with respect to time of day.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like http://logging.apache.org/log4j/2.x/manual/async.html, this keeps a queue of in-flight log messages and adding to the queue does not require holding a lock.

////

= Mutex Locking Validation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal files could be numbered, so possibly 001-lock-validation.adoc or maybe dated, 2020-07-31-lock-validation.adoc.

state. The locking API is defined in
include/qpid/dispatch/threading.h

In general lock nesting (e.g. taking a lock while holding another) is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this statement is a bit strong. There are a number of cases where locks are nested in the code base.


Every lock will be assigned a fixed priority on creation. The
priority will be a simple positive integer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you proposing that the sys_mutex function be modified to include a lock priority? Alternatively, are you planning to add an alternate constructor (sys_mutex_priority or similar) and having the existing one default to a particular value? If so, what's the default?

Copy link
Member

@ted-ross ted-ross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks like a valuable enhancement. We still need to answer some questions and hammer out some details.

@ssorj
Copy link
Contributor

ssorj commented Aug 7, 2020

I propose putting these under /docs/rips/ (as a peer to notes/, man/, and so on).

@jiridanek
Copy link
Contributor

Somebody built a tool that does a lot of what's proposed here http://manpages.ubuntu.com/manpages/bionic/man8/deadlock_detector-bpfcc.8.html. TSan is of course another such tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

proposal Router Improvement Proposal WIP PR is a Work In Progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants