Skip to content

libnetwork/osl: Namespace: assorted cleanups #46315

Merged
thaJeztah merged 10 commits intomoby:masterfrom
thaJeztah:libnetwork_remove_sandbox_interface_step2
Sep 20, 2023
Merged

libnetwork/osl: Namespace: assorted cleanups #46315
thaJeztah merged 10 commits intomoby:masterfrom
thaJeztah:libnetwork_remove_sandbox_interface_step2

Conversation

@thaJeztah
Copy link
Member

libnetwork/osl: implement Namespace.RemoveInterface

Interface.Remove() was directly accessing Namespace "internals", such
as locking/unlocking. Move the code from Interface.Remove() into the
Namespace instead.

libnetwork/osl: Namespace: make mutex private

Make the mutex internal to the Namespace; locking/unlocking should not
be done externally, and this makes it easier to see where it's used.

libnetwork/osl: Namespace: programRoute, removeRoute rm path arg

Remove the argument, because it was not used.

libnetwork/osl: Namespace: make error-handling more idiomatic

Check for non-nil errors (and return early) instead of the reverse.

libnetwork/osl: Namespace: inline setGateway and setGatewayIPv6

They were not consistently used, and the locations where they were
used were already "setters", so we may as well inline the code.

Also updating Namespace.Restore to keep the lock slightly longer,
instead of locking/unlocking for each property individually, although
we should consider to keep the long for the duration of the whole
function to make it more atomic.

libnetwork/osl: Namespace.Restore: rename vars for readability

libnetwork/osl: Namespace.Restore: flatten nested conditions

Flatten some nested "if"-statements, and improve error.

Errors returned by this function are not handled, and only logged, so
make them more informative if debugging is needed.

libnetwork/osl: Namespace.Restore: conditionally fetch IPs

We're only using the results if the interface doesn't have an address
yet, so skip this step if we don't use it.

libnetwork/osl: make constructing Interfaces more atomic

It's still not "great", but implement a newInterface() constructor
to create a new Interface instance, instead of creating a partial
instance and applying "options" after the fact.

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah thaJeztah added status/2-code-review area/networking Networking kind/refactor PR's that refactor, or clean-up code labels Aug 24, 2023
@thaJeztah thaJeztah force-pushed the libnetwork_remove_sandbox_interface_step2 branch 2 times, most recently from dad12c9 to 9c5fed3 Compare August 29, 2023 19:28
@thaJeztah thaJeztah force-pushed the libnetwork_remove_sandbox_interface_step2 branch from 9c5fed3 to 2e10816 Compare September 13, 2023 23:53
Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, only a few comments are actionable; but this package is doing weird stuff.


// RemoveInterface removes an interface from the namespace by renaming to
// original name and moving it out of the sandbox.
func (n *Namespace) RemoveInterface(i *Interface) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we have a libnetwork/ns package, but Namespace is declared in libnetwork/osl 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a weird package; also still not sure where osl stands for (I'm thinking "sandbox linux" or "operating system linux" don't know?)

Originally this code WAS libnetwork (Network), but then

So now we had a Network and Sandbox (wrapping platform-specific implementations), and "Endpoints", and Containers.

}
n.Unlock()

n.checkLoV6()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will disable IPv6 on lo interface if the removed interface was the last one offering IPv6 connectivity. First, I think that's a weird behavior; second, I don't like the idea of hiding this deep down in this function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a TODO for this one as well; all this code clearly needs a deeper look.

return err
}

err = nlh.LinkSetName(iface, i.SrcName())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we're doing this 🤔 And this would fail if the initial interface set up failed before the "dest interface" was moved into its own namespace.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points; this commit is only moving the code, but I can add TODO's to some of these lines to point out that this is code to look into.

}
} else if !isDefault {
// Move the network interface to caller namespace.
if err := nlh.LinkSetNsFd(iface, ns.ParseHandlerInt()); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, so, if the interface isn't a bridge and doesn't sit in dockerd's netns, we just move back the interface to it.

Except the call 5 lines above, I don't see any call to LinkDel in this package. Now I'm wondering, is this code really used? I see Interface.Remove() has 3 callers, I'll need to check what they're really doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a TODO for this as well

(FWIW; my main reason for this move was that I wanted to un-export the mutex; I really didn't like interface calling ns.Lock(), and wanted to internalise that (move the mutex to be non-exported, which makes it a lot easier to see where it's used / called, and then check "is it actually needed", and "do we have locking in the right places"?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(FWIW; my main reason for this move was that I wanted to un-export the mutex; I really didn't like interface calling ns.Lock(), and wanted to internalise that (move the mutex to be non-exported, which makes it a lot easier to see where it's used / called, and then check "is it actually needed", and "do we have locking in the right places"?)

💯

I'm fine with TODOs, but I think we should track them somehow to make sure we address them.

@@ -306,11 +306,12 @@ func createNamespaceFile(path string) (err error) {
// before trying to create the file.
gpmWg.Wait()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: we have a GC only to delete file paths, and it serializes concurrent calls to (*Namespace).Destroy() through a lock.

Investigating a bit this weirdo:

So, I'm wondering if we can safely drop it. Otherwise, we should at least document why it exists and when we'll be able to ditch it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'm wondering if we can safely drop it.

I guess that depends whether RHEL backported it to their 3.10 kernels. I'll have a look at adding a comment / TODO (but maybe for this one in a follow-up as I'd have to look "what" to document exactly 😂)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that depends whether RHEL backported it to their 3.10 kernels.

If the original comment from @mrjana is right, then 3.10 should be unaffected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happy if we can remove it! I just pushed an updated version; I decided to just use a separate commit for all the TODO's (to not "pollute" the other commits with them) 👍

}

return err
n.setGateway(gw)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, this "just" locks n.mu and then set n.gateway; thus it serializes concurrent calls. OTOH, n.programGateway locks nothing and thus concurrent calls aren't serialized 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, locking is a big mess in libnetwork (too many locks in one place, and no locks in places where it actually matters). That's (see my other comment) also a reason I'm un-exporting the mutexes, which also makes it easier to find where locking happens.

}
for _, opt := range options {
if opt != nil {
opt(i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IfaceOption are only setters, so I think we can remove its error return type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH! Actually, I think I messed up a rebase here (looks like I dropped the error return?)

I think we can consider the functional arguments to return an error so that an invalid option would actually produce an error (where it makes sense). In either case, I should move the code "as is" from the old location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should move the code "as is" from the old location.

Yeah that's probably better.

I think we can consider the functional arguments to return an error so that an invalid option would actually produce an error (where it makes sense).

Except no IfaceOption actually return an error, as they're all data setters. So, I think we should simply change the declaration of IfaceOption to func(i *Interface) in a follow-up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to dig in memory; ISTR I found some places where they were setters, but other code then validated it after it was set (maybe that was the master i this case (see the code below this), but there may have been other cases).

In either case; let's leave that for a follow-up.

Interface.Remove() was directly accessing Namespace "internals", such
as locking/unlocking. Move the code from Interface.Remove() into the
Namespace instead.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Make the mutex internal to the Namespace; locking/unlocking should not
be done externally, and this makes it easier to see where it's used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Remove the argument, because it was not used.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Check for non-nil errors (and return early) instead of the reverse.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
They were not consistently used, and the locations where they were
used were already "setters", so we may as well inline the code.

Also updating Namespace.Restore to keep the lock slightly longer,
instead of locking/unlocking for each property individually, although
we should consider to keep the long for the duration of the whole
function to make it more atomic.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Flatten some nested "if"-statements, and improve error.

Errors returned by this function are not handled, and only logged, so
make them more informative if debugging is needed.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
We're only using the results if the interface doesn't have an address
yet, so skip this step if we don't use it.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It's still not "great", but implement a `newInterface()` constructor
to create a new Interface instance, instead of creating a partial
instance and applying "options" after the fact.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
These came up during review of a refactor, and need further investigating.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@thaJeztah thaJeztah force-pushed the libnetwork_remove_sandbox_interface_step2 branch from 2e10816 to 313a090 Compare September 20, 2023 10:46
Copy link
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@thaJeztah thaJeztah added this to the 25.0.0 milestone Sep 20, 2023
@thaJeztah thaJeztah merged commit e061793 into moby:master Sep 20, 2023
@thaJeztah thaJeztah deleted the libnetwork_remove_sandbox_interface_step2 branch September 20, 2023 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking Networking kind/refactor PR's that refactor, or clean-up code status/2-code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants