Container errors on restart

### Description

This is a variant of #3350 

Containers cannot be restarted after being shutdown by containerd stopping, and are generally in a broken state.

This is against containerd v1.7 (unlike 3350 which was testing against ctd v2).

### Steps to reproduce the issue

Reproduction is:
```
nerdctl rm -f foo
nerdctl run -d --name foo debian sleep Inf
systemctl --user stop containerd
systemctl --user start containerd
```

Then
```
nerdctl start foo
```
or 
```
nerdctl stop foo
```

### Describe the results you received and expected

There are clearly multiple issues.

Fist is:
- [x] inability of the container to re-acquire its name in the name store

This issue affects only `main` (and not 1.7)
I have a local patch for that that I will send shortly.


Second is: 
- [x] bridge plugin refusing to return already allocated ip 
```
level=fatal msg="failed to call cni.Setup: plugin type=\"bridge\" failed (add): failed to allocate for range 0: 10.4.0.229 has been allocated to default-ec2a02d4f734a18adf2292b4a5efbcb0d5e2581198ea54653c63bdde05bdc1f1, duplicate allocation is not allowed": unknown
```

This is definitely coming from https://github.com/containernetworking/plugins/blob/main/plugins/ipam/host-local/backend/allocator/allocator.go#L83 

This has been there for some time and affects both 1.7 and main.

This needs discussion.
Should we modify the allocator over there and return the already allocated ip instead of failing?

Third is:
- [x] if `stop` cannot find the container Task, it does return `container not found`
This is probably wide spread in our codebase and other commands may also fail for the same reason.

Issues 2 and 3 might be related.

I'll look into these and figure out if we can fix or workaround, then test with different network types, reboots and also containerd v2.


cc @AkihiroSuda we should flag this urgent - although this is apparently not new, this is a pretty bad set of issues.

### What version of nerdctl are you using?

main

### Are you using a variant of nerdctl? (e.g., Rancher Desktop)

None

### Host information

```
Client:
 Namespace:	default
 Debug Mode:	false

Server:
 Server Version: v1.7.16
 Storage Driver: overlayfs
 Logging Driver: json-file
  Cgroup Driver:  : systemd
  Cgroup Version: : 2
 Plugins:
  Log:     fluentd journald json-file syslog
  Storage: native overlayfs stargz fuse-overlayfs
 Security Options:
  apparmor
  seccomp
   Profile:	builtin
  cgroupns
  rootless
 Kernel Version:   6.8.0-41-generic
 Operating System: Ubuntu 24.04 LTS
 OSType:           linux
 Architecture:     aarch64
 CPUs:             4
 Total Memory:     3.814GiB
 Name:             lima-default
 ID:               cd6896f4-2884-435e-b455-72137115b4fe

WARNING: AppArmor profile "nerdctl-default" is not loaded.
         Use 'sudo nerdctl apparmor load' if you prefer to use AppArmor with rootless mode.
         This warning is negligible if you do not intend to use AppArmor.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Container errors on restart #3352

Description

Steps to reproduce the issue

Describe the results you received and expected

What version of nerdctl are you using?

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

Host information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Container errors on restart #3352

Description

Description

Steps to reproduce the issue

Describe the results you received and expected

What version of nerdctl are you using?

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

Host information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions