-
Notifications
You must be signed in to change notification settings - Fork 697
Description
I'm seeing an issue where linux client with libnvme and nvme-cli falls into a recursive discovery, i.e. keeps adding discovery controllers if discovery controller is listed in the discovery log. This happens with latest master as well, linux kernel version is 5.14 though. This implies that host is not able to detect duplicate controller.
I see this problem only in the case when target sends AEN, not when I try to connect to the same discovery controller twice manually from the host.
In nvmf autonnect rules we have
# Events from persistent discovery controllers or nvme-fc transport events
# NVME_AEN:
# type 0x2 (NOTICE) info 0xf0 (DISCOVERY_LOG_CHANGE) log-page-id 0x70 (DISCOVERY_LOG_PAGE)
ACTION=="change", SUBSYSTEM=="nvme", ENV{NVME_AEN}=="0x70f002", \
ENV{NVME_TRTYPE}=="*", ENV{NVME_TRADDR}=="*", \
ENV{NVME_TRSVCID}=="*", ENV{NVME_HOST_TRADDR}=="*", ENV{NVME_HOST_IFACE}=="*", \
RUN+="@SYSTEMCTL@ --no-block restart nvmf-connect@--device\x3d$kernel\t--transport\x3d$env{NVME_TRTYPE}\t--traddr\x3d$env{NVME_TRADDR}\t--trsvcid\x3d$env{NVME_TRSVCID}\t--host-traddr\x3d$env{NVME_HOST_TRADDR}\t--host-iface\x3d$env{NVME_HOST_IFACE}.service"
If I remove \t--host-traddr\x3d$env{NVME_HOST_TRADDR}\t--host-iface\x3d$env{NVME_HOST_IFACE} from this, then I do not see the problem. Seems like the problem is how we interpret this variables. These rules, by default set these to 'none', and the code interprets them as 'none' and not a nullptr, which fails the candidate match in libnvme in lookup controller.
In libnvme, when we try to see if two controllers are same, we go into this path;
static bool _tcp_opt_params_match(struct nvme_ctrl *c, struct candidate_args *candidate)
{
char *src_addr, buffer[INET6_ADDRSTRLEN];
/* Check if src_addr is available (kernel 6.1 or later) */
src_addr = nvme_ctrl_get_src_addr(c, buffer, sizeof(buffer));
if (!src_addr)
return _tcp_opt_params_match_no_src_addr(c, candidate);
/* Check host_traddr only if candidate is interested */
if (candidate->host_traddr &&
!candidate->addreq(candidate->host_traddr, src_addr))
return false;
/* Check host_iface only if candidate is interested */
if (candidate->host_iface &&
!streq0(candidate->host_iface,
nvme_iface_matching_addr(candidate->iface_list, src_addr)))
return false;
return true;
}
Here, the host_traddr is set to 'none' after AEN processing, which leads to mismatch and we do not see the duplicate controller.