scsi: lpfc: Fix incomplete NVME discovery when target

NVMe device re-discovery does not complete. Dev_loss_tmo messages seen on
initiator after recovery from a link disturbance.

The failing case is the following:

When the driver (as a NVME target) receives a PLOGI, the driver initiates
an "unreg rpi" mailbox command. While the mailbox command is in progress,
the driver requests that an ACC be sent to the initiator. The target's ACC
is received by the initiator and the initiator then transmits a PLOGI. The
driver receives the PLOGI prior to receiving the completion for the PLOGI
response WQE that sent the ACC. (Different delivery sources from the hw so
the race is very possible). Given the PLOGI is prior to the ACC completion
(signifying PLOGI exchange complete), the driver LS_RJT's the PRLI. The
"unreg rpi" mailbox then completes. Since PRLI has been received, the
driver transmits a PLOGI to restart discovery, which the initiator then
ACC's.  If the driver processes the (re)PLOGI ACC prior to the completing
the handling for the earlier ACC it sent the intiators original PLOGI,
there is no state change for completion of the (re)PLOGI. The ndlp remains
in "PLOGI Sent" and the initiator continues sending PRLI's which are
rejected by the target until timeout or retry is reached.

Fix by: When in target mode, defer sending an ACC for the received PLOGI
until unreg RPI completes.

Link: https://lore.kernel.org/r/20191218235808.31922-2-jsmart2021@gmail.com
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
1 file changed