summaryrefslogtreecommitdiffstats
path: root/bgpd/bgp_fsm.c
diff options
context:
space:
mode:
authorSarita Patra <saritap@vmware.com>2020-08-21 08:33:09 +0200
committerSarita Patra <saritap@vmware.com>2020-08-21 08:36:22 +0200
commit6c4d8732e975cf67a815fc97430166bb4d7008aa (patch)
tree190e5a6189c5036230f7a2f1513dc07ea89d2976 /bgpd/bgp_fsm.c
parentbgpd: Don't stop hold timer in OpenConfirm State (diff)
downloadfrr-6c4d8732e975cf67a815fc97430166bb4d7008aa.tar.xz
frr-6c4d8732e975cf67a815fc97430166bb4d7008aa.zip
bgpd: Fix BGP session stuck in OpenConfirm state
Issue: 1. Initially BGP start listening to socket. 2. Start timer expires and BGP tries to connect to peer and moved to Idle->connect (lets say peer datastructre X) 3. Connect for X succeeds and hence moved from idle ->connect with FD-x. 4. A incoming connection is accepted and a new peer datastructure Y is created with FD-y moves from idle->Active state. 5. Peer datastercture Y FD-y sends out OPEN and moves to Active->Opensent state. 6. Peer datastrcture Y FD-y receives OPEN and moved from Opensent-> Openconfirm state. 7. Meanwhile on peer datastrcture X FD-x sends out a OPEN message and moved from connect->Opensent. 8. For peer datastrcture Y FD-y keep alive is received and it is moved from OpenConfirm->Established. 9. In this case peer datastructure Y FD-y is a accepted connection so we try to copy all its parameter to peer datastructure X and delete Y. 10. During this process TCP connection for the accepted connection (FD-y) goes down and hence get remote address and port fails. 11. With this failure bgp_stop function for both peer datastrure X and peer datastructure Y is called. 12. By this time all the parameters include state for datastrcture for X and Y are exchanged. Peer Y FD-y when it entered this function had state OpenConfirm still which has been moved to peer datastrcture X. 13. In bgp_stop it will stop all the timers and take action only if peer is in established state. Now that peer datastrcture X and Y are not in established state (in this function) it will simply close all timers and close the socket and assigns socket for both the peer datastrcture to -1. 14. Peer datastrcture Y will be deleted as it is a datastrcture created due to accept of connection where as peer datastrcture X will be held as it is created with configuration. 15. Now peer datastrcture X now holds a state of OpenConfirm without any timers running. 16. With this any new incoming connection will never be able to establish as there is config connection X which is stuck in OpenConfirm. Fix: While transferring the peer datastructure Y FD-y (accepted connection) to the peer datastructure X, if TCP connection for FD-y goes down, then 1. Call fsm event bgp_stop for X (do cleanup with bgp_stop and move the state to Idle) and 2. Call fsm event bgp_stop for Y (do cleanup with bgp_stop and gets deleted since it is an accept connection). Signed-off-by: Sarita Patra <saritap@vmware.com>
Diffstat (limited to '')
-rw-r--r--bgpd/bgp_fsm.c4
1 files changed, 2 insertions, 2 deletions
diff --git a/bgpd/bgp_fsm.c b/bgpd/bgp_fsm.c
index 357061b44..c8e5a308e 100644
--- a/bgpd/bgp_fsm.c
+++ b/bgpd/bgp_fsm.c
@@ -304,8 +304,8 @@ static struct peer *peer_xfer_conn(struct peer *from_peer)
? "accept"
: ""),
peer->host, peer->fd, from_peer->fd);
- bgp_stop(peer);
- bgp_stop(from_peer);
+ BGP_EVENT_ADD(peer, BGP_Stop);
+ BGP_EVENT_ADD(from_peer, BGP_Stop);
return NULL;
}
if (from_peer->status > Active) {