Internet-Draft | BGP BFD Strict-Mode | July 2024 |
Zheng, et al. | Expires 6 January 2025 | [Page] |
This document specifies extensions to RFC4271 BGP-4 that enable a BGP speaker to negotiate additional Bidirectional Forwarding Detection (BFD) extensions using a BGP capability. This BFD Strict-Mode Capability enables a BGP speaker to prevent a BGP session from being established until a BFD session is established. This is referred to as BFD "strict-mode".¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 6 January 2025.¶
Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Bidirectional Forwarding Detection BFD [RFC5880] enables routers to monitor data plane connectivity and to detect faults in the bidirectional forwarding path between them. This functionality is leveraged by routing protocols such as BGP [RFC4271] to rapidly react to topology changes in the face of path failures.¶
The BFD interaction with BGP is specified in Section 10.2 of [RFC5882]. When BFD is enabled for a BGP neighbor, faults in the bidirectional forwarding detected by BFD result in BGP session termination. It is possible in some failure scenarios for the network to be in a state such that a BGP session may be established but a BFD session cannot be established. In some other scenarios, it may be possible to establish a BGP session, but a degraded or poor-quality link may result in the corresponding BFD session going up and down frequently.¶
To avoid situations that result in routing churn and to minimize the impact of network interruptions, it will be beneficial to disallow BGP to establish a session until BFD session is successfully established and has stabilized. We refer to this mode of operation as BFD "strict-mode". However, always using "strict-mode" would preclude BGP operation in an environment where not all routers support BFD strict-mode or have BFD enabled.¶
This document defines BFD "strict-mode" operation as preventing BGP session establishment until both the local and remote speakers have an established BFD session. The document also specifies a BGP capability [RFC5492] for announcing BFD parameters including a BGP speaker's support for "strict-mode"; i.e., requiring a BFD session for BGP session establishment.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
Defined in this document:¶
Defined in RFC 5880:¶
The BFD Strict-Mode Capability is a BGP Capability [RFC5492] defined as follows:¶
Capability code: 74¶
Capability length: 0 octets¶
A BGP speaker which supports capabilities advertisement and has BFD strict-mode enabled MUST include the BFD Strict-Mode Capability in its OPEN message.¶
A BGP speaker which supports the BFD Strict-Mode Capability examines the list of capabilities received from its peer. If both the local and remote BGP speakers include the BFD Strict-Mode Capability, the BfdStrictNegotiated session attribute (Section 3 below) is set to TRUE.¶
Implementations SHOULD start the BFD session associated with the BGP BFD strict-mode session prior to the BGP FSM starting. The motivation is to avoid delaying BGP FSM transitions while waiting for the BFD session reach the Up state.¶
Similarly, to support BFD hold-down requirements for detecting BFD session stability (see Section 10), implementations SHOULD NOT immediately destroy BFD sessions when associated BGP connections transition to Idle.¶
When BFD is enabled, and BFD strict-mode is enabled and negotiated, the BGP finite state machine is prevented from send a KEEPALIVE to the remote BGP speaker and advancing to the OpenConfirm state until the associated BFD session has reached the Up state.¶
In the FSM defined in [RFC4271], sending of a KEEPALIVE to the remote BGP speaker and advancement to the OpenConfirm state happens:¶
For each of these scenarios, when BFD is enabled, and BFD strict-mode is negotiated, a sub-state is introduced to track the pending BFD Up event:¶
If BFD strict-mode configuration is changed once the BGP FSM has started executing, but has not reached the Established state, the session is reset to the Idle state to ensure consistent behavior. I.e., no unexpected timers are running, and the BGP session's transition to Established is not lingering on a pending event. Once the BGP session has reached the Established state, changes to BFD strict-mode are irrelevant since the work of this feature has been completed.¶
The following changes are made to the BGP FSM defined in Section 8.2.2 of [RFC4271]:¶
In the "Idle State", the BfdAdminDown, BfdDown, BfdUp, Bfd_Disabled, BfdStrict_ConfigChanged events are ignored.¶
In the "Idle State", the BfdHoldTimer_Expires event is ignored, but only would occur as an error in the FSM implementation.¶
The BfdHoldTimer is reset to zero and stopped on any transition to the Idle state.¶
In response to the BfdAdminDown event (Event TBD-X), the Bfd_Disabled event (Event TBD-X), or the BfdUp event (Event TBD-X) the the local system checks to see if it is in the ConnectDelayOpenBfdUpPending sub-state. If the FSM is in the ConnectDelayOpenBfdUpPending sub-state, the local system:¶
if the HoldTimer initial value is non-zero,¶
If the FSM is not in the ConnectDelayOpenBfdUpPending sub-state, the local system:¶
The BfdDown event (Event TBD-X) is ignored while in the Connect state.¶
A BFD session can transition to Down from the Init state, indicating the session has failed to come Up, or transition to Down from the AdminDown as part of starting the BFD state machine.¶
In response to the BfdHoldTimer_Expires event (Event TBD-X), the local system:¶
In response to the BfdStrict_ConfigChanged event (Event TBD-X) the local system:¶
In the "Connect State", the handling of Event 20, an OPEN message is received while the DelayOpenTimer is running, is revised as follows:¶
Old Text:¶
if the HoldTimer initial value is non-zero,¶
else, if the HoldTimer initial value is zero,¶
If the value of the autonomous system field is the same as the local Autonomous System number, set the connection status to an internal connection; otherwise it will be "external".¶
New Text:¶
If BfdEnabled is TRUE, and BfdStrictNegotiated is TRUE, and bfd.SessionState is neither Up nor AdminDown,¶
else,¶
If the value of the autonomous system field is the same as the local Autonomous System number, set the connection status to an internal connection; otherwise it will be "external".¶
The BfdHoldTimer is reset to zero and stopped for any transition to the Idle state.¶
In response to the BfdAdminDown event (Event TBD-X), the Bfd_Disabled event (Event TBD-X), or the BfdUp event (Event TBD-X), the local system checks to see if it is in the ActiveDelayOpenBfdUpPending sub-state. If the FSM is in the ActiveDelayOpenBfdUpPending sub-state, the local system:¶
if the HoldTimer initial value is non-zero,¶
If the FSM is not in the ActiveDelayOpenBfdUpPending sub-state, the local system:¶
The BfdDown event (Event TBD-X) is ignored while in the Active state.¶
A BFD session can transition to Down from the Init state, indicating the session has failed to come Up, or transition to Down from the AdminDown as part of starting the BFD state machine.¶
In response to the BfdHoldTimer_Expires event (Event TBD-X), the local system:¶
In response to the BfdStrict_ConfigChanged event (Event TBD-X), the local system:¶
In the "Active State", the handling of Event 20, an OPEN message is received while the DelayOpenTimer is running, is revised as follows:¶
Old Text:¶
if the HoldTimer initial value is non-zero,¶
else, if the HoldTimer initial value is zero,¶
If the value of the autonomous system field is the same as the local Autonomous System number, set the connection status to an internal connection; otherwise it will be "external".¶
New Text:¶
If BfdEnabled is TRUE, and BfdStrictNegotiated is TRUE, and bfd.SessionState is neither Up nor AdminDown,¶
else,¶
If the value of the autonomous system field is the same as the local Autonomous System number, set the connection status to an internal connection; otherwise it will be "external".¶
The BfdHoldTimer is reset to zero and stopped for any transition to the Idle state.¶
In response to the the BfdAdminDown event (Event TBD-X), the Bfd_Disabled event (Event TBD-X), or the BfdUp event (Event TBD-X), and the FSM is in the OpenSentBfdUpPending sub-state, the local system:¶
If the FSM is not in the OpenSentBfdUpPending sub-state, the local system:¶
In response to the BfdDown event (Event TBD-X):¶
In response to the BfdHoldTimer_Expires event (Event TBD-X), the local system:¶
In response to the BfdStrict_ConfigChanged event (Event TBD-X), the local system:¶
Old Text:¶
When an OPEN message is received, all fields are checked for correctness. If there are no errors in the OPEN message (Event 19), the local system:¶
If the negotiated hold time value is zero, then the HoldTimer and KeepaliveTimer are not started. If the value of the Autonomous System field is the same as the local Autonomous System number, then the connection is an "internal" connection; otherwise, it is an "external" connection.¶
New Text:¶
When an OPEN message is received, all fields are checked for correctness. If there are no errors in the OPEN message (Event 19), the local system:¶
If BfdEnabled is TRUE, and BfdStrictNegotiated is TRUE, and bfd.SessionState is neither Up nor AdminDown,¶
else,¶
If the negotiated hold time value is zero, then the HoldTimer and KeepaliveTimer are not started. If the value of the Autonomous System field is the same as the local Autonomous System number, then the connection is an "internal" connection; otherwise, it is an "external" connection.¶
The BfdAdminDown, Bfd_Disabled, and BfdUp events are ignored in the OpenConfirm state.¶
In response to the BfdDown event (Event TBD-X):¶
In response to the BfdStrict_ConfigChanged event (Event TBD-X), the local system:¶
The BfdAdminDown, Bfd_Disabled, and BfdUp events are ignored in the Established state.¶
In response to the BfdDown event (Event TBD-X), the local system:¶
The BfdStrict_ConfigChange event is ignored in the Established state.¶
The BfdHoldTimer_Expires event in the Established state is a FSM error, and is ignored.¶
When BGP sessions are closed according to the procedures in this document, the session SHOULD be terminated with a NOTIFICATION message with the Cease Code (6) and the "BFD Down" Subcode (10); see [RFC9384]. This informs the operator that interaction with BFD is the root cause of the BGP session being unable to move to the Established state.¶
The use of BFD strict-mode along with mechanisms such as hold-down (a delay in the initial BGP Establishment state following BFD session establishment) and/or dampening (a delay in the BGP Establishment state following failure detected by BFD) may help reduce the frequency of BGP session flaps and therefore reduce the associated routing churn.¶
To avoid deadlock when utilizing both BFD hold-down and BFD strict-mode, when strict-mode is enabled for a peer, the BGP FSM MUST be enabled. That is, BFD hold-down procedures MUST NOT prevent BGP from establishing a connection with the remote BGP speaker.¶
If both the local and remote BGP speakers include the BFD Strict-Mode Capability, the BGP state machine is permitted to transition to the Established state from the OpenConfirm state after the locally configured BFD hold-down interval is observed. That is, the BFD session has been Up for the desired amount of time.¶
It is RECOMMENDED that the BFD hold-down intervals used with BFD strict-mode, when configured, use similar values. Similarly, the negotiated BGP holdtime SHOULD be long enough to account for the time between the BGP FSM reaching the OpenConfirm state, the BFD hold-down interval, and any delay for the BFD session being initiated. Failure to do so can result in the BGP speaker that has transitioned to the Established state expiring its BGP holdtime and closing the connection. This is because the remote BGP speaker hasn't transitioned to Established and begun sending KEEPALIVE messages.¶
A BGP speaker SHOULD log a message if it closes its session due to hold timer expiration while waiting for the BFD hold-down interval.¶
The behavior of BGP speakers implementing BFD hold-down without negotiating the BFD strict-mode feature is out of scope of this document. However, the authors are aware that inconsistent behaviors in BGP implementations for BFD hold-down without BFD strict-mode may result in BGP session deadlock.¶
Auto-configuration is possible for enabling BFD strict-mode. However, the configuration automation is out of the scope of this document.¶
To simplify troubleshooting and avoid inconsistencies, it is RECOMMENDED that BFD strict-mode configuration be consistent for both BGP peers.¶
This draft introduces sub-states in the existing BGP finite state machine for tracking BFD session status inputs for strict mode operation. Implementations SHOULD provide visibility for these sub-states in its display of the BGP finite state machine.¶
The mechanism defined in this document interacts with the BGP finite state machine when so configured. The security considerations for BFD thus, become BGP-4 considerations [RFC4271] when so used. Given that a BFD session is required for a BGP session, a Denial-of-Service (DoS) attack on BGP can now be mounted by preventing a BFD session between the BGP peers from reaching the Up state, or interrupting an existing BFD session. The use of a BFD Authentication mechanism, some of which are defined in [RFC5880], is thus RECOMMENDED when used to protect BGP-4 [RFC4271].¶
This document defines the BFD Strict-Mode Capability. The Capability Code 74 has been assigned from the First-Come-First-Served range (64-238) of the Capability Codes registry.¶
The authors would like to acknowledge the review and inputs from Shyam Sethuram, Mohammed Mirza, Bruno Decraene, Carlos Pignataro, Enke Chen, Anup Kumar, and Ketan Talalukar.¶