BGP (Border Gateway Protocol) is the main dynamic routing protocol used across the Internet. Routers that use BGP protocol, exchange information about reachable networks. Along with information about networks, various attributes of the networks are passed, which BGP uses to select the best path and set routing policies.
One of the main attributes sent along with the information about the route is the list of autonomous systems transited by this information. It allows BGP to determine where is the network located within the autonomous system, to prevent routing loops, and also can be used when setting policies.
Routing is carried out step by step, from one autonomous system to another. All BGP policies are set mostly relative to external/neighboring autonomous systems, that is: describe the interaction rules with them. So as BGP operates with large volumes of data (the current size of the table for IPv4 routes is more than 580,000 routes), the principles of its configuration and operation differ from the internal dynamic routing protocol (IGP).
Interior Gateway Protocol – the protocol used to exchange routing information within an autonomous system.
Exterior Routing Protocol – the protocol used to transfer exchange information between autonomous systems.
Autonomous System (AS) – a set of routers with common routing rules, managed by one technical administrator and working with an IGP protocol (for routing within an AS, several IGP protocols can also be used).
Transit autonomous system (transit AS) – autonomous system, through which traffic is sent to other autonomous systems.
Path – a sequence consisting of autonomous systems numbers through which must pass to reach the destination network.
Path attributes (PA) – path characteristics that help to choose the best path.
BGP speaker – a router that runs BGP protocol.
Neighbors, peers – any two routers between which a TCP connection for exchanging routing information is open.
Network layer information about network availability (Network Layer Reachability Information, NLRI) – IP prefix and prefix length.
BGP selects the best routes based on routing policies rather than on technical characteristics of the path (bandwidth, delay, etc.). In local networks, the most important is the convergence speed of the network – reaction time to changes. While selecting a route, routers that use internal dynamic routing protocols usually compare some technical characteristics of the path such as bandwidth capacity. When choosing between two providers, is the internal company rules that matter, rather than the quality of the provider’s connections. Therefore, in BGP, best path is chosen based on policies, which is configured via various prefix filters, announcing specific routes and manipulating BGP attributes.
Like other dynamic routing protocols, BGP can only send traffic based on the IP-address of the recipient. This means that with BGP is not possible to set routing rules based on parameters like packet source or source application. If routing needs to be based on a different criteria than destination address, Policy Based Routing (PBR) must be used.
Main protocol characteristics
BGP is a path-vector protocol with the following common characteristics:
- Uses TCP to transfer data. This provides reliable delivery of the BGP updates (port 179)
- Sends updates only when changes occur in the network (no periodic updates)
- Periodically send keepalive messages to check the TCP connection
- The protocol’s metric is called path vector or attributes
Autonomous System (AS) – a collection of IP networks and routers under the control of one or a few network operators that has a single, clearly defined routing policy (RFC 1930).
Autonomous systems number (ASN) range:
0-65535 (initially designed for the 16-bit ASNs)
65536-4294967295 (a new range for the 32-bit ASNs (RFC 4893))
0 and 65535 (reserved)
1-64495 (public range)
65552-4294967295 (public range)
64512-65534 (private range)
23456 (32-bit range for devices that work with the 16-bit range)
The BGP operational algorithm
- Neighbor table – a list of all BGP neighbors
- BGP Table (forwarding database, topology database):
- List of networks received from each neighbor
- Might contain multiple paths to a destination network
- Contains BGP attributes for each path
- Routing table – the list of the best routes to destination networks
By default, BGP sends keepalive messages every 60 seconds. If there are multiple paths to a destination, the router will advertise the best route from the BGP table, rather than all available options.
Internal BGP and external BGP
⦁ Internal BGP (iBGP) – BGP working within an Autonomous System. iBGP neighbors do not necessarily have to be directly connected.
⦁ External BGP (eBGP) – BGP runs between Autonomous Systems. By default, eBGP neighbors must be directly connected.
If iBGP routers operate in a transit AS, the connection between them should be full mesh. This is a consequence of the working principles of the protocol – if the router at the edge of AS received an update, it sends it to all its neighbors; neighbors that are within the autonomous system do not send that update to other routers, because they consider that all the neighbors within the AS have received it already.
⦁ Keepalive Interval – time interval (in seconds) between sending keepalive messages. By default, it is set to 60 seconds.
⦁ Hold Time – time interval (in seconds) after which the neighbor is considered unreachable. By default, it is set to 180 seconds.
Types of BGP messages
All BGP messages have the following header format:
|<-------------------------- 32 bit --------------------------->| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | Marker | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
BGP message header fields:
- Marker – a field that is included in the header for compatibility. The size of the field – 16 bytes, each byte must be 1.
- Length – the length of the entire message in bytes, including the header. The field value can be from 19 to 4096.
- Type – the type of message sent:
2 – UPDATE
3 – NOTIFICATION
4 – KEEPALIVE
Open – used to establish neighbor relations and exchange basic parameters. Dispatched immediately after establishing the TCP connection.
Open message format:
|<-------------------------- 32 bit --------------------------->| +-+-+-+-+-+-+-+-+ | Version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | My Autonomous System | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hold Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BGP Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opt Parm Len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Optional Parameters (variable) | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In addition to the standard BGP header packet, Open message contains the following fields:
- Version – the BGP protocol version
- My Autonomous System – the source Autonomous System Number (ASN)
- Hold Time – the maximum time in seconds that can elapse between receiving the keepalive and update The minimum time is selected.
- BGP Identifier – is considered when choosing the BGP messages path in case there is more than one channel between BGP neighbors
- Optional Parameters Length – if equals to 0, the marker is set to 1 and Optional Parameters has zero length; if different to 0, data necessary to determine the code that is specified in the marker field is recorder in the Optional Parameters.
- Optional Parameters – plays a role in formation and identification of the code in the marker field.
Update – used to exchange routing information.
Update message format:
+-----------------------------------------------------+ | Unfeasible Routes Length (2 octets) | +-----------------------------------------------------+ | Withdrawn Routes (variable) | +-----------------------------------------------------+ | Total Path Attribute Length (2 octets) | +-----------------------------------------------------+ | Path Attributes (variable) | +-----------------------------------------------------+ | Network Layer Reachability Information (variable) | +-----------------------------------------------------+
Notification – used when BGP errors occur. The session with the neighbor is torn down after sending the message.
Notification message format:
|<-------------------------- 32 bit --------------------------->| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Error code | Error subcode | Data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
In addition to the standard BGP header packet, Notification message has the following fields:
- Error Code – the type of alert:
2 – OPEN Message Error
3 – UPDATE Message Error
4 – Hold Timer Expired
5 – Finite State Machine Error
6 – Cease
Keepalive – used to keep the BGP neighbor relationship to detect the inactive neighbors. Keepalive messages consist of only the packet header (19 octets in length). If the frequency of sending keepalive messages is set to 0, messages are not sent.
BGP Neighbor Relationship
In order to establish neighbor relationship in BGP, each BGP neighbor must be manually configured. When a neighbor is specified on the local router, the neighbor’s autonomous system (AS) must be indicated as well. This information is used for BGP neighbor’s type identification:
- Internal BGP neighbor (iBGP neighbor) – a router that is in the same autonomous system as the local router. Internal BGP neighbors do not necessarily have to be directly connected.
- External BGP neighbor (eBGP neighbor) – a router, which is in a different autonomous system than local router. External BGP neighbors must be directly connected by default.
The type of BGP neighbor has little effect on BGP neighbor relation establishment. More significant distinctions between different neighbor types appear in the process of sending BGP updates and adding routes to the routing table.
During the neighbor relationship establishment, BGP performs the following checks:
- The router must receive a TCP layer connection request with sender’s address, which the router will find specified in the list of neighbors (neighbor command).
- The AS number of the local router must match the AS number that is specified on the neighboring router via neighbor remote-as command (this requirement is disregarded in case of confederations settings’ modification).
- The routers’ ID (Router ID) should not match.
- If authentication is configured, the BGP neighbors must pass it.
|For item 1, there is a particularity: only one of the two routers must have the IP address, set as the updates’ source, specified in the neighbor command of another router.|
BGP performs keepalive and hold timers’ check, however incongruity of these parameters does not affect the neighbor relationship establishment. If timers do not match, then each router will use the smaller value of hold timer.
Peering states with BGP neighbors
- Open sent
- Open confirm
|State||TCP waiting?||TCP initialization?||TCP established?||Is Open sent?||Is Open received?||Is Neighbor Up?|
They are divided into four categories:
- Well-known mandatory – all routers running BGP should recognize them and they must be present in all updates.
- Well-known discretionary – all routers running BGP should recognize them. They may be present in the updates, but it is not mandatory.
- Optional transitive – attributes may not be recognized by all implementations of BGP. If the router does not recognize the attribute, it marks the update as partial and sends it to its neighbors, keeping unrecognized attribute.
- Optional non-transitive – attributes may not be recognized by all implementations of BGP. If the router does not recognize the attribute, the attribute is ignored and discarded when sending to peers.
Examples of BGP attributes:
- Well-known mandatory:
- Autonomous system path
- Well-known discretionary:
- Local preference
- Atomic aggregate
- Optional transitive:
- Optional non-transitive:
- Multi-exit discriminator (MED)
- Originator ID
- Cluster list
BGP route reflector allows to:
- avoid the need for a full-mesh topology between all iBGP-neighbors
- all iBGP neighbors to learn all the iBGP routes inside AS
- prevent formation of loops
Autonomous system path
BGP AS path attribute:
- Describes which autonomous systems must be passed through, in order to reach the destination network.
- AS number is added when sending updates from one AS, to eBGP neighbor in another AS.
- loops detection
- applying policies
Each AS path attribute segment is represented as a TLV field (path segment type, path segment length, path segment value):
- path segment type – 1 byte field for which the following values are defined:
- 1 – AS_SET: unordered list of autonomous systems, through which the route passed in Update messages,
- 2 – AS_SEQUENCE: ordered list of autonomous systems, through which the route passed in Update messages.
- path segment length – 1 byte field. It specifies how many autonomous systems are in the path segment value field
- path segment value – autonomous systems numbers, each represented by a 2 bytes field.
- IP-address of the next AS to reach the destination network.
- IP-address of the eBGP router, through which lies the route to the destination network
- Attribute changes when sending prefix to another AS
Third party next hop:
Origin attribute indicates by what means route was acquired during the update.
Possible attribute value:
- 0 — IGP: NLRI is acquired within the original autonomous system;
- 1 — EGP: NLRI learned in compliance with Exterior Gateway Protocol (EGP). BGP predecessor is not used
- 2 — Incomplete: NLRI was learned by some other way
- Informs the routers within the AS how to get beyond it.
- This attribute is transmitted only within one AS.
- Attribute value on Cisco routers is 100 by default.
- The exit point with a higher value of the attribute is preferred.
- If external BGP neighbor receives an update with local preference set, it ignores this attribute.
The label indicating that the NLRI is a summary.
List of RID and ASN of the routers, that created summary NLRI.
- Routes tagging
- There are predefined values
- Attribute is not sent to the neighbors by default
- One of the possible applications: attribute is transmitted to neighboring AS to control incoming traffic
Values from 0x00000000 to 0x0000FFFF, and from 0xFFFF0000 to 0xFFFFFFFF are reserved.
Usually, community attributes are displayed in ASN: VALUE format. In this format community attributes from 1:0 to 65534: 65535 are available for use. The first part specifies the autonomous system number, the second part is community value, which defines the routing policy.
Some communities are predefined. RFC1997 defines three values for such communities. These values must be equally recognized and processed by all BGP implementations, which recognize the community attribute.
If a router receives a route where predefined communities are set, it performs specific, predetermined action based on an attribute value.
Predefined community values (Well-known Communities):
- no-export (0xFFFFFF01) – All routes, transmitted with this community attribute value, must not be advertised to external BGP neighbors, but are advertised outside the confederation
- no-advertise (0xFFFFFF02) – All routes, transmitted with this community attribute value should not be advertised to other BGP neighbors
- no-export-subconfed (0xFFFFFF03) – All routes, transmitted with this community attribute value should not be advertised to external BGP-neighbors (neither outside the confederation nor to the external neighbors). On Cisco, this value is also indicated as local-as.
|Routers that do not support community attribute, will pass it further, as it is a transitive attribute.|
Multi exit discriminator (MED)
- It is used to inform external BGP neighbors about more preferable path into autonomous system.
- Attribute is passed between autonomous systems.
- Routers from neighboring autonomous system use this attribute, but as soon as the update goes beyond the AS, the MED attribute is discarded.
- The smaller the value of the attribute, the more preferred is the entry point into autonomous system.
Weight attribute (Cisco Proprietary)
- Allows to assign a “weight” for different paths, locally on the router.
- It is used in cases when a router has multiple exit paths from the autonomous system (the router itself is the exit point).
- It has router local meaning only.
- It is not transmitted in updates.
- The higher is the attribute value, the more preferable is the exit path.
Best Path Selection
Path selection procedure characteristics in BGP:
- BGP table contains all known routes, while the routing table contains the best ones.
- Paths are selected based on policies.
- Paths are not selected on the basis of bandwidth
First of all, it is checked:
- whether the next-hop is available (Route Resolvability Condition)
In order for the next-hop to be considered available by BGP (accessible), the routing table must contain an IGP route leading to it.
|Only the best path is included in the routing table and advertised to BGP neighbors.|