So, we got our block of addresses, an autonomous system, we are no longer dependent on anyone and have become a fully functional part of the Internet. What is the best way to manage the obtained resources? How can we implement our new opportunities?
In order to learn how to do this, it is vital to understand how routing is performed between the component parts of the large Internet and what is BGP routing.
Currently, BGP (Border Gateway Protocol) version 4 is used in building of the global routing. As the name implies, the border gateway protocol is responsible for routing traffic beyond the border of our network. In order to make our address space known outside our network, an agreement on building of border routing – the so-called BGP sessions, must be reached with each of the uplinks. For this purpose, the address of our edge router, the number of our autonomous system, and the list of networks that we announce to the uplink, must be sent to uplink technicians, who in their turn, should provide the address of their border router as well as the autonomous system number. It is generally accepted that both border routers must be in the same subnet /30 to ensure efficient and faultless work. It is not a strict rule, but non-compliance leads to a dramatic increase in the complexity of configuration and a large number of non-obvious glitches.
Thus, the session is up and our network starts to be advertised to uplinks. What does this mean? It means that we made our presence known. If someone needs to send a packet to our address, it can be sent here. Uplink, in its turn, announces our network to its border neighbors, they announce it to their neighbors, and so on, to the entire Internet. Each border router with full Internet BGP routing table will have an entry with our network and the path to it. Thus, if anyone needs to send us a data packet – the path is already known. Hence, it is clear that the announcements go to the opposite way of the traffic (uplink receives announcements from us and sends traffic to us).
But what will happen in a more complicated situation, if there is not one uplink, but several and our network is advertised to each of them? Where will the traffic come from in this case? In situations like this, BGP routing protocol identifies the best path and backup paths in case the best one fails. The traffic will be coming from the uplink, located at the best path to our network for that traffic source. For instance, if we have three uplinks and know that users access different resources with equal probability, ideally, traffic from all three uplinks will be approximately equal. Let’s analyze what is the best path in terms of BGP routing. The best path is considered to be the path where the packet travels through the least number of autonomous systems. BGP does not take into account possible delays, packet loss, bandwidth of the channel, or number of routers within each autonomous system. Therefore, traffic path in BGP routing is called AS-path. It lists autonomous systems, located on the path to our network.
Obviously, in most cases, such state of affairs cannot be considered beneficial. In order to rectify this situation in BGP, there have been invented methods that allow use of administrative criteria for traffic management. These criteria are called local-pref and prepend. If we notice that one channel generates 70% of the traffic and the rest of them only 15%, it becomes clear that situation needs to be rectified. First of all this means that the uplink, generating more traffic, is better connected to the Internet (i.e., the average BGP AS-path is shorter). If the amount of traffic, coming from this link must be decreased, one or two prepends must be placed in AS-path. Thus, the AS-path is artificially extended using 1-2 AS, making it look like our network is announced by someone, located 1-2 autonomous system further from us. As a result, part of the routers, which previously considered the path through this uplink the best one, will choose other paths. Thus, it is possible to balance the load of the incoming links.
How to turn a reserve uplink into backup one, if there exists a main uplink (usually the cheapest) and a reserve one? The process is very similar. 3-5 prepends must be added to the announcements of the reserve uplink, which will artificially extend the path through the backup uplink. Then, if the connection with the primary uplink suddenly disappears, BGP session with it will be closed, the main uplink will stop receiving our networks’ announcements, and correspondingly will stop their further advertisement. Thus, the main route disappears from the BGP routing tables of the Internet border routers, and the long way through the backup link will be the only option. It will be automatically selected as the best. All these procedures take place automatically, without the intervention of a system administrator, script, or anything else.
It is worth mentioning that this method only reduces the probability that the backup path will be chosen as the best. This cannot be guaranteed by any number of prepends, because apart from the technical criteria for the choice of the path, there are also administrative ones. Backup path may be selected regardless of its length. For example, many people prefer to use the way through the exchange points instead of uplink, because it’s cheaper and faster. Uplinks always choose the direct path to the intended customers, even if there are other ways. Some countries prohibit by law the use of the paths through another country, if the internal path is available. As a rule, only 5-10% of the main link traffic passes through the backup link, however, everything depends on the way these links are included in the global Internet.
If you are the customer of a major provider, another way to control traffic is to mark your announcements with special flags – community, that tell the provider’s router where and how to announce your networks. For instance, “to Telia with three prepends, to M9 normally”. There is no standard set of community flags, which is why they need to be stipulated with administrators of your ISP. Usually, these flags are indicated in the description of the autonomous system that can be found in the comments field of RIPE database.
All the information provided above concerns the incoming traffic. Further on, we will discuss the outgoing traffic and measures that need to be taken for its proper management. The routing table must be built in compliance with our administrative criteria and global network announcements from our providers that we need to accept. The most straightforward way is to accept full BGP routing table from all uplinks – the list of all networks that exist in the Internet. Currently there exist 590,000 routes and their number is rapidly growing (see BGP Reports). Size of such BGP routing table (and there has to be at least two of them) amounts to hundreds of megabytes of RAM on the router. Obviously not every router can handle it. What can be done about it? In case it is acceptable for you to have one main provider strictly for outgoing traffic, and the other one strictly for backup – ask them to announce only one prefix: 0.0.0.0/0 or default route, instead of hundreds of thousands of various prefixes. Thus, you will be accepting traffic with higher local-preference priority from one provider (main) and traffic with lower priority from the other, which will be your backup. In case there arise any problems with the main provider, the default route from the backup provider will be considered the best one. This configuration of memory usage will work even on Cisco 800 series. It will even be possible accept routes from local traffic exchange points (usually up to a thousand routes), “free” networks of various providers, networks of its resources, etc.
You can also build peerings. Peering is a link, built directly to the neighboring network, which receives neighboring routes and announces its own. Internetwork exchange is built this way. Participation in traffic exchange point means that we announce routes to our networks and in exchange receive and accept direct routes to the members’ networks, thus, building interworking exchange on a city or country scale.