We continue to try to understand whether BGP full view has more or less benefits than dangers, and this is part two (see Part 1).
Before we dive into discussion, let us make a little lyrical digression:
Aggregation vs detalization
If talking about dynamic routes exchange, there exist two opposite principles for working with routing information: detailing and aggregation. Detailing: lots of long prefixes, or aggregation: representation of several longer prefixes as a shorter one.
It is obvious that detailing deals a lot of damage: every pupil knows that the more information you have, the more difficult and expensive it is to store and operate it. In the days of software routers and IP youth, thesis about the need of routes’ number minimization was an indisputable axiom. Many interesting things were invented to prevent performance reduction, caused by RIB and FIB swelling. For example, countless and ruthless types of areas and LSAs in OSPF or such thing as MPLS (and yes, it was not invented for VPN or TE), Cisco Express Forwarding and other things of varying degrees of usefulness, including hardware forwarding.
Aggregation (summarization) – a science of its own, associated with skillful segmentation, selection of an address plan, ability to manage various IGP areas, sleight of hand in the writing of routing rules, etc. More information on this matter can be found in such books as “Advanced IP Networks Design” by A. Retana, D. Slice and R. White (Cisco Press).
Extreme case of aggregation – the default route: 0.0.0.0/0, meaning “everything, apart from what has explicit routes, leading to it”
Unfortunately, the brilliant art of aggregation is almost impossible to apply to Internet. Principles of geography independence, the lack of centralized management, administrative isolation of autonomous systems, minimization of the area of damage in case of failure and so on, lead to the fact that neighboring prefixes, for example 126.96.36.199/19 and 188.8.131.52/19, should not be represented in the form of aggregated prefix 184.108.40.206/18 with common parameters, unless they belong to the same autonomous system (and even then, it is not always possible). In general, such summarization may cause loops or “black holes”.
The obvious exception is the above-mentioned default route. It is used as a route to the Internet (and not only) in cases where there is no full view. It is practically the only possible alternative to full view: disregard the entire internal Internet structure. Let us see when this is possible, and what is the forwarding difference on the basis of a default and a full table.
Who needs full view and why?
Everything is clear with the default route: everything that has no specific (own) routes leading to it, is being sent to ISP (uplink) aka to the Internet. What does the full view here? It provides an opportunity to make decisions on choosing the link for sending packets, based on knowledge not about Internet as a whole, but of its individual resources. First example: x.x.x.x/x route is available via the links A and B, and the route y.y.y.y/y is available – only via the link B. If so, then the traffic to y.y.y.y/y can be sent only through B (reachability). Second – we will be able to influence the matching between destination prefixes and neighbors, through which we will send traffic. We will be sending traffic to Google through the provider A and to Yandex through B (routing).
Why it might be necessary?
The first indisputable case – the presence of higher, or equal upstream (uplink, IX), as well as downstream (customers) AS neighbors. In other words – when our autonomous system is transit, and it has connected clients that also announce something to us. In such situation, we cannot use the default by definition, because in this case we are not at the edge but in the “center” of the Internet. Presence of routes aggregation here can lead to loops. Therefore, service providers have to keep a complete table on all routers that transmit Internet traffic based on IP headers. It is possible to get creative here as well and invent other options, but they are out of scope of the topic, and are quite difficult to operate at ISP network level. Therefore, despite the fact that variations of this theme are not so rare in life, their exploitation in provider autonomous systems often brings more problems than benefits.
A much more interesting task for us is load balancing. It must be accomplished in a way that will allow traffic to go through the Internet via routes that are optimal from the point of view of the BGP (which, as we previously discussed, is not always accurate in selecting the route) or via the routes we choose (and customize). Both wishes can be completely legitimate, if you have a lot of outgoing traffic from your AS (gigabit, two or more), and if complex manipulations have economic feasibility. Usually it happens either in case of providers with transit autonomous systems that fall under the case described above anyway, or in large data centers, which, in their turn can also have clients with their own AS. Even if that is not the case, the individual knowledge of the routes for each prefix (which is given by the full view) will not hurt.
If your AS is not a transit infrastructure for sale, but a corporate network even with its small (by international standards) Data Center, outgoing traffic likely is quite low (tens to hundreds of megabits), and you probably don’t have to think about (if only for the sake of curiosity) sending traffic to Google through one route and Yandex through another. What you need for outgoing traffic is to balance it either uniformly or in any desired proportion between multiple Internet links. Contrary to popular opinion, full view here is unnecessary and even harmful.
The third – a little more interesting, and relatively non-obvious case – full view as information “help”. Often you want to know with what AS you are exchanging traffic the most. In some cases, including when full view is not needed for traffic, you want to get the statistics to assess various kinds of traffic trends. In these cases, full view can be used via such mechanisms as NetFlow to obtain more information about the traffic (incoming and outgoing). But it should be noted that the implementation of such monitoring requires some experience and knowledge of what the equipment should be capable of, where should the full view be stored, how all this works and how to interpret the statistics. To sum up, it is an advanced topic beyond the scope of the article. Moreover, if you do not have gigabits of traffic, then you probably do not need this. One more variant on the topic – console output, demonstrated above, is taken from the special routers that are not used for transit traffic. They keep full view only to be able to analyze it.
When you can do without full view?
A common misconception is that if you have your own AS and address space, then you must take a full BGP table. This is the misconception.
As it was mentioned above, the most doubtful usefulness of full view (also known as the most widespread and the richest in terms of inadequate implementations) – corporate network, where the AS is needed for possession of own address space, so, in turn, to not depend on the service providers to reserve Internet resources (such as public servers, VPN concentrators, etc.). The main task here is to facilitate the reception of incoming traffic from the Internet when one of the external connections is lost. Remember the rule, saying that traffic is sent towards the routing information. The question is why do we need full view when talking about the incoming traffic? The answer – we do not need it at all.
Well, if we abandon full view, how are we going to transmit outbound traffic? As usual! By writing routing rules, or even better – by negotiating with the provider (usually not a problem), not to strain the equipment when it is not necessary (ORF mechanism was invented namely for this), and receive only the default route from each provider. Further on, we can either use only one of the routes (keeping the second one as backup in case the first is down) and let all outbound traffic go through the same link, or configure load balancing to be uniform, or even establish a certain proportion for it by using BGP bandwidth community, if your equipment offers such option (refer to the the conversation about the difference between routers from L3 switches). The same goes if you have not two, but three, five, twenty, or a hundred and twenty providers.
The only disadvantage of this approach – is that with the “granular” routing we give up control over reachability: you will not get notified about connectivity to specific resources on the Internet. Individual full view prefixes – this is information directly from the resources you need to send traffic to. Default route is often generated by your ISP (or ISP of your provider, which is better), and it may happen that you get a default from the provider, but in fact no internet connection is present in its network (ISP). That is an illustration of a «black hole», which is a consequence of route aggregation.
However, it is not a very high risk: the degree of protection of your providers’ infrastructure must be much higher than yours in any case. If the situation described above occurs frequently, you should think about changing the provider. Apart from that, when you have two Internet links and the scheme of “main-backup” (remember, we are speaking only about outgoing traffic), you can protect yourself further by getting a couple of prefixes such as Google and Yandex from the main provider and writing a rule that creates an aggregated default only if the specified routes are fine.
Are you afraid that Facebook will only be available through one provider and Linkedin only through another? Is such a challenge business critical for you and you want to protect yourself from it? Are you ready to talk to management about it? Do they think the same and are ready to spend money? – Congratulations! (first of all to your hardware’s vendor). No? – Below you will find a concluding paragraph.
In cases when you need a connection not with an abstract internet but to specific resources, addresses of which (usually in the amount of tens-hundreds) that are well known, for example, in the case of termination of static VPN tunnels for branches, you can (and should) take routes from service providers to those resources (remote points) besides the default route. Outbound Route Filtering (ORF) would be particularly relevant here, but for some reason it is not too popular.
Harm, caused by full view
Why not to use full view? Why bother with all these arguments? What if we just take it?
Apart from that, one must consider economic reasons: a corporate network often cannot afford a high-end router. Therefore, if you need high performance (gigabits), you can use those same relatively cheap switches that can hold only a few thousand prefixes in the FIB.
However, corporate networks frequently use less productive, but much more feature rich software boxes for BGP-peering. The same box usually serves as a border router and edge stateful-firewall, does the NAT, all the internal routing, and sometimes even checks the traffic for viruses and filters the URL. All that stuff is stored in its memory, which is not unlimited, and is processed by the central processor that also has its limitations. It does not make any sense to add even a handful of full view feeds in this box.
So what if you really want full view?
In fact, full view has one important and often decisive advantage. It is one thing when the “show route summary” shows 7 routes, and quite another – when 700 thousand routes are shown. Which corporate network does not dream of becoming an operator?
Here are the answers (choose whatever you like more):
- You should not deny yourself the pleasure. Vendors and routers’ sellers will be very happy that you want that. Hopefully, the employer will support you too (financially).
- When implementing and administering a public AS and BGP-peering, you have to solve a more complex and interesting problem: balancing the incoming traffic and redundant links for it. How your address space is seen in the Internet (routing) and whether it is reachable, how to achieve redundancy, what will happen in case of AS internal connection issues, and a whole bunch of complex issues waiting for your firm and correct actions. It is not as easy as it seems.
- Finally, one can see how the full table looks like on any public route-server (by the way, you will have to do it when debugging your announcements).