Network Working Group E. Rosen
Request for Comments: 2547 Y. Rekhter
Category: Informational Cisco Systems, Inc.
March 1999
BGP/MPLS VPNs
Status of this Memo
This memo provides information for the Internet community. It does
not specify an Internet standard of any kind. Distribution of this
memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
Abstract
This document describes a method by which a Service Provider with an
IP backbone may provide VPNs (Virtual Private Networks) for its
customers. MPLS (Multiprotocol Label Switching) is used for
forwarding packets over the backbone, and BGP (Border Gateway
Protocol) is used for distributing routes over the backbone. The
primary goal of this method is to support the outsourcing of IP
backbone services for enterprise networks. It does so in a manner
which is simple for the enterprise, while still scalable and flexible
for the Service Provider, and while allowing the Service Provider to
add value. These techniques can also be used to provide a VPN which
itself provides IP service to customers.
Table of Contents
1 IntrodUCtion ....................................... 2
1.1 Virtual Private Networks ........................... 2
1.2 Edge Devices ....................................... 3
1.3 VPNs with Overlapping Address Spaces ............... 4
1.4 VPNs with Different Routes to the Same System ...... 4
1.5 Multiple Forwarding Tables in PEs .................. 5
1.6 SP Backbone Routers ................................ 5
1.7 Security ........................................... 5
2 Sites and CEs ...................................... 6
3 Per-Site Forwarding Tables in the PEs .............. 6
3.1 Virtual Sites ...................................... 8
4 VPN Route Distribution via BGP ..................... 8
4.1 The VPN-IPv4 Address Family ........................ 9
4.2 Controlling Route Distribution ..................... 10
4.2.1 The Target VPN Attribute ........................... 10
4.2.2 Route Distribution Among PEs by BGP ................ 12
4.2.3 The VPN of Origin Attribute ........................ 13
4.2.4 Building VPNs using Target and Origin Attributes ... 14
5 Forwarding Across the Backbone ..................... 15
6 How PEs Learn Routes from CEs ...................... 16
7 How CEs learn Routes from PEs ...................... 19
8 What if the CE Supports MPLS? ...................... 19
8.1 Virtual Sites ...................................... 19
8.2 Representing an ISP VPN as a Stub VPN .............. 20
9 Security ........................................... 20
9.1 Point-to-Point Security Tunnels between CE Routers . 21
9.2 Multi-Party Security Associations .................. 21
10 Quality of Service ................................. 22
11 Scalability ........................................ 22
12 Intellectual Property Considerations ............... 23
13 Security Considerations ............................ 23
14 Acknowledgments .................................... 23
15 Authors' Addresses ................................. 24
16 References ......................................... 24
17 Full Copyright Statement............................. 25
1. Introduction
1.1. Virtual Private Networks
Consider a set of "sites" which are attached to a common network
which we may call the "backbone". Let's apply some policy to create a
number of subsets of that set, and let's impose the following rule:
two sites may have IP interconnectivity over that backbone only if at
least one of these subsets contains them both.
The subsets we have created are "Virtual Private Networks" (VPNs).
Two sites have IP connectivity over the common backbone only if there
is some VPN which contains them both. Two sites which have no VPN in
common have no connectivity over that backbone.
If all the sites in a VPN are owned by the same enterprise, the VPN
is a corporate "intranet". If the various sites in a VPN are owned
by different enterprises, the VPN is an "extranet". A site can be in
more than one VPN; e.g., in an intranet and several extranets. We
regard both intranets and extranets as VPNs. In general, when we use
the term VPN we will not be distinguishing between intranets and
extranets.
We wish to consider the case in which the backbone is owned and
operated by one or more Service Providers (SPs). The owners of the
sites are the "customers" of the SPs. The policies that determine
whether a particular collection of sites is a VPN are the policies of
the customers. Some customers will want the implementation of these
policies to be entirely the responsibility of the SP. Other
customers may want to implement these policies themselves, or to
share with the SP the responsibility for implementing these policies.
In this document, we are primarily discussing mechanisms that may be
used to implement these policies. The mechanisms we describe are
general enough to allow these policies to be implemented either by
the SP alone, or by a VPN customer together with the SP. Most of the
discussion is focused on the former case, however.
The mechanisms discussed in this document allow the implementation of
a wide range of policies. For example, within a given VPN, we can
allow every site to have a direct route to every other site ("full
mesh"), or we can restrict certain pairs of sites from having direct
routes to each other ("partial mesh").
In this document, we are particularly interested in the case where
the common backbone offers an IP service. We are primarily concerned
with the case in which an enterprise is outsourcing its backbone to a
service provider, or perhaps to a set of service providers, with
which it maintains contractual relationships. We are not focused on
providing VPNs over the public Internet.
In the rest of this introduction, we specify some properties which
VPNs should have. The remainder of this document outlines a VPN
model which has all these properties. The VPN Model of this document
appears to be an instance of the framework described in [4].
1.2. Edge Devices
We suppose that at each site, there are one or more Customer Edge
(CE) devices, each of which is attached via some sort of data link
(e.g., PPP, ATM, ethernet, Frame Relay, GRE tunnel, etc.) to one or
more Provider Edge (PE) routers.
If a particular site has a single host, that host may be the CE
device. If a particular site has a single subnet, that the CE device
may be a switch. In general, the CE device can be eXPected to be a
router, which we call the CE router.
We will say that a PE router is attached to a particular VPN if it is
attached to a CE device which is in that VPN. Similarly, we will say
that a PE router is attached to a particular site if it is attached
to a CE device which is in that site.
When the CE device is a router, it is a routing peer of the PE(s) to
which it is attached, but is not a routing peer of CE routers at
other sites. Routers at different sites do not directly exchange
routing information with each other; in fact, they do not even need
to know of each other at all (except in the case where this is
necessary for security purposes, see section 9). As a consequence,
very large VPNs (i.e., VPNs with a very large number of sites) are
easily supported, while the routing strategy for each individual site
is greatly simplified.
It is important to maintain clear administrative boundaries between
the SP and its customers (cf. [4]). The PE and P routers should be
administered solely by the SP, and the SP's customers should not have
any management Access to it. The CE devices should be administered
solely by the customer (unless the customer has contracted the
management services out to the SP).
1.3. VPNs with Overlapping Address Spaces
We assume that any two non-intersecting VPNs (i.e., VPNs with no
sites in common) may have overlapping address spaces; the same
address may be reused, for different systems, in different VPNs. As
long as a given endsystem has an address which is unique within the
scope of the VPNs that it belongs to, the endsystem itself does not
need to know anything about VPNs.
In this model, the VPN owners do not have a backbone to administer,
not even a "virtual backbone". Nor do the SPs have to administer a
separate backbone or "virtual backbone" for each VPN. Site-to-site
routing in the backbone is optimal (within the constraints of the
policies used to form the VPNs), and is not constrained in any way by
an artificial "virtual topology" of tunnels.
1.4. VPNs with Different Routes to the Same System
Although a site may be in multiple VPNs, it is not necessarily the
case that the route to a given system at that site should be the same
in all the VPNs. Suppose, for example, we have an intranet
consisting of sites A, B, and C, and an extranet consisting of A, B,
C, and the "foreign" site D. Suppose that at site A there is a
server, and we want clients from B, C, or D to be able to use that
server. Suppose also that at site B there is a firewall. We want
all the traffic from site D to the server to pass through the
firewall, so that traffic from the extranet can be access controlled.
However, we don't want traffic from C to pass through the firewall on
the way to the server, since this is intranet traffic.
This means that it needs to be possible to set up two routes to the
server. One route, used by sites B and C, takes the traffic directly
to site A. The second route, used by site D, takes the traffic
instead to the firewall at site B. If the firewall allows the
traffic to pass, it then appears to be traffic coming from site B,
and follows the route to site A.
1.5. Multiple Forwarding Tables in PEs
Each PE router needs to maintain a number of separate forwarding
tables. Every site to which the PE is attached must be mapped to one
of those forwarding tables. When a packet is received from a
particular site, the forwarding table associated with that site is
consulted in order to determine how to route the packet. The
forwarding table associated with a particular site S is populated
only with routes that lead to other sites which have at least one VPN
in common with S. This prevents communication between sites which
have no VPN in common, and it allows two VPNs with no site in common
to use address spaces that overlap with each other.
1.6. SP Backbone Routers
The SP's backbone consists of the PE routers, as well as other
routers (P routers) which do not attach to CE devices.
If every router in an SP's backbone had to maintain routing
information for all the VPNs supported by the SP, this model would
have severe scalability problems; the number of sites that could be
supported would be limited by the amount of routing information that
could be held in a single router. It is important to require
therefore that the routing information about a particular VPN be
present ONLY in those PE routers which attach to that VPN. In
particular, the P routers should not need to have ANY per-VPN routing
information whatsoever.
VPNs may span multiple service providers. We assume though that when
the path between PE routers crosses a boundary between SP networks,
it does so via a private peering arrangement, at which there exists
mutual trust between the two providers. In particular, each provider
must trust the other to pass it only correct routing information, and
to pass it labeled (in the sense of MPLS [9]) packets only if those
packets have been labeled by trusted sources. We also assume that it
is possible for label switched paths to cross the boundary between
service providers.
1.7. Security
A VPN model should, even without the use of cryptographic security
measures, provide a level of security equivalent to that oBTainable
when a level 2 backbone (e.g., Frame Relay) is used. That is, in the
absence of misconfiguration or deliberate interconnection of
different VPNs, it should not be possible for systems in one VPN to
gain access to systems in another VPN.
It should also be possible to deploy standard security procedures.
2. Sites and CEs
From the perspective of a particular backbone network, a set of IP
systems constitutes a site if those systems have mutual IP
interconnectivity, and communication between them occurs without use
of the backbone. In general, a site will consist of a set of systems
which are in geographic proximity. However, this is not universally
true; two geographic locations connected via a leased line, over
which OSPF is running, will constitute a single site, because
communication between the two locations does not involve the use of
the backbone.
A CE device is always regarded as being in a single site (though as
we shall see, a site may consist of multiple "virtual sites"). A
site, however, may belong to multiple VPNs.
A PE router may attach to CE devices in any number of different
sites, whether those CE devices are in the same or in different VPNs.
A CE device may, for robustness, attach to multiple PE routers, of
the same or of different service providers. If the CE device is a
router, the PE router and the CE router will appear as router
adjacencies to each other.
While the basic unit of interconnection is the site, the architecture
described herein allows a finer degree of granularity in the control
of interconnectivity. For example, certain systems at a site may be
members of an intranet as well as members of one or more extranets,
while other systems at the same site may be restricted to being
members of the intranet only.
3. Per-Site Forwarding Tables in the PEs
Each PE router maintains one or more "per-site forwarding tables".
Every site to which the PE router is attached is associated with one
of these tables. A particular packet's IP destination address is
looked up in a particular per-site forwarding table only if that
packet has arrived directly from a site which is associated with that
table.
How are the per-site forwarding tables populated?
As an example, let PE1, PE2, and PE3 be three PE routers, and let
CE1, CE2, and CE3 be three CE routers. Suppose that PE1 learns, from
CE1, the routes which are reachable at CE1's site. If PE2 and PE3
are attached respectively to CE2 and CE3, and there is some VPN V
containing CE1, CE2, and CE3, then PE1 uses BGP to distribute to PE2
and PE3 the routes which it has learned from CE1. PE2 and PE3 use
these routes to populate the forwarding tables which they associate
respectively with the sites of CE2 and CE3. Routes from sites which
are not in VPN V do not appear in these forwarding tables, which
means that packets from CE2 or CE3 cannot be sent to sites which are
not in VPN V.
If a site is in multiple VPNs, the forwarding table associated with
that site can contain routes from the full set of VPNs of which the
site is a member.
A PE generally maintains only one forwarding table per site, even if
it is multiply connected to that site. Also, different sites can
share the same forwarding table if they are meant to use exactly the
same set of routes.
Suppose a packet is received by a PE router from a particular
directly attached site, but the packet's destination address does not
match any entry in the forwarding table associated with that site.
If the SP is not providing Internet access for that site, then the
packet is discarded as undeliverable. If the SP is providing
Internet access for that site, then the PE's Internet forwarding
table will be consulted. This means that in general, only one
forwarding table per PE need ever contain routes from the Internet,
even if Internet access is provided.
To maintain proper isolation of one VPN from another, it is important
that no router in the backbone accept a labeled packet from any
adjacent non-backbone device unless (a) the label at the top of the
label stack was actually distributed by the backbone router to the
non-backbone device, and (b) the backbone router can determine that
use of that label will cause the packet to leave the backbone before
any labels lower in the stack will be inspected, and before the IP
header will be inspected. These restrictions are necessary in order
to prevent packets from entering a VPN where they do not belong.
The per-site forwarding tables in a PE are ONLY used for packets
which arrive from a site which is directly attached to the PE. They
are not used for routing packets which arrive from other routers that
belong to the SP backbone. As a result, there may be multiple
different routes to the same system, where the route followed by a
given packet is determined by the site from which the packet enters
the backbone. E.g., one may have one route to a given system for
packets from the extranet (where the route leads to a firewall), and
a different route to the same system for packets from the intranet
(including packets that have already passed through the firewall).
3.1. Virtual Sites
In some cases, a particular site may be divided by the customer into
several virtual sites, perhaps by the use of VLANs. Each virtual
site may be a member of a different set of VPNs. The PE then needs to
contain a separate forwarding table for each virtual site. For
example, if a CE supports VLANs, and wants each VLAN mapped to a
separate VPN, the packets sent between CE and PE could be contained
in the site's VLAN encapsulation, and this could be used by the PE,
along with the interface over which the packet is received, to assign
the packet to a particular virtual site.
Alternatively, one could divide the interface into multiple "sub-
interfaces" (particularly if the interface is Frame Relay or ATM),
and assign the packet to a VPN based on the sub-interface over which
it arrives. Or one could simply use a different interface for each
virtual site. In any case, only one CE router is ever needed per
site, even if there are multiple virtual sites. Of course, a
different CE router could be used for each virtual site, if that is
desired.
Note that in all these cases, the mechanisms, as well as the policy,
for controlling which traffic is in which VPN are in the hand of the
customer.
If it is desired to have a particular host be in multiple virtual
sites, then that host must determine, for each packet, which virtual
site the packet is associated with. It can do this, e.g., by sending
packets from different virtual sites on different VLANs, our out
different network interfaces.
These schemes do NOT require the CE to support MPLS. Section 8
contains a brief discussion of how the CE might support multiple
virtual sites if it does support MPLS.
4. VPN Route Distribution via BGP
PE routers use BGP to distribute VPN routes to each other (more
accurately, to cause VPN routes to be distributed to each other).
A BGP speaker can only install and distribute one route to a given
address prefix. Yet we allow each VPN to have its own address space,
which means that the same address can be used in any number of VPNs,
where in each VPN the address denotes a different system. It follows
that we need to allow BGP to install and distribute multiple routes
to a single IP address prefix. Further, we must ensure that POLICY
is used to determine which sites can be use which routes; given that
several such routes are installed by BGP, only one such must appear
in any particular per-site forwarding table.
We meet these goals by the use of a new address family, as specified
below.
4.1. The VPN-IPv4 Address Family
The BGP Multiprotocol Extensions [3] allow BGP to carry routes from
multiple "address families". We introduce the notion of the "VPN-
IPv4 address family". A VPN-IPv4 address is a 12-byte quantity,
beginning with an 8-byte "Route Distinguisher (RD)" and ending with a
4-byte IPv4 address. If two VPNs use the same IPv4 address prefix,
the PEs translate these into unique VPN-IPv4 address prefixes. This
ensures that if the same address is used in two different VPNs, it is
possible to install two completely different routes to that address,
one for each VPN.
The RD does not by itself impose any semantics; it contains no
information about the origin of the route or about the set of VPNs to
which the route is to be distributed. The purpose of the RD is
solely to allow one to create distinct routes to a common IPv4
address prefix. Other means are used to determine where to
redistribute the route (see section 4.2).
The RD can also be used to create multiple different routes to the
very same system. In section 3, we gave an example where the route
to a particular server had to be different for intranet traffic than
for extranet traffic. This can be achieved by creating two different
VPN-IPv4 routes that have the same IPv4 part, but different RDs.
This allows BGP to install multiple different routes to the same
system, and allows policy to be used (see section 4.2.3) to decide
which packets use which route.
The RDs are structured so that every service provider can administer
its own "numbering space" (i.e., can make its own assignments of
RDs), without conflicting with the RD assignments made by any other
service provider. An RD consists of a two-byte type field, an
administrator field, and an assigned number field. The value of the
type field determines the lengths of the other two fields, as well as
the semantics of the administrator field. The administrator field
identifies an assigned number authority, and the assigned number
field contains a number which has been assigned, by the identified
authority, for a particular purpose. For example, one could have an
RD whose administrator field contains an Autonomous System number
(ASN), and whose (4-byte) number field contains a number assigned by
the SP to whom IANA has assigned that ASN. RDs are given this
structure in order to ensure that an SP which provides VPN backbone
service can always create a unique RD when it needs to do so.
However, the structuring provides no semantics. When BGP compares two
such address prefixes, it ignores the structure entirely.
If the Administrator subfield and the Assigned Number subfield of a
VPN-IPv4 address are both set to all zeroes, the VPN-IPv4 address is
considered to have exactly the same meaning as the corresponding
globally unique IPv4 address. In particular, this VPN-IPv4 address
and the corresponding globally unique IPv4 address will be considered
comparable by BGP. In all other cases, a VPN-IPv4 address and its
corresponding globally unique IPv4 address will be considered
noncomparable by BGP.
A given per-site forwarding table will only have one VPN-IPv4 route
for any given IPv4 address prefix. When a packet's destination
address is matched against a VPN-IPv4 route, only the IPv4 part is
actually matched.
A PE needs to be configured to associate routes which lead to
particular CE with a particular RD. The PE may be configured to
associate all routes leading to the same CE with the same RD, or it
may be configured to associate different routes with different RDs,
even if they lead to the same CE.
4.2. Controlling Route Distribution
In this section, we discuss the way in which the distribution of the
VPN-IPv4 routes is controlled.
4.2.1. The Target VPN Attribute
Every per-site forwarding table is associated with one or more
"Target VPN" attributes.
When a VPN-IPv4 route is created by a PE router, it is associated
with one or more "Target VPN" attributes. These are carried in BGP
as attributes of the route.
Any route associated with Target VPN T must be distributed to every
PE router that has a forwarding table associated with Target VPN T.
When such a route is received by a PE router, it is eligible to be
installed in each of the PE's per-site forwarding tables that is
associated with Target VPN T. (Whether it actually gets installed
depends on the outcome of the BGP decision process.)
In essence, a Target VPN attribute identifies a set of sites.
Associating a particular Target VPN attribute with a route allows
that route to be placed in the per-site forwarding tables that are
used for routing traffic which is received from the corresponding
sites.
There is a set of Target VPNs that a PE router attaches to a route
received from site S. And there is a set of Target VPNs that a PE
router uses to determine whether a route received from another PE
router could be placed in the forwarding table associated with site
S. The two sets are distinct, and need not be the same.
The function performed by the Target VPN attribute is similar to that
performed by the BGP Communities Attribute. However, the format of
the latter is inadequate, since it allows only a two-byte numbering
space. It would be fairly straightforward to extend the BGP
Communities Attribute to provide a larger numbering space. It should
also be possible to structure the format, similar to what we have
described for RDs (see section 4.1), so that a type field defines the
length of an administrator field, and the remainder of the attribute
is a number from the specified administrator's numbering space.
When a BGP speaker has received two routes to the same VPN-IPv4
prefix, it chooses one, according to the BGP rules for route
preference.
Note that a route can only have one RD, but it can have multiple
Target VPNs. In BGP, scalability is improved if one has a single
route with multiple attributes, as opposed to multiple routes. One
could eliminate the Target VPN attribute by creating more routes
(i.e., using more RDs), but the scaling properties would be less
favorable.
How does a PE determine which Target VPN attributes to associate with
a given route? There are a number of different possible ways. The
PE might be configured to associate all routes that lead to a
particular site with a particular Target VPN. Or the PE might be
configured to associate certain routes leading to a particular site
with one Target VPN, and certain with another. Or the CE router,
when it distributes these routes to the PE (see section 6), might
specify one or more Target VPNs for each route. The latter method
shifts the control of the mechanisms used to implement the VPN
policies from the SP to the customer. If this method is used, it may
still be desirable to have the PE eliminate any Target VPNs that,
according to its own configuration, are not allowed, and/or to add in
some Target VPNs that according to its own configuration are
mandatory.
It might be more accurate, if less suggestive, to call this attribute
the "Route Target" attribute instead of the "VPN Target" attribute.
It really identifies only a set of sites which will be able to use
the route, without prejudice to whether those sites constitute what
might intuitively be called a VPN.
4.2.2. Route Distribution Among PEs by BGP
If two sites of a VPN attach to PEs which are in the same Autonomous
System, the PEs can distribute VPN-IPv4 routes to each other by means
of an IBGP connection between them. Alternatively, each can have an
IBGP connection to a route reflector.
If two sites of VPN are in different Autonomous Systems (e.g.,
because they are connected to different SPs), then a PE router will
need to use IBGP to redistribute VPN-IPv4 routes either to an
Autonomous System Border Router (ASBR), or to a route reflector of
which an ASBR is a client. The ASBR will then need to use EBGP to
redistribute those routes to an ASBR in another AS. This allows one
to connect different VPN sites to different Service Providers.
However, VPN-IPv4 routes should only be accepted on EBGP connections
at private peering points, as part of a trusted arrangement between
SPs. VPN-IPv4 routes should neither be distributed to nor accepted
from the public Internet.
If there are many VPNs having sites attached to different Autonomous
Systems, there does not need to be a single ASBR between those two
ASes which holds all the routes for all the VPNs; there can be
multiple ASBRs, each of which holds only the routes for a particular
subset of the VPNs.
When a PE router distributes a VPN-IPv4 route via BGP, it uses its
own address as the "BGP next hop". It also assigns and distributes
an MPLS label. (Essentially, PE routers distribute not VPN-IPv4
routes, but Labeled VPN-IPv4 routes. Cf. [8]) When the PE processes a
received packet that has this label at the top of the stack, the PE
will pop the stack, and send the packet directly to the site from to
which the route leads. This will usually mean that it just sends the
packet to the CE router from which it learned the route. The label
may also determine the data link encapsulation.
In most cases, the label assigned by a PE will cause the packet to be
sent directly to a CE, and the PE which receives the labeled packet
will not look up the packet's destination address in any forwarding
table. However, it is also possible for the PE to assign a label
which implicitly identifies a particular forwarding table. In this
case, the PE receiving a packet that label would look up the packet's
destination address in one of its forwarding tables. While this can
be very useful in certain circumstances, we do not consider it
further in this paper.
Note that the MPLS label that is distributed in this way is only
usable if there is a label switched path between the router that
installs a route and the BGP next hop of that route. We do not make
any assumption about the procedure used to set up that label switched
path. It may be set up on a pre-established basis, or it may be set
up when a route which would need it is installed. It may be a "best
effort" route, or it may be a traffic engineered route. Between a
particular PE router and its BGP next hop for a particular route
there may be one LSP, or there may be several, perhaps with different
QoS characteristics. All that matters for the VPN architecture is
that some label switched path between the router and its BGP next hop
exists.
All the usual techniques for using route reflectors [2] to improve
scalability, e.g., route reflector hierarchies, are available. If
route reflectors are used, there is no need to have any one route
reflector know all the VPN-IPv4 routes for all the VPNs supported by
the backbone. One can have separate route reflectors, which do not
communicate with each other, each of which supports a subset of the
total set of VPNs.
If a given PE router is not attached to any of the Target VPNs of a
particular route, it should not receive that route; the other PE or
route reflector which is distributing routes to it should apply
outbound filtering to avoid sending it unnecessary routes. Of
course, if a PE router receives a route via BGP, and that PE is not
attached to any of the route's target VPNs, the PE should apply
inbound filtering to the route, neither installing nor redistributing
it.
A router which is not attached to any VPN, i.e., a P router, never
installs any VPN-IPv4 routes at all.
These distribution rules ensure that there is no one box which needs
to know all the VPN-IPv4 routes that are supported over the backbone.
As a result, the total number of such routes that can be supported
over the backbone is not bound by the capacity of any single device,
and therefore can increase virtually without bound.
4.2.3. The VPN of Origin Attribute
A VPN-IPv4 route may be optionally associated with a VPN of Origin
attribute. This attribute uniquely identifies a set of sites, and
identifies the corresponding route as having come from one of the
sites in that set. Typical uses of this attribute might be to
identify the enterprise which owns the site where the route leads, or
to identify the site's intranet. However, other uses are also
possible. This attribute could be encoded as an extended BGP
communities attribute.
In situations in which it is necessary to identify the source of a
route, it is this attribute, not the RD, which must be used. This
attribute may be used when "constructing" VPNs, as described below.
It might be more accurate, if less suggestive, to call this attribute
the "Route Origin" attribute instead of the "VPN of Origin"
attribute. It really identifies the route only has having come from
one of a particular set of sites, without prejudice as to whether
that particular set of sites really constitutes a VPN.
4.2.4. Building VPNs using Target and Origin Attributes
By setting up the Target VPN and VPN of Origin attributes properly,
one can construct different kinds of VPNs.
Suppose it is desired to create a Closed User Group (CUG) which
contains a particular set of sites. This can be done by creating a
particular Target VPN attribute value to represent the CUG. This
value then needs to be associated with the per-site forwarding tables
for each site in the CUG, and it needs to be associated with every
route learned from a site in the CUG. Any route which has this
Target VPN attribute will need to be redistributed so that it reaches
every PE router attached to one of the sites in the CUG.
Alternatively, suppose one desired, for whatever reason, to create a
"hub and spoke" kind of VPN. This could be done by the use of two
Target Attribute values, one meaning "Hub" and one meaning "Spoke".
Then routes from the spokes could be distributed to the hub, without
causing routes from the hub to be distributed to the spokes.
Suppose one has a number of sites which are in an intranet and an
extranet, as well as a number of sites which are in the intranet
only. Then there may be both intranet and extranet routes which have
a Target VPN identifying the entire set of sites. The sites which
are to have intranet routes only can filter out all routes with the
"wrong" VPN of Origin.
These two attributes allow great flexibility in allowing one to
control the distribution of routing information among various sets of
sites, which in turn provides great flexibility in constructing VPNs.
5. Forwarding Across the Backbone
If the intermediate routes in the backbone do not have any
information about the routes to the VPNs, how are packets forwarded
from one VPN site to another?
This is done by means of MPLS with a two-level label stack.
PE routers (and ASBRs which redistribute VPN-IPv4 addresses) need to
insert /32 address prefixes for themselves into the IGP routing
tables of the backbone. This enables MPLS, at each node in the
backbone network, to assign a label corresponding to the route to
each PE router. (Certain procedures for setting up label switched
paths in the backbone may not require the presence of the /32 address
prefixes.)
When a PE receives a packet from a CE device, it chooses a particular
per-site forwarding table in which to look up the packet's
destination address. Assume that a match is found.
If the packet is destined for a CE device attached to this same PE,
the packet is sent directly to that CE device.
If the packet is not destined for a CE device attached to this same
PE, the packet's "BGP Next Hop" is found, as well as the label which
that BGP next hop assigned for the packet's destination address. This
label is pushed onto the packet's label stack, and becomes the bottom
label. Then the PE looks up the IGP route to the BGP Next Hop, and
thus determines the IGP next hop, as well as the label assigned to
the address of the BGP next hop by the IGP next hop. This label gets
pushed on as the packet's top label, and the packet is then forwarded
to the IGP next hop. (If the BGP next hop is the same as the IGP
next hop, the second label may not need to be pushed on, however.)
At this point, MPLS will carry the packet across the backbone and
into the appropriate CE device. That is, all forwarding decisions by
P routers and PE routers are now made by means of MPLS, and the
packet's IP header is not looked at again until the packet reaches
the CE device. The final PE router will pop the last label from the
MPLS label stack before sending the packet to the CE device, thus the
CE device will just see an ordinary IP packet. (Though see section 8
for some discussion of the case where the CE desires to received
labeled packets.)
When a packet enters the backbone from a particular site via a
particular PE router, the packet's route is determined by the
contents of the forwarding table which that PE router associated with
that site. The forwarding tables of the PE router where the packet
leaves the backbone are not relevant. As a result, one may have
multiple routes to the same system, where the particular route chosen
for a particular packet is based on the site from which the packet
enters the backbone.
Note that it is the two-level labeling that makes it possible to keep
all the VPN routes out of the P routers, and this in turn is crucial
to ensuring the scalability of the model. The backbone does not even
need to have routes to the CEs, only to the PEs.
6. How PEs Learn Routes from CEs
The PE routers which attach to a particular VPN need to know, for
each of that VPN's sites, which addresses in that VPN are at each
site.
In the case where the CE device is a host or a switch, this set of
addresses will generally be configured into the PE router attaching
to that device. In the case where the CE device is a router, there
are a number of possible ways that a PE router can obtain this set of
addresses.
The PE translates these addresses into VPN-IPv4 addresses, using a
configured RD. The PE then treats these VPN-IPv4 routes as input to
BGP. In no case will routes from a site ever be leaked into the
backbone's IGP.
Exactly which PE/CE route distribution techniques are possible
depends on whether a particular CE is in a "transit VPN" or not. A
"transit VPN" is one which contains a router that receives routes
from a "third party" (i.e., from a router which is not in the VPN,
but is not a PE router), and that redistributes those routes to a PE
router. A VPN which is not a transit VPN is a "stub VPN". The vast
majority of VPNs, including just about all corporate enterprise
networks, would be expected to be "stubs" in this sense.
The possible PE/CE distribution techniques are:
1. Static routing (i.e., configuration) may be used. (This is
likely to be useful only in stub VPNs.)
2. PE and CE routers may be RIP peers, and the CE may use RIP to
tell the PE router the set of address prefixes which are
reachable at the CE router's site. When RIP is configured in
the CE, care must be taken to ensure that address prefixes from
other sites (i.e., address prefixes learned by the CE router
from the PE router) are never advertised to the PE. More
precisely: if a PE router, say PE1, receives a VPN-IPv4 route
R1, and as a result distributes an IPv4 route R2 to a CE, then
R2 must not be distributed back from that CE's site to a PE
router, say PE2, (where PE1 and PE2 may be the same router or
different routers), unless PE2 maps R2 to a VPN-IPv4 route
which is different than (i.e., contains a different RD than)
R1.
3. The PE and CE routers may be OSPF peers. In this case, the
site should be a single OSPF area, the CE should be an ABR in
that area, and the PE should be an ABR which is not in that
area. Also, the PE should report no router links other than
those to the CEs which are at the same site. (This technique
should be used only in stub VPNs.)
4. The PE and CE routers may be BGP peers, and the CE router may
use BGP (in particular, EBGP to tell the PE router the set of
address prefixes which are at the CE router's site. (This
technique can be used in stub VPNs or transit VPNs.)
From a purely technical perspective, this is by far the best
technique:
a) Unlike the IGP alternatives, this does not require the
PE to run multiple routing algorithm instances in order
to talk to multiple CEs
b) BGP is explicitly designed for just this function:
passing routing information between systems run by
different administrations
c) If the site contains "BGP backdoors", i.e., routers
with BGP connections to routers other than PE routers,
this procedure will work correctly in all
circumstances. The other procedures may or may not
work, depending on the precise circumstances.
d) Use of BGP makes it easy for the CE to pass attributes
of the routes to the PE. For example, the CE may
suggest a particular Target for each route, from among
the Target attributes that the PE is authorized to
attach to the route.
On the other hand, using BGP is likely to be something new for
the CE administrators, except in the case where the customer
itself is already an Internet Service Provider (ISP).
If a site is not in a transit VPN, note that it need not have
a unique Autonomous System Number (ASN). Every CE whose site
which is not in a transit VPN can use the same ASN. This can
be chosen from the private ASN space, and it will be stripped
out by the PE. Routing loops are prevented by use of the Site
of Origin Attribute (see below).
If a set of sites constitute a transit VPN, it is convenient
to represent them as a BGP Confederation, so that the internal
structure of the VPN is hidden from any router which is not
within the VPN. In this case, each site in the VPN would need
two BGP connections to the backbone, one which is internal to
the confederation and one which is external to it. The usual
intra-confederation procedures would have to be slightly
modified in order to take account for the fact that the
backbone and the sites may have different policies. The
backbone is a member of the confederation on one of the
connections, but is not a member on the other. These
techniques may be useful if the customer for the VPN service
is an ISP. This technique allows a customer that is an ISP to
obtain VPN backbone service from one of its ISP peers.
(However, if a VPN customer is itself an ISP, and its CE
routers support MPLS, a much simpler technique can be used,
wherein the ISP is regarded as a stub VPN. See section 8.)
When we do not need to distinguish among the different ways in which
a PE can be informed of the address prefixes which exist at a given
site, we will simply say that the PE has "learned" the routes from
that site.
Before a PE can redistribute a VPN-IPv4 route learned from a site, it
must assign certain attributes to the route. There are three such
attributes:
- Site of Origin
This attribute uniquely identifies the site from which the PE
router learned the route. All routes learned from a particular
site must be assigned the same Site of Origin attribute, even if
a site is multiply connected to a single PE, or is connected to
multiple PEs. Distinct Site of Origin attributes must be used
for distinct sites. This attribute could be encoded as an
extended BGP communities attribute (section 4.2.1).
- VPN of Origin
See section 4.2.1.
- Target VPN
See section 4.2.1.
7. How CEs learn Routes from PEs
In this section, we assume that the CE device is a router.
In general, a PE may distribute to a CE any route which the PE has
placed in the forwarding table which it uses to route packets from
that CE. There is one exception: if a route's Site of Origin
attribute identifies a particular site, that route must never be
redistributed to any CE at that site.
In most cases, however, it will be sufficient for the PE to simply
distribute the default route to the CE. (In some cases, it may even
be sufficient for the CE to be configured with a default route
pointing to the PE.) This will generally work at any site which does
not itself need to distribute the default route to other sites.
(E.g., if one site in a corporate VPN has the corporation's access to
the Internet, that site might need to have default distributed to the
other site, but one could not distribute default to that site
itself.)
Whatever procedure is used to distribute routes from CE to PE will
also be used to distribute routes from PE to CE.
8. What if the CE Supports MPLS?
In the case where the CE supports MPLS, AND is willing to import the
complete set of routes from its VPNs, the PE can distribute to it a
label for each such route. When the PE receives a packet from the CE
with such a label, it (a) replaces that label with the corresponding
label that it learned via BGP, and (b) pushes on a label
corresponding to the BGP next hop for the corresponding route.
8.1. Virtual Sites
If the CE/PE route distribution is done via BGP, the CE can use MPLS
to support multiple virtual sites. The CE may itself contain a
separate forwarding table for each virtual site, which it populates
as indicated by the VPN of Origin and Target VPN attributes of the
routes it receives from the PE. If the CE receives the full set of
routes from the PE, the PE will not need to do any address lookup at
all on packets received from the CE. Alternatively, the PE may in
some cases be able to distribute to the CE a single (labeled) default
route for each VPN. Then when the PE receives a labeled packet from
the CE, it would know which forwarding table to look in; the label
placed on the packet by the CE would identify only the virtual site
from which the packet is coming.
8.2. Representing an ISP VPN as a Stub VPN
If a particular VPN is actually an ISP, but its CE routers support
MPLS, then the VPN can actually be treated as a stub VPN. The CE and
PE routers need only exchange routes which are internal to the VPN.
The PE router would distribute to the CE router a label for each of
these routes. Routers at different sites in the VPN can then become
BGP peers. When the CE router looks up a packet's destination
address, the routing lookup always resolves to an internal address,
usually the address of the packet's BGP next hop. The CE labels the
packet appropriately and sends the packet to the PE.
9. Security
Under the following conditions:
a) labeled packets are not accepted by backbone routers from
untrusted or unreliable sources, unless it is known that such
packets will leave the backbone before the IP header or any
labels lower in the stack will be inspected, and
b) labeled VPN-IPv4 routes are not accepted from untrusted or
unreliable sources,
the security provided by this architecture is virtually identical to
that provided to VPNs by Frame Relay or ATM backbones.
It is worth noting that the use of MPLS makes it much simpler to
provide this level of security than would be possible if one
attempted to use some form of IP-within-IP tunneling in place of
MPLS. It is a simple matter to refuse to accept a labeled packet
unless the first of the above conditions applies to it. It is rather
more difficult to configure the a router to refuse to accept an IP
packet if that packet is an IP-within-IP tunnelled packet which is
going to a "wrong" place.
The use of MPLS also allows a VPN to span multiple SPs without
depending in any way on the inter-domain distribution of IPv4 routing
information.
It is also possible for a VPN user to provide himself with enhanced
security by making use of Tunnel Mode IPSEC [5]. This is discussed
in the remainder of this section.
9.1. Point-to-Point Security Tunnels between CE Routers
A security-conscious VPN user might want to ensure that some or all
of the packets which traverse the backbone are authenticated and/or
encrypted. The standard way to obtain this functionality today would
be to create a "security tunnel" between every pair of CE routers in
a VPN, using IPSEC Tunnel Mode.
However, the procedures described so far do not enable the CE router
transmitting a packet to determine the identify of the next CE router
that the packet will traverse. Yet that information is required in
order to use Tunnel Mode IPSEC. So we must extend those procedures
to make this information available.
A way to do this is suggested in [6]. Every VPN-IPv4 route can have
an attribute which identifies the next CE router that will be
traversed if that route is followed. If this information is provided
to all the CE routers in the VPN, standard IPSEC Tunnel Mode can be
used.
If the CE and PE are BGP peers, it is natural to present this
information as a BGP attribute.
Each CE that is to use IPSEC should also be configured with a set of
address prefixes, such that it is prohibited from sending insecure
traffic to any of those addresses. This prevents the CE from sending
insecure traffic if, for some reason, it fails to obtain the
necessary information.
When MPLS is used to carry packets between the two endpoints of an
IPSEC tunnel, the IPSEC outer header does not really perform any
function. It might be beneficial to develop a form of IPSEC tunnel
mode which allows the outer header to be omitted when MPLS is used.
9.2. Multi-Party Security Associations
Instead of setting up a security tunnel between each pair of CE
routers, it may be advantageous to set up a single, multiparty
security association. In such a security association, all the CE
routers which are in a particular VPN would share the same security
parameters (.e.g., same secret, same algorithm, etc.). Then the
ingress CE wouldn't have to know which CE is the next one to receive
the data, it would only have to know which VPN the data is going to.
A CE which is in multiple VPNs could use different security
parameters for each one, thus protecting, e.g., intranet packets from
being exposed to the extranet.
With such a scheme, standard Tunnel Mode IPSEC could not be used,
because there is no way to fill in the IP destination address field
of the "outer header". However, when MPLS is used for forwarding,
there is no real need for this outer header anyway; the PE router can
use MPLS to get a packet to a tunnel endpoint without even knowing
the IP address of that endpoint; it only needs to see the IP
destination address of the "inner header".
A significant advantage of a scheme like this is that it makes
routing changes (in particular, a change of egress CE for a
particular address prefix) transparent to the security mechanism.
This could be particularly important in the case of multi-provider
VPNs, where the need to distribute information about such routing
changes simply to support the security mechanisms could result in
scalability issues.
Another advantage is that it eliminates the need for the outer IP
header, since the MPLS encapsulation performs its role.
10. Quality of Service
Although not the focus of this paper, Quality of Service is a key
component of any VPN service. In MPLS/BGP VPNs, existing L3 QoS
capabilities can be applied to labeled packets through the use of the
"experimental" bits in the shim header [10], or, where ATM is used as
the backbone, through the use of ATM QoS capabilities. The traffic
engineering work discussed in [1] is also directly applicable to
MPLS/BGP VPNs. Traffic engineering could even be used to establish
LSPs with particular QoS characteristics between particular pairs of
sites, if that is desirable. Where an MPLS/BGP VPN spans multiple
SPs, the architecture described in [7] may be useful. An SP may
apply either intserv or diffserv capabilities to a particular VPN, as
appropriate.
11. Scalability
We have discussed scalability issues throughout this paper. In this
section, we briefly summarize the main characteristics of our model
with respect to scalability.
The Service Provider backbone network consists of (a) PE routers, (b)
BGP Route Reflectors, (c) P routers (which are neither PE routers nor
Route Reflectors), and, in the case of multi-provider VPNs, (d)
ASBRs.
P routers do not maintain any VPN routes. In order to properly
forward VPN traffic, the P routers need only maintain routes to the
PE routers and the ASBRs. The use of two levels of labeling is what
makes it possible to keep the VPN routes out of the P routers.
A PE router to maintains VPN routes, but only for those VPNs to which
it is directly attached.
Route reflectors and ASBRs can be partitioned among VPNs so that each
partition carries routes for only a subset of the VPNs provided by
the Service Provider. Thus no single Route Reflector or ASBR is
required to maintain routes for all the VPNs.
As a result, no single component within the Service Provider network
has to maintain all the routes for all the VPNs. So the total
capacity of the network to support increasing numbers of VPNs is not
limited by the capacity of any individual component.
12. Intellectual Property Considerations
Cisco Systems may seek patent or other intellectual property
protection for some of all of the technologies disclosed in this
document. If any standards arising from this document are or become
protected by one or more patents assigned to Cisco Systems, Cisco
intends to disclose those patents and license them on reasonable and
non-discriminatory terms.
13. Security Considerations
Security issues are discussed throughout this memo.
14. Acknowledgments
Significant contributions to this work have been made by Ravi
Chandra, Dan Tappan and Bob Thomas.
15. Authors' Addresses
Eric C. Rosen
Cisco Systems, Inc.
250 Apollo Drive
Chelmsford, MA, 01824
EMail: erosen@cisco.com
Yakov Rekhter
Cisco Systems, Inc.
170 Tasman Drive
San Jose, CA, 95134
EMail: yakov@cisco.com
16. References
[1] Awduche, Berger, Gan, Li, Swallow, and Srinavasan, "Extensions
to RSVP for LSP Tunnels", Work in Progress.
[2] Bates, T. and R. Chandrasekaran, "BGP Route Reflection: An
alternative to full mesh IBGP", RFC1966, June 1996.
[3] Bates, T., Chandra, R., Katz, D. and Y. Rekhter, "Multiprotocol
Extensions for BGP4", RFC2283, February 1998.
[4] Gleeson, Heinanen, and Armitage, "A Framework for IP Based
Virtual Private Networks", Work in Progress.
[5] Kent and Atkinson, "Security Architecture for the Internet
Protocol", RFC2401, November 1998.
[6] Li, "CPE based VPNs using MPLS", October 1998, Work in Progress.
[7] Li, T. and Y. Rekhter, "A Provider Architecture for
Differentiated Services and Traffic Engineering (PASTE)", RFC
2430, October 1998.
[8] Rekhter and Rosen, "Carrying Label Information in BGP4", Work in
Progress.
[9] Rosen, Viswanathan, and Callon, "Multiprotocol Label Switching
Architecture", Work in Progress.
[10] Rosen, Rekhter, Tappan, Farinacci, Fedorkow, Li, and Conta, "MPLS
Label Stack Encoding", Work in Progress.
17. Full Copyright Statement
Copyright (C) The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.