Home | History | Annotate | Download | only in doc
      1 \documentstyle[12pt,twoside]{article}
      2 \def\TITLE{Tunnels over IP}
      3 \input preamble
      4 \begin{center}
      5 \Large\bf Tunnels over IP in Linux-2.2
      6 \end{center}
      7 
      8 
      9 \begin{center}
     10 { \large Alexey~N.~Kuznetsov } \\
     11 \em Institute for Nuclear Research, Moscow \\
     12 \verb|kuznet (a] ms2.inr.ac.ru| \\
     13 \rm March 17, 1999
     14 \end{center}
     15 
     16 \vspace{5mm}
     17 
     18 \tableofcontents
     19 
     20 
     21 \section{Instead of introduction: micro-FAQ.}
     22 
     23 \begin{itemize}
     24 
     25 \item
     26 Q: In linux-2.0.36 I used:
     27 \begin{verbatim} 
     28     ifconfig tunl1 10.0.0.1 pointopoint 193.233.7.65
     29 \end{verbatim} 
     30 to create tunnel. It does not work in 2.2.0!
     31 
     32 A: You are right, it does not work. The command written above is split to two commands.
     33 \begin{verbatim}
     34     ip tunnel add MY-TUNNEL mode ipip remote 193.233.7.65
     35 \end{verbatim} 
     36 will create tunnel device with name \verb|MY-TUNNEL|. Now you may configure
     37 it with:
     38 \begin{verbatim} 
     39     ifconfig MY-TUNNEL 10.0.0.1
     40 \end{verbatim} 
     41 Certainly, if you prefer name \verb|tunl1| to \verb|MY-TUNNEL|,
     42 you still may use it.
     43 
     44 \item
     45 Q: In linux-2.0.36 I used:
     46 \begin{verbatim} 
     47     ifconfig tunl0 10.0.0.1
     48     route add -net 10.0.0.0 gw 193.233.7.65 dev tunl0
     49 \end{verbatim} 
     50 to tunnel net 10.0.0.0 via router 193.233.7.65. It does not
     51 work in 2.2.0! Moreover, \verb|route| prints a funny error sort of
     52 ``network unreachable'' and after this I found a strange direct route
     53 to 10.0.0.0 via \verb|tunl0| in routing table.
     54 
     55 A: Yes, in 2.2 the rule that {\em normal} gateway must reside on directly
     56 connected network has not any exceptions. You may tell kernel, that
     57 this particular route is {\em abnormal}:
     58 \begin{verbatim} 
     59   ifconfig tunl0 10.0.0.1 netmask 255.255.255.255
     60   ip route add 10.0.0.0/8 via 193.233.7.65 dev tunl0 onlink
     61 \end{verbatim}
     62 Note keyword \verb|onlink|, it is the magic key that orders kernel
     63 not to check for consistency of gateway address.
     64 Probably, after this explanation you have already guessed another method
     65 to cheat kernel:
     66 \begin{verbatim} 
     67   ifconfig tunl0 10.0.0.1 netmask 255.255.255.255
     68   route add -host 193.233.7.65 dev tunl0
     69   route add -net 10.0.0.0 netmask 255.0.0.0 gw 193.233.7.65
     70   route del -host 193.233.7.65 dev tunl0
     71 \end{verbatim}
     72 Well, if you like such tricks, nobody may prohibit you to use them.
     73 Only do not forget
     74 that between \verb|route add| and \verb|route del| host 193.233.7.65 is
     75 unreachable.
     76 
     77 \item
     78 Q: In 2.0.36 I used to load \verb|tunnel| device module and \verb|ipip| module.
     79 I cannot find any \verb|tunnel| in 2.2!
     80 
     81 A: Linux-2.2 has single module \verb|ipip| for both directions of tunneling
     82 and for all IPIP tunnel devices.
     83 
     84 \item
     85 Q: \verb|traceroute| does not work over tunnel! Well, stop... It works,
     86      only skips some number of hops.
     87 
     88 A: Yes. By default tunnel driver copies \verb|ttl| value from
     89 inner packet to outer one. It means that path traversed by tunneled
     90 packets to another endpoint is not hidden. If you dislike this, or if you
     91 are going to use some routing protocol expecting that packets
     92 with ttl 1 will reach peering host (f.e.\ RIP, OSPF or EBGP)
     93 and you are not afraid of
     94 tunnel loops, you may append option \verb|ttl 64|, when creating tunnel
     95 with \verb|ip tunnel add|.
     96 
     97 \item
     98 Q: ... Well, list of things, which 2.0 was able to do finishes.
     99 
    100 \end{itemize}
    101 
    102 \paragraph{Summary of differences between 2.2 and 2.0.}
    103 
    104 \begin{itemize}
    105 
    106 \item {\bf In 2.0} you could compile tunnel device into kernel
    107 	and got set of 4 devices \verb|tunl0| ... \verb|tunl3| or,
    108 	alternatively, compile it as module and load new module
    109 	for each new tunnel. Also, module \verb|ipip| was necessary
    110 	to receive tunneled packets.
    111 
    112       {\bf 2.2} has {\em one\/} module \verb|ipip|. Loading it you get base
    113 	tunnel device \verb|tunl0| and another tunnels may be created with command
    114 	\verb|ip tunnel add|. These new devices may have arbitrary names.
    115 
    116 
    117 \item {\bf In 2.0} you set remote tunnel endpoint address with
    118 	the command \verb|ifconfig| ... \verb|pointopoint A|.
    119 
    120 	{\bf In 2.2} this command has the same semantics on all
    121 	the interfaces, namely it sets not tunnel endpoint,
    122 	but address of peering host, which is directly reachable
    123 	via this tunnel,
    124 	rather than via Internet. Actual tunnel endpoint address \verb|A|
    125 	should be set with \verb|ip tunnel add ... remote A|.
    126 
    127 \item {\bf In 2.0} you create tunnel routes with the command:
    128 \begin{verbatim}
    129     route add -net 10.0.0.0 gw A dev tunl0
    130 \end{verbatim}
    131 
    132 	{\bf 2.2} interprets this command equally for all device
    133 	kinds and gateway is required to be directly reachable via this tunnel,
    134 	rather than via Internet. You still may use \verb|ip route add ... onlink|
    135 	to override this behaviour.
    136 
    137 \end{itemize}
    138 
    139 
    140 \section{Tunnel setup: basics}
    141 
    142 Standard Linux-2.2 kernel supports three flavor of tunnels,
    143 listed in the following table:
    144 \vspace{2mm}
    145 
    146 \begin{tabular}{lll}
    147 \vrule depth 0.8ex width 0pt\relax
    148 Mode & Description  & Base device \\
    149 ipip & IP over IP & tunl0 \\
    150 sit & IPv6 over IP & sit0 \\
    151 gre & ANY over GRE over IP & gre0
    152 \end{tabular}
    153 
    154 \vspace{2mm}
    155 
    156 \noindent All the kinds of tunnels are created with one command:
    157 \begin{verbatim}
    158   ip tunnel add <NAME> mode <MODE> [ local <S> ] [ remote <D> ]
    159 \end{verbatim}
    160 
    161 This command creates new tunnel device with name \verb|<NAME>|.
    162 The \verb|<NAME>| is an arbitrary string. Particularly,
    163 it may be even \verb|eth0|. The rest of parameters set
    164 different tunnel characteristics.
    165 
    166 \begin{itemize}
    167 
    168 \item
    169 \verb|mode <MODE>| sets tunnel mode. Three modes are available now
    170 	\verb|ipip|, \verb|sit| and \verb|gre|.
    171 
    172 \item
    173 \verb|remote <D>| sets remote endpoint of the tunnel to IP
    174 	address \verb|<D>|.
    175 \item
    176 \verb|local <S>| sets fixed local address for tunneled
    177 	packets. It must be an address on another interface of this host.
    178 
    179 \end{itemize}
    180 
    181 \let\thefootnote\oldthefootnote
    182 
    183 Both \verb|remote| and \verb|local| may be omitted. In this case we
    184 say that they are zero or wildcard. Two tunnels of one mode cannot
    185 have the same \verb|remote| and \verb|local|. Particularly it means
    186 that base device or fallback tunnel cannot be replicated.\footnote{
    187 This restriction is relaxed for keyed GRE tunnels.}
    188 
    189 Tunnels are divided to two classes: {\bf pointopoint} tunnels, which
    190 have some not wildcard \verb|remote| address and deliver all the packets
    191 to this destination, and {\bf NBMA} (i.e. Non-Broadcast Multi-Access) tunnels,
    192 which have no \verb|remote|. Particularly, base devices (f.e.\ \verb|tunl0|)
    193 are NBMA, because they have neither \verb|remote| nor
    194 \verb|local| addresses.
    195 
    196 
    197 After tunnel device is created you should configure it as you did
    198 it with another devices. Certainly, the configuration of tunnels has
    199 some features related to the fact that they work over existing Internet
    200 routing infrastructure and simultaneously create new virtual links,
    201 which changes this infrastructure. The danger that not enough careful
    202 tunnel setup will result in formation of tunnel loops,
    203 collapse of routing or flooding network with exponentially
    204 growing number of tunneled fragments is very real.
    205 
    206 
    207 Protocol setup on pointopoint tunnels does not differ of configuration
    208 of another devices. You should set a protocol address with \verb|ifconfig|
    209 and add routes with \verb|route| utility.
    210 
    211 NBMA tunnels are different. To route something via NBMA tunnel
    212 you have to explain to driver, where it should deliver packets to.
    213 The only way to make it is to create special routes with gateway
    214 address pointing to desired endpoint. F.e.\ 
    215 \begin{verbatim}
    216     ip route add 10.0.0.0/24 via <A> dev tunl0 onlink
    217 \end{verbatim}
    218 It is important to use option \verb|onlink|, otherwise
    219 kernel will refuse request to create route via gateway not directly
    220 reachable over device \verb|tunl0|. With IPv6 the situation is much simpler:
    221 when you start device \verb|sit0|, it automatically configures itself
    222 with all IPv4 addresses mapped to IPv6 space, so that all IPv4
    223 Internet is {\em really reachable} via \verb|sit0|! Excellent, the command
    224 \begin{verbatim}
    225     ip route add 3FFE::/16 via ::193.233.7.65 dev sit0
    226 \end{verbatim}
    227 will route \verb|3FFE::/16| via \verb|sit0|, sending all the packets
    228 destined to this prefix to 193.233.7.65.
    229 
    230 \section{Tunnel setup: options}
    231 
    232 Command \verb|ip tunnel add| has several additional options.
    233 \begin{itemize}
    234 
    235 \item \verb|ttl N| --- set fixed TTL \verb|N| on tunneled packets.
    236 	\verb|N| is number in the range 1--255. 0 is special value,
    237 	meaning that packets inherit TTL value. 
    238 		Default value is: \verb|inherit|.
    239 
    240 \item \verb|tos T| --- set fixed tos \verb|T| on tunneled packets.
    241 		Default value is: \verb|inherit|.
    242 
    243 \item \verb|dev DEV| --- bind tunnel to device \verb|DEV|, so that
    244 	tunneled packets will be routed only via this device and will
    245 	not be able to escape to another device, when route to endpoint changes.
    246 
    247 \item \verb|nopmtudisc| --- disable Path MTU Discovery on this tunnel.
    248 	It is enabled by default. Note that fixed ttl is incompatible
    249 	with this option: tunnels with fixed ttl always make pmtu discovery.
    250 
    251 \end{itemize}
    252 
    253 \verb|ipip| and \verb|sit| tunnels have no more options. \verb|gre|
    254 tunnels are more complicated:
    255 
    256 \begin{itemize}
    257 
    258 \item \verb|key K| --- use keyed GRE with key \verb|K|. \verb|K| is
    259 	either number or IP address-like dotted quad.
    260 
    261 \item \verb|csum| --- checksum tunneled packets.
    262 
    263 \item \verb|seq| --- serialize packets.
    264 \begin{NB}
    265 	I think this option does not
    266 	work. At least, I did not test it, did not debug it and
    267 	even do not understand,	how it is supposed to work and for what
    268 	purpose Cisco planned to use it.
    269 \end{NB}
    270 
    271 \end{itemize}
    272 
    273 
    274 Actually, these GRE options can be set separately for input and
    275 output directions by prefixing corresponding keywords with letter
    276 \verb|i| or \verb|o|. F.e.\ \verb|icsum| orders to accept only
    277 packets with correct checksum and \verb|ocsum| means, that
    278 our host will calculate and send checksum.
    279 
    280 Command \verb|ip tunnel add| is not the only operation,
    281 which can be made with tunnels. Certainly, you may get short help page
    282 with:
    283 \begin{verbatim}
    284     ip tunnel help
    285 \end{verbatim}
    286 
    287 Besides that, you may view list of installed tunnels with the help of command:
    288 \begin{verbatim}
    289     ip tunnel ls
    290 \end{verbatim}
    291 Also you may look at statistics:
    292 \begin{verbatim}
    293     ip -s tunnel ls Cisco
    294 \end{verbatim}
    295 where \verb|Cisco| is name of tunnel device. Command
    296 \begin{verbatim}
    297     ip tunnel del Cisco
    298 \end{verbatim}
    299 destroys tunnel \verb|Cisco|. And, finally,
    300 \begin{verbatim}
    301     ip tunnel change Cisco mode sit local ME remote HE ttl 32
    302 \end{verbatim}
    303 changes its parameters.
    304 
    305 \section{Differences 2.2 and 2.0 tunnels revisited.}
    306 
    307 Now we can discuss more subtle differences between tunneling in 2.0
    308 and 2.2.
    309 
    310 \begin{itemize}
    311 
    312 \item In 2.0 all tunneled packets were received promiscuously
    313 as soon as you loaded module \verb|ipip|. 2.2 tries to select the best
    314 tunnel device and packet looks as received on this. F.e.\ if host
    315 received \verb|ipip| packet from host \verb|D| destined to our
    316 local address \verb|S|, kernel searches for matching tunnels
    317 in order:
    318 
    319 \begin{tabular}{ll}
    320 1 & \verb|remote| is \verb|D| and \verb|local| is \verb|S| \\
    321 2 & \verb|remote| is \verb|D| and \verb|local| is wildcard \\
    322 3 & \verb|remote| is wildcard and \verb|local| is \verb|S| \\
    323 4 & \verb|tunl0|
    324 \end{tabular}
    325 
    326 If tunnel exists, but it is not in \verb|UP| state, the tunnel is ignored.
    327 Note, that if \verb|tunl0| is \verb|UP| it receives all the IPIP packets,
    328 not acknowledged by more specific tunnels.
    329 Be careful, it means that without carefully installed firewall rules
    330 anyone on the Internet may inject to your network any packets with
    331 source addresses indistinguishable from local ones. It is not so bad idea
    332 to design tunnels in the way enforcing maximal route symmetry
    333 and to enable reversed path filter (\verb|rp_filter| sysctl option) on
    334 tunnel devices.
    335 
    336 \item In 2.2 you can monitor and debug tunnels with \verb|tcpdump|.
    337 F.e.\ \verb|tcpdump| \verb|-i Cisco| \verb|-nvv| will dump packets,
    338 which kernel output, via tunnel \verb|Cisco| and the packets received on it
    339 from kernel viewpoint.
    340 
    341 \end{itemize}
    342 
    343 
    344 \section{Linux and Cisco IOS tunnels.}
    345 
    346 Among another tunnels Cisco IOS supports IPIP and GRE.
    347 Essentially, Cisco setup is subset of options, available for Linux.
    348 Let us consider the simplest example:
    349 
    350 \begin{verbatim}
    351 interface Tunnel0
    352  tunnel mode gre ip
    353  tunnel source 10.10.14.1
    354  tunnel destination 10.10.13.2
    355 \end{verbatim}
    356 
    357 
    358 This command set translates to:
    359 
    360 \begin{verbatim}
    361     ip tunnel add Tunnel0 \
    362         mode gre \
    363         local 10.10.14.1 \
    364         remote 10.10.13.2
    365 \end{verbatim}
    366 
    367 Any questions? No questions.
    368 
    369 \section{Interaction IPIP tunnels and DVMRP.}
    370 
    371 DVMRP exploits IPIP tunnels to route multicasts via Internet.
    372 \verb|mrouted| creates
    373 IPIP tunnels listed in its configuration file automatically.
    374 From kernel and user viewpoints there are no differences between
    375 tunnels, created in this way, and tunnels created by \verb|ip tunnel|.
    376 I.e.\ if \verb|mrouted| created some tunnel, it may be used to
    377 route unicast packets, provided appropriate routes are added.
    378 And vice versa, if administrator has already created a tunnel,
    379 it will be reused by \verb|mrouted|, if it requests DVMRP
    380 tunnel with the same local and remote addresses.
    381 
    382 Do not wonder, if your manually configured tunnel is
    383 destroyed, when mrouted exits.
    384 
    385 
    386 \section{Broadcast GRE ``tunnels''.}
    387 
    388 It is possible to set \verb|remote| for GRE tunnel to a multicast
    389 address. Such tunnel becomes {\bf broadcast} tunnel (though word
    390 tunnel is not quite appropriate in this case, it is rather virtual network).
    391 \begin{verbatim}
    392   ip tunnel add Universe local 193.233.7.65 \
    393                          remote 224.66.66.66 ttl 16
    394   ip addr add 10.0.0.1/16 dev Universe
    395   ip link set Universe up
    396 \end{verbatim}
    397 This tunnel is true broadcast network and broadcast packets are
    398 sent to multicast group 224.66.66.66. By default such tunnel starts
    399 to resolve both IP and IPv6 addresses via ARP/NDISC, so that
    400 if multicast routing is supported in surrounding network, all GRE nodes
    401 will find one another automatically and will form virtual Ethernet-like
    402 broadcast network. If multicast routing does not work, it is unpleasant
    403 but not fatal flaw. The tunnel becomes NBMA rather than broadcast network.
    404 You may disable dynamic ARPing by:
    405 \begin{verbatim}
    406   echo 0 > /proc/sys/net/ipv4/neigh/Universe/mcast_solicit
    407 \end{verbatim}
    408 and to add required information to ARP tables manually:
    409 \begin{verbatim}
    410   ip neigh add 10.0.0.2 lladdr 128.6.190.2 dev Universe nud permanent
    411 \end{verbatim}
    412 In this case packets sent to 10.0.0.2 will be encapsulated in GRE
    413 and sent to 128.6.190.2. It is possible to facilitate address resolution
    414 using methods typical for another NBMA networks f.e.\ to start user
    415 level \verb|arpd| daemon, which will maintain database of hosts attached
    416 to GRE virtual network or ask for information
    417 dedicated ARP or NHRP server.
    418 
    419 
    420 Actually, such setup is the most natural for tunneling,
    421 it is really flexible, scalable and easily managable, so that
    422 it is strongly recommended to be used with GRE tunnels instead of ugly
    423 hack with NBMA mode and \verb|onlink| modifier. Unfortunately,
    424 by historical reasons broadcast mode is not supported by IPIP tunnels,
    425 but this probably will change in future.
    426 
    427 
    428 
    429 \section{Traffic control issues.}
    430 
    431 Tunnels are devices, hence all the power of Linux traffic control
    432 applies to them. The simplest (and the most useful in practice)
    433 example is limiting tunnel bandwidth. The following command:
    434 \begin{verbatim}
    435     tc qdisc add dev tunl0 root tbf \
    436         rate 128Kbit burst 4K limit 10K
    437 \end{verbatim}
    438 will limit tunneled traffic to 128Kbit with maximal burst size of 4K
    439 and queuing not more than 10K.
    440 
    441 However, you should remember, that tunnels are {\em virtual} devices
    442 implemented in software and true queue management is impossible for them
    443 just because they have no queues. Instead, it is better to create classes
    444 on real physical interfaces and to map tunneled packets to them.
    445 In general case of dynamic routing you should create such classes
    446 on all outgoing interfaces, or, alternatively,
    447 to use option \verb|dev DEV| to bind tunnel to a fixed physical device.
    448 In the last case packets will be routed only via specified device
    449 and you need to setup corresponding classes only on it.
    450 Though you have to pay for this convenience,
    451 if routing will change, your tunnel will fail.
    452 
    453 Suppose that CBQ class \verb|1:ABC| has been created on device \verb|eth0| 
    454 specially for tunnel \verb|Cisco| with endpoints \verb|S| and \verb|D|.
    455 Now you can select IPIP packets with addresses \verb|S| and \verb|D|
    456 with some classifier and map them to class \verb|1:ABC|. F.e.\ 
    457 it is easy to make with \verb|rsvp| classifier:
    458 \begin{verbatim}
    459     tc filter add dev eth0 pref 100 proto ip rsvp \
    460         session D ipproto ipip filter S \
    461         classid 1:ABC
    462 \end{verbatim}
    463 
    464 If you want to make more detailed classification of sub-flows
    465 transmitted via tunnel, you can build CBQ subtree,
    466 rooted at \verb|1:ABC| and attach to subroot set of rules parsing
    467 IPIP packets more deeply.
    468 
    469 \end{document}
    470