1
Author: Soren Hansen <soren@ubuntu.com>
2
Description: Revert changes between 1.4.1.1-3 and 1.4.1.1-4, thus bringing back
6
Index: iptables-1.4.12/howtos/Makefile
7
===================================================================
8
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
9
+++ iptables-1.4.12/howtos/Makefile 2011-11-07 13:57:14.000000000 -0600
12
+ for i in *.sgml; do sgml2html $$i; done
15
+ for i in *.html; do install -D -m 0644 $$i ${DESTDIR}/howtos/$$i; done
20
+.PHONY: all clean install
21
Index: iptables-1.4.12/howtos/NAT-HOWTO.sgml
22
===================================================================
23
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
24
+++ iptables-1.4.12/howtos/NAT-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600
26
+<!doctype linuxdoc system>
28
+<!-- This is the Linux NAT HOWTO.
31
+<!-- $Id: NAT-HOWTO.sgml,v 1.18 2002/01/14 09:35:13 laforge Exp $ -->
35
+<!-- Title information -->
37
+<title>Linux 2.4 NAT HOWTO
38
+<author>Rusty Russell, mailing list <tt>netfilter@lists.samba.org</tt>
39
+<date>$Revision: 1.18 $ $Date: 2002/01/14 09:35:13 $
41
+This document describes how to do masquerading, transparent proxying,
42
+port forwarding, and other forms of Network Address Translations with
43
+the 2.4 Linux Kernels.
46
+<!-- Table of contents -->
49
+<!-- Begin the document -->
51
+<sect>Introduction<label id="intro">
54
+Welcome, gentle reader.
57
+You are about to delve into the fascinating (and sometimes horrid)
58
+world of NAT: Network Address Translation, and this HOWTO is going to
59
+be your somewhat accurate guide to the 2.4 Linux Kernel and beyond.
61
+<p>In Linux 2.4, an infrastructure for mangling packets was
62
+introduced, called `netfilter'. A layer on top of this provides NAT,
63
+completely reimplemented from previous kernels.
65
+<p>(C) 2000 Paul `Rusty' Russell. Licensed under the GNU GPL.
67
+<sect>Where is the official Web Site and List?
69
+<p>There are three official sites:
71
+<item>Thanks to <url url="http://netfilter.filewatcher.org/" name="Filewatcher">.
72
+<item>Thanks to <url url="http://netfilter.samba.org/" name="The Samba Team and SGI">.
73
+<item>Thanks to <url url="http://netfilter.gnumonks.org/" name="Harald Welte">.
76
+<p>You can reach all of them using round-robin DNS via
77
+<url url="http://www.netfilter.org/"> and <url url="http://www.iptables.org/">
79
+<p>For the official netfilter mailing list, see
80
+<url url="http://www.netfilter.org/contact.html#list" name="netfilter List">.
82
+<sect1>What is Network Address Translation?
85
+Normally, packets on a network travel from their source (such as your
86
+home computer) to their destination (such as www.gnumonks.org)
87
+through many different links: about 19 from where I am in Australia.
88
+None of these links really alter your packet: they just send it
92
+If one of these links were to do NAT, then they would alter the source
93
+or destinations of the packet as it passes through. As you can
94
+imagine, this is not how the system was designed to work, and hence
95
+NAT is always something of a crock. Usually the link doing NAT will
96
+remember how it mangled a packet, and when a reply packet passes
97
+through the other way, it will do the reverse mangling on that reply
98
+packet, so everything works.
100
+<sect1>Why Would I Want To Do NAT?
102
+<p>In a perfect world, you wouldn't. Meanwhile, the main reasons are:
105
+<tag/Modem Connections To The Internet/ Most ISPs give you a single IP
106
+address when you dial up to them. You can send out packets with any
107
+source address you want, but only replies to packets with this source
108
+IP address will return to you. If you want to use multiple different
109
+machines (such as a home network) to connect to the Internet through
110
+this one link, you'll need NAT.
112
+<p>This is by far the most common use of NAT today, commonly known as
113
+`masquerading' in the Linux world. I call this SNAT, because you
114
+change the <bf>source</bf> address of the first packet.
116
+<tag/Multiple Servers/ Sometimes you want to change where packets
117
+heading into your network will go. Frequently this is because (as
118
+above), you have only one IP address, but you want people to be able
119
+to get into the boxes behind the one with the `real' IP address. If
120
+you rewrite the destination of incoming packets, you can manage this.
121
+This type of NAT was called port-forwarding under previous versions of
124
+<p>A common variation of this is load-sharing, where the mapping
125
+ranges over a set of machines, fanning packets out to them. If you're
126
+doing this on a serious scale, you may want to look at
128
+<url url="http://linuxvirtualserver.org/" name="Linux Virtual Server">.
130
+<tag/Transparent Proxying/ Sometimes you want to pretend that each
131
+packet which passes through your Linux box is destined for a program
132
+on the Linux box itself. This is used to make transparent proxies: a
133
+proxy is a program which stands between your network and the outside
134
+world, shuffling communication between the two. The transparent part
135
+is because your network won't even know it's talking to a proxy,
136
+unless of course, the proxy doesn't work.
138
+<p>Squid can be configured to work this way, and it is called
139
+redirection or transparent proxying under previous Linux versions.
142
+<sect>The Two Types of NAT
144
+<p>I divide NAT into two different types: <bf>Source NAT</bf> (SNAT)
145
+and <bf>Destination NAT</bf> (DNAT).
147
+<p>Source NAT is when you alter the source address of the first
148
+packet: i.e. you are changing where the connection is coming from.
149
+Source NAT is always done post-routing, just before the packet goes
150
+out onto the wire. Masquerading is a specialized form of SNAT.
152
+<p>Destination NAT is when you alter the destination address of the
153
+first packet: i.e. you are changing where the connection is going to.
154
+Destination NAT is always done before routing, when the packet first
155
+comes off the wire. Port forwarding, load sharing, and transparent
156
+proxying are all forms of DNAT.
158
+<sect>Quick Translation From 2.0 and 2.2 Kernels
160
+<p>Sorry to those of you still shell-shocked from the 2.0 (ipfwadm) to
161
+2.2 (ipchains) transition. There's good and bad news.
163
+<p>Firstly, you can simply use ipchains and ipfwadm as before. To do
164
+this, you need to insmod the `ipchains.o' or `ipfwadm.o' kernel
165
+modules found in the latest netfilter distribution. These are
166
+mutually exclusive (you have been warned), and should not be combined
167
+with any other netfilter modules.
169
+<p>Once one of these modules is installed, you can use ipchains and
170
+ipfwadm as normal, with the following differences:
173
+<item> Setting the masquerading timeouts with ipchains -M -S, or
174
+ ipfwadm -M -s does nothing. Since the timeouts are longer for
175
+ the new NAT infrastructure, this should not matter.
177
+<item> The init_seq, delta and previous_delta fields in the verbose
178
+ masquerade listing are always zero.
180
+<item> Zeroing and listing the counters at the same time `-Z -L' does
181
+ not work any more: the counters will not be zeroed.
183
+<item> The backward compatibility layer doesn't scale very well for
184
+ large numbers of connections: don't use it for your corporate
188
+Hackers may also notice:
191
+<item> You can now bind to ports 61000-65095 even if you're
192
+ masquerading. The masquerading code used to assume anything
193
+ in this range was fair game, so programs couldn't use it.
195
+<item> The (undocumented) `getsockname' hack, which transparent proxy
196
+ programs could use to find out the real destinations of
197
+ connections no longer works.
199
+<item> The (undocumented) bind-to-foreign-address hack is also not
200
+ implemented; this was used to complete the illusion of
201
+ transparent proxying.
205
+<sect1> I just want masquerading! Help!
207
+<p>This is what most people want. If you have a dynamically allocated
208
+IP PPP dialup (if you don't know, this is you), you simply want to
209
+tell your box that all packets coming from your internal network
210
+should be made to look like they are coming from the PPP dialup box.
213
+# Load the NAT module (this pulls in all the others).
214
+modprobe iptable_nat
216
+# In the NAT table (-t nat), Append a rule (-A) after routing
217
+# (POSTROUTING) for all packets going out ppp0 (-o ppp0) which says to
218
+# MASQUERADE the connection (-j MASQUERADE).
219
+iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
221
+# Turn on IP forwarding
222
+echo 1 > /proc/sys/net/ipv4/ip_forward
225
+Note that you are not doing any packet filtering here: for that, see
226
+the Packet Filtering HOWTO: `Mixing NAT and Packet Filtering'.
228
+<sect1> What about ipmasqadm?
230
+<p>This is a much more niche user base, so I didn't worry about
231
+backward compatibility as much. You can simply use `iptables -t nat'
232
+to do port forwarding. So for example, in Linux 2.2 you might have
237
+# Forward TCP packets going to port 8080 on 1.2.3.4 to 192.168.1.1's port 80
238
+ipmasqadm portfw -a -P tcp -L 1.2.3.4 8080 -R 192.168.1.1 80
245
+# Append a rule before routing (-A PREROUTING) to the NAT table (-t nat) that
246
+# TCP packets (-p tcp) going to 1.2.3.4 (-d 1.2.3.4) port 8080 (--dport 8080)
247
+# have their destination mapped (-j DNAT) to 192.168.1.1, port 80
248
+# (--to 192.168.1.1:80).
249
+iptables -A PREROUTING -t nat -p tcp -d 1.2.3.4 --dport 8080 \
250
+ -j DNAT --to 192.168.1.1:80
253
+<sect>Controlling What To NAT
255
+<p>You need to create NAT rules which tell the kernel what connections
256
+to change, and how to change them. To do this, we use the very
257
+versatile <tt>iptables</tt> tool, and tell it to alter the NAT table by
258
+specifying the `-t nat' option.
260
+<p>The table of NAT rules contains three lists called `chains': each
261
+rule is examined in order until one matches. The two chains are
262
+called PREROUTING (for Destination NAT, as packets first come in), and
263
+POSTROUTING (for Source NAT, as packets leave). The third (OUTPUT)
264
+will be ignored here.
266
+<p>The following diagram would illustrate it quite well if I had any
272
+ PREROUTING -->[Routing ]----------------->POSTROUTING----->
273
+ \D-NAT/ [Decision] \S-NAT/
281
+ --------> Local Process ------
284
+At each of the points above, when a packet passes we look up what
285
+connection it is associated with. If it's a new connection, we look
286
+up the corresponding chain in the NAT table to see what to do with it.
287
+The answer it gives will apply to all future packets on that
290
+<sect1>Simple Selection using iptables
292
+<p><tt>iptables</tt> takes a number of standard options as listed
293
+below. All the double-dash options can be abbreviated, as long as
294
+<tt>iptables</tt> can still tell them apart from the other possible
295
+options. If your kernel has iptables support as a module, you'll need
296
+to load the ip_tables.o module first: `insmod ip_tables'.
298
+<p>The most important option here is the table selection option, `-t'.
299
+For all NAT operations, you will want to use `-t nat' for the NAT
300
+table. The second most important option to use is `-A' to append a
301
+new rule at the end of the chain (e.g. `-A POSTROUTING'), or `-I' to
302
+insert one at the beginning (e.g. `-I PREROUTING').
304
+<p>You can specify the source (`-s' or `--source') and destination
305
+(`-d' or `--destination') of the packets you want to NAT. These
306
+options can be followed by a single IP address (e.g. 192.168.1.1), a
307
+name (e.g. www.gnumonks.org), or a network address
308
+(e.g. 192.168.1.0/24 or 192.168.1.0/255.255.255.0).
310
+<p>You can specify the incoming (`-i' or `--in-interface') or outgoing
311
+(`-o' or `--out-interface') interface to match, but which you can
312
+specify depends on which chain you are putting the rule into: at
313
+PREROUTING you can only select incoming interface, and at POSTROUTING
314
+you can only select outgoing interface. If you use the
315
+wrong one, <tt>iptables</tt> will give an error.
317
+<sect1>Finer Points Of Selecting What Packets To Mangle
319
+<p>I said above that you can specify a source and destination address.
320
+If you omit the source address option, then any source address will
321
+do. If you omit the destination address option, then any destination
324
+<p>You can also indicate a specific protocol (`-p' or `--protocol'),
325
+such as TCP or UDP; only packets of this protocol will match the rule.
326
+The main reason for doing this is that specifying a protocol of tcp or
327
+udp then allows extra options: specifically the `--source-port' and
328
+`--destination-port' options (abbreviated as `--sport' and `--dport').
330
+<p>These options allow you to specify that only packets with a certain
331
+source and destination port will match the rule. This is useful for
332
+redirecting web requests (TCP port 80 or 8080) and leaving other
335
+<p>These options must follow the `-p' option (which has a side-effect
336
+of loading the shared library extension for that protocol). You can
337
+use port numbers, or a name from the /etc/services file.
339
+<p>All the different qualities you can select a packet by are detailed
340
+in painful detail in the manual page (<tt>man iptables</tt>).
342
+<sect>Saying How To Mangle The Packets
344
+<p>So now we know how to select the packets we want to mangle. To
345
+complete our rule, we need to tell the kernel exactly what we want it
346
+to do to the packets.
350
+<p>You want to do Source NAT; change the source address of connections
351
+to something different. This is done in the POSTROUTING chain, just
352
+before it is finally sent out; this is an important detail, since it
353
+means that anything else on the Linux box itself (routing, packet
354
+filtering) will see the packet unchanged. It also means that the `-o'
355
+(outgoing interface) option can be used.
357
+<p>Source NAT is specified using `-j SNAT', and the `--to-source'
358
+option specifies an IP address, a range of IP addresses, and an
359
+optional port or range of ports (for UDP and TCP protocols only).
362
+## Change source addresses to 1.2.3.4.
363
+# iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 1.2.3.4
365
+## Change source addresses to 1.2.3.4, 1.2.3.5 or 1.2.3.6
366
+# iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 1.2.3.4-1.2.3.6
368
+## Change source addresses to 1.2.3.4, ports 1-1023
369
+# iptables -t nat -A POSTROUTING -p tcp -o eth0 -j SNAT --to 1.2.3.4:1-1023
374
+<p>There is a specialized case of Source NAT called masquerading: it
375
+should only be used for dynamically-assigned IP addresses, such as
376
+standard dialups (for static IP addresses, use SNAT above).
378
+<p>You don't need to put in the source address explicitly with
379
+masquerading: it will use the source address of the interface the
380
+packet is going out from. But more importantly, if the link goes
381
+down, the connections (which are now lost anyway) are forgotten,
382
+meaning fewer glitches when connection comes back up with a new IP
386
+## Masquerade everything out ppp0.
387
+# iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
390
+<sect1>Destination NAT
392
+<p>This is done in the PREROUTING chain, just as the packet comes in;
393
+this means that anything else on the Linux box itself (routing, packet
394
+filtering) will see the packet going to its `real' destination. It
395
+also means that the `-i' (incoming interface) option can be used.
397
+<p>Destination NAT is specified using `-j DNAT', and the
398
+`--to-destination' option specifies an IP address, a range of IP
399
+addresses, and an optional port or range of ports (for UDP and TCP
403
+## Change destination addresses to 5.6.7.8
404
+# iptables -t nat -A PREROUTING -i eth0 -j DNAT --to 5.6.7.8
406
+## Change destination addresses to 5.6.7.8, 5.6.7.9 or 5.6.7.10.
407
+# iptables -t nat -A PREROUTING -i eth0 -j DNAT --to 5.6.7.8-5.6.7.10
409
+## Change destination addresses of web traffic to 5.6.7.8, port 8080.
410
+# iptables -t nat -A PREROUTING -p tcp --dport 80 -i eth0 \
411
+ -j DNAT --to 5.6.7.8:8080
416
+<p>There is a specialized case of Destination NAT called redirection:
417
+it is a simple convenience which is exactly equivalent to doing DNAT
418
+to the address of the incoming interface.
421
+## Send incoming port-80 web traffic to our squid (transparent) proxy
422
+# iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 \
423
+ -j REDIRECT --to-port 3128
426
+Note that squid needs to be configured to know it's a transparent proxy!
428
+<sect1>Mappings In Depth
430
+<p>There are some subtleties to NAT which most people will never have
431
+to deal with. They are documented here for the curious.
433
+<sect2>Selection Of Multiple Addresses in a Range
435
+<p>If a range of IP addresses is given, the IP address to use is
436
+chosen based on the least currently used IP for connections the
437
+machine knows about. This gives primitive load-balancing.
439
+<sect2>Creating Null NAT Mappings
441
+<p>You can use the `-j ACCEPT' target to let a connection through
442
+without any NAT taking place.
444
+<sect2>Standard NAT Behavior
446
+<p>The default behavior is to alter the connection as little as
447
+possible, within the constraints of the rule given by the user. This
448
+means we won't remap ports unless we have to.
450
+<sect2>Implicit Source Port Mapping
452
+<p>Even when no NAT is requested for a connection, source port
453
+translation may occur implicitly, if another connection has been
454
+mapped over the new one. Consider the case of masquerading, which
458
+<item> A web connection is established by a box 192.1.1.1 from port
459
+ 1024 to www.netscape.com port 80.
461
+<item> This is masqueraded by the masquerading box to use its source
462
+ IP address (1.2.3.4).
464
+<item> The masquerading box tries to make a web connection to
465
+ www.netscape.com port 80 from 1.2.3.4 (its external interface
466
+ address) port 1024.
468
+<item> The NAT code will alter the source port of the second
469
+ connection to 1025, so that the two don't clash.
472
+<p>When this implicit source mapping occurs, ports are divided into
475
+<item> Ports below 512
476
+<item> Ports between 512 and 1023
477
+<item> Ports 1024 and above.
480
+A port will never be implicitly mapped into a different class.
482
+<sect2>What Happens When NAT Fails
484
+<p>If there is no way to uniquely map a connection as the user
485
+requests, it will be dropped. This also applies to packets which
486
+could not be classified as part of any connection, because they are
487
+malformed, or the box is out of memory, etc.
489
+<sect2>Multiple Mappings, Overlap and Clashes
491
+<p>You can have NAT rules which map packets onto the same range; the
492
+NAT code is clever enough to avoid clashes. Hence having two rules
493
+which map the source address 192.168.1.1 and 192.168.1.2 respectively
494
+onto 1.2.3.4 is fine.
496
+<p>Furthermore, you can map over real, used IP addresses, as long as
497
+those addresses pass through the mapping box as well. So if you have
498
+an assigned network (1.2.3.0/24), but have one internal network using
499
+those addresses and one using the Private Internet Addresses
500
+192.168.1.0/24, you can simply NAT the 192.168.1.0/24 source addresses
501
+onto the 1.2.3.0 network, without fear of clashing:
504
+# iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth1 \
505
+ -j SNAT --to 1.2.3.0/24
508
+<p>The same logic applies to addresses used by the NAT box itself:
509
+this is how masquerading works (by sharing the interface address
510
+between masqueraded packets and `real' packets coming from the box
513
+<p>Moreover, you can map the same packets onto many different targets,
514
+and they will be shared. For example, if you don't want to map
515
+anything over 1.2.3.5, you could do:
518
+# iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth1 \
519
+ -j SNAT --to 1.2.3.0-1.2.3.4 --to 1.2.3.6-1.2.3.254
522
+<sect2>Altering the Destination of Locally-Generated Connections
524
+<p>The NAT code allows you to insert DNAT rules in the OUTPUT chain,
525
+but this is not fully supported in 2.4 (it can be, but it requires a
526
+new configuration option, some testing, and a fair bit of coding, so
527
+unless someone contracts Rusty to write it, I wouldn't expect it
530
+<p>The current limitation is that you can only change the destination
531
+to the local machine (e.g. `j DNAT --to 127.0.0.1'), not to any other
532
+machine, otherwise the replies won't be translated correctly.
534
+<sect>Special Protocols
536
+<p>Some protocols do not like being NAT'ed. For each of these
537
+protocols, two extensions must be written; one for the connection
538
+tracking of the protocol, and one for the actual NAT.
540
+<p>Inside the netfilter distribution, there are currently modules for
541
+ftp: ip_conntrack_ftp.o and ip_nat_ftp.o. If you insmod these into
542
+your kernel (or you compile them in permanently), then doing any kind
543
+of NAT on ftp connections should work. If you don't, then you can
544
+only use passive ftp, and even that might not work reliably if you're
545
+doing more than simple Source NAT.
547
+<sect>Caveats on NAT
549
+<p>If you are doing NAT on a connection, all packets passing
550
+<bf>both</bf> ways (in and out of the network) must pass through the
551
+NAT'ed box, otherwise it won't work reliably. In particular, the
552
+connection tracking code reassembles fragments, which means that not
553
+only will connection tracking not be reliable, but your packets may
554
+not get through at all, as fragments will be withheld.
556
+<sect>Source NAT and Routing
558
+<p>If you are doing SNAT, you will want to make sure that every
559
+machine the SNAT'ed packets goes to will send replies back to the NAT
560
+box. For example, if you are mapping some outgoing packets onto the
561
+source address 1.2.3.4, then the outside router must know that it is
562
+to send reply packets (which will have <bf>destination</bf> 1.2.3.4)
563
+back to this box. This can be done in the following ways:
566
+<item> If you are doing SNAT onto the box's own address (for which
567
+ routing and everything already works), you don't need to do
570
+<item> If you are doing SNAT onto an unused address on the local LAN
571
+ (for example, you're mapping onto 1.2.3.99, a free IP on your
572
+ 1.2.3.0/24 network), your NAT box will need to respond to ARP
573
+ requests for that address as well as its own: the easiest way
574
+ to do this is create an IP alias, e.g.:
576
+# ip address add 1.2.3.99 dev eth0
579
+<item> If you are doing SNAT onto a completely different address, you
580
+ will have to ensure that the machines the SNAT packets will hit
581
+ will route this address back to the NAT box. This is already
582
+ achieved if the NAT box is their default gateway, otherwise you
583
+ will need to advertise a route (if running a routing protocol)
584
+ or manually add routes to each machine involved.
587
+<sect>Destination NAT Onto the Same Network
589
+<p>If you are doing port forwarding back onto the same network, you
590
+need to make sure that both future packets and reply packets pass
591
+through the NAT box (so they can be altered). The NAT code will now
592
+(since 2.4.0-test6), block the outgoing ICMP redirect which is
593
+produced when the NAT'ed packet heads out the same interface it came
594
+in on, but the receiving server will still try to reply directly to
595
+the client (which won't recognize the reply).
597
+<p>The classic case is that internal staff try to access your `public'
598
+web server, which is actually DNAT'ed from the public address
599
+(1.2.3.4) to an internal machine (192.168.1.1), like so:
602
+# iptables -t nat -A PREROUTING -d 1.2.3.4 \
603
+ -p tcp --dport 80 -j DNAT --to 192.168.1.1
606
+<p>One way is to run an internal DNS server which knows the real
607
+(internal) IP address of your public web site, and forward all other
608
+requests to an external DNS server. This means that the logging on
609
+your web server will show the internal IP addresses correctly.
611
+<p>The other way is to have the NAT box also map the source IP address
612
+to its own for these connections, fooling the server into replying
613
+through it. In this example, we would do the following (assuming the
614
+internal IP address of the NAT box is 192.168.1.250):
617
+# iptables -t nat -A POSTROUTING -d 192.168.1.1 -s 192.168.1.0/24 \
618
+ -p tcp --dport 80 -j SNAT --to 192.168.1.250
621
+Because the <bf>PREROUTING</bf> rule gets run first, the packets will
622
+already be destined for the internal web server: we can tell which
623
+ones are internally sourced by the source IP addresses.
627
+<p>Thanks first to WatchGuard, and David Bonn, who believed in the
628
+netfilter idea enough to support me while I worked on it.
630
+<p>And to everyone else who put up with my ranting as I learnt about
631
+the ugliness of NAT, especially those who read my diary.
635
Index: iptables-1.4.12/howtos/netfilter-extensions-HOWTO.sgml
636
===================================================================
637
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
638
+++ iptables-1.4.12/howtos/netfilter-extensions-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600
640
+<!doctype linuxdoc system>
642
+<!-- This is the Netfilter Extensions HOWTO.
647
+<!-- Title information -->
649
+<title>Netfilter Extensions HOWTO</title>
650
+<author>Fabrice MARIE <fabrice@netfilter.org>, mailing list <tt>netfilter-devel@lists.samba.org</tt></author>
651
+<date>$Revision: 1.28 $</date>
653
+This document describes how to install and use current iptables extensions for netfilter.
656
+<!-- Table of contents -->
659
+<!-- Begin the document -->
661
+<sect>Introduction<label id="intro">
664
+Hello. This is a great opportunity for me to thank all the people
665
+spending a lot of time developing, testing, reporting bugs of, and using netfilter.
666
+So, thanks to you all !!
669
+This HOWTO assumes you have read and understood Rusty's
670
+<url url="http://www.netfilter.org/documentation/HOWTO/packet-filtering-HOWTO.html" name="Linux 2.4 Packet Filtering HOWTO">.
671
+It is assumed as well that you know how to compile and install a kernel properly.
674
+<tt>iptables</tt> distribution contains extensions that are not used by regular users
675
+or that are still quite experimental or finally, that are pending for kernel inclusion.
676
+These extensions are usually not compiled, unless you've asked for it.
679
+You should find the latest version of this document on
680
+<url url="http://www.netfilter.org/documentation/index.html#HOWTO" name="netfilter documentation"> web page.
683
+The goal of this HOWTO is to help people get started with the netfilter extensions
684
+by explaining how you can install them, and how to basically use them.
687
+Finally, there's a script generated complete list of patches that are available in patch-o-matic :
688
+<url url="http://www.netfilter.org/documentation/pomlist/pom-summary.html" name="Patch-O-Matic Listing - Summary">.
690
+<p>(C) 2001-2002 Fabrice MARIE. Licensed under the GNU GPL.
694
+<sect1>What is Patch-O-Matic ?
696
+Netfilter developers distribute a set of patches that they package
697
+so that it can be used by their `patch-o-matic' (or `p-o-m') system.
698
+p-o-m is a script that guides you through the process of choosing/selecting
699
+the patches you want to apply, and automatically patch the kernel for you.
702
+First, you should get the latest CVS tree, to be sure that you are using the
703
+latest extensions. To do so, perform :
706
+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic login
708
+(When it asks you for a password type `cvs').
710
+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic co netfilter/userspace netfilter/patch-o-matic
714
+This will create the toplevel directory `netfilter/', and will
715
+check out all the files inside for you :
721
+drwxr-xr-x 2 root root 160 Nov 7 14:48 CVS/
722
+drwxr-xr-x 13 root root 488 Nov 7 14:54 patch-o-matic/
723
+drwxr-xr-x 9 root root 864 Nov 7 14:48 userspace/
728
+Make sure your kernel source is ready in `/usr/src/linux/'.
729
+If for whatever reason the kernel you want to patch is not
730
+in `/usr/src/linux/' then you can make the variable KERNEL_DIR
731
+point to the patch where your kernel is :
734
+# export KERNEL_DIR=/the/path/linux
738
+Make sure the dependencies are made already. If unsure :
741
+# cd /usr/src/linux/
746
+Then you can go back to the netfilter directory, in the `patch-o-matic/' directory.
747
+You can now invoke p-o-m.
749
+<sect1>Running Patch-O-Matic
751
+While in the `patch-o-matic/' directory, let's run p-o-m :
756
+Welcome to Rusty's Patch-o-matic!
758
+Each patch is a new feature: many have minimal impact, some do not.
759
+Almost every one has bugs, so I don't recommend applying them all!
760
+-------------------------------------------------------
762
+Already applied: 2.4.1 2.4.4
763
+Testing... name_of_the_patch NOT APPLIED ( 2 missing files)
764
+The name_of_the_patch patch:
765
+ Here usually is the help text describing what
766
+ the patch is for, what you can expect from it,
767
+ and what you should not expect from it.
768
+Do you want to apply this patch [N/y/t/f/q/?]
772
+p-o-m will go through most of the patches. If they are already applied,
773
+you will see so on the `Already applied:' first line. If they are not applied
774
+yet, it will display the name of the patch with some explanations.
775
+p-o-m will tell you what is going on : `NOT APPLIED ( n missing files)' simply means
776
+the patch has not been applied yet, whereas `NOT APPLIED ( n rejects out of n hunks)'
777
+generally means that :
779
+<item>Either the patch cannot be applied cleanly...
780
+<item>...Or the patch has already been included in the kernel you are trying to patch.
784
+Finally it will prompt you to decide whether or not to patch it.
787
+<item>Simply press enter if you do not want to apply it.
788
+<item>Type `y' if you want p-o-m to test the patch and apply it,
789
+if the attempt fail then it will tell you so and prompt you again for confirmation.
790
+If not, the patch will be applied, and you will see the name of the patch
791
+on the `Already Applied' line.
792
+<item>Type `t' if you just want to test if the patch would apply normally.
793
+<item>Type `f' if you
794
+want to force p-o-m to apply the patch.
795
+<item>Finally type `q' if you want to quit p-o-m.
799
+A rule of thumb is to read carefully the little explanation text of each patch
800
+before actually applying it. As there are currently a LOT of official patches for patch-o-matic
801
+(and probably more unofficial ones), it is not recommended to apply them all !
802
+You should really consider applying only the ones you need, even if it means recompiling
803
+netfilter when you need more patches later on.
806
+Patch-o-matic in fact, is mainly the `runme' shell script. If you run it without arguments, it will
807
+display its help message :
811
+Usage: ./runme [--batch] [--reverse] [--exclude suite/patch-file ...] suite|suite/patch-file
813
+ --batch batch mode, automatically applying patches
814
+ --reverse back out the selected patches
815
+ --exclude excludes the named patches
820
+The patches are contained in `patch-o-matic/pending/', `patch-o-matic/base', etc.. Here, `pending' and `base'
821
+are two suite names. ls the `patch-o-matic' directory to see all the suites. Example of `runme' commands :
825
+./runme --batch pending
826
+./runme --batch userspace/ipt_REJECT-fake-source.patch
831
+The first command will attempt to apply all the patches from submitted suite,
832
+then the pending suite (we explain further why two suites). The second command
833
+will only apply the patch `ipt_REJECT-fake-source.patch' from the userspace suite.
836
+The most relevant patches `suites' or repositories are (in their order or application) :
846
+When you instruct `./runme' to apply patches from the `extra/' patch repository it will first
847
+present you with the patches from the `submitted/', `pending/', and `base/' directories.
848
+Each suite, maintain a file named `SUITE' that instruct p-o-m of the order in which
849
+it should attempt to apply the patches. For example, what I explained above is written
850
+in the `userspace/' repository's `SUITE' file :
854
+# cat userspace/SUITE
855
+submitted pending base extra userspace
859
+<sect1>So what's next ?
862
+Once you have applied all the patches you wished to apply, the next step is recompile
863
+your kernel and install it. This HOWTO will not explain how to do this. Instead, you
864
+can read the <url url="http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html" name="Linux Kernel HOWTO">.
867
+While configuring your kernel, you will see new options in
868
+``Networking Options -> Netfilter Configuration''. Choose the options
869
+you need, recompile & install your new kernel.
872
+Once your new kernel is installed, you can go ahead and compile and install the ``iptables''
873
+package, from the `userspace/' directory as follows :
882
+That's it ! Your new shiny iptables package is installed ! Now it's time
883
+to use these brand new functionalities.
885
+<sect>New netfilter matches
888
+In this section, we will attempt to explain the usage of new netfilter matches.
889
+The patches will appear in alphabetical order. Additionally, we will not explain
890
+patches that break other patches. But this might come later.
893
+Generally speaking, for matches, you can get the help hints from a particular
898
+# iptables -m the_match_you_want --help
903
+This would display the normal iptables help message, plus the specific
904
+``the_match_you_want'' match help message at the end.
908
+This patch by Yon Uriarte <yon@astaro.de> adds 2 new matches :
911
+<item>``ah'' : lets you match an AH packet based on its Security Parameter Index (SPI).
912
+<item>``esp'' : lets you match an ESP packet based on its SPI.
916
+This patch can be quite useful for people using IPSEC who are willing
917
+to discriminate connections based on their SPI.
920
+For example, we will drop all the AH packets that have a SPI equal to
924
+# iptables -A INPUT -p 51 -m ah --ahspi 500 -j DROP
927
+Chain INPUT (policy ACCEPT)
928
+target prot opt source destination
929
+DROP ipv6-auth-- anywhere anywhere ah spi:500
933
+Supported options for the ah match are :
936
+<tag>--ahspi [!] spi[:spi]</> -> match spi (range)
940
+The esp match works exactly the same :
943
+# iptables -A INPUT -p 50 -m esp --espspi 500 -j DROP
946
+Chain INPUT (policy ACCEPT)
947
+target prot opt source destination
948
+DROP ipv6-crypt-- anywhere anywhere esp spi:500
952
+Supported options for the esp match are :
955
+<tag>--espspi [!] spi[:spi]</> -> match spi (range)
959
+Do not forget to specify the proper protocol through ``-p 50'' or ``-p 51'' (for esp & ah respectively)
960
+when you use the ah or esp matches, or else the rule insertion will simply abort
961
+for obvious reasons.
963
+<sect1>condition match
965
+This patch by Stephane Ouellette <ouellettes@videotron.ca> adds a new match that is used
966
+to enable or disable a set of rules using condition variables stored in `/proc' files.
972
+<item>The condition variables are stored in the `/proc/net/ipt_condition/' directory.
973
+<item>A condition variable can only be set to ``0'' (FALSE) or ``1'' (TRUE).
974
+<item>One or many rules can be affected by the state of a single condition variable.
975
+<item>A condition proc file is automatically created when a new condition is first referenced.
976
+<item>A condition proc file is automatically deleted when the last reference to it is removed.
980
+Supported options for the condition match are :
983
+<tag>--condition [!] conditionfile</> -> match on condition variable.
987
+For example, if you want to prohibit access to your web server while doing maintenance, you can use the
991
+# iptables -A FORWARD -p tcp -d 192.168.1.10 --dport http -m condition --condition webdown -j REJECT --reject-with tcp-reset
993
+# echo 1 > /proc/net/ipt_condition/webdown
997
+The following rule will match only if the ``webdown'' condition is set to ``1''.
1000
+<sect1>conntrack patch
1002
+This patch by Marc Boucher <marc+nf@mbsi.ca> adds a new general conntrack match module
1003
+(a superset of the state match) that allows you to match on additional conntrack information.
1006
+For example, if you want to allow all the RELATED connections for TCP protocols only,
1007
+then you can proceed as follows :
1010
+# iptables -A FORWARD -m conntrack --ctstate RELATED --ctproto tcp -j ACCEPT
1013
+Chain FORWARD (policy ACCEPT)
1014
+target prot opt source destination
1015
+ACCEPT all -- anywhere anywhere ctstate RELATED
1019
+Supported options for the conntrack match are :
1022
+<tag>[!] --ctstate [INVALID|ESTABLISHED|NEW|RELATED|SNAT|DNAT][,...]</>
1023
+-> State(s) to match. The "new" `SNAT' and `DNAT' states are virtual ones, matching if the original
1024
+source address differs from the reply destination, or if the original destination differs from the reply source.
1026
+<tag>[!] --ctproto proto</> -> Protocol to match; by number or name, eg. `tcp'.
1028
+<tag>--ctorigsrc [!] address[/mask]</> -> Original source specification.
1030
+<tag>--ctorigdst [!] address[/mask]</> -> Original destination specification.
1032
+<tag>--ctreplsrc [!] address[/mask]</> -> Reply source specification.
1034
+<tag>--ctrepldst [!] address[/mask]</> -> Reply destination specification.
1036
+<tag>[!] --ctstatus [NONE|EXPECTED|SEEN_REPLY|ASSURED][,...]</>
1037
+-> Status(es) to match.
1039
+<tag>[!] --ctexpire time[:time]</> -> Match remaining lifetime in seconds against
1040
+value or range of values (inclusive).
1045
+This patch by Hime Aguiar e Oliveira Jr. <hime@engineer.com> adds a new module
1046
+which allows you to match packets according to a dynamic profile
1047
+implemented by means of a simple Fuzzy Logic Controller (FLC).
1050
+This match implements a TSK FLC (Takagi-Sugeno-Kang Fuzzy Logic
1051
+Controller). The basic idea is that the match is given two parameters
1052
+that tell it the desired filtering interval.
1055
+<item>When the packet rate is below `lower-limit' the rule will never match.
1056
+<item>Between `lower-limit' and `upper-limit', matching will occurs according a
1057
+increasing (mean) rate.
1058
+<item>Finally, when the packet rate comes to `upper-limit',
1059
+(mean) matching rate attains its maximum value, 99%.
1063
+Taking into account that the sampling rate is variable and is of approximately 100ms
1064
+(on a busy machine), the author believes that the module presents good responsiveness,
1065
+adapting fast to changing traffic patterns.
1068
+For example, if you wish to avoid Denials Of Service, you could use the following rule:
1071
+iptables -A INPUT -m fuzzy --lower-limit 100 --upper-limit 1000 -j REJECT
1075
+<item>Below the 100 pps (packets per second) rate, the filter is inactive.
1076
+<item>Between 100 and 1000 pps the mean acceptance rate drops
1077
+from 100% (when we are at 100 pps) to 1% (when we are at 1000 pps).
1078
+<item>Above 1000 pps the acceptance rate keeps constant at 1%.
1082
+Supported options for the fuzzy patch are :
1085
+<tag>--upper-limit n</> -> Desired upper bound for traffic rate matching.
1086
+<tag>--lower-limit n</> -> Lower bound over which the FLC starts to match.
1089
+<sect1>iplimit patch
1091
+This patch by Gerd Knorr <kraxel@bytesex.org> adds a new match that
1092
+will allow you to restrict the number of parallel TCP connections
1093
+from a particular host or network.
1096
+For example, let's limit the number of parallel HTTP connections made by a single
1100
+# iptables -A INPUT -p tcp --syn --dport http -m iplimit --iplimit-above 4 -j REJECT
1103
+Chain INPUT (policy ACCEPT)
1104
+target prot opt source destination
1105
+REJECT tcp -- anywhere anywhere tcp dpt:http flags:SYN,RST,ACK/SYN #conn/32 > 4 reject-with icmp-port-unreachable
1109
+Or you might want to limit the number of parallel connections made by a whole class A for example :
1112
+# iptables -A INPUT -p tcp --syn --dport http -m iplimit --iplimit-mask 8 --iplimit-above 4 -j REJECT
1115
+Chain INPUT (policy ACCEPT)
1116
+target prot opt source destination
1117
+REJECT tcp -- anywhere anywhere tcp dpt:http flags:SYN,RST,ACK/SYN #conn/8 > 4 reject-with icmp-port-unreachable
1121
+Supported options for the iplimit patch are :
1124
+<tag>[!] --iplimit-above n</> -> match if the number of existing tcp connections is (not) above n
1125
+<tag>--iplimit-mask n</> -> group hosts using mask
1128
+<sect1>ipv4options patch
1131
+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a news match
1132
+that allows you to match packets based on the IP options they have set.
1135
+For example, let's drop all packets that have the record-route or the timestamp
1139
+# iptables -A INPUT -m ipv4options --rr -j DROP
1140
+# iptables -A INPUT -m ipv4options --ts -j DROP
1143
+Chain INPUT (policy ACCEPT)
1144
+target prot opt source destination
1145
+DROP all -- anywhere anywhere IPV4OPTS RR
1146
+DROP all -- anywhere anywhere IPV4OPTS TS
1150
+Supported options for the ipv4options match are :
1153
+<tag>--ssrr</> -> match strict source routing flag.
1154
+<tag>--lsrr</> -> match loose source routing flag.
1155
+<tag>--no-srr</> -> match packets with no source routing.
1156
+<tag>[!] --rr</> -> match record route flag.
1157
+<tag>[!] --ts</> -> match timestamp flag.
1158
+<tag>[!] --ra</> -> match router-alert option.
1159
+<tag>[!] --any-opt</> -> Match a packet that has at least one IP option
1160
+(or that has no IP option at all if ! is chosen).
1163
+<sect1>length patch
1165
+This patch by James Morris <jmorris@intercode.com.au> adds a new match
1166
+that allows you to match a packet based on its length.
1169
+For example, let's drop all the pings with a packet size greater than
1173
+# iptables -A INPUT -p icmp --icmp-type echo-request -m length --length 86:0xffff -j DROP
1176
+Chain INPUT (policy ACCEPT)
1177
+target prot opt source destination
1178
+DROP icmp -- anywhere anywhere icmp echo-request length 86:65535
1182
+Supported options for the length match are :
1185
+<tag>[!] --length length[:length]</> -> Match packet length
1186
+against value or range of values (inclusive)
1190
+Values of the range not present will be implied. The implied value for minimum
1191
+is 0, and for maximum is 65535.
1195
+This patch by Andreas Ferber <af@devcon.net> adds a new match that allows
1196
+you to specify ports with a mix of port-ranges and single ports for UDP and TCP protocols.
1199
+For example, if you want to block ftp, ssh, telnet and http in one line, you can :
1202
+# iptables -A INPUT -p tcp -m mport --ports 20:23,80 -j DROP
1205
+Chain INPUT (policy ACCEPT)
1206
+target prot opt source destination
1207
+DROP tcp -- anywhere anywhere mport ports ftp-data:telnet,http
1211
+Supported options for the mport match are :
1214
+<tag>--source-ports port[,port:port,port...]</> -> match source port(s)
1215
+<tag>--sports port[,port:port,port...]</> -> match source port(s)
1216
+<tag>--destination-ports port[,port:port,port...]</> -> match destination port(s)
1217
+<tag>--dports port[,port:port,port...]</> -> match destination port(s)
1218
+<tag>--ports port[,port:port,port]</> -> match both source and destination port(s)
1223
+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that allows
1224
+you to match a particular Nth packet received by the rule.
1227
+For example, if you want to drop every 2 ping packets, you can do as follows :
1230
+# iptables -A INPUT -p icmp --icmp-type echo-request -m nth --every 2 -j DROP
1233
+Chain INPUT (policy ACCEPT)
1234
+target prot opt source destination
1235
+DROP icmp -- anywhere anywhere icmp echo-request every 2th
1239
+Extensions by Richard Wagner <rwagner@cloudnet.com> allows
1240
+you to create an easy and quick method to produce load-balancing for both inbound and outbound
1244
+For example, if you want to balance the load to the 3 addresses 10.0.0.5, 10.0.0.6 and 10.0.0.7,
1245
+then you can do as follows :
1248
+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 0 -j SNAT --to-source 10.0.0.5
1249
+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 1 -j SNAT --to-source 10.0.0.6
1250
+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 2 -j SNAT --to-source 10.0.0.7
1252
+# iptables -t nat --list
1253
+Chain POSTROUTING (policy ACCEPT)
1254
+target prot opt source destination
1255
+SNAT all -- anywhere anywhere every 3th packet #0 to:10.0.0.5
1256
+SNAT all -- anywhere anywhere every 3th packet #1 to:10.0.0.6
1257
+SNAT all -- anywhere anywhere every 3th packet #2 to:10.0.0.7
1261
+Supported options for the nth match are :
1264
+<tag>--every Nth</> -> Match every Nth packet.
1265
+<tag>[--counter] num</> -> Use counter 0-15 (default:0).
1266
+<tag>[--start] num</> -> Initialize the counter at the number `num' instead of 0. Must be between 0 and (Nth-1).
1267
+<tag>[--packet] num</> -> Match on the `num' packet. Must be between 0 and Nth-1.
1268
+If `--packet' is used for a counter, then there must be Nth number of --packet rules, covering all values between 0 and
1269
+(Nth-1) inclusively.
1272
+<sect1>pkttype patch
1274
+This patch by Michal Ludvig <michal@logix.cz> adds a new match that allows
1275
+you to match a packet based on its type : host/broadcast/multicast.
1278
+If For example you want to silently drop all the broadcasted packets :
1281
+# iptables -A INPUT -m pkttype --pkt-type broadcast -j DROP
1284
+Chain INPUT (policy ACCEPT)
1285
+target prot opt source destination
1286
+DROP all -- anywhere anywhere PKTTYPE = broadcast
1290
+Supported options for this match are :
1293
+<tag>--pkt-type [!] packettype</> -> match packet type where packet type is one of
1295
+<tag>host</> -> to us
1296
+<tag>broadcast</> -> to all
1297
+<tag>multicast</> -> to group
1303
+Patch by Patrick Schaaf <bof@bof.de>. Joakim Axelsson and Patrick are in the process
1304
+of re-writing it, therefore they will replace this section with the actual
1305
+explanations once its written.
1309
+This patch by Dennis Koslowski <dkoslowski@astaro.de> adds a new match that will
1310
+attempt to detect port scans.
1313
+In its simplest form, psd match can be used as follows :
1316
+# iptables -A INPUT -m psd -j DROP
1319
+Chain INPUT (policy ACCEPT)
1320
+target prot opt source destination
1321
+DROP all -- anywhere anywhere psd weight-threshold: 21 delay-threshold: 300 lo-ports-weight: 3 hi-ports-weight: 1
1325
+Supported options for psd match are :
1328
+<tag>[--psd-weight-threshold threshold]</> -> Portscan detection weight threshold
1329
+<tag>[--psd-delay-threshold delay]</> -> Portscan detection delay threshold
1330
+<tag>[--psd-lo-ports-weight lo]</> -> Privileged ports weight
1331
+<tag>[--psd-hi-ports-weight hi]</> -> High ports weight
1336
+This patch by Sam Johnston <samj@samj.net> adds a new match that
1337
+allows you to set quotas. When the quota is reached, the rule doesn't
1341
+For example, if you want to limit put a quota of 50Megs on incoming http data
1342
+you can do as follows :
1345
+# iptables -A INPUT -p tcp --dport 80 -m quota --quota 52428800 -j ACCEPT
1346
+# iptables -A INPUT -p tcp --dport 80 -j DROP
1349
+Chain INPUT (policy ACCEPT)
1350
+target prot opt source destination
1351
+ACCEPT tcp -- anywhere anywhere tcp dpt:http quota: 52428800 bytes
1352
+DROP tcp -- anywhere anywhere tcp dpt:http
1356
+Supported options for quota match are :
1359
+<tag> --quota quota</> -> The quota you want to set.
1362
+<sect1>random patch
1364
+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that
1365
+allows you to math a packet randomly based on given probability.
1368
+For example, if you want to drop 50% of the pings randomly, you can do as follows :
1371
+# iptables -A INPUT -p icmp --icmp-type echo-request -m random --average 50 -j DROP
1374
+Chain INPUT (policy ACCEPT)
1375
+target prot opt source destination
1376
+DROP icmp -- anywhere anywhere icmp echo-request random 50%
1380
+Supported options for random match are :
1383
+<tag>[--average percent]</> -> The probability in percentage of the match.
1384
+If omitted, a probability of 50% percent is set. Percentage must be within : 1 <= percent <= 99.
1389
+This patch by Sampsa Ranta <sampsa@netsonic.fi> adds a new match that allows you
1390
+to use realm key from routing as match criteria similar to the one found in the packet
1394
+For example, to log all the outgoing packet with a realm of 10, you can do the following :
1397
+# iptables -A OUTPUT -m realm --realm 10 -j LOG
1400
+Chain OUTPUT (policy ACCEPT)
1401
+target prot opt source destination
1402
+LOG all -- anywhere anywhere REALM match 0xa LOG level warning
1406
+Supported options for the realm match are :
1409
+<tag>--realm [!] value[/mask]</> -> Match realm
1412
+<sect1>recent patch
1414
+This patch by Stephen Frost <sfrost@snowman.net> adds a new match that allows you
1415
+to dynamically create a list of IP addresses and then match against that list in a few
1419
+For example, you can create a `badguy' list out of people attempting to connect to port 139
1420
+on your firewall and then DROP all future packets from them without considering them.
1423
+# iptables -A FORWARD -m recent --name badguy --rcheck --seconds 60 -j DROP
1424
+# iptables -A FORWARD -p tcp -i eth0 --dport 139 -m recent --name badguy --set -j DROP
1427
+Chain FORWARD (policy ACCEPT)
1428
+target prot opt source destination
1429
+DROP all -- anywhere anywhere recent: CHECK seconds: 60
1430
+DROP tcp -- anywhere anywhere tcp dpt:netbios-ssn recent: SET
1434
+Supported options for the recent match are :
1437
+<tag>--name name</> -> Specify the list to use for the commands. If no name is given
1438
+then 'DEFAULT' will be used.
1440
+<tag>[!] --set</> -> This will add the source address of the packet to the list.
1441
+If the source address is already in the list, this will update the existing entry. This will
1442
+always return success or failure if `!' is passed in.
1444
+<tag>[!] --rcheck</> -> This will check if the source address of the packet is currently
1445
+in the list and return true if it is, and false otherwise. Opposite is returned if `!' is passed in.
1447
+<tag>[!] --update</> -> This will check if the source address of the packet is currently
1448
+in the list. If it is then that entry will be updated and the rule will return true. If the source
1449
+address is not in the list then the rule will return false. Opposite is returned if `!' is passed in.
1451
+<tag>[!] --remove</> -> This will check if the source address of the packet is currently
1452
+in the list and if so that address will be removed from the list and the rule will return true.
1453
+If the address is not found, false is returned. Opposite is returned if `!' is passed in.
1455
+<tag>[!] --seconds seconds</> -> This option must be used in conjunction with one of `rcheck' or
1456
+`update'. When used, this will narrow the match to only happen when the address is in the list and was seen
1457
+within the last given number of seconds. Opposite is returned if `!' is passed in.
1459
+<tag>[!] --hitcount hits</> -> This option must be used in conjunction with one of `rcheck' or
1460
+`update'. When used, this will narrow the match to only happen when the address is in the list and packets
1461
+had been received greater than or equal to the given value. This option may be used along with `seconds'
1462
+to create an even narrower match requiring a certain number of hits within a specific time frame.
1463
+Opposite returned if `!' passed in.
1465
+<tag>--rttl</> -> This option must be used in conjunction with one of `rcheck' or `update'.
1466
+When used, this will narrow the match to only happen when the address is in the list and the TTL of
1467
+the current packet matches that of the packet which hit the --set rule. This may be useful if you have
1468
+problems with people faking their source address in order to DoS you via this module by disallowing others
1469
+access to your site by sending bogus packets to you.
1472
+<sect1>record-rpc patch
1474
+This patch by Marcelo Barbosa Lima <marcelo.lima@dcc.unicamp.br> adds a new match that allows
1475
+you to match if the source of the packet has requested that port through the portmapper before,
1476
+or it is a new GET request to the portmapper, allowing effective RPC filtering.
1479
+To match RPC connection tracking information, simply do the following :
1482
+# iptables -A INPUT -m record_rpc -j ACCEPT
1485
+Chain INPUT (policy ACCEPT)
1486
+target prot opt source destination
1487
+ACCEPT all -- anywhere anywhere
1491
+The record_rpc match does not take any option.
1494
+Do not worry for the match information not printed,
1495
+it's simply because the print() function of this match is empty :
1498
+/* Prints out the union ipt_matchinfo. */
1500
+print(const struct ipt_ip *ip,
1501
+ const struct ipt_entry_match *match,
1507
+<sect1>string patch
1509
+This patch by Emmanuel Roger <winfield@freegates.be> adds a new match that allows
1510
+you to match a string anywhere in the packet.
1513
+For example, to match packets containing the string ``cmd.exe'' anywhere
1514
+in the packet and queue them to a userland IDS, you could use :
1517
+# iptables -A INPUT -m string --string 'cmd.exe' -j QUEUE
1520
+Chain INPUT (policy ACCEPT)
1521
+target prot opt source destination
1522
+QUEUE all -- anywhere anywhere STRING match cmd.exe
1526
+Please do use this match with caution. A lot of people want to use
1527
+this match to stop worms, along with the DROP target. This is a major mistake.
1528
+It would be defeated by any IDS evasion method.
1531
+In a similar fashion, a lot of people have been using this match as a mean
1532
+to stop particular functions in HTTP like POST or GET by dropping
1533
+any HTTP packet containing the string POST. Please understand that this job
1534
+is better done by a filtering proxy. Additionally, any HTML content with
1535
+the word POST would get dropped with the former method.
1536
+This match has been designed to be able to queue to userland interesting packets
1537
+for better analysis, that's all. Dropping packet based on this would be defeated
1538
+by any IDS evasion method.
1541
+Supported options for the string match are :
1544
+<tag>--string [!] string</> -> Match a string in a packet
1549
+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that allows
1550
+you to match a packet based on its arrival or departure (for locally generated packets) timestamp.
1553
+for example, to accept packets that have an arrival time from 8:00H to 18:00H from Monday
1554
+to Friday you can do as follows :
1557
+# iptables -A INPUT -m time --timestart 8:00 --timestop 18:00 --days Mon,Tue,Wed,Thu,Fri -j ACCEPT
1560
+Chain INPUT (policy ACCEPT)
1561
+target prot opt source destination
1562
+ACCEPT all -- anywhere anywhere TIME from 8:0 to 18:0 on Mon,Tue,Wed,Thu,Fri
1566
+Supported options for the time match are :
1569
+<tag>--timestart value</> -> minimum HH:MM
1570
+<tag>--timestop value</> -> maximum HH:MM
1571
+<tag>--days listofdays</> -> a list of days to apply, from (case sensitive)
1585
+This patch by Harald Welte <laforge@gnumonks.org> adds a new match that allows you
1586
+to match a packet based on its TTL.
1589
+For example if you want to log any packet that have a TTL less than 5, you can do as follows :
1592
+# iptables -A INPUT -m ttl --ttl-lt 5 -j LOG
1595
+Chain INPUT (policy ACCEPT)
1596
+target prot opt source destination
1597
+LOG all -- anywhere anywhere TTL match TTL < 5 LOG level warning
1601
+Options supported by the ttl match are :
1604
+<tag>--ttl-eq value</> -> Match time to live value
1605
+<tag>--ttl-lt value</> -> Match TTL < value
1606
+<tag>--ttl-gt value</> -> Match TTL > value
1609
+<sect>New netfilter targets
1611
+In this section, we will attempt to explain the usage of new netfilter targets.
1612
+The patches will appear in alphabetical order. Additionally, we will not explain
1613
+patches that break other patches. But this might come later.
1616
+Generally speaking, for targets, you can get the help hints from a particular
1621
+# iptables -j THE_TARGET_YOU_WANT --help
1626
+This would display the normal iptables help message, plus the specific
1627
+``THE_TARGET_YOU_WANT'' target help message at the end.
1631
+This patch by Matthew G. Marsh <mgm@paktronix.com> adds a new target that allows you
1632
+to set the TOS of packets to an arbitrary value.
1635
+For example, if you want to set the TOS of all the outgoing packets to be 15, you can do as follows :
1638
+# iptables -t mangle -A OUTPUT -j FTOS --set-ftos 15
1640
+# iptables -t mangle --list
1641
+Chain OUTPUT (policy ACCEPT)
1642
+target prot opt source destination
1643
+FTOS all -- anywhere anywhere TOS set 0x0f
1647
+Supported options for the FTOS target are :
1650
+<tag>--set-ftos value</> -> Set TOS field in packet header to value. This value can be in decimal (ex: <tt>32</tt>)
1651
+or in hex (ex: <tt>0x20</tt>)
1654
+<sect1>IPV4OPTSSTRIP patch
1656
+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new target that allows you
1657
+to strip all the IP options from an IPv4 packet.
1660
+It's simpled loaded as follows :
1663
+# iptables -t mangle -A PREROUTING -j IPV4OPTSSTRIP
1665
+# iptables -t mangle --list
1666
+Chain PREROUTING (policy ACCEPT)
1667
+target prot opt source destination
1668
+IPV4OPTSSTRIP all -- anywhere anywhere
1672
+This target doesn't support any option.
1674
+<sect1>NETLINK patch
1676
+This patch by Gianni Tedesco <gianni@ecsc.co.uk> adds a new target that allows you to
1677
+send dropped packets to userspace via a netlink socket.
1680
+For example, if you want to drop all pings and send them to a userland netlink socket instead,
1681
+you can do as follows :
1684
+# iptables -A INPUT -p icmp --icmp-type echo-request -j NETLINK --nldrop
1687
+Chain INPUT (policy ACCEPT)
1688
+target prot opt source destination
1689
+NETLINK icmp -- anywhere anywhere icmp echo-request nldrop
1693
+Supported options for the NETLINK target are :
1696
+<tag>--nldrop</> -> Drop the packet too
1697
+<tag>--nlmark <number></> -> Mark the packet
1698
+<tag>--nlsize <bytes></> -> Limit packet size
1702
+For more information on netlink sockets, you can refer to the
1703
+<url url="http://www.skyfree.org/linux/kernel_network/netlink.html" name="Netlink Sockets Tour">.
1705
+<sect1>NETMAP patch
1707
+This patch by Svenning Soerensen <svenning@post5.tele.dk> adds a new target that allows you
1708
+create a static 1:1 mapping of the network address, while keeping host addresses intact.
1711
+For example, if you want to alter the destination of incoming connections from
1712
+1.2.3.0/24 to 5.6.7.0/24, you can do as follows :
1715
+# iptables -t nat -A PREROUTING -d 1.2.3.0/24 -j NETMAP --to 5.6.7.0/24
1717
+# iptables -t nat --list
1718
+Chain PREROUTING (policy ACCEPT)
1719
+target prot opt source destination
1720
+NETMAP all -- anywhere 1.2.3.0/24 5.6.7.0/24
1724
+Supported options for NETMAP target are :
1727
+<tag>--to address[/mask]</> -> Network address to map to.
1732
+This patch by C�dric de Launois <delaunois@info.ucl.ac.be> adds a new
1733
+target which allows you to setup unusual routes not supported by the
1734
+standard kernel routing table. The ROUTE target lets you route
1735
+a received packet through an interface or towards a host, even if the
1736
+regular destination of the packet is the router itself. The ROUTE target is
1737
+also able to change the incoming interface of a packet. Packets are
1738
+directly put on the wire and do not traverse any other table.
1741
+This target does not modify the packets and is a final target.
1742
+It has to be used inside the mangle table.
1745
+Whenever possible, you should use the MARK target together with
1746
+iproute2 instead of this ROUTE target. However, this target is useful
1747
+to force the use of an interface or a next hop and to change the
1748
+incoming interface of a packet. People also use it for easiness
1749
+and to simplify their rules (one rule to route a packet is easier
1750
+that one MARK rule + one iproute2 rule).
1753
+Options supported by the ROUTE target are :
1756
+<tag>--oif ifname</>
1757
+Send the packet out using `ifname' network interface. The destination
1758
+host must be on the same link or the interface must be a tunnel.
1759
+Otherwise, arp resolution cannot be performed and the packet is dropped.
1760
+<tag>--iif ifname</>
1761
+Change the packet's incoming interface to `ifname'.
1763
+Route the packet via this gateway. The packet is routed as if
1764
+its destination IP address was this ip.
1769
+For example, assume that you want to redirect ssh packets towards a
1770
+server inside your network, without modifying those packets in any way
1771
+(this excludes the use of the standard port forwarding mechanism).
1772
+A solution is to use an ipip tunnel and the ROUTE target to reroute ssh
1773
+packets to the real ssh server, which has the same IP address as the router.
1774
+It is not possible to reroute those packets using the standard routing
1775
+mechanisms, because the kernel locally delivers a packet having
1776
+a destination address belonging to the router itself.
1779
+Time for ASCII art :
1781
+ eth0 +------+ 192.168.0.1 192.168.0.2 +----+
1782
+ ----------------|router|--------------------------------|host|
1783
+ IP: 150.150.0.1 +------+ +----+
1784
+ | | tunl1 IP: 150.150.0.1 | |
1785
+ | +------------------------------------+ |
1786
+ +----------------------------------------+
1791
+For the example above, you can do as follows :
1794
+# iptables -A PREROUTING -t mangle -i eth0 -p tcp --dport 22 -j ROUTE --oif tunl1
1795
+# iptables -A PREROUTING -t mangle -i tunl1 -j ROUTE --oif eth0
1797
+# iptables -L PREROUTING -t mangle
1798
+Chain PREROUTING (policy ACCEPT)
1799
+target prot opt source destination
1800
+ROUTE tcp -- anywhere anywhere tcp dpt:ssh ROUTE oif tunl1
1801
+ROUTE all -- anywhere anywhere ROUTE oif eth0
1805
+Another example : if you want to quickly and easily balance the load between two
1806
+gateways 10.0.0.1 and 10.0.0.2, then you can do as follows :
1809
+# iptables -A PREROUTING -t mangle -m random --average 50 -j ROUTE --gw 10.0.0.1
1810
+# iptables -A PREROUTING -t mangle -j ROUTE --gw 10.0.0.2
1812
+# iptables -L PREROUTING -t mangle
1813
+Chain PREROUTING (policy ACCEPT)
1814
+target prot opt source destination
1815
+ROUTE all -- anywhere anywhere random 50% ROUTE gw 10.0.0.1
1816
+ROUTE all -- anywhere anywhere ROUTE gw 10.0.0.2
1821
+This patch by Martin Josefsson <gandalf@wlug.westbo.se> adds a new target
1822
+which is similar to SNAT and will gives a client the same address for each connection.
1825
+For example, if you want to modify the source address of the connections
1826
+to be 1.2.3.4-1.2.3.7 you can do as follows :
1829
+# iptables -t nat -A POSTROUTING -j SAME --to 1.2.3.4-1.2.3.7
1831
+# iptables -t nat --list
1832
+Chain POSTROUTING (policy ACCEPT)
1833
+target prot opt source destination
1834
+SAME all -- anywhere anywhere same:1.2.3.4-1.2.3.7
1838
+Options supported by the SAME target are :
1841
+<tag>--to <ipaddr>-<ipaddr></> -> Addresses to map source to.
1842
+May be specified more than once for multiple ranges.
1843
+<tag>--nodst</> -> Don't use destination-ip in source selection
1846
+<sect1>tcp-MSS patch
1848
+This patch by Marc Boucher <marc+nf@mbsi.ca> adds a new target that allows you to examine and
1849
+alter the MSS value of TCP SYN packets, to control the maximum size
1850
+for that connection.
1853
+As explained by Marc himself, THIS IS A HACK, used to overcome criminally
1854
+brain-dead ISPs or servers which block ICMP Fragmentation Needed
1858
+Typical usage would be :
1861
+# iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu
1864
+Chain FORWARD (policy ACCEPT)
1865
+target prot opt source destination
1866
+TCPMSS tcp -- anywhere anywhere tcp flags:SYN,RST/SYN TCPMSS clamp to PMTU
1870
+Options supported by the tcp-MSS target are (mutually-exclusive) :
1873
+<tag>--set-mss value</> explicitly set MSS option to specified value
1874
+<tag>--clamp-mss-to-pmtu</> automatically clamp MSS value to (path_MTU - 40)
1879
+This patch by Harald Welte <laforge@gnumonks.org> adds a new target that
1880
+enables the user to set the TTL value of an IP packet or to increment/decrement it
1884
+For example, if you want to set the TTL of all outgoing connections
1885
+to 126, you can do as follows :
1888
+# iptables -t mangle -A OUTPUT -j TTL --ttl-set 126
1890
+# iptables -t mangle --list
1891
+Chain OUTPUT (policy ACCEPT)
1892
+target prot opt source destination
1893
+TTL all -- anywhere anywhere TTL set to 126
1897
+Supported options for the TTL target are :
1900
+<tag>--ttl-set value</> -> Set TTL to <value>
1901
+<tag>--ttl-dec value</> -> Decrement TTL by <value>
1902
+<tag>--ttl-inc value</> -> Increment TTL by <value>
1907
+This patch by Harald Welte <laforge@gnumonks.org> adds a new target
1908
+which supplies a more advanced packet logging mechanism than the standard LOG target.
1909
+The `libipulog/' contains a library for receiving the ULOG messages.
1913
+<url url="http://www.gnumonks.org/projects/ulogd" name="web page"> containing the proper documentation
1914
+for ULOG, so there is no point for me to explain this here..
1916
+<sect>New connection tracking patches
1918
+In this sections, we will show the available connection tracking/nat patches.
1919
+To use them, simply load the corresponding modules (with options if needed)
1920
+for them to be in effect.
1922
+<sect1>amanda-conntrack-nat patch
1924
+This patch by Brian J. Murrell <netfilter@interlinx.bc.ca> adds support
1925
+for connection tracking and nat of the Amanda backup tool protocol.
1927
+<sect1>eggdrop-conntrack patch
1929
+This patch by Magnus Sandin <magnus@sandin.cx> adds support
1930
+for connection tracking for eggdrop bot networks.
1932
+<sect1>h323-conntrack-nat patch
1934
+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> adds
1935
+H.323/netmeeting support module for netfilter connection tracking and NAT.
1938
+H.323 uses/relies on the following data streams :
1941
+<item>port 389 -> Internet Locator Server (TCP).
1942
+<item>port 522 -> User Location Server (TCP).
1943
+<item>port 1503 -> T.120 Protocol (TCP).
1944
+<item>port 1720 -> H.323 (H.225 call setup, TCP)
1945
+<item>port 1731 -> Audio call control (TCP)
1946
+<item>Dynamic port -> H.245 call control (TCP)
1947
+<item>Dynamic port -> RTCP/RTP streaming (UDP)
1951
+The H.323 conntrack/NAT modules support the connection tracking/NATing of
1952
+the data streams requested on the dynamic ports. The helpers use the
1953
+search/replace hack from the ip_masq_h323.c module for the 2.2 kernel
1957
+At the very minimum, H.323/netmeeting (video/audio) is functional by letting
1958
+trough the 1720 port and loading these H.323 module(s).
1961
+The H.323 conntrack/NAT modules do not support :
1964
+<item>H.245 tunnelling
1965
+<item>H.225 RAS (gatekeepers)
1968
+<sect1>irc-conntrack-nat patch
1970
+This patch by Harald Welte <laforge@gnumonks.org> allows DCC to work though NAT and
1971
+connection tracking. By default, this module will track IRC connection on port 6667.
1972
+But you can change this for another port with the `ports=xx' argument.
1974
+<sect1>mms-conntrack-nat patch
1976
+This patch by Filip Sneppe <filip.sneppe@cronos.be> adds support for
1977
+connection tracking of Microsoft Streaming Media Services protocol.
1980
+This allows client (Windows Media Player) and server
1981
+to negotiate protocol (UDP, TCP) and port for the media stream.
1982
+A partially reverse engineered protocol analysis is available
1983
+from <url url="http://get.to/sdp" name="here">, together with a link to a Linux client.
1986
+It is recommended to open UDP port 1755 to the server, as this port is used
1987
+for retransmission requests.
1990
+This helper has been tested in SNAT and DNAT setups.
1994
+This patch by Harald Welte <laforge@gnumonks.org> allows netfilter to track pptp connection as well as to NAT them.
1996
+<sect1>quake3-conntrack patch
1998
+This patch by Filip Sneppe <filip.sneppe@cronos.be> adds support for
1999
+Quake III Arena connection tracking and nat.
2003
+This patch by Ian Larry Latter <Ian.Latter@mq.edu.au> adds support for
2004
+RSH connection tracking.
2007
+An RSH connection tracker is required if the dynamic stderr "Server
2008
+to Client" connection is to occur during a normal RSH session. This
2009
+typically operates as follows :
2012
+ Client 0:1023 --> Server 514 (stream 1 - stdin/stdout)
2013
+ Client 0:1023 <-- Server 0:1023 (stream 2 - stderr)
2017
+The author of this patch is warning you that this module could be dangerous, and
2018
+that it is not "best practice" to use RSH, and you should use SSH in all instances.
2020
+<sect1>snmp-nat patch
2022
+This patch by James Morris <jmorris@intercode.com.au> allows netfilter to NAT basic SNMP
2023
+This is the ``basic'' form of SNMP-ALG, as described in
2024
+<url url="http://www.faqs.org/rfcs/rfc2962.html" name="RFC 2962">,
2025
+it works by modifying IP addresses inside SNMP payloads
2026
+to match IP-layer NAT mapping.
2028
+<sect1>talk-conntrack-nat patch
2030
+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> allows netfilter to track
2031
+talk connections, as well as to NAT them. By default both otalk (UDP port 517) and talk (UDP port 518) are
2032
+supported. otalk/talk supports can selectively be enabled/disabled
2033
+by the module parameters of the ip_conntrack_talk and ip_nat_talk modules. The options are :
2036
+<item>otalk = 0 | 1
2041
+where `0' means `do not support' while `1' means `do support'
2042
+the given protocol flavor.
2044
+<sect1>tcp-window-tracking patch
2046
+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> allows netfilter
2047
+do TCP connection tracking according to the article
2048
+<url url="http://www.iae.nl/users/guido/papers/tcp_filtering.ps.gz" name="Real Stateful TCP Packet Filtering in IP Filter"> by
2049
+Guido van Rooij. It supports window scaling, and can now handle already established connections.
2053
+This patch by Magnus Boden <mb@ozaba.mine.nu> allows netfilter to track
2054
+tftp connections as well as to NAT them. By default, this module will track
2055
+tftp connections on port 69. But you can change this for another port with the
2056
+`ports=xx' argument.
2058
+<sect>New IPv6 netfilter matches
2060
+In this section, we will attempt to explain the usage of new netfilter matches.
2061
+The patches will appear in alphabetical order. Additionally, we will not explain
2062
+patches that break other patches. But this might come later.
2065
+Generally speaking, for matches, you can get the help hints from a particular
2070
+# ip6tables -m the_match_you_want --help
2075
+This would display the normal ip6tables help message, plus the specific
2076
+``the_match_you_want'' match help message at the end.
2080
+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds 1 new match :
2083
+<item>``eui64'' : lets you match the IPv6 packet based on it's addressing parameters.
2087
+This patch can be quite useful for people using EUI-64 IPv6 addressing scheme
2088
+who are willing to check the packets based on the delivered address on a LAN.
2091
+For example, we will redirect the packets that have a correct EUI-64 address:
2094
+# ip6tables -N ipv6ok
2095
+# ip6tables -A INPUT -m eui64 -j ipv6ok
2096
+# ip6tables -A INPUT -s ! 3FFE:2F00:A0::/64 -j ipv6ok
2097
+# ip6tables -A INPUT -j LOG
2098
+# ip6tables -A ipv6ok -j ACCEPT
2101
+Chain INPUT (policy ACCEPT)
2102
+target prot opt source destination
2103
+ipv6ok all anywhere anywhere eui64
2104
+ipv6ok all !3ffe:2f00:a0::/64 anywhere
2105
+LOG all anywhere anywhere LOG level warning
2107
+Chain ipv6ok (2 references)
2108
+target prot opt source destination
2109
+ACCEPT all anywhere anywhere
2113
+This match hasn't got any option.
2115
+<sect1>ahesp6 patch
2117
+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match
2118
+that allows you to match a packet based on its ah and esp headers' content.
2119
+The name of the matches:
2121
+<item>``ah'' : lets you match the IPv6 packet based on its ah header.
2122
+<item>``esp'' : lets you match the IPv6 packet based on its esp header.
2126
+For example, we will drop all the AH packets that have a SPI equal to
2127
+500, and check the contents of the restricted area in the header :
2130
+# ip6tables -A INPUT -m ah --ahspi 500 --ahres -j DROP
2133
+Chain INPUT (policy ACCEPT)
2134
+target prot opt source destination
2135
+DROP all anywhere anywhere ah spi:500 reserved
2139
+Supported options for the ah match are :
2142
+<tag>--ahspi [!] spi[:spi]</> -> match spi (range)
2143
+<tag>--ahlen [!] length</> -> length ot this header
2144
+<tag>--ahres </> -> checks the contents of the reserved field
2148
+The esp match works exactly the same as in IPv4 :
2151
+# ip6tables -A INPUT -m esp --espspi 500 -j DROP
2154
+Chain INPUT (policy ACCEPT)
2155
+target prot opt source destination
2156
+DROP all anywhere anywhere esp spi:500
2160
+Supported options for the esp match are :
2163
+<tag>--espspi [!] spi[:spi]</> -> match spi (range)
2166
+In IPv6 these matches can be concatenated:
2169
+# ip6tables -A INPUT -m ah --ahspi 500 --ahres --ahlen ! 40 -m esp --espspi 500 -j DROP
2172
+Chain INPUT (policy ACCEPT)
2173
+target prot opt source destination
2174
+DROP all anywhere anywhere ah spi:500 length:!40 reserved esp spi:500
2179
+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match
2180
+that allows you to match a packet based on the content of its fragmentation
2182
+The name of the match:
2184
+<item>``frag'' : lets you match the IPv6 packet based on its fragmentation
2189
+For example, we will drop all the packets that have an ID between 100 and 200,
2190
+and the packet is the first fragment :
2193
+# ip6tables -A INPUT -m frag --fragid 100:200 --fragfirst -j DROP
2196
+Chain INPUT (policy ACCEPT)
2197
+target prot opt source destination
2198
+DROP all anywhere anywhere frag ids:100:200 first
2202
+Supported options for the frag match are :
2205
+<tag>--fragid [!] id[:id]</> -> match the id (range) of the fragmenation
2206
+<tag>--fraglen [!] length</> -> match total length of this header
2207
+<tag>--fragres</> -> checks the contents of the reserved field
2208
+<tag>--fragfirst</> -> matches on the first fragment
2209
+<tag>--fragmore</> -> there are more fragments
2210
+<tag>--fraglast</> -> this is the last fragment
2213
+<sect1>ipv6header patch
2215
+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match
2216
+that allows you to match a packet based on its extension headers.
2217
+The name of the match:
2219
+<item>``ipv6header'' : lets you match the IPv6 packet based on its headers.
2223
+For example, let's drop the packets which have got hop-by-hop, ipv6-route
2224
+headers and a protocol payload:
2227
+# ip6tables -A INPUT -m ipv6header --header hop-by-hop,ipv6-route,protocol -j DROP
2230
+Chain INPUT (policy ACCEPT)
2231
+target prot opt source destination
2232
+DROP all anywhere anywhere ipv6header flags:hop-by-hop,ipv6-route,protocol
2236
+And now, let's drop the packets which have got an ipv6-route extension header:
2239
+# ip6tables -A INPUT -m ipv6header --header ipv6-route --soft -j DROP
2241
+# ip6ptables --list
2242
+Chain INPUT (policy ACCEPT)
2243
+target prot opt source destination
2244
+DROP all anywhere anywhere ipv6header flags:ipv6-route soft
2248
+Supported options for the ipv6header match are :
2250
+<tag>[!] --header headers</> -> You can specify the interested
2251
+headers with this option. Accepted formats:
2253
+<item>hop,dst,route,frag,auth,esp,none,proto
2254
+<item>hop-by-hop,ipv6-opts,ipv6-route,ipv6-frag,ah,esp,ipv6-nonxt,protocol
2255
+<item>0,60,43,44,51,50,59
2257
+<tag>--soft</> -> You can specify the soft mode: in this mode
2258
+the match checks the existance of the header, not the full match!
2261
+<sect1>ipv6-ports patch
2263
+This patch by Jan Rekorajski <baggins@pld.org.pl> adds 4 new matches :
2266
+<item>``limit'' : lets you to restrict the number of parallel TCP connections from a particular host or network.
2267
+<item>``mac'' : lets you match a packet based on its MAC address.
2268
+<item>``multiport'' : lets you to specify ports with a mix of port-ranges and single ports for UDP and TCP protocols.
2269
+<item>``owner'' : lets you match a packet based on its originator process' owner id.
2273
+These matches are the ports of the IPv4 versions. See the main documentation for the details!
2275
+<sect1>length patch
2277
+This patch by Imran Patel <ipatel@crosswinds.net> adds a new match
2278
+that allows you to match a packet based on its length. (This patch is shameless adaption from the
2279
+IPv4 match written by James Morris <jmorris@intercode.com.au>)
2282
+For example, let's drop all the pings with a packet size greater than
2286
+# ip6tables -A INPUT -p ipv6-icmp --icmpv6-type echo-request -m length --length 85:0xffff -j DROP
2288
+# ip6ptables --list
2289
+Chain INPUT (policy ACCEPT)
2290
+target prot opt source destination
2291
+DROP ipv6-icmp -- anywhere anywhere ipv6-icmp echo-request length 85:65535
2295
+Supported options for the length match are :
2298
+<tag>[!] --length length[:length]</> -> Match packet length
2299
+against value or range of values (inclusive)
2303
+Values of the range not present will be implied. The implied value for minimum
2304
+is 0, and for maximum is 65535.
2306
+<sect1>route6 patch
2308
+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match
2309
+that allows you to match a packet based on the content of its routing
2311
+The name of the match:
2313
+<item>``rt'' : lets you match the IPv6 packet based on its routing
2318
+For example, we will drop all the packets that have 0 routing type, the packet
2319
+is near the last hop (max 2 hops far), the routing path contains ::1 and ::2
2323
+# ip6tables -A INPUT -m rt --rt-type 0 --rt-segsleft :2 --rt-0-addrs ::1,::2 --rt-0-not-strict -j DROP
2326
+Chain INPUT (policy ACCEPT)
2327
+target prot opt source destination
2328
+DROP all anywhere anywhere rt type:0 segslefts:0:2 0-addrs ::1,::2 0-not-strict
2332
+Supported options for the rt match are :
2335
+<tag>--rt-type [!] type</> -> matches the type
2336
+<tag>--rt-segsleft [!] num[:num]</> -> matches the Segments Left field (range)
2337
+<tag>--rt-len [!] length</> -> total length of this header
2338
+<tag>--rt-0-res</> -> checks the contents of the reserved field
2339
+<tag>--rt-0-addrs ADDR[,ADDR...]</> -> Type=0 addresses (list, max: 16)
2340
+<tag>--rt-0-not-strict</> -> List of Type=0 addresses not a strict list
2343
+<sect>New IPv6 netfilter targets
2345
+In this section, we will attempt to explain the usage of new netfilter targets.
2346
+The patches will appear in alphabetical order. Additionally, we will not explain
2347
+patches that break other patches. But this might come later.
2350
+Generally speaking, for targets, you can get the help hints from a particular
2355
+# ip6tables -j THE_TARGET_YOU_WANT --help
2360
+This would display the normal iptables help message, plus the specific
2361
+``THE_TARGET_YOU_WANT'' target help message at the end.
2365
+This patch by Jan Rekorajski <baggins@pld.org.pl> adds a new target that allows you
2366
+to LOG the packets as in the IPv4 version of iptables.
2369
+The examples are the same as in iptables. See the man page for details!
2371
+<sect1>REJECT patch
2373
+This patch by Harald Welte <laforge@gnumonks.org> adds a new target that allows you
2374
+to REJECT the packets as in the IPv4 version of iptables.
2377
+The examples are the same as in iptables. See the man page for details!
2379
+<sect>New IPv6 connection tracking patches
2381
+The connection tracking hasn't supported, yet.
2385
+<sect1>Contributing a new extension
2387
+Netfilter core-team always welcome new extensions/bug-fixes. In this section we will not focus
2388
+on how to package a new extension to ease its inclusion into patch-o-matic yet. But this might
2389
+come in a future version of this HOWTO.
2392
+First of all, you should be familiar with the
2393
+<url url="http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html" name="Netfilter Hacking HOWTO">.
2396
+Rusty has already written a guideline on how to make new patches for netfilter,
2400
+/path/to/netfiltercvs/netfilter/patch-o-matic/NEWPATCHES
2404
+Or read the latest version online at :
2405
+<url url="http://cvs.netfilter.org/cgi-bin/cvsweb/netfilter/patch-o-matic/NEWPATCHES" name="NEWPATCHES">.
2408
+Finally, it's a good idea to subscribe to netfilter-devel mailing list.
2409
+More info on how to subscribe can be found on the netfilter homepage.
2411
+<sect1>Contributing to this HOWTO
2413
+You are mostly welcome to update this HOWTO. To do so, the preferred way
2414
+is to send a patch of the SGML master of this document to the
2415
+netfilter-devel mailing list.
2418
+Thanks for your help! Thanks to the developers who contributed the
2419
+netfilter-extensions-HOWTO parts related to their patches.
2421
Index: iptables-1.4.12/howtos/netfilter-hacking-HOWTO.sgml
2422
===================================================================
2423
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
2424
+++ iptables-1.4.12/howtos/netfilter-hacking-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600
2426
+<!doctype linuxdoc system>
2428
+<!-- This is the Linux Netfilter Hacking HOWTO.
2431
+<!-- $Id: netfilter-hacking-HOWTO.sgml,v 1.14 2002/07/02 04:07:19 fabrice Exp $ -->
2435
+<!-- Title information -->
2437
+<title>Linux netfilter Hacking HOWTO
2438
+<author>Rusty Russell and Harald Welte, mailing list <tt>netfilter@lists.samba.org</tt>
2439
+<date>$Revision: 1.14 $ $Date: 2002/07/02 04:07:19 $
2441
+This document describes the netfilter architecture for Linux, how to
2442
+hack it, and some of the major systems which sit on top of it, such as
2443
+packet filtering, connection tracking and Network Address Translation.
2446
+<!-- Table of contents -->
2449
+<!-- Begin the document -->
2451
+<sect>Introduction<label id="intro">
2457
+This document is a journey; some parts are well-traveled, and in
2458
+other areas you will find yourself almost alone. The best advice I
2459
+can give you is to grab a large, cozy mug of coffee or hot chocolate,
2460
+get into a comfortable chair, and absorb the contents before venturing
2461
+out into the sometimes dangerous world of network hacking.
2463
+<p>For more understanding of the use of the infrastructure on top of
2464
+the netfilter framework, I recommend reading the Packet Filtering
2465
+HOWTO and the NAT HOWTO. For information on kernel programming I
2466
+suggest Rusty's Unreliable Guide to Kernel Hacking and Rusty's
2467
+Unreliable Guide to Kernel Locking.
2469
+<p>(C) 2000 Paul `Rusty' Russell. Licenced under the GNU GPL.
2471
+<sect1>What is netfilter?
2474
+netfilter is a framework for packet mangling, outside the normal
2475
+Berkeley socket interface. It has four parts. Firstly, each protocol
2476
+defines "hooks" (IPv4 defines 5) which are well-defined points in a
2477
+packet's traversal of that protocol stack. At each of these points,
2478
+the protocol will call the netfilter framework with the packet and the
2482
+Secondly, parts of the kernel can register to listen to the different
2483
+hooks for each protocol. So when a packet is passed to the netfilter
2484
+framework, it checks to see if anyone has registered for that protocol
2485
+and hook; if so, they each get a chance to examine (and possibly
2486
+alter) the packet in order, then discard the packet
2487
+(<tt>NF_DROP</tt>), allow it to pass (<tt>NF_ACCEPT</tt>), tell
2488
+netfilter to forget about the packet (<tt>NF_STOLEN</tt>), or ask
2489
+netfilter to queue the packet for userspace (<tt>NF_QUEUE</tt>).
2492
+The third part is that packets that have been queued are collected (by
2493
+the ip_queue driver) for sending to userspace; these packets are
2494
+handled asynchronously.
2497
+The final part consists of cool comments in the code and
2498
+documentation. This is instrumental for any experimental project.
2499
+The netfilter motto is (stolen shamelessly from Cort Dougan):
2502
+ ``So... how is this better than KDE?''
2505
+<p>(This motto narrowly edged out `Whip me, beat me, make me use
2509
+In addition to this raw framework, various modules have been written
2510
+which provide functionality similar to previous (pre-netfilter)
2511
+kernels, in particular, an extensible NAT system, and an extensible
2512
+packet filtering system (iptables).
2514
+<sect1>What's wrong with what we had in 2.0 and 2.2?
2518
+<item>No infrastructure established for passing packet to userspace:
2520
+<item>Kernel coding is hard
2521
+<item>Kernel coding must be done in C/C++
2522
+<item>Dynamic filtering policies do not belong in kernel
2523
+<item> 2.2 introduced copying packets to userspace via netlink, but
2524
+ reinjecting packets is slow, and subject to `sanity' checks.
2525
+ For example, reinjecting packet claiming to come from an
2526
+ existing interface is not possible.
2529
+<item>Transparent proxying is a crock:
2533
+<item> We look up <bf>every</bf> packet to see if there is a socket
2534
+bound to that address
2536
+<item> Root is allowed to bind to foreign addresses
2538
+<item> Can't redirect locally-generated packets
2540
+<item> REDIRECT doesn't handle UDP replies: redirecting UDP named
2541
+packets to 1153 doesn't work because some clients don't like replies
2542
+coming from anything other than port 53.
2544
+<item> REDIRECT doesn't coordinate with tcp/udp port allocation: a
2545
+user may get a port shadowed by a REDIRECT rule.
2547
+<item>Has been broken at least twice during 2.1 series.
2549
+<item>Code is extremely intrusive. Consider the stats on the number
2550
+of #ifdef CONFIG_IP_TRANSPARENT_PROXY in 2.2.1: 34 occurrences in 11
2551
+files. Compare this with CONFIG_IP_FIREWALL, which has 10 occurrences
2555
+<item>Creating packet filter rules independent of interface addresses
2559
+<item>Must know local interface addresses to distinguish
2560
+locally-generated or locally-terminating packets from through
2563
+<item>Even that is not enough in cases of redirection or
2566
+<item>Forward chain only has information on outgoing interface,
2567
+meaning you have to figure where a packet came from using knowledge of
2568
+the network topography.
2571
+<item>Masquerading is tacked onto packet filtering:<p>
2572
+ Interactions between packet filtering and masquerading make firewalling
2575
+<item>At input filtering, reply packets appear to be destined for box itself
2576
+<item>At forward filtering, demasqueraded packets are not seen at all
2577
+<item>At output filtering, packets appear to come from local box
2580
+<item>TOS manipulation, redirect, ICMP unreachable and mark (which can
2581
+effect port forwarding, routing, and QoS) are tacked onto packet
2582
+filter code as well.
2584
+<item>ipchains code is neither modular, nor extensible (eg. MAC
2585
+address filtering, options filtering, etc).
2587
+<item>Lack of sufficient infrastructure has led to a profusion of
2588
+different techniques:
2590
+<item>Masquerading, plus per-protocol modules
2591
+<item>Fast static NAT by routing code (doesn't have per-protocol handling)
2592
+<item>Port forwarding, redirect, auto forwarding
2593
+<item>The Linux NAT and Virtual Server Projects.
2596
+<item>Incompatibility between CONFIG_NET_FASTROUTE and packet filtering:
2598
+<item>Forwarded packets traverse three chains anyway
2599
+<item>No way to tell if these chains can be bypassed
2602
+<item>Inspection of packets dropped due to routing protection
2603
+(eg. Source Address Verification) not possible.
2605
+<item>No way of atomically reading counters on packet filter rules.
2607
+<item>CONFIG_IP_ALWAYS_DEFRAG is a compile-time option, making life
2608
+difficult for distributions who want one general-purpose kernel.
2612
+<sect1>Who are you?
2615
+I'm the only one foolish enough to do this. As ipchains co-author and
2616
+current Linux Kernel IP Firewall maintainer, I see many of the
2617
+problems that people have with the current system, as well as getting
2618
+exposure to what they are trying to do.
2620
+<sect1>Why does it crash?
2623
+Woah! You should have seen it <bf>last</bf> week!
2626
+Because I'm not as great a programmer as we might all wish, and I
2627
+certainly haven't tested all scenarios, because of lack of time,
2628
+equipment and/or inspiration. I do have a testsuite, which I
2629
+encourage you to contribute to.
2631
+<sect>Where Can I Get The Latest?
2633
+<p>There is a CVS server on netfilter.org which contains the latest
2634
+HOWTOs, userspace tools and testsuite. For casual browsing, you
2636
+<url url="http://cvs.netfilter.org/" name="Web Interface">.
2638
+To grab the latest sources, you can do the following:
2641
+<item> Log in to the netfilter CVS server anonymously:
2643
+cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic login
2645
+<item> When it asks you for a password type `cvs'.
2646
+<item> Check out the code using:
2648
+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic co netfilter/userspace
2650
+<item> To update to the latest version, use
2656
+<sect>Netfilter Architecture
2658
+<p>Netfilter is merely a series of hooks in various points in a
2659
+protocol stack (at this stage, IPv4, IPv6 and DECnet). The
2660
+(idealized) IPv4 traversal diagram looks like the following:
2663
+A Packet Traversing the Netfilter System:
2665
+ --->[1]--->[ROUTE]--->[3]--->[4]--->
2674
+</verb></tscreen><label id="netfilter-traversal">
2676
+On the left is where packets come in: having passed the simple sanity
2677
+checks (i.e., not truncated, IP checksum OK, not a promiscuous receive),
2678
+they are passed to the netfilter framework's NF_IP_PRE_ROUTING [1] hook.
2681
+Next they enter the routing code, which decides whether the packet is
2682
+destined for another interface, or a local process. The routing code
2683
+may drop packets that are unroutable.
2686
+If it's destined for the box itself, the netfilter framework is called
2687
+again for the NF_IP_LOCAL_IN [2] hook, before being passed to the
2691
+If it's destined to pass to another interface instead, the netfilter
2692
+framework is called for the NF_IP_FORWARD [3] hook.
2695
+The packet then passes a final netfilter hook, the NF_IP_POST_ROUTING
2696
+[4] hook, before being put on the wire again.
2699
+The NF_IP_LOCAL_OUT [5] hook is called for packets that are created
2700
+locally. Here you can see that routing occurs after this hook is
2701
+called: in fact, the routing code is called first (to figure out the
2702
+source IP address and some IP options): if you want to alter the
2703
+routing, you must alter the `skb->dst' field yourself, as is done in
2706
+<sect1>Netfilter Base
2708
+Now we have an example of netfilter for IPv4, you can see when each
2709
+hook is activated. This is the essence of netfilter.
2712
+Kernel modules can register to listen at any of these hooks. A module
2713
+that registers a function must specify the priority of the function
2714
+within the hook; then when that netfilter hook is called from the core
2715
+networking code, each module registered at that point is called in the
2716
+order of priorites, and is free to manipulate the packet. The
2717
+module can then tell netfilter to do one of five things:
2720
+<item> NF_ACCEPT: continue traversal as normal.
2721
+<item> NF_DROP: drop the packet; don't continue traversal.
2722
+<item> NF_STOLEN: I've taken over the packet; don't continue traversal.
2723
+<item> NF_QUEUE: queue the packet (usually for userspace handling).
2724
+<item> NF_REPEAT: call this hook again.
2728
+The other parts of netfilter (handling queued packets, cool comments)
2729
+will be covered in the kernel section later.
2732
+Upon this foundation, we can build fairly complex packet
2733
+manipulations, as shown in the next two sections.
2735
+<sect1>Packet Selection: IP Tables
2737
+A packet selection system called IP Tables has been built over the
2738
+netfilter framework. It is a direct descendent of ipchains (that came
2739
+from ipfwadm, that came from BSD's ipfw IIRC), with extensibility.
2740
+Kernel modules can register a new table, and ask for a packet to
2741
+traverse a given table. This packet selection method is used for
2742
+packet filtering (the `filter' table), Network Address Translation
2743
+(the `nat' table) and general pre-route packet mangling (the `mangle'
2746
+<p>The hooks that are registered with netfilter are as follows (with
2747
+the functions in each hook in the order that they are actually
2752
+ --->PRE------>[ROUTE]--->FWD---------->POST------>
2753
+ Conntrack | Mangle ^ Mangle
2754
+ Mangle | Filter | NAT (Src)
2755
+ NAT (Dst) | | Conntrack
2758
+ IN Filter OUT Conntrack
2759
+ | Conntrack ^ Mangle
2760
+ | Mangle | NAT (Dst)
2764
+<sect2>Packet Filtering
2767
+This table, `filter', should never alter packets: only filter them.
2770
+One of the advantages of iptables filter over ipchains is that it is
2771
+small and fast, and it hooks into netfilter at the NF_IP_LOCAL_IN,
2772
+NF_IP_FORWARD and NF_IP_LOCAL_OUT points. This means that for any
2773
+given packet, there is one (and only one) possible place to filter it.
2774
+This makes things much simpler for users than ipchains was. Also, the
2775
+fact that the netfilter framework provides both the input and output
2776
+interfaces for the NF_IP_FORWARD hook means that many kinds of
2777
+filtering are far simpler.
2780
+Note: I have ported the kernel portions of both ipchains and ipfwadm
2781
+as modules on top of netfilter, enabling the use of the old ipfwadm
2782
+and ipchains userspace tools without requiring an upgrade.
2787
+This is the realm of the `nat' table, which is fed packets from two
2788
+netfilter hooks: for non-local packets, the NF_IP_PRE_ROUTING and
2789
+NF_IP_POST_ROUTING hooks are perfect for destination and source
2790
+alterations respectively. If CONFIG_IP_NF_NAT_LOCAL is defined, the
2791
+hooks NF_IP_LOCAL_OUT and NF_IP_LOCAL_IN are used for altering the
2792
+destination of local packets.
2795
+This table is slightly different from the `filter' table, in that only
2796
+the first packet of a new connection will traverse the table: the
2797
+result of this traversal is then applied to all future packets in the
2800
+<sect3>Masquerading, Port Forwarding, Transparent Proxying
2802
+<p>I divide NAT into Source NAT (where the first packet has its source
2803
+altered), and Destination NAT (the first packet has its destination
2806
+<p>Masquerading is a special form of Source NAT: port forwarding and
2807
+transparent proxying are special forms of Destination NAT. These are
2808
+now all done using the NAT framework, rather than being independent
2811
+<sect2>Packet Mangling
2813
+<p>The packet mangling table (the `mangle' table) is used for actual
2814
+changing of packet information. Example applications are the TOS and
2815
+TCPMSS targets. The mangle table hooks into all five netfilter hooks.
2816
+(please note this changed with kernel 2.4.18. Previous kernels didn't
2817
+have mangle attached to all hooks)
2819
+<sect1>Connection Tracking
2821
+Connection tracking is fundamental to NAT, but it is implemented as a
2822
+separate module; this allows an extension to the packet filtering code
2823
+to simply and cleanly use connection tracking (the `state' module).
2825
+<sect1>Other Additions
2827
+<p>The new flexibility provides both the opportunity to do really
2828
+funky things, but for people to write enhancements or complete
2829
+replacements that can be mixed and matched.
2831
+<sect>Information for Programmers
2833
+<p>I'll let you in on a secret: my pet hamster did all the coding. I
2834
+was just a channel, a `front' if you will, in my pet's grand plan.
2835
+So, don't blame me if there are bugs. Blame the cute, furry one.
2837
+<sect1>Understanding ip_tables
2839
+<p>iptables simply provides a named array of rules in memory (hence
2840
+the name `iptables'), and such information as where packets from each
2841
+hook should begin traversal. After a table is registered, userspace
2842
+can read and replace its contents using getsockopt() and setsockopt().
2844
+<p>iptables does not register with any netfilter hooks: it relies on
2845
+other modules to do that and feed it the packets as appropriate; a
2846
+module must register the netfilter hooks and ip_tables separately, and
2847
+provide the mechanism to call ip_tables when the hook is reached.
2849
+<sect2> ip_tables Data Structures
2851
+<p>For convenience, the same data structure is used to represent a
2852
+rule by userspace and within the kernel, although a few fields are
2853
+only used inside the kernel.
2855
+<p>Each rule consists of the following parts:
2857
+<item> A `struct ipt_entry'.
2858
+<item> Zero or more `struct ipt_entry_match' structures, each with a
2859
+ variable amount (0 or more bytes) of data appended to it.
2860
+<item> A `struct ipt_entry_target' structure, with a variable amount
2861
+ (0 or more bytes) of data appended to it.
2864
+The variable nature of the rule gives a huge amount of flexibility for
2865
+extensions, as we'll see, especially as each match or target can carry
2866
+an arbitrary amount of data. This does create a few traps, however:
2867
+we have to watch out for alignment. We do this by ensuring that the
2868
+`ipt_entry', `ipt_entry_match' and `ipt_entry_target' structures are
2869
+conveniently sized, and that all data is rounded up to the maximal
2870
+alignment of the machine using the IPT_ALIGN() macro.
2873
+The `struct ipt_entry' has the following fields:
2875
+<item> A `struct ipt_ip' part, containing the specifications for the
2876
+IP header that it is to match.
2878
+<item> An `nf_cache' bitfield showing what parts of the packet this
2881
+<item> A `target_offset' field indicating the offset from the
2882
+beginning of this rule where the ipt_entry_target structure begins.
2883
+This should always be aligned correctly (with the IPT_ALIGN macro).
2885
+<item> A `next_offset' field indicating the total size of this rule,
2886
+including the matches and target. This should also be aligned
2887
+correctly using the IPT_ALIGN macro.
2889
+<item> A `comefrom' field used by the kernel to track packet
2892
+<item> A `struct ipt_counters' field containing the packet and byte
2893
+counters for packets which matched this rule.
2897
+The `struct ipt_entry_match' and `struct ipt_entry_target' are very
2898
+similar, in that they contain a total (IPT_ALIGN'ed) length field
2899
+(`match_size' and `target_size' respectively) and a union holding the
2900
+name of the match or target (for userspace), and a pointer (for the
2904
+Because of the tricky nature of the rule data structure, some helper
2905
+routines are provided:
2908
+<tag>ipt_get_target()</tag> This inline function returns a pointer to
2909
+the target of a rule.
2911
+<tag>IPT_MATCH_ITERATE()</tag> This macro calls the given function for
2912
+every match in the given rule. The function's first argument is the
2913
+`struct ipt_match_entry', and other arguments (if any) are those
2914
+supplied to the IPT_MATCH_ITERATE() macro. The function must return
2915
+either zero for the iteration to continue, or a non-zero value to
2918
+<tag>IPT_ENTRY_ITERATE()</tag> This function takes a pointer to an
2919
+entry, the total size of the table of entries, and a function to call.
2920
+The functions first argument is the `struct ipt_entry', and other
2921
+arguments (if any) are those supplied to the IPT_ENTRY_ITERATE()
2922
+macro. The function must return either zero for the iteration to
2923
+continue, or a non-zero value to stop.
2926
+<sect2>ip_tables From Userspace
2928
+<p>Userspace has four operations: it can read the current table, read
2929
+the info (hook positions and size of table), replace the table (and
2930
+grab the old counters), and add in new counters.
2932
+<p>This allows any atomic operation to be simulated by userspace: this
2933
+is done by the libiptc library, which provides convenience
2934
+"add/delete/replace" semantics for programs.
2936
+<p>Because these tables are transferred into kernel space, alignment
2937
+becomes an issue for machines which have different userspace and
2938
+kernelspace type rules (eg. Sparc64 with 32-bit userland). These
2939
+cases are handled by overriding the definition of IPT_ALIGN for these
2940
+platforms in `libiptc.h'.
2942
+<sect2> ip_tables Use And Traversal
2944
+<p>The kernel starts traversing at the location indicated by the
2945
+particular hook. That rule is examined, if the `struct ipt_ip'
2946
+elements match, each `struct ipt_entry_match' is checked in turn (the
2947
+match function associated with that match is called). If the match
2948
+function returns 0, iteration stops on that rule. If it sets the
2949
+`hotdrop' parameter to 1, the packet will also be immediately dropped
2950
+(this is used for some suspicious packets, such as in the tcp match
2953
+<p>If the iteration continues to the end, the counters are
2954
+incremented, the `struct ipt_entry_target' is examined: if it's a
2955
+standard target, the `verdict' field is read (negative means a packet
2956
+verdict, positive means an offset to jump to). If the answer is
2957
+positive and the offset is not that of the next rule, the `back'
2958
+variable is set, and the previous `back' value is placed in that
2959
+rule's `comefrom' field.
2961
+<p>For non-standard targets, the target function is called: it returns
2962
+a verdict (non-standard targets can't jump, as this would break the
2963
+static loop-detection code). The verdict can be IPT_CONTINUE, to
2964
+continue on to the next rule.
2966
+<sect1>Extending iptables
2968
+<p>Because I'm lazy, <tt>iptables</tt> is fairly extensible. This is
2969
+basically a scam to palm off work onto other people, which is what
2970
+Open Source is all about (cf. Free Software, which as RMS would say,
2971
+is about freedom, and I was sitting in one of his talks when I wrote
2974
+<p>Extending <tt>iptables</tt> potentially involves two parts:
2975
+extending the kernel, by writing a new module, and possibly extending
2976
+the userspace program <tt>iptables</tt>, by writing a new shared
2981
+<p>Writing a kernel module itself is fairly simple, as you can see
2982
+from the examples. One thing to be aware of is that your code must be
2983
+re-entrant: there can be one packet coming in from userspace, while
2984
+another arrives on an interrupt. In fact in SMP there can be one
2985
+packet on an interrupt per CPU in 2.3.4 and above.
2988
+The functions you need to know about are:
2991
+<tag>init_module()</tag> This is the entry-point of the module. It
2992
+returns a negative error number, or 0 if it successfully registers
2993
+itself with netfilter.
2995
+<tag>cleanup_module()</tag> This is the exit point of the module; it
2996
+should unregister itself with netfilter.
2998
+<tag>ipt_register_match()</tag> This is used to register a new match
2999
+type. You hand it a `struct ipt_match', which is usually declared as
3000
+a static (file-scope) variable.
3002
+<tag>ipt_register_target()</tag> This is used to register a new
3003
+type. You hand it a `struct ipt_target', which is usually declared as
3004
+a static (file-scope) variable.
3006
+<tag>ipt_unregister_target()</tag> Used to unregister your target.
3008
+<tag>ipt_unregister_match()</tag> Used to unregister your match.
3011
+<p>One warning about doing tricky things (such as providing counters)
3012
+in the extra space in your new match or target. On SMP machines, the
3013
+entire table is duplicated using memcpy for each CPU: if you really
3014
+want to keep central information, you should see the method used in
3017
+<sect3>New Match Functions
3019
+<p>New match functions are usually written as a standalone module.
3020
+It's possible to have these modules extensible in turn, although it's
3021
+usually not necessary. One way would be to use the netfilter
3022
+framework's `nf_register_sockopt' function to allows users to talk to
3023
+your module directly. Another way would be to export symbols for
3024
+other modules to register themselves, the same way netfilter and
3027
+<p>The core of your new match function is the struct ipt_match which
3028
+it passes to `ipt_register_match()'. This structure has the following
3032
+<tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.
3034
+<tag>name</tag> This field is the name of the match function, as
3035
+referred to by userspace. The name should match the name of the
3036
+module (i.e., if the name is "mac", the module must be "ipt_mac.o") for
3037
+auto-loading to work.
3039
+<tag>match</tag> This field is a pointer to a match function, which
3040
+takes the skb, the in and out device pointers (one of which may be
3041
+NULL, depending on the hook), a pointer to the match data in the rule
3042
+that is worked on (the structure that was prepared in userspace), the
3043
+IP offset (non-zero means
3044
+a non-head fragment), a pointer to the protocol header (i.e., just
3045
+past the IP header), the length of the data (ie. the packet length
3046
+minus the IP header length) and finally a pointer to a `hotdrop'
3047
+variable. It should return non-zero if the packet matches, and can
3048
+set `hotdrop' to 1 if it returns 0, to indicate that the packet must
3049
+be dropped immediately.
3051
+<tag>checkentry</tag> This field is a pointer to a function which
3052
+checks the specifications for a rule; if this returns 0, then the rule
3053
+will not be accepted from the user. For example, the "tcp" match type
3054
+will only accept tcp packets, and so if the `struct ipt_ip' part of
3055
+the rule does not specify that the protocol must be tcp, a zero is
3056
+returned. The tablename argument allows your match to control what
3057
+tables it can be used in, and the `hook_mask' is a bitmask of hooks
3058
+this rule may be called from: if your match does not make sense from
3059
+some netfilter hooks, you can avoid that here.
3061
+<tag>destroy</tag> This field is a pointer to a function which is
3062
+called when an entry using this match is deleted. This allows you to
3063
+dynamically allocate resources in checkentry and clean them up here.
3065
+<tag>me</tag> This field is set to `THIS_MODULE', which gives a
3066
+pointer to your module. It causes the usage-count to go up and down
3067
+as rules of that type are created and destroyed. This prevents a user
3068
+removing the module (and hence cleanup_module() being called) if a
3074
+<p>If your target alters the packet (ie. the headers or the body), it
3075
+must call skb_unshare() to copy the packet in case it is cloned:
3076
+otherwise any raw sockets which have a clone of the skbuff will see
3077
+the alterations (ie. people will see wierd stuff happening in
3080
+<p>New targets are also usually written as a standalone module. The
3081
+discussions under the above section on `New Match Functions' apply
3084
+<p>The core of your new target is the struct ipt_target that it
3085
+passes to ipt_register_target(). This structure has the following
3089
+ <tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.
3091
+ <tag>name</tag> This field is the name of the target function, as
3092
+ referred to by userspace. The name should match the name of the
3093
+ module (i.e., if the name is "REJECT", the module must be
3094
+ "ipt_REJECT.o") for auto-loading to work.
3096
+ <tag>target</tag> This is a pointer to the target function, which
3097
+ takes the skbuff, the hook number, the input and output device
3098
+ pointers (either of which may be NULL), a pointer to the target data,
3099
+ and the position of the rule in the table. The target function may
3100
+ return either IPT_CONTINUE (-1) if traversing should continue, or a
3101
+ netfilter verdict (NF_DROP, NF_ACCEPT, NF_STOLEN etc.).
3103
+ <tag>checkentry</tag> This field is a pointer to a function which
3104
+ checks the specifications for a rule; if this returns 0, then the
3105
+ rule will not be accepted from the user.
3107
+ <tag>destroy</tag> This field is a pointer to a function which is
3108
+ called when an entry using this target is deleted. This allows you
3109
+ to dynamically allocate resources in checkentry and clean them up
3112
+ <tag>me</tag> This field is set to `THIS_MODULE', which gives a
3113
+ pointer to your module. It causes the usage-count to go up and down
3114
+ as rules with this as a target are created and destroyed. This
3115
+ prevents a user removing the module (and hence cleanup_module() being
3116
+ called) if a rule refers to it.
3121
+<p>You can create a new table for your specific purpose if you wish.
3122
+To do this, you call `ipt_register_table()', with a `struct
3123
+ipt_table', which has the following fields:
3126
+ <tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.
3128
+ <tag>name</tag> This field is the name of the table function, as
3129
+ referred to by userspace. The name should match the name of the
3130
+ module (i.e., if the name is "nat", the module must be
3131
+ "iptable_nat.o") for auto-loading to work.
3133
+ <tag>table</tag> This is a fully-populated `struct ipt_replace', as
3134
+ used by userspace to replace a table. The `counters' pointer should
3135
+ be set to NULL. This data structure can be declared `__initdata' so
3136
+ it is discarded after boot.
3138
+ <tag>valid_hooks</tag> This is a bitmask of the IPv4 netfilter hooks
3139
+ you will enter the table with: this is used to check that those entry
3140
+ points are valid, and to calculate the possible hooks for ipt_match
3141
+ and ipt_target `checkentry()' functions.
3143
+ <tag>lock</tag> This is the read-write spinlock for the entire table;
3144
+ initialize it to RW_LOCK_UNLOCKED.
3146
+ <tag>private</tag> This is used internally by the ip_tables code.
3149
+<sect2>Userspace Tool
3151
+<p>Now you've written your nice shiny kernel module, you may want to
3152
+control the options on it from userspace. Rather than have a branched
3153
+version of <tt>iptables</tt> for each extension, I use the very latest
3154
+90's technology: furbies. Sorry, I mean shared libraries.
3156
+<p>New tables generally don't require any extension to
3157
+<tt>iptables</tt>: the user just uses the `-t' option to make it use
3160
+<p>The shared library should have an `_init()' function, which will
3161
+automatically be called upon loading: the moral equivalent of the
3162
+kernel module's `init_module()' function. This should call
3163
+`register_match()' or `register_target()', depending on whether your
3164
+shared library provides a new match or a new target.
3166
+<p>You need to provide a shared library: this can be used to
3167
+initialize part of the structure, or provide additional options. I
3168
+now insist on a shared library even if it doesn't do anything, to
3169
+reduce problem reports where the shares libraries are missing.
3171
+<p>There are useful functions described in the `iptables.h' header,
3174
+<tag>check_inverse()</tag> checks if an argument is actually a `!',
3175
+and if so, sets the `invert' flag if not already set. If it returns
3176
+true, you should increment optind, as done in the examples.
3178
+<tag>string_to_number()</tag> converts a string into a number in the
3179
+given range, returning -1 if it is malformed or out of range.
3180
+`string_to_number' rely on `strtol' (see the manpage), meaning
3181
+that a leading "0x" would make the number be in Hexadecimal base, a leading
3182
+"0" would make it be in Octal base.
3184
+<tag>exit_error()</tag> should be called if an error is found.
3185
+Usually the first argument is `PARAMETER_PROBLEM', meaning the user
3186
+didn't use the command line correctly.
3189
+<sect3>New Match Functions
3191
+<p>Your shared library's _init() function hands `register_match()' a
3192
+pointer to a static `struct iptables_match', which has the following
3196
+<tag>next</tag> This pointer is used to make a linked list of matches
3197
+(such as used for listing rules). It should be set to NULL initially.
3199
+<tag>name</tag> The name of the match function. This should match the
3200
+library name (eg "tcp" for `libipt_tcp.so').
3202
+<tag>version</tag> Usually set to the IPTABLES_VERSION macro: this is
3203
+used to ensure that the <tt>iptables</tt> binary doesn't pick up the
3204
+wrong shared libraries by mistake.
3206
+<tag>size</tag> The size of the match data for this match; you should
3207
+use the IPT_ALIGN() macro to ensure it is correctly aligned.
3209
+<tag>userspacesize</tag> For some matches, the kernel changes some
3210
+fields internally (the `limit' target is a case of this). This means
3211
+that a simple `memcmp()' is insufficient to compare two rules
3212
+(required for delete-matching-rule functionality). If this is the
3213
+case, place all the fields which do not change at the start of the
3214
+structure, and put the size of the unchanging fields here. Usually,
3215
+however, this will be identical to the `size' field.
3217
+<tag>help</tag> A function which prints out the option synopsis.
3219
+<tag>init</tag> This can be used to initialize the extra space (if
3220
+any) in the ipt_entry_match structure, and set any nfcache bits; if
3221
+you are examining something not expressible using the contents of
3222
+`linux/include/netfilter_ipv4.h', then simply OR in the NFC_UNKNOWN
3223
+bit. It will be called before `parse()'.
3225
+<tag>parse</tag> This is called when an unrecognized option is seen on
3226
+the command line: it should return non-zero if the option was indeed
3227
+for your library. `invert' is true if a `!' has already been seen.
3228
+The `flags' pointer is for the exclusive use of your match library,
3229
+and is usually used to store a bitmask of options which have been
3230
+specified. Make sure you adjust the nfcache field. You may extend
3231
+the size of the `ipt_entry_match' structure by reallocating if
3232
+necessary, but then you must ensure that the size is passed through
3233
+the IPT_ALIGN macro.
3235
+<tag>final_check</tag> This is called after the command line has been
3236
+parsed, and is handed the `flags' integer reserved for your library.
3237
+This gives you a chance to check that any compulsory options have been
3238
+specified, for example: call `exit_error()' if this is the case.
3240
+<tag>print</tag> This is used by the chain listing code to print (to
3241
+standard output) the extra match information (if any) for a rule. The
3242
+numeric flag is set if the user specified the `-n' flag.
3244
+<tag>save</tag> This is the reverse of parse: it is used by
3245
+`iptables-save' to reproduce the options which created the rule.
3247
+<tag>extra_opts</tag> This is a NULL-terminated list of extra options
3248
+which your library offers. This is merged with the current options
3249
+and handed to getopt_long; see the man page for details. The return
3250
+code for getopt_long becomes the first argument (`c') to your
3251
+`parse()' function.
3254
+There are extra elements at the end of this structure for use
3255
+internally by <tt>iptables</tt>: you don't need to set them.
3259
+<p>Your shared library's _init() function hands `register_target()' it
3260
+a pointer to a static `struct iptables_target', which has similar
3261
+fields to the iptables_match structure detailed above.
3263
+<sect2>Using `libiptc'
3265
+<p><tt>libiptc</tt> is the iptables control library, designed for
3266
+listing and manipulating rules in the iptables kernel module. While
3267
+its current use is for the iptables program, it makes writing other
3268
+tools fairly easy. You need to be root to use these functions.
3270
+<p>The kernel tables themselves are simply a table of rules, and a set
3271
+of numbers representing entry points. Chain names ("INPUT", etc) are
3272
+provided as an abstraction by the library. User defined chains are
3273
+labelled by inserting an error node before the head of the
3274
+user-defined chain, which contains the chain name in the extra data
3275
+section of the target (the builtin chain positions are defined by the
3276
+three table entry points).
3278
+<p>The following standard targets are supported: ACCEPT, DROP, QUEUE
3279
+(which are translated to NF_ACCEPT, NF_DROP, and NF_QUEUE,
3280
+respectively), RETURN (which is translated to a special IPT_RETURN
3281
+value handled by ip_tables), and JUMP (which is translated from the
3282
+chain name to an actual offset within the table).
3284
+<p>When `iptc_init()' is called, the table, including the counters, is
3285
+read. This table is manipulated by the `iptc_insert_entry()',
3286
+`iptc_replace_entry()', `iptc_append_entry()', `iptc_delete_entry()',
3287
+`iptc_delete_num_entry()', `iptc_flush_entries()',
3288
+`iptc_zero_entries()', `iptc_create_chain()' `iptc_delete_chain()',
3289
+and `iptc_set_policy()' functions.
3291
+<p>The table changes are not written back until the `iptc_commit()'
3292
+function is called. This means it is possible for two library users
3293
+operating on the same chain to race each other; locking would be
3294
+required to prevent this, and it is not currently done.
3296
+<p>There is no race with counters, however; counters are added back in
3297
+to the kernel in such a way that counter increments between the
3298
+reading and writing of the table still show up in the new table.
3300
+<p>There are various helper functions:
3303
+<tag>iptc_first_chain()</tag> This function returns the first chain
3306
+<tag>iptc_next_chain()</tag> This function returns the next chain name
3307
+in the table: NULL means no more chains.
3309
+<tag>iptc_builtin()</tag> Returns true if the given chain name is the
3310
+name of a builtin chain.
3312
+<tag>iptc_first_rule()</tag> This returns a pointer to the first rule
3313
+in the given chain name: NULL for an empty chain.
3315
+<tag>iptc_next_rule()</tag> This returns a pointer to the next rule in
3316
+the chain: NULL means the end of the chain.
3318
+<tag>iptc_get_target()</tag> This gets the target of the given rule. If
3319
+it's an extended target, the name of that target is returned. If it's
3320
+a jump to another chain, the name of that chain is returned. If it's
3321
+a verdict (eg. DROP), that name is returned. If it has no target (an
3322
+accounting-style rule), then the empty string is returned.
3324
+<p>Note that this function should be used instead of using the value
3325
+of the `verdict' field of the ipt_entry structure directly, as it
3326
+offers the above further interpretations of the standard verdict.
3328
+<tag>iptc_get_policy()</tag> This gets the policy of a builtin chain,
3329
+and fills in the `counters' argument with the hit statistics on that
3332
+<tag>iptc_strerror()</tag> This function returns a more meaningful
3333
+explanation of a failure code in the iptc library. If a function
3334
+fails, it will always set errno: this value can be passed to
3335
+iptc_strerror() to yield an error message.
3338
+<sect1>Understanding NAT
3340
+<p>Welcome to Network Address Translation in the kernel. Note that
3341
+the infrastructure offered is designed more for completeness than raw
3342
+efficiency, and that future tweaks may increase the efficiency
3343
+markedly. For the moment I'm happy that it works at all.
3345
+<p>NAT is separated into connection tracking (which doesn't manipulate
3346
+packets at all), and the NAT code itself. Connection tracking is also
3347
+designed to be used by an iptables modules, so it makes subtle
3348
+distinctions in states which NAT doesn't care about.
3350
+<sect2>Connection Tracking
3352
+<p>Connection tracking hooks into high-priority NF_IP_LOCAL_OUT and
3353
+NF_IP_PRE_ROUTING hooks, in order to see packets before they enter the
3356
+<p>The nfct field in the skb is a pointer to inside the struct
3357
+ip_conntrack, at one of the infos[] array. Hence we can tell the
3358
+state of the skb by which element in this array it is pointing to:
3359
+this pointer encodes both the state structure and the relationship of
3360
+this skb to that state.
3362
+<p>The best way to extract the `nfct' field is to call
3363
+`ip_conntrack_get()', which returns NULL if it's not set, or the
3364
+connection pointer, and fills in ctinfo which describes the
3365
+relationship of the packet to that connection. This enumerated type
3366
+has several values:
3370
+<tag>IP_CT_ESTABLISHED</tag> The packet is part of an established
3371
+connection, in the original direction.
3373
+<tag>IP_CT_RELATED</tag> The packet is related to the connection, and
3374
+is passing in the original direction.
3376
+<tag>IP_CT_NEW</tag> The packet is trying to create a new connection
3377
+(obviously, it is in the original direction).
3379
+<tag>IP_CT_ESTABLISHED + IP_CT_IS_REPLY</tag> The packet is part of an
3380
+established connection, in the reply direction.
3382
+<tag>IP_CT_RELATED + IP_CT_IS_REPLY</tag> The packet is related to the
3383
+connection, and is passing in the reply direction.
3386
+Hence a reply packet can be identified by testing for >=
3389
+<sect1>Extending Connection Tracking/NAT
3391
+<p>These frameworks are designed to accommodate any number of protocols
3392
+and different mapping types. Some of these mapping types might be
3393
+quite specific, such as a load-balancing/fail-over mapping type.
3395
+<p>Internally, connection tracking converts a packet to a "tuple",
3396
+representing the interesting parts of the packet, before searching for
3397
+bindings or rules which match it. This tuple has a manipulatable
3398
+part, and a non-manipulatable part; called "src" and "dst", as this is
3399
+the view for the first packet in the Source NAT world (it'd be a reply
3400
+packet in the Destination NAT world). The tuple for every packet in
3401
+the same packet stream in that direction is the same.
3403
+<p>For example, a TCP packet's tuple contains the manipulatable part:
3404
+source IP and source port, the non-manipulatable part: destination IP
3405
+and the destination port. The manipulatable and non-manipulatable
3406
+parts do not need to be the same type though; for example, an ICMP
3407
+packet's tuple contains the manipulatable part: source IP and the ICMP
3408
+id, and the non-manipulatable part: the destination IP and the ICMP
3411
+<p>Every tuple has an inverse, which is the tuple of the reply packets
3412
+in the stream. For example, the inverse of an ICMP ping packet, icmp
3413
+id 12345, from 192.168.1.1 to 1.2.3.4, is a ping-reply packet, icmp id
3414
+12345, from 1.2.3.4 to 192.168.1.1.
3416
+<p>These tuples, represented by the `struct ip_conntrack_tuple', are used
3417
+widely. In fact, together with the hook the packet came in on (which
3418
+has an effect on the type of manipulation expected), and the device
3419
+involved, this is the complete information on the packet.
3421
+<p>Most tuples are contained within a `struct
3422
+ip_conntrack_tuple_hash', which adds a doubly linked list entry, and a
3423
+pointer to the connection that the tuple belongs to.
3425
+<p>A connection is represented by the `struct ip_conntrack': it has
3426
+two `struct ip_conntrack_tuple_hash' fields: one referring to the
3427
+direction of the original packet (tuplehash[IP_CT_DIR_ORIGINAL]), and
3428
+one referring to packets in the reply direction
3429
+(tuplehash[IP_CT_DIR_REPLY]).
3431
+<p>Anyway, the first thing the NAT code does is to see if the
3432
+connection tracking code managed to extract a tuple and find an
3433
+existing connection, by looking at the skbuff's nfct field; this tells
3434
+us if it's an attempt on a new connection, or if not, which direction
3435
+it is in; in the latter case, then the manipulations determined
3436
+previously for that connection are done.
3438
+<p>If it was the start of a new connection, we look for a rule for that
3439
+tuple, using the standard iptables traversal mechanism, on the `nat'
3440
+table. If a rule matches, it is used to initialize the manipulations
3441
+for both that direction and the reply; the connection-tracking code is
3442
+told that the reply it should expect has changed. Then, it's
3443
+manipulated as above.
3445
+<p>If there is no rule, a `null' binding is created: this usually does
3446
+not map the packet, but exists to ensure we don't map another stream
3447
+over an existing one. Sometimes, the null binding cannot be created,
3448
+because we have already mapped an existing stream over it, in which
3449
+case the per-protocol manipulation may try to remap it, even though
3450
+it's nominally a `null' binding.
3452
+<sect2>Standard NAT Targets
3454
+<p>NAT targets are like any other iptables target extensions, except
3455
+they insist on being used only in the `nat' table. Both the SNAT and
3456
+DNAT targets take a `struct ip_nat_multi_range' as their extra data;
3457
+this is used to specify the range of addresses a mapping is allowed to
3458
+bind into. A range element, `struct ip_nat_range' consists of an
3459
+inclusive minimum and maximum IP address, and an inclusive maximum and
3460
+minimum protocol-specific value (eg. TCP ports). There is also room
3461
+for flags, which say whether the IP address can be mapped (sometimes
3462
+we only want to map the protocol-specific part of a tuple, not the
3463
+IP), and another to say that the protocol-specific part of the range
3466
+<p>A multi-range is an array of these `struct ip_nat_range' elements;
3467
+this means that a range could be "1.1.1.1-1.1.1.2 ports 50-55 AND
3468
+1.1.1.3 port 80". Each range element adds to the range (a union, for
3469
+those who like set theory).
3471
+<sect2>New Protocols
3473
+<sect3> Inside The Kernel
3475
+<p>Implementing a new protocol first means deciding what the
3476
+manipulatable and non-manipulatable parts of the tuple should be.
3477
+Everything in the tuple has the property that it identifies the stream
3478
+uniquely. The manipulatable part of the tuple is the part you can do
3479
+NAT with: for TCP this is the source port, for ICMP it's the icmp ID;
3480
+something to use as a "stream identifier". The non-manipulatable part
3481
+is the rest of the packet that uniquely identifies the stream, but we
3482
+can't play with (eg. TCP destination port, ICMP type).
3484
+<p>Once you've decided this, you can write an extension to the
3485
+connection-tracking code in the directory, and go about populating the
3486
+`ip_conntrack_protocol' structure which you need to pass to
3487
+`ip_conntrack_register_protocol()'.
3489
+<p>The fields of `struct ip_conntrack_protocol' are:
3492
+<tag>list</tag> Set it to '{ NULL, NULL }'; used to sew you into the list.
3494
+<tag>proto</tag> Your protocol number; see `/etc/protocols'.
3496
+<tag>name</tag> The name of your protocol. This is the name the user
3497
+will see; it's usually best if it's the canonical name in
3500
+<tag>pkt_to_tuple</tag> The function which fills out the protocol
3501
+specific parts of the tuple, given the packet. The `datah' pointer
3502
+points to the start of your header (just past the IP header), and the
3503
+datalen is the length of the packet. If the packet isn't long enough
3504
+to contain the header information, return 0; datalen will always be
3505
+at least 8 bytes though (enforced by framework).
3507
+<tag>invert_tuple</tag> This function is simply used to change the
3508
+protocol-specific part of the tuple into the way a reply to that
3511
+<tag>print_tuple</tag> This function is used to print out the
3512
+protocol-specific part of a tuple; usually it's sprintf()'d into the
3513
+buffer provided. The number of buffer characters used is returned.
3514
+This is used to print the states for the /proc entry.
3516
+<tag>print_conntrack</tag> This function is used to print the private
3517
+part of the conntrack structure, if any, also used for printing the
3520
+<tag>packet</tag> This function is called when a packet is seen which
3521
+is part of an established connection. You get a pointer to the
3522
+conntrack structure, the IP header, the length, and the ctinfo. You
3523
+return a verdict for the packet (usually NF_ACCEPT), or -1 if the
3524
+packet is not a valid part of the connection. You can delete the
3525
+connection inside this function if you wish, but you must use the
3526
+following idiom to avoid races (see ip_conntrack_proto_icmp.c):
3529
+if (del_timer(&ct->timeout))
3530
+ ct->timeout.function((unsigned long)ct);
3533
+<tag>new</tag> This function is called when a packet creates a
3534
+connection for the first time; there is no ctinfo arg, since the first
3535
+packet is of ctinfo IP_CT_NEW by definition. It returns 0 to fail to
3536
+create the connection, or a connection timeout in jiffies.
3539
+Once you've written and tested that you can track your new protocol,
3540
+it's time to teach NAT how to translate it. This means writing a new
3541
+module; an extension to the NAT code and go about populating the
3542
+`ip_nat_protocol' structure which you need to pass to
3543
+`ip_nat_protocol_register()'.
3546
+<tag>list</tag> Set it to '{ NULL, NULL }'; used to sew you into the list.
3548
+<tag>name</tag> The name of your protocol. This is the name the user
3549
+will see; it's best if it's the canonical name in `/etc/protocols' for
3550
+userspace auto-loading, as we'll see later.
3552
+<tag>protonum</tag> Your protocol number; see `/etc/protocols'.
3554
+<tag>manip_pkt</tag> This is the other half of connection tracking's
3555
+pkt_to_tuple function: you can think of it as "tuple_to_pkt". There
3556
+are some differences though: you get a pointer to the start of the IP
3557
+header, and the total packet length. This is because some protocols
3558
+(UDP, TCP) need to know the IP header. You're given the
3559
+ip_nat_tuple_manip field from the tuple (i.e., the "src" field), rather
3560
+than the entire tuple, and the type of manipulation you are to
3563
+<tag>in_range</tag> This function is used to tell if manipulatable
3564
+part of the given tuple is in the given range. This function is a bit
3565
+tricky: we're given the manipulation type which has been applied to
3566
+the tuple, which tells us how to interpret the range (is it a source
3567
+range or a destination range we're aiming for?).
3569
+<p>This function is used to check if an existing mapping puts us in
3570
+the right range, and also to check if no manipulation is necessary at
3573
+<tag>unique_tuple</tag> This function is the core of NAT: given a
3574
+tuple and a range, we're to alter the per-protocol part of the tuple
3575
+to place it within the range, and make it unique. If we can't find an
3576
+unused tuple in the range, return 0. We also get a pointer to the
3577
+conntrack structure, which is required for ip_nat_used_tuple().
3579
+<p>The usual approach is to simply iterate the per-protocol part of
3580
+the tuple through the range, checking `ip_nat_used_tuple()' on it,
3581
+until one returns false.
3583
+<p>Note that the null-mapping case has already been checked: it's
3584
+either outside the range given, or already taken.
3586
+<p>If IP_NAT_RANGE_PROTO_SPECIFIED isn't set, it means that the user
3587
+is doing NAT, not NAPT: do something sensible with the range. If no
3588
+mapping is desirable (for example, within TCP, a destination mapping
3589
+should not change the TCP port unless ordered to), return 0.
3591
+<tag>print</tag> Given a character buffer, a match tuple and a mask,
3592
+write out the per-protocol parts and return the length of the buffer
3595
+<tag>print_range</tag> Given a character buffer and a range, write out
3596
+the per-protocol part of the range, and return the length of the
3597
+buffer used. This won't be called if the IP_NAT_RANGE_PROTO_SPECIFIED
3598
+flag wasn't set for the range.
3601
+<sect2>New NAT Targets
3603
+<p>This is the really interesting part. You can write new NAT targets
3604
+which provide a new mapping type: two extra targets are provided in
3605
+the default package: MASQUERADE and REDIRECT. These are fairly simple
3606
+to illustrate the potential and power of writing a new NAT target.
3608
+<p>These are written just like any other iptables targets, but
3609
+internally they will extract the connection and call
3610
+`ip_nat_setup_info()'.
3612
+<sect2>Protocol Helpers
3614
+<p>Protocol helpers for connection tracking allow the connection
3615
+tracking code to understand protocols which use multiple network
3616
+connections (eg. FTP) and mark the `child' connections as being
3617
+related to the initial connection, usually by reading the related
3618
+address out of the data stream.
3620
+<p>Protocol helpers for NAT do two things: firstly allow the NAT code
3621
+to manipulate the data stream to change the address contained within
3622
+it, and secondly to perform NAT on the related connection when it
3623
+comes in, based on the original connection.
3625
+<sect2>Connection Tracking Helper Modules
3629
+The duty of a connection tracking module is to specify which packets
3630
+belong to an already established connection. The module has the
3631
+following means to do that:
3634
+<item>Tell netfilter which packets our module is interested in (most
3635
+helpers operate on a particular port).
3637
+<item>Register a function with netfilter. This function is called for
3638
+every packet which matches the criteria above.
3640
+<item>An `ip_conntrack_expect_related()' function which can be called
3641
+from there to tell netfilter to expect related connections.</item>
3645
+If there is some additional work to be done at the time the first packet
3646
+of the expected connection arrives, the module can register a callback
3647
+function which is called at that time.
3649
+<sect3>Structures and Functions Available
3651
+<p>Your kernel module's init function has to call
3652
+`ip_conntrack_helper_register()' with a pointer to a
3653
+`struct ip_conntrack_helper'. This struct has the following fields:
3656
+<tag>list</tag>This is the header for the linked list. Netfilter
3657
+handles this list internally. Just initialize it with `{ NULL, NULL }'.
3659
+<tag>name</tag>This is a pointer to a string constant specifying the
3660
+name of the protocol. ("ftp", "irc", ...)
3662
+<tag>flags</tag>A set of flags with one or more out of the following flgs:
3664
+<item>IP_CT_HELPER_F_REUSE_EXPECT : Reuse expectations if the limit (see
3665
+`max_expected` below) is reached.</item>
3668
+<tag>me</tag>A pointer to the module structure of the helper. Intitialize this with the `THIS_MODULE' macro.
3670
+<tag>max_expected</tag>Maximum number of unconfirmed (outstanding) expectations.
3672
+<tag>timeout</tag>Timeout (in seconds) for each unconfirmed expectation. An expectation is deleted `timeout' seconds after the expectation was issued with the `ip_conntrack_expect_related()' function.
3674
+<tag>tuple</tag>This is a `struct ip_conntrack_tuple' which specifies
3675
+the packets our conntrack helper module is interested in.
3677
+<tag>mask</tag>Again a `struct ip_conntrack_tuple'. This mask
3678
+specifies which bits of <tt>tuple</tt> are valid.
3680
+<tag>help</tag>The function which netfilter should call for each
3681
+packet matching tuple+mask
3684
+<sect3>Example skeleton of a conntrack helper module
3687
+#define FOO_PORT 111
3689
+static int foo_expectfn(struct ip_conntrack *new)
3691
+ /* called when the first packet of an expected
3692
+ connection arrives */
3697
+static int foo_help(const struct iphdr *iph, size_t len,
3698
+ struct ip_conntrack *ct,
3699
+ enum ip_conntrack_info ctinfo)
3701
+ /* analyze the data passed on this connection and
3702
+ decide how related packets will look like */
3704
+ /* update per master-connection private data
3705
+ (session state, ...) */
3706
+ ct->help.ct_foo_info = ...
3708
+ if (there_will_be_new_packets_related_to_this_connection)
3710
+ struct ip_conntrack_expect exp;
3712
+ memset(&exp, 0, sizeof(exp));
3713
+ exp.t = tuple_specifying_related_packets;
3714
+ exp.mask = mask_for_above_tuple;
3715
+ exp.expectfn = foo_expectfn;
3716
+ exp.seq = tcp_sequence_number_of_expectation_cause;
3718
+ /* per slave-connection private data */
3719
+ exp.help.exp_foo_info = ...
3721
+ ip_conntrack_expect_related(ct, &exp);
3726
+static struct ip_conntrack_helper foo;
3728
+static int __init init(void)
3730
+ memset(&foo, 0, sizeof(struct ip_conntrack_helper);
3733
+ foo.flags = IP_CT_HELPER_F_REUSE_EXPECT;
3734
+ foo.me = THIS_MODULE;
3735
+ foo.max_expected = 1; /* one expectation at a time */
3736
+ foo.timeout = 0; /* expectation never expires */
3738
+ /* we are interested in all TCP packets with destport 111 */
3739
+ foo.tuple.dst.protonum = IPPROTO_TCP;
3740
+ foo.tuple.dst.u.tcp.port = htons(FOO_PORT);
3741
+ foo.mask.dst.protonum = 0xFFFF;
3742
+ foo.mask.dst.u.tcp.port = 0xFFFF;
3743
+ foo.help = foo_help;
3745
+ return ip_conntrack_helper_register(&foo);
3748
+static void __exit fini(void)
3750
+ ip_conntrack_helper_unregister(&foo);
3755
+<sect2>NAT helper modules
3759
+NAT helper modules do some application specific NAT handling. Usually
3760
+this includes on-the-fly manipulation of data: think about the PORT
3761
+command in FTP, where the client tells the server which IP/port to
3762
+connect to. Therefor an FTP helper module must replace the IP/port
3763
+after the PORT command in the FTP control connection.
3766
+If we are dealing with TCP, things get slightly more complicated. The
3767
+reason is a possible change of the packet size (FTP example: the
3768
+length of the string representing an IP/port tuple after the PORT
3769
+command has changed). If we change the packet size, we have a syn/ack
3770
+difference between left and right side of the NAT box. (i.e. if we had
3771
+extended one packet by 4 octets, we have to add this offset to the TCP
3772
+sequence number of each following packet).
3775
+Special NAT handling of all related packets is required, too. Take as
3776
+example again FTP, where all incoming packets of the DATA connection
3777
+have to be NATed to the IP/port given by the client with the PORT
3778
+command on the control connection, rather than going through the
3779
+normal table lookup.
3782
+<item>callback for the packet causing the related connection (foo_help)
3783
+<item>callback for all related packets (foo_nat_expected)
3786
+<sect3>Structures and Functions Available
3788
+<p>Your nat helper module's `init()' function calls
3789
+`ip_nat_helper_register()' with a pointer to a `struct
3790
+ip_nat_helper'. This struct has the following members:
3793
+<tag>list</tag>Just again the list header for netfilters internal use.
3794
+Initialize this with { NULL, NULL }.
3796
+<tag>name</tag>A pointer to a string constant with the protocol's name
3798
+<tag>flags</tag>A set out of zero, one or more of the following flags:
3800
+<item>IP_NAT_HELPER_F_ALWAYS : Call the NAT helper for every packet,
3801
+not only for packets where conntrack has detected an expectation-cause.</item>
3802
+<item>IP_NAT_HELPER_F_STANDALONE : Tell the NAT core that this protocol
3803
+doesn't have a conntrack helper, only a NAT helper.</item>
3806
+<tag>me</tag>A pointer to the module structure of the helper. Initialize
3807
+this using the `THIS_MODULE' macro.
3809
+<tag>tuple</tag>a `struct ip_conntrack_tuple' describing which packets
3810
+our NAT helper is interested in.
3812
+<tag>mask</tag>a `struct ip_conntrack_tuple', telling netfilter which
3813
+bits of <tt>tuple</tt> are valid.
3815
+<tag>help</tag>The help function which is called for each packet
3816
+matching tuple+mask.
3818
+<tag>expect</tag>The expect function which is called for every first
3819
+packet of an expected connection.
3823
+This is very similar to writing a connection tracking helper.
3825
+<sect3>Example NAT helper module
3828
+#define FOO_PORT 111
3830
+static int foo_nat_expected(struct sk_buff **pksb,
3831
+ unsigned int hooknum,
3832
+ struct ip_conntrack *ct,
3833
+ struct ip_nat_info *info)
3834
+/* called whenever the first packet of a related connection arrives.
3835
+ params: pksb packet buffer
3836
+ hooknum HOOK the call comes from (POST_ROUTING, PRE_ROUTING)
3837
+ ct information about this (the related) connection
3838
+ info &ct->nat.info
3839
+ return value: Verdict (NF_ACCEPT, ...)
3841
+ /* Change ip/port of the packet to the masqueraded
3842
+ values (read from master->tuplehash), to map it the same way,
3843
+ call ip_nat_setup_info, return NF_ACCEPT. */
3847
+static int foo_help(struct ip_conntrack *ct,
3848
+ struct ip_conntrack_expect *exp,
3849
+ struct ip_nat_info *info,
3850
+ enum ip_conntrack_info ctinfo,
3851
+ unsigned int hooknum,
3852
+ struct sk_buff **pksb)
3853
+/* called for every packet where conntrack detected an expectation-cause
3854
+ params: ct struct ip_conntrack of the master connection
3855
+ exp struct ip_conntrack_expect of the expectation
3856
+ caused by the conntrack helper for this protocol
3857
+ info (STATE: related, new, established, ... )
3858
+ hooknum HOOK the call comes from (POST_ROUTING, PRE_ROUTING)
3859
+ pksb packet buffer
3863
+ /* extract information about future related packets (you can
3864
+ share information with the connection tracking's foo_help).
3865
+ Exchange address/port with masqueraded values, insert tuple
3866
+ about related packets */
3869
+static struct ip_nat_helper hlpr;
3871
+static int __init(void)
3875
+ memset(&hlpr, 0, sizeof(struct ip_nat_helper));
3876
+ hlpr.list = { NULL, NULL };
3877
+ hlpr.tuple.dst.protonum = IPPROTO_TCP;
3878
+ hlpr.tuple.dst.u.tcp.port = htons(FOO_PORT);
3879
+ hlpr.mask.dst.protonum = 0xFFFF;
3880
+ hlpr.mask.dst.u.tcp.port = 0xFFFF;
3881
+ hlpr.help = foo_help;
3882
+ hlpr.expect = foo_nat_expect;
3884
+ ret = ip_nat_helper_register(hlpr);
3889
+static void __exit(void)
3891
+ ip_nat_helper_unregister(&hlpr);
3895
+<sect1>Understanding Netfilter
3897
+<p>Netfilter is pretty simple, and is described fairly thoroughly in
3898
+the previous sections. However, sometimes it's necessary to go
3899
+beyond what the NAT or ip_tables infrastructure offers, or you may
3900
+want to replace them entirely.
3902
+<p>One important issue for netfilter (well, in the future) is caching.
3903
+Each skb has an `nfcache' field: a bitmask of what fields in the
3904
+header were examined, and whether the packet was altered or not. The
3905
+idea is that each hook off netfilter OR's in the bits relevant to it,
3906
+so that we can later write a cache system which will be clever enough
3907
+to realize when packets do not need to be passed through netfilter at
3910
+<p>The most important bits are NFC_ALTERED, meaning the packet was
3911
+altered (this is already used for IPv4's NF_IP_LOCAL_OUT hook, to
3912
+reroute altered packets), and NFC_UNKNOWN, which means caching should
3913
+not be done because some property which cannot be expressed was
3914
+examined. If in doubt, simply set the NFC_UNKNOWN flag on the skb's
3915
+nfcache field inside your hook.
3917
+<sect1>Writing New Netfilter Modules
3919
+<sect2> Plugging Into Netfilter Hooks
3921
+<p> To receive/mangle packets inside the kernel, you can simply write
3922
+a module which registers a "netfilter hook". This is basically an
3923
+expression of interest at some given point; the actual points are
3924
+protocol-specific, and defined in protocol-specific netfilter headers,
3925
+such as "netfilter_ipv4.h".
3927
+<p> To register and unregister netfilter hooks, you use the functions
3928
+`nf_register_hook' and `nf_unregister_hook'. These each take a
3929
+pointer to a `struct nf_hook_ops', which you populate as follows:
3932
+<tag>list</tag> Used to sew you into the linked list: set to '{ NULL,
3935
+<tag>hook</tag> The function which is called when a packet hits this
3936
+hook point. Your function must return NF_ACCEPT, NF_DROP or NF_QUEUE.
3937
+If NF_ACCEPT, the next hook attached to that point will be called. If
3938
+NF_DROP, the packet is dropped. If NF_QUEUE, it's queued. You
3939
+receive a pointer to an skb pointer, so you can entirely replace the
3942
+<tag>flush</tag> Currently unused: designed to pass on packet hits
3943
+when the cache is flushed. May never be implemented: set it to NULL.
3945
+<tag>pf</tag> The protocol family, eg, `PF_INET' for IPv4.
3947
+<tag>hooknum</tag> The number of the hook you are interested in, eg
3951
+<sect2> Processing Queued Packets
3953
+<p>This interface is currently used by ip_queue; you can register to
3954
+handle queued packets for a given protocol. This has similar semantics
3955
+to registering for a hook, except you can block processing the packet,
3956
+and you only see packets for which a hook has replied `NF_QUEUE'.
3958
+<p>The two functions used to register interest in queued packets are
3959
+`nf_register_queue_handler()' and `nf_unregister_queue_handler()'. The
3960
+function you register will be called with the `void *' pointer you
3961
+handed it to `nf_register_queue_handler()'.
3964
+If no-one is registered to handle a protocol, then returning NF_QUEUE
3965
+is equivalent to returning NF_DROP.
3968
+Once you have registered interest in queued packets, they begin
3969
+queueing. You can do whatever you want with them, but you must call
3970
+`nf_reinject()' when you are finished with them (don't simply
3971
+kfree_skb() them). When you reinject an skb, you hand it the skb, the
3972
+`struct nf_info' which your queue handler was given, and a verdict:
3973
+NF_DROP causes them to be dropped, NF_ACCEPT causes them to continue
3974
+to iterate through the hooks, NF_QUEUE causes them to be queued again,
3975
+and NF_REPEAT causes the hook which queued the packet to be consulted
3976
+again (beware infinite loops).
3978
+<p>You can look inside the `struct nf_info' to get auxiliary
3979
+information about the packet, such as the interfaces and hook it was
3982
+<sect2> Receiving Commands From Userspace
3984
+<p>It is common for netfilter components to want to interact with
3985
+userspace. The method for doing this is by using the setsockopt
3986
+mechanism. Note that each protocol must be modified to call
3987
+nf_setsockopt() for setsockopt numbers it doesn't understand (and
3988
+nf_getsockopt() for getsockopt numbers), and so far only IPv4, IPv6
3989
+and DECnet have been modified.
3991
+<p>Using a now-familiar technique, we register a `struct
3992
+nf_sockopt_ops' using the nf_register_sockopt() call. The fields of
3993
+this structure are as follows:
3996
+<tag>list</tag> Used to sew it into the linked list: set to '{ NULL,
3999
+<tag>pf</tag> The protocol family you handle, eg. PF_INET.
4001
+<tag>set_optmin</tag> and
4002
+<tag>set_optmax</tag>
4004
+These specify the (exclusive) range of setsockopt numbers handled.
4005
+Hence using 0 and 0 means you have no setsockopt numbers.
4007
+<tag>set</tag> This is the function called when the user calls one of
4008
+your setsockopts. You should check that they have NET_ADMIN
4009
+capability within this function.
4011
+<tag>get_optmin</tag> and
4012
+<tag>get_optmax</tag>
4014
+These specify the (exclusive) range of getsockopt numbers handled.
4015
+Hence using 0 and 0 means you have no getsockopt numbers.
4017
+<tag>get</tag> This is the function called when the user calls one of
4018
+your getsockopts. You should check that they have NET_ADMIN
4019
+capability within this function.
4022
+<p>The final two fields are used internally.
4024
+<sect1>Packet Handling in Userspace
4026
+<p>Using the libipq library and the `ip_queue' module, almost anything
4027
+which can be done inside the kernel can now be done in userspace.
4028
+This means that, with some speed penalty, you can develop your code
4029
+entirely in userspace. Unless you are trying to filter large
4030
+bandwidths, you should find this approach superior to in-kernel packet
4033
+<p>In the very early days of netfilter, I proved this by porting an
4034
+embryonic version of iptables to userspace. Netfilter opens the doors
4035
+for more people to write their own, fairly efficient netmangling
4036
+modules, in whatever language they want.
4038
+<sect>Translating 2.0 and 2.2 Packet Filter Modules
4040
+<p>Look at the ip_fw_compat.c file for a simple layer which should
4041
+make porting quite simple.
4043
+<sect>Netfilter Hooks for Tunnel Writers
4045
+<p>Authors of tunnel (or encapsulation) drivers should follow two
4046
+simple rules for the 2.4 kernel (as do the drivers inside the kernel,
4047
+like net/ipv4/ipip.c):
4051
+Release skb->nfct if you're going to make the packet unrecognisable
4052
+(ie. decapsulating/encapsulating). You don't need to do this if you
4053
+unwrap it into a *new* skb, but if you're going to do it in place, you
4056
+<p>Otherwise: the NAT code will use the old connection tracking
4057
+information to mangle the packet, with bad consequences.
4059
+<item>Make sure the encapsulated packets go through the LOCAL_OUT
4060
+hook, and decapsulated packets go through the PRE_ROUTING hook (most
4061
+tunnels use ip_rcv(), which does this for you).
4063
+<p>Otherwise: the user will not be able to filter as they expect to with
4067
+<p>The canonical way to do the first is to insert code like the
4068
+following before you wrap or unwrap the packet:
4071
+ /* Tell the netfilter framework that this packet is not the
4072
+ same as the one before! */
4073
+#ifdef CONFIG_NETFILTER
4074
+ nf_conntrack_put(skb->nfct);
4076
+#ifdef CONFIG_NETFILTER_DEBUG
4077
+ skb->nf_debug = 0;
4082
+<p>Usually, all you need to do for the second, is to find where the
4083
+newly encapsulated packet goes into "ip_send()", and replace it with
4087
+ /* Send "new" packet from local host */
4088
+ NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev, ip_send);
4091
+<p> Following these rules means that the person setting up the packet
4092
+filtering rules on the tunnel box will see something like the
4093
+following sequence for a packet being tunnelled:
4096
+<item> FORWARD hook: normal packet (from eth0 -> tunl0)
4097
+<item> LOCAL_OUT hook: encapsulated packet (to eth1).
4100
+And for the reply packet:
4102
+<item> LOCAL_IN hook: encapsulated reply packet (from eth1)
4103
+<item> FORWARD hook: reply packet (from eth1 -> eth0).
4106
+<sect>The Test Suite
4108
+<p>Within the CVS repository lives a test suite: the more the test
4109
+suite covers, the greater confidence you can have that changes to the
4110
+code hasn't quietly broken something. Trivial tests are at least as
4111
+important as tricky tests: it's the trivial tests which simplify the
4112
+complex tests (since you know the basics work fine before the complex
4115
+<p>The tests are simple: they are just shell scripts under the
4116
+testsuite/ subdirectory which are supposed to succeed. The scripts
4117
+are run in alphabetical order, so `01test' is run before `02test'.
4118
+Currently there are 5 test directories:
4121
+<tag>00netfilter/</tag> General netfilter framework tests.
4122
+<tag>01iptables/</tag> iptables tests.
4123
+<tag>02conntrack/</tag> connection tracking tests.
4124
+<tag>03NAT/</tag> NAT tests
4125
+<tag>04ipchains-compat/</tag> ipchains/ipfwadm compatibility tests
4128
+Inside the testsuite/ directory is a script called `test.sh'. It
4129
+configures two dummy interfaces (tap0 and tap1), turns forwarding on,
4130
+and removes all netfilter modules. Then it runs through the
4131
+directories above and runs each of their test.sh scripts until one
4132
+fails. This script takes two optional arguments: `-v' meaning to
4133
+print out each test as it proceeds, and an optional test name: if this
4134
+is given, it will skip over all tests until this one is found.
4136
+<sect1>Writing a Test
4138
+<p>Create a new file in the appropriate directory: try to number your
4139
+test so that it gets run at the right time. For example, in order to
4140
+test ICMP reply tracking (02conntrack/02reply.sh), we need to first
4141
+check that outgoing ICMPs are tracked properly
4142
+(02conntrack/01simple.sh).
4144
+<p>It's usually better to create many small files, each of which
4145
+covers one area, because it helps to isolate problems immediately for
4146
+people running the testsuite.
4148
+<p>If something goes wrong in the test, simply do an `exit 1', which
4149
+causes failure; if it's something you expect may fail, you should
4150
+print a unique message. Your test should end with `exit 0' if
4151
+everything goes OK. You should check the success of <bf>every</bf>
4152
+command, either using `set -e' at the top of the script, or
4153
+appending `|| exit 1' to the end of each command.
4155
+<p>The helper functions `load_module' and `remove_module' can be used
4156
+to load modules: you should never rely on autoloading in the testsuite
4157
+unless that is what you are specifically testing.
4159
+<sect1>Variables And Environment
4161
+<p>You have two play interfaces: tap0 and tap1. Their interface
4162
+addresses are in variables <tt>$TAP0</tt> and <tt>$TAP1</tt>
4163
+respectively. They both have netmasks of 255.255.255.0; their
4164
+networks are in $TAP0NET and $TAP1NET respectively.
4166
+<p>There is an empty temporary file in $TMPFILE. It is deleted at the
4169
+<p>Your script will be run from the testsuite/ directory, wherever it
4170
+is. Hence you should access tools (such as iptables) using path
4171
+starting with `../userspace'.
4173
+<p>Your script can print out more information if $VERBOSE is set
4174
+(meaning that the user specified `-v' on the command line).
4176
+<sect1>Useful Tools
4179
+There are several useful testsuite tools in the "tools" subdirectory:
4180
+each one exits with a non-zero exit status if there is a problem.
4184
+<p>You can generate IP packets using `gen_ip', which outputs an IP
4185
+packet to standard input. You can feed packets in the tap0 and tap1
4186
+by sending standard output to /dev/tap0 and /dev/tap1 (these are
4187
+created upon first running the testsuite if they don't exist).
4189
+<p>gen_ip is a simplistic program which is currently very fussy about
4190
+its argument order. First are the general optional arguments:
4194
+<tag>FRAG=offset,length</tag> Generate the packet, then turn it into a
4195
+ fragment at the following offset and length.
4197
+<tag>MF</tag> Set the `More Fragments' bit on the packet.
4199
+<tag>MAC=xx:xx:xx:xx:xx:xx</tag> Set the source MAC address on the
4202
+<tag>TOS=tos</tag> Set the TOS field on the packet (0 to 255).
4206
+Next come the compulsory arguments:
4209
+<tag>source ip</tag> Source IP address of the packet.
4211
+<tag>dest ip</tag> Destination IP address of the packet.
4213
+<tag>length</tag> Total length of the packet, including headers.
4215
+<tag>protocol</tag> Protocol number of the packet, eg 17 = UDP.
4219
+Then the arguments depend on the protocol: for UDP (17), they are the
4220
+source and destination port numbers. For ICMP (1), they are the type
4221
+and code of the ICMP message: if the type is 0 or 8 (ping-reply or
4222
+ping), then two additional arguments (the ID and sequence fields) are
4223
+required. For TCP, the source and destination ports, and flags
4224
+("SYN", "SYN/ACK", "ACK", "RST" or "FIN") are required. There are
4225
+three optional arguments: "OPT=" followed by a comma-separated list of
4226
+options, "SYN=" followed by a sequence number, and "ACK=" followed by
4227
+a sequence number. Finally, the optional argument "DATA" indicates
4228
+that the payload of the TCP packet is to be filled with the contents
4233
+<p>You can see IP packets using `rcv_ip', which prints out the command
4234
+line as close as possible to the original value fed to gen_ip
4235
+(fragments are the exception).
4237
+<p>This is extremely useful for analyzing packets. It takes two
4238
+compulsory arguments:
4241
+<tag>wait time</tag> The maximum time in seconds to wait for a packet
4242
+ from standard input.
4244
+<tag>iterations</tag> The number of packets to receive.
4247
+There is one optional argument, "DATA", which causes the payload of a
4248
+TCP packet to be printed on standard output after the packet header.
4250
+<p>The standard way to use `rcv_ip' in a shell script is as follows:
4253
+# Set up job control, so we can use & in shell scripts.
4256
+# Wait two seconds for one packet from tap0
4257
+../tools/rcv_ip 2 1 < /dev/tap0 > $TMPFILE &
4259
+# Make sure that rcv_ip has started running.
4262
+# Send a ping packet
4263
+../tools/gen_ip $TAP1NET.2 $TAP0NET.2 100 1 8 0 55 57 > /dev/tap1 || exit 1
4266
+if wait %../tools/rcv_ip; then :
4268
+ echo rcv_ip failed:
4276
+<p>This program takes a packet (as generated by gen_ip, for example)
4277
+on standard input, and turns it into an ICMP error.
4279
+<p>It takes three arguments: a source IP address, a type and a code.
4280
+The destination IP address will be set to the source IP address of the
4281
+packet fed in standard input.
4285
+<p>This takes a packet from standard input and injects it into the
4286
+system from a raw socket. This give the appearance of a
4287
+locally-generated packet (as separate from feeding a packet in one of
4288
+the ethertap devices, which looks like a remotely-generated packet).
4290
+<sect1>Random Advice
4292
+<p>All the tools assume they can do everything in one read or write:
4293
+this is true for the ethertap devices, but might not be true if you're
4294
+doing something tricky with pipes.
4296
+<p>dd can be used to cut packets: dd has an obs (output block size)
4297
+option which can be used to make it output the packet in a single
4300
+<p>Test for success first: eg. testing that packets are successfully
4301
+blocked. First test that packets pass through normally, <bf>then</bf>
4302
+test that some packets are blocked. Otherwise an unrelated failure
4303
+could be stopping the packets...
4305
+<p>Try to write exact tests, not `throw random stuff and see what
4306
+happens' tests. If an exact test goes wrong, it's a useful thing to
4307
+know. If a random test goes wrong once, it doesn't help much.
4309
+<p>If a test fails without a message, you can add `-x' to the top line
4310
+of the script (ie. `#! /bin/sh -x') to see what commands it's running.
4312
+<p>If a test fails randomly, check for random network traffic
4313
+interfering (try downing all your external interfaces). Sitting on
4314
+the same network as Andrew Tridgell, I tend to get plagued by Windows
4315
+broadcasts, for example.
4319
+<p>As I was developing ipchains, I realized (in one of those
4320
+blinding-flash-while-waiting-for-entree moments in a Chinese
4321
+restaurant in Sydney) that packet filtering was being done in the
4322
+wrong place. I can't find it now, but I remember sending mail to Alan
4323
+Cox, who kind of said `why don't you finish what you're doing, first,
4324
+even though you're probably right'. In the short term, pragmatism was
4325
+to win over The Right Thing.
4327
+<p>After I finished ipchains, which was initially going to be a minor
4328
+modification of the kernel part of ipfwadm, and turned into a larger
4329
+rewrite, and wrote the HOWTO, I became aware of just how much
4330
+confusion there is in the wider Linux community about issues like
4331
+packet filtering, masquerading, port forwarding and the like.
4333
+<p>This is the joy of doing your own support: you get a closer feel
4334
+for what the users are trying to do, and what they are struggling
4335
+with. Free software is most rewarding when it's in the hands of the
4336
+most users (that's the point, right?), and that means making it easy.
4337
+The architecture, not the documentation, was the key flaw.
4339
+<p>So I had the experience, with the ipchains code, and a good idea of
4340
+what people out there were doing. There were only two problems.
4342
+<p>Firstly, I didn't want to get back into security. Being a security
4343
+consultant is a constant moral tug-of-war between your conscience and
4344
+your wallet. At a fundamental level, you are selling the feeling of
4345
+security, which is at odds with actual security. Maybe working in a
4346
+military setting, where they understand security, it'd be different.
4348
+<p>The second problem is that newbie users aren't the only concern; an
4349
+increasing number of large companies and ISPs are using this stuff. I
4350
+needed reliable input from that class of users if it was to scale to
4351
+tomorrow's home users.
4353
+<p>These problems were resolved, when I ran into David Bonn, of
4354
+WatchGuard fame, at Usenix in July 1998. They were looking for a
4355
+Linux kernel coder; in the end we agreed that I'd head across to their
4356
+Seattle offices for a month and we'd see if we could hammer out an
4357
+agreement whereby they'd sponsor my new code, and my current support
4358
+efforts. The rate we agreed on was more than I asked, so I didn't
4359
+take a pay cut. This means I don't have to even think about external
4360
+conslutting for a while.
4362
+<p>Exposure to WatchGuard gave me exposure to the large clients I
4363
+need, and being independent from them allowed me to support all users
4364
+(eg. WatchGuard competitors) equally.
4366
+<p>So I could have simply written netfilter, ported ipchains over the
4367
+top, and been done with it. Unfortunately, that would leave all the
4368
+masquerading code in the kernel: making masquerading independent from
4369
+filtering is the one of the major wins point of moving the packet
4370
+filtering points, but to do that masquerading also needed to be moved
4371
+over to the netfilter framework as well.
4373
+<p>Also, my experience with ipfwadm's `interface-address' feature (the
4374
+one I removed in ipchains) had taught me that there was no hope of
4375
+simply ripping out the masquerading code and expecting someone who
4376
+needed it to do the work of porting it onto netfilter for me.
4378
+<p>So I needed to have at least as many features as the current code;
4379
+preferably a few more, to encourage niche users to become early
4380
+adopters. This means replacing transparent proxying (gladly!),
4381
+masquerading and port forwarding. In other words, a complete NAT layer.
4383
+<p>Even if I had decided to port the existing masquerading layer,
4384
+instead of writing a generic NAT system, the masquerading code was
4385
+showing its age, and lack of maintenance. See, there was no
4386
+masquerading maintainer, and it shows. It seems that serious users
4387
+generally don't use masquerading, and there aren't many home users up
4388
+to the task of doing maintenance. Brave people like Juan Ciarlante
4389
+were doing fixes, but it had reached to the stage (being extended over
4390
+and over) that a rewrite was needed.
4392
+<p>Please note that I wasn't the person to do a NAT rewrite: I didn't
4393
+use masquerading any more, and I'd not studied the existing code at
4394
+the time. That's probably why it took me longer than it should have.
4395
+But the result is fairly good, in my opinion, and I sure as hell
4396
+learned a lot. No doubt the second version will be even better, once
4397
+we see how people use it.
4401
+<p>Thanks to those who helped, expecially Harald Welte for writing the
4402
+Protocol Helpers section.
4404
Index: iptables-1.4.12/howtos/packet-filtering-HOWTO.sgml
4405
===================================================================
4406
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
4407
+++ iptables-1.4.12/howtos/packet-filtering-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600
4409
+<!doctype linuxdoc system>
4411
+<!-- This is the Linux Packet Filtering HOWTO.
4414
+<!-- $Id: packet-filtering-HOWTO.sgml,v 1.26 2002/01/24 13:42:53 laforge Exp $ -->
4418
+<!-- Title information -->
4420
+<title>Linux 2.4 Packet Filtering HOWTO
4421
+<author>Rusty Russell, mailing list <tt>netfilter@lists.samba.org</tt>
4422
+<date>$Revision: 1.26 $ $Date: 2002/01/24 13:42:53 $
4424
+This document describes how to use iptables to filter out bad packets
4425
+for the 2.4 Linux kernels.
4428
+<!-- Table of contents -->
4431
+<!-- Begin the document -->
4433
+<sect>Introduction<label id="intro">
4436
+Welcome, gentle reader.
4439
+It is assumed you know what an IP address, a network address, a
4440
+netmask, routing and DNS are. If not, I recommend that you read the
4441
+Network Concepts HOWTO.
4444
+This HOWTO flips between a gentle introduction (which will leave you
4445
+feeling warm and fuzzy now, but unprotected in the Real World) and raw
4446
+full-disclosure (which would leave all but the hardiest souls
4447
+confused, paranoid and seeking heavy weaponry).
4450
+Your network is not <bf>secure</bf>. The problem of allowing rapid,
4451
+convenient communication while restricting its use to good, and not
4452
+evil intents is congruent to other intractable problems such as
4453
+allowing free speech while disallowing a call of ``Fire!'' in a
4454
+crowded theater. It will not be solved in the space of this HOWTO.
4457
+So only you can decide where the compromise will be. I will try to
4458
+instruct you in the use of some of the tools available and some
4459
+vulnerabilities to be aware of, in the hope that you will use them for
4460
+good, and not evil purposes. Another equivalent problem.
4462
+<p>(C) 2000 Paul `Rusty' Russell. Licenced under the GNU GPL.
4464
+<sect>Where is the official Web Site? Is there a Mailing List?
4466
+<p>There are three official sites:
4468
+<item>Thanks to <url url="http://netfilter.filewatcher.org/" name="Filewatcher">.
4469
+<item>Thanks to <url url="http://netfilter.samba.org/" name="The Samba Team and SGI">.
4470
+<item>Thanks to <url url="http://netfilter.gnumonks.org/" name="Harald Welte">.
4472
+<p> You can reach all of them using round-robin DNS via
4473
+<url url="http://www.netfilter.org/"> and <url url="http://www.iptables.org/">
4475
+<p>For the official netfilter mailing list, see
4476
+<url url="http://www.netfilter.org/contact.html#list" name="netfilter List">.
4478
+<sect>So What's A Packet Filter?
4481
+A packet filter is a piece of software which looks at the
4482
+<em>header</em> of packets as they pass through, and decides the fate
4483
+of the entire packet. It might decide to <bf>DROP</bf> the packet
4484
+(i.e., discard the packet as if it had never received it),
4485
+<bf>ACCEPT</bf> the packet (i.e., let the packet go through), or
4486
+something more complicated.
4489
+Under Linux, packet filtering is built into the kernel (as a kernel
4490
+module, or built right in), and there are a few trickier things we can
4491
+do with packets, but the general principle of looking at the headers
4492
+and deciding the fate of the packet is still there.
4494
+<sect1>Why Would I Want to Packet Filter?
4497
+Control. Security. Watchfulness.
4501
+<tag/Control:/ when you are using a Linux box to connect your internal
4502
+network to another network (say, the Internet) you have an opportunity
4503
+to allow certain types of traffic, and disallow others. For example,
4504
+the header of a packet contains the destination address of the packet,
4505
+so you can prevent packets going to a certain part of the outside
4506
+network. As another example, I use Netscape to access the Dilbert
4507
+archives. There are advertisements from doubleclick.net on the page,
4508
+and Netscape wastes my time by cheerfully downloading them.
4509
+Telling the packet filter not to allow any packets to or from the
4510
+addresses owned by doubleclick.net solves that problem (there are
4511
+better ways of doing this though: see Junkbuster).
4513
+<tag/Security:/ when your Linux box is the only thing between the
4514
+chaos of the Internet and your nice, orderly network, it's nice to
4515
+know you can restrict what comes tromping in your door. For example,
4516
+you might allow anything to go out from your network, but you might be
4517
+worried about the well-known `Ping of Death' coming in from malicious
4518
+outsiders. As another example, you might not want outsiders
4519
+telnetting to your Linux box, even though all your accounts have
4520
+passwords. Maybe you want (like most people) to be an observer on the
4521
+Internet, and not a server (willing or otherwise). Simply don't let
4522
+anyone connect in, by having the packet filter reject incoming packets
4523
+used to set up connections.
4525
+<tag/Watchfulness:/ sometimes a badly configured machine on the local
4526
+network will decide to spew packets to the outside world. It's nice
4527
+to tell the packet filter to let you know if anything abnormal occurs;
4528
+maybe you can do something about it, or maybe you're just curious by
4532
+<sect1>How Do I Packet Filter Under Linux?<label id="filter-linux">
4534
+<p>Linux kernels have had packet filtering since the 1.1 series. The
4535
+first generation, based on ipfw from BSD, was ported by Alan Cox in
4536
+late 1994. This was enhanced by Jos Vos and others for Linux 2.0; the
4537
+userspace tool `ipfwadm' controlled the kernel filtering rules. In
4538
+mid-1998, for Linux 2.2, I reworked the kernel quite heavily, with the
4539
+help of Michael Neuling, and introduced the userspace tool `ipchains'.
4540
+Finally, the fourth-generation tool, `iptables', and another kernel
4541
+rewrite occurred in mid-1999 for Linux 2.4. It is this iptables which
4542
+this HOWTO concentrates on.
4545
+You need a kernel which has the netfilter infrastructure in it:
4546
+netfilter is a general framework inside the Linux kernel which other
4547
+things (such as the iptables module) can plug into. This means you
4548
+need kernel 2.3.15 or beyond, and answer `Y' to CONFIG_NETFILTER in
4549
+the kernel configuration.
4552
+The tool <tt>iptables</tt> talks to the kernel and tells it what
4553
+packets to filter. Unless you are a programmer, or overly curious,
4554
+this is how you will control the packet filtering.
4559
+The <tt>iptables</tt> tool inserts and deletes rules from the kernel's
4560
+packet filtering table. This means that whatever you set up, it will
4561
+be lost upon reboot; see <ref id="permanent" name="Making Rules
4562
+Permanent"> for how to make sure they are restored the next time Linux
4566
+<tt>iptables</tt> is a replacement for <tt>ipfwadm</tt> and
4567
+<tt>ipchains</tt>: see
4568
+<ref id="oldstyle" name="Using ipchains and ipfwadm"> for how to painlessly
4569
+avoid using iptables if you're using one of those tools.
4571
+<sect2> Making Rules Permanent<label id="permanent">
4573
+<p>Your current firewall setup is stored in the kernel, and thus will
4574
+be lost on reboot. You can try the iptables-save and iptables-restore
4575
+scripts to save them to, and restore them from a file.
4577
+<p>The other way is to put the commands required to set up your rules
4578
+in an initialization script. Make sure you do something intelligent
4579
+if one of the commands should fail (usually `exec /sbin/sulogin').
4581
+<sect>Who the hell are you, and why are you playing with my kernel?
4584
+I'm Rusty Russell; the Linux IP Firewall maintainer and just another
4585
+working coder who happened to be in the right place at the right time.
4586
+I wrote ipchains (see <ref id="filter-linux" name="How Do I Packet
4587
+Filter Under Linux?"> above for due credit to the people who did the
4588
+actual work), and learnt enough to get packet filtering right this
4592
+<url url="http://www.watchguard.com" name="WatchGuard">, an excellent
4593
+firewall company who sell the really nice plug-in Firebox, offered to
4594
+pay me to do nothing, so I could spend all my time writing this stuff,
4595
+and maintaining my previous stuff. I predicted 6 months, and it took
4596
+12, but I felt by the end that it had been done Right. Many rewrites,
4597
+a hard-drive crash, a laptop being stolen, a couple of corrupted
4598
+filesystems and one broken screen later, here it is.
4601
+While I'm here, I want to clear up some people's misconceptions: I am
4602
+no kernel guru. I know this, because my kernel work has brought me
4603
+into contact with some of them: David S. Miller, Alexey Kuznetsov,
4604
+Andi Kleen, Alan Cox. However, they're all busy doing the deep magic,
4605
+leaving me to wade in the shallow end where it's safe.
4607
+<!-- This is probably no longer true; somewhere in writing all this
4608
+kernel code and documentation I seem to have picked up a fair number
4609
+of kernel tricks. But I'm still nowhere near as clever as I think I
4612
+<sect> Rusty's Really Quick Guide To Packet Filtering
4615
+Most people just have a single PPP connection to the Internet, and
4616
+don't want anyone coming back into their network, or the firewall:
4619
+## Insert connection-tracking modules (not needed if built into kernel).
4620
+# insmod ip_conntrack
4621
+# insmod ip_conntrack_ftp
4623
+## Create chain which blocks new connections, except if coming from inside.
4624
+# iptables -N block
4625
+# iptables -A block -m state --state ESTABLISHED,RELATED -j ACCEPT
4626
+# iptables -A block -m state --state NEW -i ! ppp0 -j ACCEPT
4627
+# iptables -A block -j DROP
4629
+## Jump to that chain from INPUT and FORWARD chains.
4630
+# iptables -A INPUT -j block
4631
+# iptables -A FORWARD -j block
4634
+<sect> How Packets Traverse The Filters
4637
+The kernel starts with three lists of rules in the `filter' table;
4638
+these lists are called <bf>firewall chains</bf> or just
4639
+<bf>chains</bf>. The three chains are called <bf>INPUT</bf>,
4640
+<bf>OUTPUT</bf> and <bf>FORWARD</bf>.
4643
+For ASCII-art fans, the chains are arranged like so: <bf>(Note: this
4644
+is a very different arrangement from the 2.0 and 2.2 kernels!)</bf>
4648
+Incoming / \ Outgoing
4649
+ -->[Routing ]--->|FORWARD|------->
4650
+ [Decision] \_____/ ^
4658
+ ----> Local Process ----
4661
+<p>The three circles represent the three chains mentioned above. When
4662
+a packet reaches a circle in the diagram, that chain is examined to
4663
+decide the fate of the packet. If the chain says to DROP the packet,
4664
+it is killed there, but if the chain says to ACCEPT the packet, it
4665
+continues traversing the diagram.
4668
+A chain is a checklist of <bf>rules</bf>. Each rule says `if the packet
4669
+header looks like this, then here's what to do with the packet'. If
4670
+the rule doesn't match the packet, then the next rule in the chain is
4671
+consulted. Finally, if there are no more rules to consult, then the
4672
+kernel looks at the chain <bf>policy</bf> to decide what to do. In a
4673
+security-conscious system, this policy usually tells the kernel to
4678
+<item>When a packet comes in (say, through the Ethernet card) the kernel
4679
+first looks at the destination of the packet: this is called
4682
+<item>If it's destined for this box, the packet passes downwards
4683
+in the diagram, to the INPUT chain. If it passes this, any processes
4684
+waiting for that packet will receive it.
4686
+<item>Otherwise, if the kernel does not have forwarding enabled, or it
4687
+doesn't know how to forward the packet, the packet is dropped. If
4688
+forwarding is enabled, and the packet is destined for another network
4689
+interface (if you have another one), then the packet goes rightwards
4690
+on our diagram to the FORWARD chain. If it is ACCEPTed, it will be
4693
+<item>Finally, a program running on the box can send network packets.
4694
+These packets pass through the OUTPUT chain immediately: if it says
4695
+ACCEPT, then the packet continues out to whatever interface it is
4699
+<sect>Using iptables
4702
+iptables has a fairly detailed manual page (<tt>man iptables</tt>),
4703
+and if you need more detail on particulars. Those of you familiar
4704
+with ipchains may simply want to look at <ref id="Appendix-A"
4705
+name="Differences Between iptables and ipchains">; they are very
4709
+There are several different things you can do with <tt>iptables</tt>.
4710
+You start with three built-in chains <tt>INPUT</tt>, <tt>OUTPUT</tt>
4711
+and <tt>FORWARD</tt> which you can't delete. Let's look at the
4712
+operations to manage whole chains:
4715
+<item> Create a new chain (-N).
4716
+<item> Delete an empty chain (-X).
4717
+<item> Change the policy for a built-in chain. (-P).
4718
+<item> List the rules in a chain (-L).
4719
+<item> Flush the rules out of a chain (-F).
4720
+<item> Zero the packet and byte counters on all rules in a chain (-Z).
4723
+There are several ways to manipulate rules inside a chain:
4726
+<item> Append a new rule to a chain (-A).
4727
+<item> Insert a new rule at some position in a chain (-I).
4728
+<item> Replace a rule at some position in a chain (-R).
4729
+<item> Delete a rule at some position in a chain, or the first that matches (-D).
4732
+<sect1> What You'll See When Your Computer Starts Up
4735
+iptables may be a module, called (`iptable_filter.o'), which should be
4736
+automatically loaded when you first run <tt>iptables</tt>. It can
4737
+also be built into the kernel permenantly.
4739
+<p>Before any iptables commands have been run (be careful: some
4740
+distributions will run iptables in their initialization scripts),
4741
+there will be no rules in any of the built-in chains (`INPUT',
4742
+`FORWARD' and `OUTPUT'), all the chains will have a policy of ACCEPT.
4743
+You can alter the default policy of the FORWARD chain by providing the
4744
+`forward=0' option to the iptable_filter module.
4746
+<sect1> Operations on a Single Rule
4749
+This is the bread-and-butter of packet filtering; manipulating rules.
4750
+Most commonly, you will probably use the append (-A) and delete (-D)
4751
+commands. The others (-I for insert and -R for replace) are simple
4752
+extensions of these concepts.
4755
+Each rule specifies a set of conditions the packet must meet, and what
4756
+to do if it meets them (a `target'). For example, you might want to
4757
+drop all ICMP packets coming from the IP address 127.0.0.1. So in
4758
+this case our conditions are that the protocol must be ICMP and that
4759
+the source address must be 127.0.0.1. Our target is `DROP'.
4762
+127.0.0.1 is the `loopback' interface, which you will have even if you
4763
+have no real network connection. You can use the `ping' program to
4764
+generate such packets (it simply sends an ICMP type 8 (echo request)
4765
+which all cooperative hosts should obligingly respond to with an ICMP
4766
+type 0 (echo reply) packet). This makes it useful for testing.
4769
+# ping -c 1 127.0.0.1
4770
+PING 127.0.0.1 (127.0.0.1): 56 data bytes
4771
+64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.2 ms
4773
+--- 127.0.0.1 ping statistics ---
4774
+1 packets transmitted, 1 packets received, 0% packet loss
4775
+round-trip min/avg/max = 0.2/0.2/0.2 ms
4776
+# iptables -A INPUT -s 127.0.0.1 -p icmp -j DROP
4777
+# ping -c 1 127.0.0.1
4778
+PING 127.0.0.1 (127.0.0.1): 56 data bytes
4780
+--- 127.0.0.1 ping statistics ---
4781
+1 packets transmitted, 0 packets received, 100% packet loss
4785
+You can see here that the first ping succeeds (the `-c 1' tells ping
4786
+to only send a single packet).
4789
+Then we append (-A) to the `INPUT' chain, a rule specifying that for
4790
+packets from 127.0.0.1 (`-s 127.0.0.1') with protocol ICMP (`-p icmp')
4791
+we should jump to DROP (`-j DROP').
4794
+Then we test our rule, using the second ping. There will be a pause
4795
+before the program gives up waiting for a response that will never
4799
+We can delete the rule in one of two ways. Firstly, since we know
4800
+that it is the only rule in the input chain, we can use a numbered
4803
+ # iptables -D INPUT 1
4806
+To delete rule number 1 in the INPUT chain.
4809
+The second way is to mirror the -A command, but replacing the -A with
4810
+-D. This is useful when you have a complex chain of rules and you
4811
+don't want to have to count them to figure out that it's rule 37 that
4812
+you want to get rid of. In this case, we would use:
4814
+ # iptables -D INPUT -s 127.0.0.1 -p icmp -j DROP
4817
+The syntax of -D must have exactly the same options as the -A (or -I
4818
+or -R) command. If there are multiple identical rules in the same
4819
+chain, only the first will be deleted.
4821
+<sect1>Filtering Specifications
4824
+We have seen the use of `-p' to specify protocol, and `-s' to specify
4825
+source address, but there are other options we can use to specify
4826
+packet characteristics. What follows is an exhaustive compendium.
4828
+<sect2>Specifying Source and Destination IP Addresses
4831
+Source (`-s', `--source' or `--src') and destination (`-d',
4832
+`--destination' or `--dst') IP addresses can be specified in four
4833
+ways. The most common way is to use the full name, such as
4834
+`localhost' or `www.linuxhq.com'. The second way is to specify the IP
4835
+address such as `127.0.0.1'.
4838
+The third and fourth ways allow specification of a group of IP
4839
+addresses, such as `199.95.207.0/24' or `199.95.207.0/255.255.255.0'.
4840
+These both specify any IP address from 199.95.207.0 to 199.95.207.255
4841
+inclusive; the digits after the `/' tell which parts of the IP address
4842
+are significant. `/32' or `/255.255.255.255' is the default (match
4843
+all of the IP address). To specify any IP address at all `/0' can be
4846
+ [ NOTE: `-s 0/0' is redundant here. ]
4847
+ # iptables -A INPUT -s 0/0 -j DROP
4851
+This is rarely used, as the effect above is the same as not specifying
4852
+the `-s' option at all.
4854
+<sect2>Specifying Inversion
4857
+Many flags, including the `-s' (or `--source') and `-d'
4858
+(`--destination') flags can have their arguments preceded by `!'
4859
+(pronounced `not') to match addresses NOT equal to the ones given.
4860
+For example. `-s ! localhost' matches any packet <bf>not</bf> coming
4863
+<sect2>Specifying Protocol
4866
+The protocol can be specified with the `-p' (or `--protocol') flag.
4867
+Protocol can be a number (if you know the numeric protocol values for
4868
+IP) or a name for the special cases of `TCP', `UDP' or `ICMP'. Case
4869
+doesn't matter, so `tcp' works as well as `TCP'.
4872
+The protocol name can be prefixed by a `!', to invert it, such as `-p
4873
+! TCP' to specify packets which are <bf>not</bf> TCP.
4875
+<sect2>Specifying an Interface
4878
+The `-i' (or `--in-interface') and `-o' (or `--out-interface') options
4879
+specify the name of an <bf>interface</bf> to match. An interface is
4880
+the physical device the packet came in on (`-i') or is going out on
4881
+(`-o'). You can use the <tt>ifconfig</tt> command to list the
4882
+interfaces which are `up' (i.e., working at the moment).
4885
+Packets traversing the <tt>INPUT</tt> chain don't have an output
4886
+interface, so any rule using `-o' in this chain will never match.
4887
+Similarly, packets traversing the <tt>OUTPUT</tt> chain don't have an
4888
+input interface, so any rule using `-i' in this chain will never match.
4890
+<p>Only packets traversing the <tt>FORWARD</tt> chain have both an
4891
+input and output interface.
4894
+It is perfectly legal to specify an interface that currently does not
4895
+exist; the rule will not match anything until the interface comes up.
4896
+This is extremely useful for dial-up PPP links (usually interface
4897
+<tt>ppp0</tt>) and the like.
4900
+As a special case, an interface name ending with a `+' will match all
4901
+interfaces (whether they currently exist or not) which begin with that
4902
+string. For example, to specify a rule which matches all PPP
4903
+interfaces, the <tt>-i ppp+</tt> option would be used.
4906
+The interface name can be preceded by a `!' with spaces around it, to
4907
+match a packet which does <bf>not</bf> match the specified
4908
+interface(s), eg <tt>-i ! ppp+</tt>.
4910
+<sect2>Specifying Fragments
4913
+Sometimes a packet is too large to fit down a wire all at once. When
4914
+this happens, the packet is divided into <bf>fragments</bf>, and sent
4915
+as multiple packets. The other end reassembles these fragments to
4916
+reconstruct the whole packet.
4919
+The problem with fragments is that the initial fragment has the
4920
+complete header fields (IP + TCP, UDP and ICMP) to examine, but
4921
+subsequent packets only have a subset of the headers (IP without the
4922
+additional protocol fields). Thus looking inside subsequent fragments
4923
+for protocol headers (such as is done by the TCP, UDP and ICMP
4924
+extensions) is not possible.
4927
+If you are doing connection tracking or NAT, then all fragments will
4928
+get merged back together before they reach the packet filtering code,
4929
+so you need never worry about fragments.
4932
+Please also note that in the INPUT chain of the filter table (or any other
4933
+table hooking into the NF_IP_LOCAL_IN hook) is traversed after
4934
+defragmentation of the core IP stack.
4937
+Otherwise, it is important to understand how fragments get treated by
4938
+the filtering rules. Any filtering rule that asks for information we
4939
+don't have will <em>not</em> match. This means that the first fragment is
4940
+treated like any other packet. Second and further fragments won't be.
4941
+Thus a rule <tt>-p TCP --sport www</tt> (specifying a source port of
4942
+`www') will never match a fragment (other than the first fragment).
4943
+Neither will the opposite rule <tt>-p TCP --sport ! www</tt>.
4946
+However, you can specify a rule specifically for second and further
4947
+fragments, using the `-f' (or `--fragment') flag. It is also legal to
4948
+specify that a rule does <em>not</em> apply to second and further
4949
+fragments, by preceding the `-f' with ` ! '.
4952
+Usually it is regarded as safe to let second and further fragments
4953
+through, since filtering will effect the first fragment, and thus
4954
+prevent reassembly on the target host; however, bugs have been known
4955
+to allow crashing of machines simply by sending fragments. Your call.
4958
+Note for network-heads: malformed packets (TCP, UDP and ICMP packets
4959
+too short for the firewalling code to read the ports or ICMP code and
4960
+type) are dropped when such examinations are attempted. So are TCP
4961
+fragments starting at position 8.
4964
+As an example, the following rule will drop any fragments going to
4968
+# iptables -A OUTPUT -f -d 192.168.1.1 -j DROP
4972
+<sect2>Extensions to iptables: New Matches
4974
+<p><tt>iptables</tt> is <bf>extensible</bf>, meaning that both the
4975
+kernel and the iptables tool can be extended to provide new features.
4977
+<p>Some of these extensions are standard, and other are more exotic.
4978
+Extensions can be made by other people and distributed separately for
4981
+<p>Kernel extensions normally live in the kernel module subdirectory,
4982
+such as /lib/modules/2.4.0-test10/kernel/net/ipv4/netfilter. They are demand loaded if your
4983
+kernel was compiled with CONFIG_KMOD set, so you should not need to
4984
+manually insert them.
4986
+<p>Extensions to the iptables program are shared libraries which
4987
+usually live in /usr/local/lib/, although a distribution
4988
+would put them in /lib/iptables or /usr/lib/iptables.
4990
+<p>Extensions come in two types: new targets, and new matches (we'll
4991
+talk about new targets a little later). Some protocols automatically
4992
+offer new tests: currently these are TCP, UDP and ICMP as shown below.
4994
+<p>For these you will be able to specify the new tests on the command
4995
+line after the `-p' option, which will load the extension. For
4996
+explicit new tests, use the `-m' option to load the extension, after
4997
+which the extended options will be available.
4999
+<p>To get help on an extension, use the option to load it (`-p', `-j' or
5000
+`-m') followed by `-h' or `--help', eg:
5002
+# iptables -p tcp --help
5006
+<sect3>TCP Extensions
5009
+The TCP extensions are automatically loaded if `-p tcp' is specified.
5010
+It provides the following options (none of which match fragments).
5014
+<tag>--tcp-flags</tag> Followed by an optional `!', then two strings
5015
+of flags, allows you to filter on specific TCP flags. The first
5016
+string of flags is the mask: a list of flags you want to examine. The
5017
+second string of flags tells which one(s) should be set. For example,
5020
+# iptables -A INPUT --protocol tcp --tcp-flags ALL SYN,ACK -j DROP
5023
+This indicates that all flags should be examined (`ALL' is synonymous
5024
+with `SYN,ACK,FIN,RST,URG,PSH'), but only SYN and ACK should be set.
5025
+There is also an argument `NONE' meaning no flags.
5027
+<tag>--syn</tag> Optionally preceded by a `!', this is shorthand
5028
+ for `--tcp-flags SYN,RST,ACK SYN'.
5030
+<tag>--source-port</tag> followed by an optional `!', then either a
5031
+single TCP port, or a range of ports. Ports can be port names, as
5032
+listed in /etc/services, or numeric. Ranges are either two port names
5033
+separated by a `:', or (to specify greater than or equal to a given
5034
+port) a port with a `:' appended, or (to specify less than or equal to
5035
+a given port), a port preceded by a `:'.
5037
+<tag>--sport</tag> is synonymous with `--source-port'.
5039
+<tag>--destination-port</tag> and <tag>--dport</tag> are the same as
5040
+above, only they specify the destination, rather than source, port to
5043
+<tag>--tcp-option</tag> followed by an optional `!' and a number,
5044
+matches a packet with a TCP option equaling that number. A packet
5045
+which does not have a complete TCP header is dropped automatically if
5046
+an attempt is made to examine its TCP options.
5049
+<sect4>An Explanation of TCP Flags
5052
+It is sometimes useful to allow TCP connections in one direction, but
5053
+not the other. For example, you might want to allow connections to an
5054
+external WWW server, but not connections from that server.
5057
+The naive approach would be to block TCP packets coming from the
5058
+server. Unfortunately, TCP connections require packets going in both
5059
+directions to work at all.
5062
+The solution is to block only the packets used to request a
5063
+connection. These packets are called <bf>SYN</bf> packets (ok,
5064
+technically they're packets with the SYN flag set, and the RST and ACK
5065
+flags cleared, but we call them SYN packets for short). By
5066
+disallowing only these packets, we can stop attempted connections in
5070
+The `--syn' flag is used for this: it is only valid for rules which
5071
+specify TCP as their protocol. For example, to specify TCP connection
5072
+attempts from 192.168.1.1:
5074
+-p TCP -s 192.168.1.1 --syn
5078
+This flag can be inverted by preceding it with a `!', which means
5079
+every packet other than the connection initiation.
5081
+<sect3>UDP Extensions
5084
+These extensions are automatically loaded if `-p udp' is specified.
5085
+It provides the options `--source-port', `--sport',
5086
+`--destination-port' and `--dport' as detailed for TCP above.
5088
+<sect3>ICMP Extensions
5091
+This extension is automatically loaded if `-p icmp' is specified. It
5092
+provides only one new option:
5096
+<tag>--icmp-type</tag> followed by an optional `!', then either an
5097
+icmp type name (eg `host-unreachable'), or a numeric type (eg. `3'),
5098
+or a numeric type and code separated by a `/' (eg. `3/3'). A list
5099
+of available icmp type names is given using `-p icmp --help'.
5102
+<sect3>Other Match Extensions
5105
+The other extensions in the netfilter package are demonstration
5106
+extensions, which (if installed) can be invoked with the `-m' option.
5109
+<tag>mac</tag> This module must be explicitly specified with `-m mac'
5110
+or `--match mac'. It is used for matching incoming packet's source
5111
+Ethernet (MAC) address, and thus only useful for packets traversing
5112
+the PREROUTING and INPUT chains. It provides only one option:
5115
+ <tag>--mac-source</tag> followed by an optional `!', then an
5116
+ ethernet address in colon-separated hexbyte notation, eg
5117
+ `--mac-source 00:60:08:91:CC:B7'.
5120
+<tag>limit</tag> This module must be explicitly specified with `-m
5121
+limit' or `--match limit'. It is used to restrict the rate of
5122
+matches, such as for suppressing log messages. It will only match a
5123
+given number of times per second (by default 3 matches per hour,
5124
+with a burst of 5). It takes two optional arguments:
5127
+ <tag>--limit</tag> followed by a number; specifies the maximum
5128
+ average number of matches to allow per second. The number can
5129
+ specify units explicitly, using `/second', `/minute', `/hour' or
5130
+ `/day', or parts of them (so `5/second' is the same as `5/s').
5132
+ <tag>--limit-burst</tag> followed by a number, indicating the
5133
+ maximum burst before the above limit kicks in.
5136
+This match can often be used with the LOG target to do rate-limited
5137
+logging. To understand how it works, let's look at the following
5138
+rule, which logs packets with the default limit parameters:
5141
+# iptables -A FORWARD -m limit -j LOG
5144
+The first time this rule is reached, the packet will be logged; in
5145
+fact, since the default burst is 5, the first five packets will be
5146
+logged. After this, it will be twenty minutes before a packet will be
5147
+logged from this rule, regardless of how many packets reach it. Also,
5148
+every twenty minutes which passes without matching a packet, one of
5149
+the burst will be regained; if no packets hit the rule for 100
5150
+minutes, the burst will be fully recharged; back where we started.
5152
+<p>Note: you cannot currently create a rule with a recharge time
5153
+greater than about 59 hours, so if you set an average rate of one per
5154
+day, then your burst rate must be less than 3.
5156
+<p>You can also use this module to avoid various denial of service
5157
+attacks (DoS) with a faster rate to increase responsiveness.
5159
+<p>Syn-flood protection:
5161
+# iptables -A FORWARD -p tcp --syn -m limit --limit 1/s -j ACCEPT
5164
+Furtive port scanner:
5166
+# iptables -A FORWARD -p tcp --tcp-flags SYN,ACK,FIN,RST RST -m limit --limit 1/s -j ACCEPT
5171
+# iptables -A FORWARD -p icmp --icmp-type echo-request -m limit --limit 1/s -j ACCEPT
5174
+This module works like a "hysteresis door", as shown in the graph
5182
+Edge of DoS -|.....:.........\.......................
5184
+limit-burst) | / : \ .-.
5187
+End of DoS -|/....:..............:.../.......\..../.
5188
+ = limit | : :`-' `--'
5189
+-------------+-----+--------------+------------------> time (s)
5190
+ LOGIC => Match | Didn't Match | Match
5193
+Say we say match one packet per second with a five packet
5194
+burst, but packets start coming in at four per second, for three
5195
+seconds, then start again in another three seconds.
5199
+ <--Flood 1--> <---Flood 2--->
5201
+Total ^ Line __-- YNNN
5202
+Packets| Rate __-- YNNN
5210
+ | Y Key: Y -> Matched Rule
5211
+ | Y N -> Didn't Match Rule
5214
+ 0 +--------------------------------------------------> Time (seconds)
5215
+ 0 1 2 3 4 5 6 7 8 9 10 11 12
5218
+You can see that the first five packets are allowed to exceed the one
5219
+packet per second, then the limiting kicks in. If there is a pause,
5220
+another burst is allowed but not past the maximum rate set by the
5221
+rule (1 packet per second after the burst is used).
5224
+This module attempts to match various characteristics of the packet
5225
+creator, for locally-generated packets. It is only valid in the
5226
+OUTPUT chain, and even then some packets (such as ICMP ping responses)
5227
+may have no owner, and hence never match.
5230
+ <tag>--uid-owner userid</tag>
5231
+Matches if the packet was created by a process with the given
5232
+effective (numerical) user id.
5233
+ <tag>--gid-owner groupid</tag>
5234
+Matches if the packet was created by a process with the given
5235
+effective (numerical) group id.
5236
+ <tag>--pid-owner processid</tag>
5237
+Matches if the packet was created by a process with the given
5239
+ <tag>--sid-owner sessionid</tag>
5240
+Matches if the packet was created by a process in the given session
5244
+<tag>unclean</tag> This experimental module must be explicitly
5245
+specified with `-m unclean or `--match unclean'. It does various
5246
+random sanity checks on packets. This module has not been audited,
5247
+and should not be used as a security device (it probably makes things
5248
+worse, since it may well have bugs itself). It provides no options.
5251
+<sect3>The State Match
5253
+<p>The most useful match criterion is supplied by the `state'
5254
+extension, which interprets the connection-tracking analysis of the
5255
+`ip_conntrack' module. This is highly recommended.
5257
+<p>Specifying `-m state' allows an additional `--state' option, which
5258
+is a comma-separated list of states to match (the `!' flag indicates
5259
+<bf>not</bf> to match those states). These states are:
5262
+<tag>NEW</tag> A packet which creates a new connection.
5264
+<tag>ESTABLISHED</tag> A packet which belongs to an existing
5265
+connection (i.e., a reply packet, or outgoing packet on a connection
5266
+which has seen replies).
5268
+<tag>RELATED</tag> A packet which is related to, but not part of, an
5269
+existing connection, such as an ICMP error, or (with the FTP module
5270
+inserted), a packet establishing an ftp data connection.
5272
+<tag>INVALID</tag> A packet which could not be identified for some
5273
+reason: this includes running out of memory and ICMP errors which
5274
+don't correspond to any known connection. Generally these packets
5278
+An example of this powerful match extension would be:
5280
+# iptables -A FORWARD -i ppp0 -m state ! --state NEW -j DROP
5283
+<sect1>Target Specifications
5285
+<p>Now we know what examinations we can do on a packet, we need a way
5286
+of saying what to do to the packets which match our tests. This is
5287
+called a rule's <bf>target</bf>.
5289
+<p>There are two very simple built-in targets: DROP and ACCEPT. We've
5290
+already met them. If a rule matches a packet and its target is one of
5291
+these two, no further rules are consulted: the packet's fate has been
5294
+<p>There are two types of targets other than the built-in ones:
5295
+extensions and user-defined chains.
5297
+<sect2>User-defined chains
5300
+One powerful feature which <tt>iptables</tt> inherits from
5301
+<tt>ipchains</tt> is the ability for the user to create new chains, in
5302
+addition to the three built-in ones (INPUT, FORWARD and OUTPUT). By
5303
+convention, user-defined chains are lower-case to distinguish them
5304
+(we'll describe how to create new user-defined chains below in <ref
5305
+id="chain-ops" name="Operations on an Entire Chain">).
5308
+When a packet matches a rule whose target is a user-defined chain, the
5309
+packet begins traversing the rules in that user-defined chain. If
5310
+that chain doesn't decide the fate of the packet, then once traversal
5311
+on that chain has finished, traversal resumes on the next rule in the
5315
+Time for more ASCII art. Consider two (silly) chains: <tt>INPUT</tt> (the
5316
+built-in chain) and <tt>test</tt> (a user-defined chain).
5320
+ ---------------------------- ----------------------------
5321
+ | Rule1: -p ICMP -j DROP | | Rule1: -s 192.168.1.1 |
5322
+ |--------------------------| |--------------------------|
5323
+ | Rule2: -p TCP -j test | | Rule2: -d 192.168.1.1 |
5324
+ |--------------------------| ----------------------------
5325
+ | Rule3: -p UDP -j DROP |
5326
+ ----------------------------
5330
+Consider a TCP packet coming from 192.168.1.1, going to 1.2.3.4. It
5331
+enters the <tt>INPUT</tt> chain, and gets tested against Rule1 - no match.
5332
+Rule2 matches, and its target is <tt>test</tt>, so the next rule examined
5333
+is the start of <tt>test</tt>. Rule1 in <tt>test</tt> matches, but doesn't
5334
+specify a target, so the next rule is examined, Rule2. This doesn't
5335
+match, so we have reached the end of the chain. We return to the
5336
+<tt>INPUT</tt> chain, where we had just examined Rule2, so we now examine
5337
+Rule3, which doesn't match either.
5340
+So the packet path is:
5342
+ v __________________________
5343
+ `INPUT' | / `test' v
5344
+ ------------------------|--/ -----------------------|----
5345
+ | Rule1 | /| | Rule1 | |
5346
+ |-----------------------|/-| |----------------------|---|
5347
+ | Rule2 / | | Rule2 | |
5348
+ |--------------------------| -----------------------v----
5349
+ | Rule3 /--+___________________________/
5350
+ ------------------------|---
5354
+<p>User-defined chains can jump to other user-defined chains (but
5355
+don't make loops: your packets will be dropped if they're found to
5358
+<sect2>Extensions to iptables: New Targets
5360
+<p>The other type of extension is a target. A target extension
5361
+consists of a kernel module, and an optional extension to
5362
+<tt>iptables</tt> to provide new command line options. There are
5363
+several extensions in the default netfilter distribution:
5366
+<tag>LOG</tag> This module provides kernel logging of matching
5367
+packets. It provides these additional options:
5369
+ <tag>--log-level</tag> Followed by a level number or name. Valid
5370
+ names are (case-insensitive) `debug', `info', `notice', `warning',
5371
+ `err', `crit', `alert' and `emerg', corresponding to numbers 7
5372
+ through 0. See the man page for syslog.conf for an explanation of
5373
+ these levels. The default is `warning'.
5375
+ <tag>--log-prefix</tag> Followed by a string of up to 29 characters,
5376
+ this message is sent at the start of the log message, to allow it to
5377
+ be uniquely identified.
5380
+ This module is most useful after a limit match, so you don't flood
5383
+<tag>REJECT</tag> This module has the same effect as `DROP', except
5384
+that the sender is sent an ICMP `port unreachable' error message.
5385
+Note that the ICMP error message is not sent if (see RFC 1122):
5388
+<item> The packet being filtered was an ICMP error message in the
5389
+first place, or some unknown ICMP type.
5391
+<item> The packet being filtered was a non-head fragment.
5393
+<item> We've sent too many ICMP error messages to that destination
5394
+recently (see /proc/sys/net/ipv4/icmp_ratelimit).
5397
+REJECT also takes a `--reject-with' optional argument which alters the
5398
+reply packet used: see the manual page.
5401
+<sect2>Special Built-In Targets
5403
+<p>There are two special built-in targets: <tt>RETURN</tt> and
5406
+<p><tt>RETURN</tt> has the same effect of falling off the end of a
5407
+chain: for a rule in a built-in chain, the policy of the chain is
5408
+executed. For a rule in a user-defined chain, the traversal continues
5409
+at the previous chain, just after the rule which jumped to this chain.
5411
+<p><tt>QUEUE</tt> is a special target, which queues the packet for
5412
+userspace processing. For this to be useful, two further components are
5416
+<item>a "queue handler", which deals with the actual mechanics of
5417
+passing packets between the kernel and userspace; and
5418
+<item>a userspace application to receive, possibly manipulate, and
5419
+issue verdicts on packets.
5421
+The standard queue handler for IPv4 iptables is the ip_queue module,
5422
+which is distributed with the kernel and marked as experimental.
5424
+The following is a quick example of how to use iptables to queue packets
5425
+for userspace processing:
5427
+# modprobe iptable_filter
5428
+# modprobe ip_queue
5429
+# iptables -A OUTPUT -p icmp -j QUEUE
5431
+With this rule, locally generated outgoing ICMP packets (as created with,
5432
+say, ping) are passed to the ip_queue module, which then attempts to deliver
5433
+the packets to a userspace application. If no userspace application is
5434
+waiting, the packets are dropped.
5436
+<p>To write a userspace application, use the libipq API. This is
5437
+distributed with iptables. Example code may be found in the testsuite
5438
+tools (e.g. redirect.c) in CVS.
5440
+<p>The status of ip_queue may be checked via:
5444
+The maximum length of the queue (i.e. the number packets delivered
5445
+to userspace with no verdict issued back) may be controlled via:
5447
+/proc/sys/net/ipv4/ip_queue_maxlen
5449
+The default value for the maximum queue length is 1024. Once this limit
5450
+is reached, new packets will be dropped until the length of the queue falls
5451
+below the limit again. Nice protocols such as TCP interpret dropped packets
5452
+as congestion, and will hopefully back off when the queue fills up. However,
5453
+it may take some experimenting to determine an ideal maximum queue length
5454
+for a given situation if the default value is too small.
5456
+<sect1>Operations on an Entire Chain<label id="chain-ops">
5459
+A very useful feature of <tt>iptables</tt> is the ability to group
5460
+related rules into chains. You can call the chains whatever you want,
5461
+but I recommend using lower-case letters to avoid confusion with the
5462
+built-in chains and targets. Chain names can be up to 31 letters
5465
+<sect2>Creating a New Chain
5468
+Let's create a new chain. Because I am such an imaginative fellow,
5469
+I'll call it <tt>test</tt>. We use the `-N' or `--new-chain' options:
5477
+It's that simple. Now you can put rules in it as detailed above.
5479
+<sect2>Deleting a Chain
5482
+Deleting a chain is simple as well, using the `-X' or `--delete-chain'
5483
+options. Why `-X'? Well, all the good letters were taken.
5491
+There are a couple of restrictions to deleting chains: they must be
5492
+empty (see <ref id="flushing" name="Flushing a Chain"> below) and they
5493
+must not be the target of any rule. You can't delete any of the three
5497
+If you don't specify a chain, then <em>all</em> user-defined chains
5498
+will be deleted, if possible.
5500
+<sect2> Flushing a Chain<label id="flushing">
5503
+There is a simple way of emptying all rules out of a chain, using the
5504
+`-F' (or `--flush') commands.
5507
+# iptables -F FORWARD
5512
+If you don't specify a chain, then <em>all</em> chains will be flushed.
5514
+<sect2>Listing a Chain
5517
+You can list all the rules in a chain by using the `-L' (or `--list')
5521
+The `refcnt' listed for each user-defined chain is the number of rules
5522
+which have that chain as their target. This must be zero (and the
5523
+chain be empty) before this chain can be deleted.
5526
+If the chain name is omitted, all chains are listed, even empty ones.
5529
+There are three options which can accompany `-L'. The `-n' (numeric)
5530
+option is very useful as it prevents <tt>iptables</tt> from trying to
5531
+lookup the IP addresses, which (if you are using DNS like most people)
5532
+will cause large delays if your DNS is not set up properly, or you
5533
+have filtered out DNS requests. It also causes TCP and UDP ports to
5534
+be printed out as numbers rather than names.
5537
+The `-v' options shows you all the details of the rules, such as the
5538
+the packet and byte counters, the TOS comparisons, and the interfaces.
5539
+Otherwise these values are omitted.
5542
+Note that the packet and byte counters are printed out using the
5543
+suffixes `K', `M' or `G' for 1000, 1,000,000 and 1,000,000,000
5544
+respectively. Using the `-x' (expand numbers) flag as well prints the
5545
+full numbers, no matter how large they are.
5547
+<sect2>Resetting (Zeroing) Counters
5550
+It is useful to be able to reset the counters. This can be done with
5551
+the `-Z' (or `--zero') option.
5554
+Consider the following:
5557
+# iptables -L FORWARD
5558
+# iptables -Z FORWARD
5562
+In the above example, some packets could pass through between the `-L'
5563
+and `-Z' commands. For this reason, you can use the `-L' and `-Z'
5564
+<em>together</em>, to reset the counters while reading them.
5566
+<sect2>Setting Policy<label id="policy">
5569
+We glossed over what happens when a packet hits the end of a built-in
5570
+chain when we discussed how a packet walks through chains earlier. In
5571
+this case, the <bf>policy</bf> of the chain determines the fate of the
5572
+packet. Only built-in chains (<tt>INPUT</tt>, <tt>OUTPUT</tt> and
5573
+<tt>FORWARD</tt>) have policies, because if a packet falls off the end
5574
+of a user-defined chain, traversal resumes at the previous chain.
5577
+The policy can be either <tt>ACCEPT</tt> or <tt>DROP</tt>, for
5581
+# iptables -P FORWARD DROP
5585
+<sect> Using ipchains and ipfwadm<label id="oldstyle">
5587
+<p> There are modules in the netfilter distribution called ipchains.o
5588
+and ipfwadm.o. Insert one of these in your kernel (NOTE: they are
5589
+incompatible with ip_tables.o!). Then you can use ipchains or ipfwadm
5590
+just like the good old days.
5592
+<p> This will be supported for some time yet. I think a reasonable
5593
+formula is 2 * [notice of replacement - initial stable release],
5594
+beyond the date that a stable release of the replacement is available.
5595
+This means that support will probably be dropped in Linux 2.6 or 2.8.
5597
+<sect> Mixing NAT and Packet Filtering
5600
+It's common to want to do Network Address Translation (see the NAT
5601
+HOWTO) and packet filtering. The good news is that they mix extremely
5604
+<p>You design your packet filtering completely ignoring any NAT you
5605
+are doing. The sources and destinations seen by the packet filter
5606
+will be the `real' sources and destinations. For example, if you are
5607
+doing DNAT to send any connections to 1.2.3.4 port 80 through to
5608
+10.1.1.1 port 8080, the packet filter would see packets going to
5609
+10.1.1.1 port 8080 (the real destination), not 1.2.3.4 port 80.
5610
+Similarly, you can ignore masquerading: packets will seem to come from
5611
+their real internal IP addresses (say 10.1.1.1), and replies will seem
5614
+<p>You can use the `state' match extension without making the packet
5615
+filter do any extra work, since NAT requires connection tracking
5616
+anyway. To enhance the simple masquerading example in the NAT HOWTO
5617
+to disallow any new connections from coming in the ppp0 interface, you
5621
+# Masquerade out ppp0
5622
+iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE
5624
+# Disallow NEW and INVALID incoming or forwarded packets from ppp0.
5625
+iptables -A INPUT -i ppp0 -m state --state NEW,INVALID -j DROP
5626
+iptables -A FORWARD -i ppp0 -m state --state NEW,INVALID -j DROP
5628
+# Turn on IP forwarding
5629
+echo 1 > /proc/sys/net/ipv4/ip_forward
5632
+<sect> Differences Between iptables and ipchains<label id="Appendix-A">
5636
+<item> Firstly, the names of the built-in chains have changed from
5637
+lower case to UPPER case, because the INPUT and OUTPUT chains now only
5638
+get locally-destined and locally-generated packets. They used to see
5639
+all incoming and all outgoing packets respectively.
5641
+<item> The `-i' flag now means the incoming interface, and only works
5642
+in the INPUT and FORWARD chains. Rules in the FORWARD or OUTPUT
5643
+chains that used `-i' should be changed to `-o'.
5645
+<item> TCP and UDP ports now need to be spelled out with the
5646
+--source-port or --sport (or --destination-port/--dport) options, and
5647
+must be placed after the `-p tcp' or `-p udp' options, as this loads
5648
+the TCP or UDP extensions respectively.
5650
+<item> The TCP -y flag is now --syn, and must be after `-p tcp'.
5652
+<item> The DENY target is now DROP, finally.
5654
+<item> Zeroing single chains while listing them works.
5656
+<item> Zeroing built-in chains also clears policy counters.
5658
+<item> Listing chains gives you the counters as an atomic snapshot.
5660
+<item> REJECT and LOG are now extended targets, meaning they are
5661
+separate kernel modules.
5663
+<item> Chain names can be up to 31 characters.
5665
+<item> MASQ is now MASQUERADE and uses a different syntax. REDIRECT,
5666
+while keeping the same name, has also undergone a syntax change. See
5667
+the NAT-HOWTO for more information on how to configure both of these.
5669
+<item> The -o option is no longer used to direct packets to the userspace
5670
+device (see -i above). Packets are now sent to userspace via the QUEUE
5673
+<item> Probably heaps of other things I forgot.
5676
+<sect> Advice on Packet Filter Design
5679
+Common wisdom in the computer security arena is to block everything,
5680
+then open up holes as neccessary. This is usually phrased `that which
5681
+is not explicitly allowed is prohibited'. I recommend this approach
5682
+if security is your maximal concern.
5684
+<p>Do not run any services you do not need to, even if you think you
5685
+have blocked access to them.
5687
+<p>If you are creating a dedicated firewall, start by running nothing,
5688
+and blocking all packets, then add services and let packets through as
5691
+<p>I recommend security in depth: combine tcp-wrappers (for
5692
+connections to the packet filter itself), proxies (for connections
5693
+passing through the packet filter), route verification and packet
5694
+filtering. Route verification is where a packet which comes from an
5695
+unexpected interface is dropped: for example, if your internal network
5696
+has addresses 10.1.1.0/24, and a packet with that source address comes
5697
+in your external interface, it will be dropped. This can be enabled
5698
+for one interface (ppp0) like so:
5701
+# echo 1 > /proc/sys/net/ipv4/conf/ppp0/rp_filter
5705
+Or for all existing and future interfaces like this:
5708
+# for f in /proc/sys/net/ipv4/conf/*/rp_filter; do
5714
+Debian does this by default where possible. If you have asymmetric
5715
+routing (ie. you expect packets coming in from strange directions),
5716
+you will want to disable this filtering on those interfaces.
5718
+<p>Logging is useful when setting up a firewall if something isn't
5719
+working, but on a production firewall, always combine it with the
5720
+`limit' match, to prevent someone from flooding your logs.
5722
+<p>I highly recommend connection tracking for secure systems: it
5723
+introduces some overhead, as all connections are tracked, but is very
5724
+useful for controlling access to your networks. You may need to load
5725
+the `ip_conntrack.o' module if your kernel does not load modules
5726
+automatically, and it's not built into the kernel. If you want to
5727
+accurately track complex protocols, you'll need to load the
5728
+appropriate helper module (eg. `ip_conntrack_ftp.o').
5731
+# iptables -N no-conns-from-ppp0
5732
+# iptables -A no-conns-from-ppp0 -m state --state ESTABLISHED,RELATED -j ACCEPT
5733
+# iptables -A no-conns-from-ppp0 -m state --state NEW -i ! ppp0 -j ACCEPT
5734
+# iptables -A no-conns-from-ppp0 -i ppp0 -m limit -j LOG --log-prefix "Bad packet from ppp0:"
5735
+# iptables -A no-conns-from-ppp0 -i ! ppp0 -m limit -j LOG --log-prefix "Bad packet not from ppp0:"
5736
+# iptables -A no-conns-from-ppp0 -j DROP
5738
+# iptables -A INPUT -j no-conns-from-ppp0
5739
+# iptables -A FORWARD -j no-conns-from-ppp0
5742
+<p>Building a good firewall is beyond the scope of this HOWTO, but my
5743
+advice is `always be minimalist'. See the Security HOWTO for more
5744
+information on testing and probing your box.
5748
Index: iptables-1.4.12/Makefile.am
5749
===================================================================
5750
--- iptables-1.4.12.orig/Makefile.am 2011-11-07 13:57:20.000000000 -0600
5751
+++ iptables-1.4.12/Makefile.am 2011-11-07 13:58:55.000000000 -0600
5753
ACLOCAL_AMFLAGS = -I m4
5754
AUTOMAKE_OPTIONS = foreign subdir-objects
5756
-SUBDIRS = libiptc libxtables
5757
+SUBDIRS = libiptc libxtables howtos