~ubuntu-branches/ubuntu/saucy/iptables/saucy-proposed : revision 25

1

Author: Soren Hansen <soren@ubuntu.com>

2

Description: Revert changes between 1.4.1.1-3 and 1.4.1.1-4, thus bringing back

3

the howtos.

4

Forwarded: no

5

6

Index: iptables-1.4.12/howtos/Makefile

7

===================================================================

8

--- /dev/null 1970-01-01 00:00:00.000000000 +0000

9

+++ iptables-1.4.12/howtos/Makefile 2011-11-07 13:57:14.000000000 -0600

10

@@ -0,0 +1,10 @@

11

+all:

12

+ for i in *.sgml; do sgml2html $$i; done

13

+

14

+install:

15

+ for i in *.html; do install -D -m 0644 $$i ${DESTDIR}/howtos/$$i; done

16

+

17

+clean:

18

+ -rm *.html

19

+

20

+.PHONY: all clean install

21

Index: iptables-1.4.12/howtos/NAT-HOWTO.sgml

22

===================================================================

23

--- /dev/null 1970-01-01 00:00:00.000000000 +0000

24

+++ iptables-1.4.12/howtos/NAT-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600

25

@@ -0,0 +1,609 @@

26

+<!doctype linuxdoc system>

27

+

28

+<!-- This is the Linux NAT HOWTO.

29

+ -->

30

+

31

+

32

+

33

+<article>

34

+

35

+

36

+

37

+<title>Linux 2.4 NAT HOWTO

38

+<author>Rusty Russell, mailing list <tt>netfilter@lists.samba.org</tt>

39

+<date>$Revision: 1.18 $ $Date: 2002/01/14 09:35:13 $

40

+<abstract>

41

+This document describes how to do masquerading, transparent proxying,

42

+port forwarding, and other forms of Network Address Translations with

43

+the 2.4 Linux Kernels.

44

+</abstract>

45

+

46

+

47

+<toc>

48

+

49

+

50

+

51

+<sect>Introduction<label id="intro">

52

+

53

+

54

+Welcome, gentle reader.

55

+

56

+

57

+You are about to delve into the fascinating (and sometimes horrid)

58

+world of NAT: Network Address Translation, and this HOWTO is going to

59

+be your somewhat accurate guide to the 2.4 Linux Kernel and beyond.

60

+

61

+In Linux 2.4, an infrastructure for mangling packets was

62

+introduced, called `netfilter'. A layer on top of this provides NAT,

63

+completely reimplemented from previous kernels.

64

+

65

66

+

67

+<sect>Where is the official Web Site and List?

68

+

69

+There are three official sites:

70

+<itemize>

71

+<item>Thanks to <url url="http://netfilter.filewatcher.org/" name="Filewatcher">.

72

+<item>Thanks to <url url="http://netfilter.samba.org/" name="The Samba Team and SGI">.

73

+<item>Thanks to <url url="http://netfilter.gnumonks.org/" name="Harald Welte">.

74

+</itemize>

75

+

76

+You can reach all of them using round-robin DNS via

77

+<url url="http://www.netfilter.org/"> and <url url="http://www.iptables.org/">

78

+

79

+For the official netfilter mailing list, see

80

+<url url="http://www.netfilter.org/contact.html#list" name="netfilter List">.

81

+

82

+<sect1>What is Network Address Translation?

83

+

84

+

85

+Normally, packets on a network travel from their source (such as your

86

+home computer) to their destination (such as www.gnumonks.org)

87

+through many different links: about 19 from where I am in Australia.

88

+None of these links really alter your packet: they just send it

89

+onward.

90

+

91

+

92

+If one of these links were to do NAT, then they would alter the source

93

+or destinations of the packet as it passes through. As you can

94

+imagine, this is not how the system was designed to work, and hence

95

+NAT is always something of a crock. Usually the link doing NAT will

96

+remember how it mangled a packet, and when a reply packet passes

97

+through the other way, it will do the reverse mangling on that reply

98

+packet, so everything works.

99

+

100

+<sect1>Why Would I Want To Do NAT?

101

+

102

+In a perfect world, you wouldn't. Meanwhile, the main reasons are:

103

+

104

+<descrip>

105

+<tag/Modem Connections To The Internet/ Most ISPs give you a single IP

106

+address when you dial up to them. You can send out packets with any

107

+source address you want, but only replies to packets with this source

108

+IP address will return to you. If you want to use multiple different

109

+machines (such as a home network) to connect to the Internet through

110

+this one link, you'll need NAT.

111

+

112

+This is by far the most common use of NAT today, commonly known as

113

+`masquerading' in the Linux world. I call this SNAT, because you

114

+change the <bf>source</bf> address of the first packet.

115

+

116

+<tag/Multiple Servers/ Sometimes you want to change where packets

117

+heading into your network will go. Frequently this is because (as

118

+above), you have only one IP address, but you want people to be able

119

+to get into the boxes behind the one with the `real' IP address. If

120

+you rewrite the destination of incoming packets, you can manage this.

121

+This type of NAT was called port-forwarding under previous versions of

122

+Linux.

123

+

124

+A common variation of this is load-sharing, where the mapping

125

+ranges over a set of machines, fanning packets out to them. If you're

126

+doing this on a serious scale, you may want to look at

127

+

128

+<url url="http://linuxvirtualserver.org/" name="Linux Virtual Server">.

129

+

130

+<tag/Transparent Proxying/ Sometimes you want to pretend that each

131

+packet which passes through your Linux box is destined for a program

132

+on the Linux box itself. This is used to make transparent proxies: a

133

+proxy is a program which stands between your network and the outside

134

+world, shuffling communication between the two. The transparent part

135

+is because your network won't even know it's talking to a proxy,

136

+unless of course, the proxy doesn't work.

137

+

138

+Squid can be configured to work this way, and it is called

139

+redirection or transparent proxying under previous Linux versions.

140

+</descrip>

141

+

142

+<sect>The Two Types of NAT

143

+

144

+I divide NAT into two different types: <bf>Source NAT</bf> (SNAT)

145

+and <bf>Destination NAT</bf> (DNAT).

146

+

147

+Source NAT is when you alter the source address of the first

148

+packet: i.e. you are changing where the connection is coming from.

149

+Source NAT is always done post-routing, just before the packet goes

150

+out onto the wire. Masquerading is a specialized form of SNAT.

151

+

152

+Destination NAT is when you alter the destination address of the

153

+first packet: i.e. you are changing where the connection is going to.

154

+Destination NAT is always done before routing, when the packet first

155

+comes off the wire. Port forwarding, load sharing, and transparent

156

+proxying are all forms of DNAT.

157

+

158

+<sect>Quick Translation From 2.0 and 2.2 Kernels

159

+

160

+Sorry to those of you still shell-shocked from the 2.0 (ipfwadm) to

161

+2.2 (ipchains) transition. There's good and bad news.

162

+

163

+Firstly, you can simply use ipchains and ipfwadm as before. To do

164

+this, you need to insmod the `ipchains.o' or `ipfwadm.o' kernel

165

+modules found in the latest netfilter distribution. These are

166

+mutually exclusive (you have been warned), and should not be combined

167

+with any other netfilter modules.

168

+

169

+Once one of these modules is installed, you can use ipchains and

170

+ipfwadm as normal, with the following differences:

171

+

172

+<itemize>

173

+<item> Setting the masquerading timeouts with ipchains -M -S, or

174

+ ipfwadm -M -s does nothing. Since the timeouts are longer for

175

+ the new NAT infrastructure, this should not matter.

176

+

177

+<item> The init_seq, delta and previous_delta fields in the verbose

178

+ masquerade listing are always zero.

179

+

180

+<item> Zeroing and listing the counters at the same time `-Z -L' does

181

+ not work any more: the counters will not be zeroed.

182

+

183

+<item> The backward compatibility layer doesn't scale very well for

184

+ large numbers of connections: don't use it for your corporate

185

+ gateway!

186

+</itemize>

187

+

188

+Hackers may also notice:

189

+

190

+<itemize>

191

+<item> You can now bind to ports 61000-65095 even if you're

192

+ masquerading. The masquerading code used to assume anything

193

+ in this range was fair game, so programs couldn't use it.

194

+

195

+<item> The (undocumented) `getsockname' hack, which transparent proxy

196

+ programs could use to find out the real destinations of

197

+ connections no longer works.

198

+

199

+<item> The (undocumented) bind-to-foreign-address hack is also not

200

+ implemented; this was used to complete the illusion of

201

+ transparent proxying.

202

+

203

+</itemize>

204

+

205

+<sect1> I just want masquerading! Help!

206

+

207

+This is what most people want. If you have a dynamically allocated

208

+IP PPP dialup (if you don't know, this is you), you simply want to

209

+tell your box that all packets coming from your internal network

210

+should be made to look like they are coming from the PPP dialup box.

211

+

212

+<tscreen><verb>

213

+# Load the NAT module (this pulls in all the others).

214

+modprobe iptable_nat

215

+

216

+# In the NAT table (-t nat), Append a rule (-A) after routing

217

+# (POSTROUTING) for all packets going out ppp0 (-o ppp0) which says to

218

+# MASQUERADE the connection (-j MASQUERADE).

219

+iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE

220

+

221

+# Turn on IP forwarding

222

+echo 1 > /proc/sys/net/ipv4/ip_forward

223

+</verb></tscreen>

224

+

225

+Note that you are not doing any packet filtering here: for that, see

226

+the Packet Filtering HOWTO: `Mixing NAT and Packet Filtering'.

227

+

228

+<sect1> What about ipmasqadm?

229

+

230

+This is a much more niche user base, so I didn't worry about

231

+backward compatibility as much. You can simply use `iptables -t nat'

232

+to do port forwarding. So for example, in Linux 2.2 you might have

233

+done:

234

+

235

+<tscreen><verb>

236

+# Linux 2.2

237

+# Forward TCP packets going to port 8080 on 1.2.3.4 to 192.168.1.1's port 80

238

+ipmasqadm portfw -a -P tcp -L 1.2.3.4 8080 -R 192.168.1.1 80

239

+</verb></tscreen>

240

+

241

+Now you would do:

242

+

243

+<tscreen><verb>

244

+# Linux 2.4

245

+# Append a rule before routing (-A PREROUTING) to the NAT table (-t nat) that

246

+# TCP packets (-p tcp) going to 1.2.3.4 (-d 1.2.3.4) port 8080 (--dport 8080)

247

+# have their destination mapped (-j DNAT) to 192.168.1.1, port 80

248

+# (--to 192.168.1.1:80).

249

+iptables -A PREROUTING -t nat -p tcp -d 1.2.3.4 --dport 8080 \

250

+ -j DNAT --to 192.168.1.1:80

251

+</verb></tscreen>

252

+

253

+<sect>Controlling What To NAT

254

+

255

+You need to create NAT rules which tell the kernel what connections

256

+to change, and how to change them. To do this, we use the very

257

+versatile <tt>iptables</tt> tool, and tell it to alter the NAT table by

258

+specifying the `-t nat' option.

259

+

260

+The table of NAT rules contains three lists called `chains': each

261

+rule is examined in order until one matches. The two chains are

262

+called PREROUTING (for Destination NAT, as packets first come in), and

263

+POSTROUTING (for Source NAT, as packets leave). The third (OUTPUT)

264

+will be ignored here.

265

+

266

+The following diagram would illustrate it quite well if I had any

267

+artistic talent:

268

+

269

+<tscreen><verb>

270

+ _____ _____

271

+ / \ / \

272

+ PREROUTING -->[Routing ]----------------->POSTROUTING----->

273

+ \D-NAT/ [Decision] \S-NAT/

274

+ | ^

275

+ | |

276

+ | |

277

+ | |

278

+ | |

279

+ | |

280

+ | |

281

+ --------> Local Process ------

282

+</verb></tscreen>

283

+

284

+At each of the points above, when a packet passes we look up what

285

+connection it is associated with. If it's a new connection, we look

286

+up the corresponding chain in the NAT table to see what to do with it.

287

+The answer it gives will apply to all future packets on that

288

+connection.

289

+

290

+<sect1>Simple Selection using iptables

291

+

292

+<tt>iptables</tt> takes a number of standard options as listed

293

+below. All the double-dash options can be abbreviated, as long as

294

+<tt>iptables</tt> can still tell them apart from the other possible

295

+options. If your kernel has iptables support as a module, you'll need

296

+to load the ip_tables.o module first: `insmod ip_tables'.

297

+

298

+The most important option here is the table selection option, `-t'.

299

+For all NAT operations, you will want to use `-t nat' for the NAT

300

+table. The second most important option to use is `-A' to append a

301

+new rule at the end of the chain (e.g. `-A POSTROUTING'), or `-I' to

302

+insert one at the beginning (e.g. `-I PREROUTING').

303

+

304

+You can specify the source (`-s' or `--source') and destination

305

+(`-d' or `--destination') of the packets you want to NAT. These

306

+options can be followed by a single IP address (e.g. 192.168.1.1), a

307

+name (e.g. www.gnumonks.org), or a network address

308

+(e.g. 192.168.1.0/24 or 192.168.1.0/255.255.255.0).

309

+

310

+You can specify the incoming (`-i' or `--in-interface') or outgoing

311

+(`-o' or `--out-interface') interface to match, but which you can

312

+specify depends on which chain you are putting the rule into: at

313

+PREROUTING you can only select incoming interface, and at POSTROUTING

314

+you can only select outgoing interface. If you use the

315

+wrong one, <tt>iptables</tt> will give an error.

316

+

317

+<sect1>Finer Points Of Selecting What Packets To Mangle

318

+

319

+I said above that you can specify a source and destination address.

320

+If you omit the source address option, then any source address will

321

+do. If you omit the destination address option, then any destination

322

+address will do.

323

+

324

+You can also indicate a specific protocol (`-p' or `--protocol'),

325

+such as TCP or UDP; only packets of this protocol will match the rule.

326

+The main reason for doing this is that specifying a protocol of tcp or

327

+udp then allows extra options: specifically the `--source-port' and

328

+`--destination-port' options (abbreviated as `--sport' and `--dport').

329

+

330

+These options allow you to specify that only packets with a certain

331

+source and destination port will match the rule. This is useful for

332

+redirecting web requests (TCP port 80 or 8080) and leaving other

333

+packets alone.

334

+

335

+These options must follow the `-p' option (which has a side-effect

336

+of loading the shared library extension for that protocol). You can

337

+use port numbers, or a name from the /etc/services file.

338

+

339

+All the different qualities you can select a packet by are detailed

340

+in painful detail in the manual page (<tt>man iptables</tt>).

341

+

342

+<sect>Saying How To Mangle The Packets

343

+

344

+So now we know how to select the packets we want to mangle. To

345

+complete our rule, we need to tell the kernel exactly what we want it

346

+to do to the packets.

347

+

348

+<sect1>Source NAT

349

+

350

+You want to do Source NAT; change the source address of connections

351

+to something different. This is done in the POSTROUTING chain, just

352

+before it is finally sent out; this is an important detail, since it

353

+means that anything else on the Linux box itself (routing, packet

354

+filtering) will see the packet unchanged. It also means that the `-o'

355

+(outgoing interface) option can be used.

356

+

357

+Source NAT is specified using `-j SNAT', and the `--to-source'

358

+option specifies an IP address, a range of IP addresses, and an

359

+optional port or range of ports (for UDP and TCP protocols only).

360

+

361

+<tscreen><verb>

362

+## Change source addresses to 1.2.3.4.

363

+# iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 1.2.3.4

364

+

365

+## Change source addresses to 1.2.3.4, 1.2.3.5 or 1.2.3.6

366

+# iptables -t nat -A POSTROUTING -o eth0 -j SNAT --to 1.2.3.4-1.2.3.6

367

+

368

+## Change source addresses to 1.2.3.4, ports 1-1023

369

+# iptables -t nat -A POSTROUTING -p tcp -o eth0 -j SNAT --to 1.2.3.4:1-1023

370

+</verb></tscreen>

371

+

372

+<sect2>Masquerading

373

+

374

+There is a specialized case of Source NAT called masquerading: it

375

+should only be used for dynamically-assigned IP addresses, such as

376

+standard dialups (for static IP addresses, use SNAT above).

377

+

378

+You don't need to put in the source address explicitly with

379

+masquerading: it will use the source address of the interface the

380

+packet is going out from. But more importantly, if the link goes

381

+down, the connections (which are now lost anyway) are forgotten,

382

+meaning fewer glitches when connection comes back up with a new IP

383

+address.

384

+

385

+<tscreen><verb>

386

+## Masquerade everything out ppp0.

387

+# iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE

388

+</verb></tscreen>

389

+

390

+<sect1>Destination NAT

391

+

392

+This is done in the PREROUTING chain, just as the packet comes in;

393

+this means that anything else on the Linux box itself (routing, packet

394

+filtering) will see the packet going to its `real' destination. It

395

+also means that the `-i' (incoming interface) option can be used.

396

+

397

+Destination NAT is specified using `-j DNAT', and the

398

+`--to-destination' option specifies an IP address, a range of IP

399

+addresses, and an optional port or range of ports (for UDP and TCP

400

+protocols only).

401

+

402

+<tscreen><verb>

403

+## Change destination addresses to 5.6.7.8

404

+# iptables -t nat -A PREROUTING -i eth0 -j DNAT --to 5.6.7.8

405

+

406

+## Change destination addresses to 5.6.7.8, 5.6.7.9 or 5.6.7.10.

407

+# iptables -t nat -A PREROUTING -i eth0 -j DNAT --to 5.6.7.8-5.6.7.10

408

+

409

+## Change destination addresses of web traffic to 5.6.7.8, port 8080.

410

+# iptables -t nat -A PREROUTING -p tcp --dport 80 -i eth0 \

411

+ -j DNAT --to 5.6.7.8:8080

412

+</verb></tscreen>

413

+

414

+<sect2>Redirection

415

+

416

+There is a specialized case of Destination NAT called redirection:

417

+it is a simple convenience which is exactly equivalent to doing DNAT

418

+to the address of the incoming interface.

419

+

420

+<tscreen><verb>

421

+## Send incoming port-80 web traffic to our squid (transparent) proxy

422

+# iptables -t nat -A PREROUTING -i eth1 -p tcp --dport 80 \

423

+ -j REDIRECT --to-port 3128

424

+</verb></tscreen>

425

+

426

+Note that squid needs to be configured to know it's a transparent proxy!

427

+

428

+<sect1>Mappings In Depth

429

+

430

+There are some subtleties to NAT which most people will never have

431

+to deal with. They are documented here for the curious.

432

+

433

+<sect2>Selection Of Multiple Addresses in a Range

434

+

435

+If a range of IP addresses is given, the IP address to use is

436

+chosen based on the least currently used IP for connections the

437

+machine knows about. This gives primitive load-balancing.

438

+

439

+<sect2>Creating Null NAT Mappings

440

+

441

+You can use the `-j ACCEPT' target to let a connection through

442

+without any NAT taking place.

443

+

444

+<sect2>Standard NAT Behavior

445

+

446

+The default behavior is to alter the connection as little as

447

+possible, within the constraints of the rule given by the user. This

448

+means we won't remap ports unless we have to.

449

+

450

+<sect2>Implicit Source Port Mapping

451

+

452

+Even when no NAT is requested for a connection, source port

453

+translation may occur implicitly, if another connection has been

454

+mapped over the new one. Consider the case of masquerading, which

455

+is rather common:

456

+

457

+<enum>

458

+<item> A web connection is established by a box 192.1.1.1 from port

459

+ 1024 to www.netscape.com port 80.

460

+

461

+<item> This is masqueraded by the masquerading box to use its source

462

+ IP address (1.2.3.4).

463

+

464

+<item> The masquerading box tries to make a web connection to

465

+ www.netscape.com port 80 from 1.2.3.4 (its external interface

466

+ address) port 1024.

467

+

468

+<item> The NAT code will alter the source port of the second

469

+ connection to 1025, so that the two don't clash.

470

+</enum>

471

+

472

+When this implicit source mapping occurs, ports are divided into

473

+three classes:

474

+<itemize>

475

+<item> Ports below 512

476

+<item> Ports between 512 and 1023

477

+<item> Ports 1024 and above.

478

+</itemize>

479

+

480

+A port will never be implicitly mapped into a different class.

481

+

482

+<sect2>What Happens When NAT Fails

483

+

484

+If there is no way to uniquely map a connection as the user

485

+requests, it will be dropped. This also applies to packets which

486

+could not be classified as part of any connection, because they are

487

+malformed, or the box is out of memory, etc.

488

+

489

+<sect2>Multiple Mappings, Overlap and Clashes

490

+

491

+You can have NAT rules which map packets onto the same range; the

492

+NAT code is clever enough to avoid clashes. Hence having two rules

493

+which map the source address 192.168.1.1 and 192.168.1.2 respectively

494

+onto 1.2.3.4 is fine.

495

+

496

+Furthermore, you can map over real, used IP addresses, as long as

497

+those addresses pass through the mapping box as well. So if you have

498

+an assigned network (1.2.3.0/24), but have one internal network using

499

+those addresses and one using the Private Internet Addresses

500

+192.168.1.0/24, you can simply NAT the 192.168.1.0/24 source addresses

501

+onto the 1.2.3.0 network, without fear of clashing:

502

+

503

+<tscreen><verb>

504

+# iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth1 \

505

+ -j SNAT --to 1.2.3.0/24

506

+</verb></tscreen>

507

+

508

+The same logic applies to addresses used by the NAT box itself:

509

+this is how masquerading works (by sharing the interface address

510

+between masqueraded packets and `real' packets coming from the box

511

+itself).

512

+

513

+Moreover, you can map the same packets onto many different targets,

514

+and they will be shared. For example, if you don't want to map

515

+anything over 1.2.3.5, you could do:

516

+

517

+<tscreen><verb>

518

+# iptables -t nat -A POSTROUTING -s 192.168.1.0/24 -o eth1 \

519

+ -j SNAT --to 1.2.3.0-1.2.3.4 --to 1.2.3.6-1.2.3.254

520

+</verb></tscreen>

521

+

522

+<sect2>Altering the Destination of Locally-Generated Connections

523

+

524

+The NAT code allows you to insert DNAT rules in the OUTPUT chain,

525

+but this is not fully supported in 2.4 (it can be, but it requires a

526

+new configuration option, some testing, and a fair bit of coding, so

527

+unless someone contracts Rusty to write it, I wouldn't expect it

528

+soon).

529

+

530

+The current limitation is that you can only change the destination

531

+to the local machine (e.g. `j DNAT --to 127.0.0.1'), not to any other

532

+machine, otherwise the replies won't be translated correctly.

533

+

534

+<sect>Special Protocols

535

+

536

+Some protocols do not like being NAT'ed. For each of these

537

+protocols, two extensions must be written; one for the connection

538

+tracking of the protocol, and one for the actual NAT.

539

+

540

+Inside the netfilter distribution, there are currently modules for

541

+ftp: ip_conntrack_ftp.o and ip_nat_ftp.o. If you insmod these into

542

+your kernel (or you compile them in permanently), then doing any kind

543

+of NAT on ftp connections should work. If you don't, then you can

544

+only use passive ftp, and even that might not work reliably if you're

545

+doing more than simple Source NAT.

546

+

547

+<sect>Caveats on NAT

548

+

549

+If you are doing NAT on a connection, all packets passing

550

+<bf>both</bf> ways (in and out of the network) must pass through the

551

+NAT'ed box, otherwise it won't work reliably. In particular, the

552

+connection tracking code reassembles fragments, which means that not

553

+only will connection tracking not be reliable, but your packets may

554

+not get through at all, as fragments will be withheld.

555

+

556

+<sect>Source NAT and Routing

557

+

558

+If you are doing SNAT, you will want to make sure that every

559

+machine the SNAT'ed packets goes to will send replies back to the NAT

560

+box. For example, if you are mapping some outgoing packets onto the

561

+source address 1.2.3.4, then the outside router must know that it is

562

+to send reply packets (which will have <bf>destination</bf> 1.2.3.4)

563

+back to this box. This can be done in the following ways:

564

+

565

+<enum>

566

+<item> If you are doing SNAT onto the box's own address (for which

567

+ routing and everything already works), you don't need to do

568

+ anything.

569

+

570

+<item> If you are doing SNAT onto an unused address on the local LAN

571

+ (for example, you're mapping onto 1.2.3.99, a free IP on your

572

+ 1.2.3.0/24 network), your NAT box will need to respond to ARP

573

+ requests for that address as well as its own: the easiest way

574

+ to do this is create an IP alias, e.g.:

575

+<tscreen><verb>

576

+# ip address add 1.2.3.99 dev eth0

577

+</verb></tscreen>

578

+

579

+<item> If you are doing SNAT onto a completely different address, you

580

+ will have to ensure that the machines the SNAT packets will hit

581

+ will route this address back to the NAT box. This is already

582

+ achieved if the NAT box is their default gateway, otherwise you

583

+ will need to advertise a route (if running a routing protocol)

584

+ or manually add routes to each machine involved.

585

+</enum>

586

+

587

+<sect>Destination NAT Onto the Same Network

588

+

589

+If you are doing port forwarding back onto the same network, you

590

+need to make sure that both future packets and reply packets pass

591

+through the NAT box (so they can be altered). The NAT code will now

592

+(since 2.4.0-test6), block the outgoing ICMP redirect which is

593

+produced when the NAT'ed packet heads out the same interface it came

594

+in on, but the receiving server will still try to reply directly to

595

+the client (which won't recognize the reply).

596

+

597

+The classic case is that internal staff try to access your `public'

598

+web server, which is actually DNAT'ed from the public address

599

+(1.2.3.4) to an internal machine (192.168.1.1), like so:

600

+

601

+<tscreen><verb>

602

+# iptables -t nat -A PREROUTING -d 1.2.3.4 \

603

+ -p tcp --dport 80 -j DNAT --to 192.168.1.1

604

+</verb></tscreen>

605

+

606

+One way is to run an internal DNS server which knows the real

607

+(internal) IP address of your public web site, and forward all other

608

+requests to an external DNS server. This means that the logging on

609

+your web server will show the internal IP addresses correctly.

610

+

611

+The other way is to have the NAT box also map the source IP address

612

+to its own for these connections, fooling the server into replying

613

+through it. In this example, we would do the following (assuming the

614

+internal IP address of the NAT box is 192.168.1.250):

615

+

616

+<tscreen><verb>

617

+# iptables -t nat -A POSTROUTING -d 192.168.1.1 -s 192.168.1.0/24 \

618

+ -p tcp --dport 80 -j SNAT --to 192.168.1.250

619

+</verb></tscreen>

620

+

621

+Because the <bf>PREROUTING</bf> rule gets run first, the packets will

622

+already be destined for the internal web server: we can tell which

623

+ones are internally sourced by the source IP addresses.

624

+

625

+<sect>Thanks

626

+

627

+Thanks first to WatchGuard, and David Bonn, who believed in the

628

+netfilter idea enough to support me while I worked on it.

629

+

630

+And to everyone else who put up with my ranting as I learnt about

631

+the ugliness of NAT, especially those who read my diary.

632

+

633

+Rusty.

634

+</article>

635

Index: iptables-1.4.12/howtos/netfilter-extensions-HOWTO.sgml

636

===================================================================

637

--- /dev/null 1970-01-01 00:00:00.000000000 +0000

638

+++ iptables-1.4.12/howtos/netfilter-extensions-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600

639

@@ -0,0 +1,1781 @@

640

+<!doctype linuxdoc system>

641

+

642

+<!-- This is the Netfilter Extensions HOWTO.

643

+ -->

644

+

645

+<article>

646

+

647

+

648

+

649

+<title>Netfilter Extensions HOWTO</title>

650

+<author>Fabrice MARIE <fabrice@netfilter.org>, mailing list <tt>netfilter-devel@lists.samba.org</tt></author>

651

+<date>$Revision: 1.28 $</date>

652

+<abstract>

653

+This document describes how to install and use current iptables extensions for netfilter.

654

+</abstract>

655

+

656

+

657

+<toc>

658

+

659

+

660

+

661

+<sect>Introduction<label id="intro">

662

+

663

+

664

+Hello. This is a great opportunity for me to thank all the people

665

+spending a lot of time developing, testing, reporting bugs of, and using netfilter.

666

+So, thanks to you all !!

667

+

668

+

669

+This HOWTO assumes you have read and understood Rusty's

670

+<url url="http://www.netfilter.org/documentation/HOWTO/packet-filtering-HOWTO.html" name="Linux 2.4 Packet Filtering HOWTO">.

671

+It is assumed as well that you know how to compile and install a kernel properly.

672

+

673

+

674

+<tt>iptables</tt> distribution contains extensions that are not used by regular users

675

+or that are still quite experimental or finally, that are pending for kernel inclusion.

676

+These extensions are usually not compiled, unless you've asked for it.

677

+

678

+

679

+You should find the latest version of this document on

680

+<url url="http://www.netfilter.org/documentation/index.html#HOWTO" name="netfilter documentation"> web page.

681

+

682

+

683

+The goal of this HOWTO is to help people get started with the netfilter extensions

684

+by explaining how you can install them, and how to basically use them.

685

+

686

+

687

+Finally, there's a script generated complete list of patches that are available in patch-o-matic :

688

+<url url="http://www.netfilter.org/documentation/pomlist/pom-summary.html" name="Patch-O-Matic Listing - Summary">.

689

+

690

691

+

692

+<sect>Patch-O-Matic

693

+

694

+<sect1>What is Patch-O-Matic ?

695

+

696

+Netfilter developers distribute a set of patches that they package

697

+so that it can be used by their `patch-o-matic' (or `p-o-m') system.

698

+p-o-m is a script that guides you through the process of choosing/selecting

699

+the patches you want to apply, and automatically patch the kernel for you.

700

+

701

+

702

+First, you should get the latest CVS tree, to be sure that you are using the

703

+latest extensions. To do so, perform :

704

+

705

+<tscreen><verb>

706

+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic login

707

+

708

+(When it asks you for a password type `cvs').

709

+

710

+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic co netfilter/userspace netfilter/patch-o-matic

711

+</verb></tscreen>

712

+

713

+

714

+This will create the toplevel directory `netfilter/', and will

715

+check out all the files inside for you :

716

+

717

+<tscreen>

718

+<verb>

719

+# ls -l netfilter/

720

+total 3

721

+drwxr-xr-x 2 root root 160 Nov 7 14:48 CVS/

722

+drwxr-xr-x 13 root root 488 Nov 7 14:54 patch-o-matic/

723

+drwxr-xr-x 9 root root 864 Nov 7 14:48 userspace/

724

+</verb>

725

+</tscreen>

726

+

727

+

728

+Make sure your kernel source is ready in `/usr/src/linux/'.

729

+If for whatever reason the kernel you want to patch is not

730

+in `/usr/src/linux/' then you can make the variable KERNEL_DIR

731

+point to the patch where your kernel is :

732

+

733

+<tscreen><verb>

734

+# export KERNEL_DIR=/the/path/linux

735

+</verb></tscreen>

736

+

737

+

738

+Make sure the dependencies are made already. If unsure :

739

+

740

+<tscreen><verb>

741

+# cd /usr/src/linux/

742

+# make dep

743

+</verb></tscreen>

744

+

745

+

746

+Then you can go back to the netfilter directory, in the `patch-o-matic/' directory.

747

+You can now invoke p-o-m.

748

+

749

+<sect1>Running Patch-O-Matic

750

+

751

+While in the `patch-o-matic/' directory, let's run p-o-m :

752

+

753

+<tscreen><verb>

754

+# ./runme extra

755

+

756

+Welcome to Rusty's Patch-o-matic!

757

+

758

+Each patch is a new feature: many have minimal impact, some do not.

759

+Almost every one has bugs, so I don't recommend applying them all!

760

+-------------------------------------------------------

761

+

762

+Already applied: 2.4.1 2.4.4

763

+Testing... name_of_the_patch NOT APPLIED ( 2 missing files)

764

+The name_of_the_patch patch:

765

+ Here usually is the help text describing what

766

+ the patch is for, what you can expect from it,

767

+ and what you should not expect from it.

768

+Do you want to apply this patch [N/y/t/f/q/?]

769

+</verb></tscreen>

770

+

771

+

772

+p-o-m will go through most of the patches. If they are already applied,

773

+you will see so on the `Already applied:' first line. If they are not applied

774

+yet, it will display the name of the patch with some explanations.

775

+p-o-m will tell you what is going on : `NOT APPLIED ( n missing files)' simply means

776

+the patch has not been applied yet, whereas `NOT APPLIED ( n rejects out of n hunks)'

777

+generally means that :

778

+<enum>

779

+<item>Either the patch cannot be applied cleanly...

780

+<item>...Or the patch has already been included in the kernel you are trying to patch.

781

+</enum>

782

+

783

+

784

+Finally it will prompt you to decide whether or not to patch it.

785

+

786

+<itemize>

787

+<item>Simply press enter if you do not want to apply it.

788

+<item>Type `y' if you want p-o-m to test the patch and apply it,

789

+if the attempt fail then it will tell you so and prompt you again for confirmation.

790

+If not, the patch will be applied, and you will see the name of the patch

791

+on the `Already Applied' line.

792

+<item>Type `t' if you just want to test if the patch would apply normally.

793

+<item>Type `f' if you

794

+want to force p-o-m to apply the patch.

795

+<item>Finally type `q' if you want to quit p-o-m.

796

+</itemize>

797

+

798

+

799

+A rule of thumb is to read carefully the little explanation text of each patch

800

+before actually applying it. As there are currently a LOT of official patches for patch-o-matic

801

+(and probably more unofficial ones), it is not recommended to apply them all !

802

+You should really consider applying only the ones you need, even if it means recompiling

803

+netfilter when you need more patches later on.

804

+

805

+

806

+Patch-o-matic in fact, is mainly the `runme' shell script. If you run it without arguments, it will

807

+display its help message :

808

+

809

+<tscreen>

810

+<verb>

811

+Usage: ./runme [--batch] [--reverse] [--exclude suite/patch-file ...] suite|suite/patch-file

812

+

813

+ --batch batch mode, automatically applying patches

814

+ --reverse back out the selected patches

815

+ --exclude excludes the named patches

816

+</verb>

817

+</tscreen>

818

+

819

+

820

+The patches are contained in `patch-o-matic/pending/', `patch-o-matic/base', etc.. Here, `pending' and `base'

821

+are two suite names. ls the `patch-o-matic' directory to see all the suites. Example of `runme' commands :

822

+

823

+<tscreen>

824

+<verb>

825

+./runme --batch pending

826

+./runme --batch userspace/ipt_REJECT-fake-source.patch

827

+</verb>

828

+</tscreen>

829

+

830

+

831

+The first command will attempt to apply all the patches from submitted suite,

832

+then the pending suite (we explain further why two suites). The second command

833

+will only apply the patch `ipt_REJECT-fake-source.patch' from the userspace suite.

834

+

835

+

836

+The most relevant patches `suites' or repositories are (in their order or application) :

837

+<itemize>

838

+<item>submitted

839

+<item>pending

840

+<item>base

841

+<item>extra

842

+<item>userspace

843

+</itemize>

844

+

845

+

846

+When you instruct `./runme' to apply patches from the `extra/' patch repository it will first

847

+present you with the patches from the `submitted/', `pending/', and `base/' directories.

848

+Each suite, maintain a file named `SUITE' that instruct p-o-m of the order in which

849

+it should attempt to apply the patches. For example, what I explained above is written

850

+in the `userspace/' repository's `SUITE' file :

851

+

852

+<tscreen>

853

+<verb>

854

+# cat userspace/SUITE

855

+submitted pending base extra userspace

856

+</verb>

857

+</tscreen>

858

+

859

+<sect1>So what's next ?

860

+

861

+

862

+Once you have applied all the patches you wished to apply, the next step is recompile

863

+your kernel and install it. This HOWTO will not explain how to do this. Instead, you

864

+can read the <url url="http://www.linuxdoc.org/HOWTO/Kernel-HOWTO.html" name="Linux Kernel HOWTO">.

865

+

866

+

867

+While configuring your kernel, you will see new options in

868

+``Networking Options -> Netfilter Configuration''. Choose the options

869

+you need, recompile & install your new kernel.

870

+

871

+

872

+Once your new kernel is installed, you can go ahead and compile and install the ``iptables''

873

+package, from the `userspace/' directory as follows :

874

+

875

+<tscreen>

876

+<verb>

877

+# make all install

878

+</verb>

879

+</tscreen>

880

+

881

+

882

+That's it ! Your new shiny iptables package is installed ! Now it's time

883

+to use these brand new functionalities.

884

+

885

+<sect>New netfilter matches

886

+

887

+

888

+In this section, we will attempt to explain the usage of new netfilter matches.

889

+The patches will appear in alphabetical order. Additionally, we will not explain

890

+patches that break other patches. But this might come later.

891

+

892

+

893

+Generally speaking, for matches, you can get the help hints from a particular

894

+module by typing :

895

+

896

+<tscreen>

897

+<verb>

898

+# iptables -m the_match_you_want --help

899

+</verb>

900

+</tscreen>

901

+

902

+

903

+This would display the normal iptables help message, plus the specific

904

+``the_match_you_want'' match help message at the end.

905

+

906

+<sect1>ah-esp patch

907

+

908

+This patch by Yon Uriarte <yon@astaro.de> adds 2 new matches :

909

+

910

+<itemize>

911

+<item>``ah'' : lets you match an AH packet based on its Security Parameter Index (SPI).

912

+<item>``esp'' : lets you match an ESP packet based on its SPI.

913

+</itemize>

914

+

915

+

916

+This patch can be quite useful for people using IPSEC who are willing

917

+to discriminate connections based on their SPI.

918

+

919

+

920

+For example, we will drop all the AH packets that have a SPI equal to

921

+500 :

922

+

923

+<tscreen><verb>

924

+# iptables -A INPUT -p 51 -m ah --ahspi 500 -j DROP

925

+

926

+# iptables --list

927

+Chain INPUT (policy ACCEPT)

928

+target prot opt source destination

929

+DROP ipv6-auth-- anywhere anywhere ah spi:500

930

+</verb></tscreen>

931

+

932

+

933

+Supported options for the ah match are :

934

+

935

+<descrip>

936

+<tag>--ahspi [!] spi[:spi]</> -> match spi (range)

937

+</descrip>

938

+

939

+

940

+The esp match works exactly the same :

941

+

942

+<tscreen><verb>

943

+# iptables -A INPUT -p 50 -m esp --espspi 500 -j DROP

944

+

945

+# iptables --list

946

+Chain INPUT (policy ACCEPT)

947

+target prot opt source destination

948

+DROP ipv6-crypt-- anywhere anywhere esp spi:500

949

+</verb></tscreen>

950

+

951

+

952

+Supported options for the esp match are :

953

+

954

+<descrip>

955

+<tag>--espspi [!] spi[:spi]</> -> match spi (range)

956

+</descrip>

957

+

958

+

959

+Do not forget to specify the proper protocol through ``-p 50'' or ``-p 51'' (for esp & ah respectively)

960

+when you use the ah or esp matches, or else the rule insertion will simply abort

961

+for obvious reasons.

962

+

963

+<sect1>condition match

964

+

965

+This patch by Stephane Ouellette <ouellettes@videotron.ca> adds a new match that is used

966

+to enable or disable a set of rules using condition variables stored in `/proc' files.

967

+

968

+

969

+Notes:

970

+

971

+<itemize>

972

+<item>The condition variables are stored in the `/proc/net/ipt_condition/' directory.

973

+<item>A condition variable can only be set to ``0'' (FALSE) or ``1'' (TRUE).

974

+<item>One or many rules can be affected by the state of a single condition variable.

975

+<item>A condition proc file is automatically created when a new condition is first referenced.

976

+<item>A condition proc file is automatically deleted when the last reference to it is removed.

977

+</itemize>

978

+

979

+

980

+Supported options for the condition match are :

981

+

982

+<descrip>

983

+<tag>--condition [!] conditionfile</> -> match on condition variable.

984

+</descrip>

985

+

986

+

987

+For example, if you want to prohibit access to your web server while doing maintenance, you can use the

988

+following :

989

+

990

+<tscreen><verb>

991

+# iptables -A FORWARD -p tcp -d 192.168.1.10 --dport http -m condition --condition webdown -j REJECT --reject-with tcp-reset

992

+

993

+# echo 1 > /proc/net/ipt_condition/webdown

994

+</verb></tscreen>

995

+

996

+

997

+The following rule will match only if the ``webdown'' condition is set to ``1''.

998

+

999

+

1000

+<sect1>conntrack patch

1001

+

1002

+This patch by Marc Boucher <marc+nf@mbsi.ca> adds a new general conntrack match module

1003

+(a superset of the state match) that allows you to match on additional conntrack information.

1004

+

1005

+

1006

+For example, if you want to allow all the RELATED connections for TCP protocols only,

1007

+then you can proceed as follows :

1008

+

1009

+<tscreen><verb>

1010

+# iptables -A FORWARD -m conntrack --ctstate RELATED --ctproto tcp -j ACCEPT

1011

+

1012

+# iptables --list

1013

+Chain FORWARD (policy ACCEPT)

1014

+target prot opt source destination

1015

+ACCEPT all -- anywhere anywhere ctstate RELATED

1016

+</verb></tscreen>

1017

+

1018

+

1019

+Supported options for the conntrack match are :

1020

+

1021

+<descrip>

1022

1023

+-> State(s) to match. The "new" `SNAT' and `DNAT' states are virtual ones, matching if the original

1024

+source address differs from the reply destination, or if the original destination differs from the reply source.

1025

+

1026

+<tag>[!] --ctproto proto</> -&gt Protocol to match; by number or name, eg. `tcp'.

1027

+

1028

+<tag>--ctorigsrc [!] address[/mask]</> -&gt Original source specification.

1029

+

1030

+<tag>--ctorigdst [!] address[/mask]</> -&gt Original destination specification.

1031

+

1032

+<tag>--ctreplsrc [!] address[/mask]</> -&gt Reply source specification.

1033

+

1034

+<tag>--ctrepldst [!] address[/mask]</> -&gt Reply destination specification.

1035

+

1036

+<tag>[!] --ctstatus [NONE|EXPECTED|SEEN_REPLY|ASSURED][,...]</>

1037

+-> Status(es) to match.

1038

+

1039

+<tag>[!] --ctexpire time[:time]</> -&gt Match remaining lifetime in seconds against

1040

+value or range of values (inclusive).

1041

+</descrip>

1042

+

1043

+<sect1>fuzzy patch

1044

+

1045

+This patch by Hime Aguiar e Oliveira Jr. <hime@engineer.com> adds a new module

1046

+which allows you to match packets according to a dynamic profile

1047

+implemented by means of a simple Fuzzy Logic Controller (FLC).

1048

+

1049

+

1050

+This match implements a TSK FLC (Takagi-Sugeno-Kang Fuzzy Logic

1051

+Controller). The basic idea is that the match is given two parameters

1052

+that tell it the desired filtering interval.

1053

+

1054

+<itemize>

1055

+<item>When the packet rate is below `lower-limit' the rule will never match.

1056

+<item>Between `lower-limit' and `upper-limit', matching will occurs according a

1057

+increasing (mean) rate.

1058

+<item>Finally, when the packet rate comes to `upper-limit',

1059

+(mean) matching rate attains its maximum value, 99%.

1060

+</itemize>

1061

+

1062

+

1063

+Taking into account that the sampling rate is variable and is of approximately 100ms

1064

+(on a busy machine), the author believes that the module presents good responsiveness,

1065

+adapting fast to changing traffic patterns.

1066

+

1067

+

1068

+For example, if you wish to avoid Denials Of Service, you could use the following rule:

1069

+

1070

+<tscreen><verb>

1071

+iptables -A INPUT -m fuzzy --lower-limit 100 --upper-limit 1000 -j REJECT

1072

+</verb></tscreen>

1073

+

1074

+<itemize>

1075

+<item>Below the 100 pps (packets per second) rate, the filter is inactive.

1076

+<item>Between 100 and 1000 pps the mean acceptance rate drops

1077

+from 100% (when we are at 100 pps) to 1% (when we are at 1000 pps).

1078

+<item>Above 1000 pps the acceptance rate keeps constant at 1%.

1079

+</itemize>

1080

+

1081

+

1082

+Supported options for the fuzzy patch are :

1083

+

1084

+<descrip>

1085

+<tag>--upper-limit n</> -> Desired upper bound for traffic rate matching.

1086

+<tag>--lower-limit n</> -> Lower bound over which the FLC starts to match.

1087

+</descrip>

1088

+

1089

+<sect1>iplimit patch

1090

+

1091

+This patch by Gerd Knorr <kraxel@bytesex.org> adds a new match that

1092

+will allow you to restrict the number of parallel TCP connections

1093

+from a particular host or network.

1094

+

1095

+

1096

+For example, let's limit the number of parallel HTTP connections made by a single

1097

+IP address to 4 :

1098

+

1099

+<tscreen><verb>

1100

+# iptables -A INPUT -p tcp --syn --dport http -m iplimit --iplimit-above 4 -j REJECT

1101

+

1102

+# iptables --list

1103

+Chain INPUT (policy ACCEPT)

1104

+target prot opt source destination

1105

+REJECT tcp -- anywhere anywhere tcp dpt:http flags:SYN,RST,ACK/SYN #conn/32 > 4 reject-with icmp-port-unreachable

1106

+</verb></tscreen>

1107

+

1108

+

1109

+Or you might want to limit the number of parallel connections made by a whole class A for example :

1110

+

1111

+<tscreen><verb>

1112

+# iptables -A INPUT -p tcp --syn --dport http -m iplimit --iplimit-mask 8 --iplimit-above 4 -j REJECT

1113

+

1114

+# iptables --list

1115

+Chain INPUT (policy ACCEPT)

1116

+target prot opt source destination

1117

+REJECT tcp -- anywhere anywhere tcp dpt:http flags:SYN,RST,ACK/SYN #conn/8 > 4 reject-with icmp-port-unreachable

1118

+</verb></tscreen>

1119

+

1120

+

1121

+Supported options for the iplimit patch are :

1122

+

1123

+<descrip>

1124

+<tag>[!] --iplimit-above n</> -> match if the number of existing tcp connections is (not) above n

1125

+<tag>--iplimit-mask n</> -> group hosts using mask

1126

+</descrip>

1127

+

1128

+<sect1>ipv4options patch

1129

+

1130

+

1131

+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a news match

1132

+that allows you to match packets based on the IP options they have set.

1133

+

1134

+

1135

+For example, let's drop all packets that have the record-route or the timestamp

1136

+IP option set :

1137

+

1138

+<tscreen><verb>

1139

+# iptables -A INPUT -m ipv4options --rr -j DROP

1140

+# iptables -A INPUT -m ipv4options --ts -j DROP

1141

+

1142

+# iptables --list

1143

+Chain INPUT (policy ACCEPT)

1144

+target prot opt source destination

1145

+DROP all -- anywhere anywhere IPV4OPTS RR

1146

+DROP all -- anywhere anywhere IPV4OPTS TS

1147

+</verb></tscreen>

1148

+

1149

+

1150

+Supported options for the ipv4options match are :

1151

+

1152

+<descrip>

1153

+<tag>--ssrr</> -> match strict source routing flag.

1154

+<tag>--lsrr</> -> match loose source routing flag.

1155

+<tag>--no-srr</> -> match packets with no source routing.

1156

+<tag>[!] --rr</> -> match record route flag.

1157

+<tag>[!] --ts</> -> match timestamp flag.

1158

+<tag>[!] --ra</> -> match router-alert option.

1159

+<tag>[!] --any-opt</> -> Match a packet that has at least one IP option

1160

+(or that has no IP option at all if ! is chosen).

1161

+</descrip>

1162

+

1163

+<sect1>length patch

1164

+

1165

+This patch by James Morris <jmorris@intercode.com.au> adds a new match

1166

+that allows you to match a packet based on its length.

1167

+

1168

+

1169

+For example, let's drop all the pings with a packet size greater than

1170

+85 bytes :

1171

+

1172

+<tscreen><verb>

1173

+# iptables -A INPUT -p icmp --icmp-type echo-request -m length --length 86:0xffff -j DROP

1174

+

1175

+# iptables --list

1176

+Chain INPUT (policy ACCEPT)

1177

+target prot opt source destination

1178

+DROP icmp -- anywhere anywhere icmp echo-request length 86:65535

1179

+</verb></tscreen>

1180

+

1181

+

1182

+Supported options for the length match are :

1183

+

1184

+<descrip>

1185

+<tag>[!] --length length[:length]</> -> Match packet length

1186

+against value or range of values (inclusive)

1187

+</descrip>

1188

+

1189

+

1190

+Values of the range not present will be implied. The implied value for minimum

1191

+is 0, and for maximum is 65535.

1192

+

1193

+<sect1>mport patch

1194

+

1195

+This patch by Andreas Ferber <af@devcon.net> adds a new match that allows

1196

+you to specify ports with a mix of port-ranges and single ports for UDP and TCP protocols.

1197

+

1198

+

1199

+For example, if you want to block ftp, ssh, telnet and http in one line, you can :

1200

+

1201

+<tscreen><verb>

1202

+# iptables -A INPUT -p tcp -m mport --ports 20:23,80 -j DROP

1203

+

1204

+# iptables --list

1205

+Chain INPUT (policy ACCEPT)

1206

+target prot opt source destination

1207

+DROP tcp -- anywhere anywhere mport ports ftp-data:telnet,http

1208

+</verb></tscreen>

1209

+

1210

+

1211

+Supported options for the mport match are :

1212

+

1213

+<descrip>

1214

+<tag>--source-ports port[,port:port,port...]</> -> match source port(s)

1215

+<tag>--sports port[,port:port,port...]</> -> match source port(s)

1216

+<tag>--destination-ports port[,port:port,port...]</> -> match destination port(s)

1217

+<tag>--dports port[,port:port,port...]</> -> match destination port(s)

1218

+<tag>--ports port[,port:port,port]</> -> match both source and destination port(s)

1219

+</descrip>

1220

+

1221

+<sect1>nth patch

1222

+

1223

+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that allows

1224

+you to match a particular Nth packet received by the rule.

1225

+

1226

+

1227

+For example, if you want to drop every 2 ping packets, you can do as follows :

1228

+

1229

+<tscreen><verb>

1230

+# iptables -A INPUT -p icmp --icmp-type echo-request -m nth --every 2 -j DROP

1231

+

1232

+# iptables --list

1233

+Chain INPUT (policy ACCEPT)

1234

+target prot opt source destination

1235

+DROP icmp -- anywhere anywhere icmp echo-request every 2th

1236

+</verb></tscreen>

1237

+

1238

+

1239

+Extensions by Richard Wagner <rwagner@cloudnet.com> allows

1240

+you to create an easy and quick method to produce load-balancing for both inbound and outbound

1241

+connections.

1242

+

1243

+

1244

+For example, if you want to balance the load to the 3 addresses 10.0.0.5, 10.0.0.6 and 10.0.0.7,

1245

+then you can do as follows :

1246

+

1247

+<tscreen><verb>

1248

+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 0 -j SNAT --to-source 10.0.0.5

1249

+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 1 -j SNAT --to-source 10.0.0.6

1250

+# iptables -t nat -A POSTROUTING -o eth0 -m nth --counter 7 --every 3 --packet 2 -j SNAT --to-source 10.0.0.7

1251

+

1252

+# iptables -t nat --list

1253

+Chain POSTROUTING (policy ACCEPT)

1254

+target prot opt source destination

1255

+SNAT all -- anywhere anywhere every 3th packet #0 to:10.0.0.5

1256

+SNAT all -- anywhere anywhere every 3th packet #1 to:10.0.0.6

1257

+SNAT all -- anywhere anywhere every 3th packet #2 to:10.0.0.7

1258

+</verb></tscreen>

1259

+

1260

+

1261

+Supported options for the nth match are :

1262

+

1263

+<descrip>

1264

+<tag>--every Nth</> -> Match every Nth packet.

1265

+<tag>[--counter] num</> -> Use counter 0-15 (default:0).

1266

+<tag>[--start] num</> -> Initialize the counter at the number `num' instead of 0. Must be between 0 and (Nth-1).

1267

+<tag>[--packet] num</> -> Match on the `num' packet. Must be between 0 and Nth-1.

1268

+If `--packet' is used for a counter, then there must be Nth number of --packet rules, covering all values between 0 and

1269

+(Nth-1) inclusively.

1270

+</descrip>

1271

+

1272

+<sect1>pkttype patch

1273

+

1274

+This patch by Michal Ludvig <michal@logix.cz> adds a new match that allows

1275

+you to match a packet based on its type : host/broadcast/multicast.

1276

+

1277

+

1278

+If For example you want to silently drop all the broadcasted packets :

1279

+

1280

+<tscreen><verb>

1281

+# iptables -A INPUT -m pkttype --pkt-type broadcast -j DROP

1282

+

1283

+# iptables --list

1284

+Chain INPUT (policy ACCEPT)

1285

+target prot opt source destination

1286

+DROP all -- anywhere anywhere PKTTYPE = broadcast

1287

+</verb></tscreen>

1288

+

1289

+

1290

+Supported options for this match are :

1291

+

1292

+<descrip>

1293

+<tag>--pkt-type [!] packettype</> -&gt match packet type where packet type is one of

1294

+<descrip>

1295

+<tag>host</> -> to us

1296

+<tag>broadcast</> -> to all

1297

+<tag>multicast</> -> to group

1298

+</descrip>

1299

+</descrip>

1300

+

1301

+<sect1>pool patch

1302

+

1303

+Patch by Patrick Schaaf <bof@bof.de>. Joakim Axelsson and Patrick are in the process

1304

+of re-writing it, therefore they will replace this section with the actual

1305

+explanations once its written.

1306

+

1307

+<sect1>psd patch

1308

+

1309

+This patch by Dennis Koslowski <dkoslowski@astaro.de> adds a new match that will

1310

+attempt to detect port scans.

1311

+

1312

+

1313

+In its simplest form, psd match can be used as follows :

1314

+

1315

+<tscreen><verb>

1316

+# iptables -A INPUT -m psd -j DROP

1317

+

1318

+# iptables --list

1319

+Chain INPUT (policy ACCEPT)

1320

+target prot opt source destination

1321

+DROP all -- anywhere anywhere psd weight-threshold: 21 delay-threshold: 300 lo-ports-weight: 3 hi-ports-weight: 1

1322

+</verb></tscreen>

1323

+

1324

+

1325

+Supported options for psd match are :

1326

+

1327

+<descrip>

1328

+<tag>[--psd-weight-threshold threshold]</> -> Portscan detection weight threshold

1329

+<tag>[--psd-delay-threshold delay]</> -> Portscan detection delay threshold

1330

+<tag>[--psd-lo-ports-weight lo]</> -> Privileged ports weight

1331

+<tag>[--psd-hi-ports-weight hi]</> -> High ports weight

1332

+</descrip>

1333

+

1334

+<sect1>quota patch

1335

+

1336

+This patch by Sam Johnston <samj@samj.net> adds a new match that

1337

+allows you to set quotas. When the quota is reached, the rule doesn't

1338

+match any more.

1339

+

1340

+

1341

+For example, if you want to limit put a quota of 50Megs on incoming http data

1342

+you can do as follows :

1343

+

1344

+<tscreen><verb>

1345

+# iptables -A INPUT -p tcp --dport 80 -m quota --quota 52428800 -j ACCEPT

1346

+# iptables -A INPUT -p tcp --dport 80 -j DROP

1347

+

1348

+# iptables --list

1349

+Chain INPUT (policy ACCEPT)

1350

+target prot opt source destination

1351

+ACCEPT tcp -- anywhere anywhere tcp dpt:http quota: 52428800 bytes

1352

+DROP tcp -- anywhere anywhere tcp dpt:http

1353

+</verb></tscreen>

1354

+

1355

+

1356

+Supported options for quota match are :

1357

+

1358

+<descrip>

1359

+<tag> --quota quota</> -> The quota you want to set.

1360

+</descrip>

1361

+

1362

+<sect1>random patch

1363

+

1364

+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that

1365

+allows you to math a packet randomly based on given probability.

1366

+

1367

+

1368

+For example, if you want to drop 50% of the pings randomly, you can do as follows :

1369

+

1370

+<tscreen><verb>

1371

+# iptables -A INPUT -p icmp --icmp-type echo-request -m random --average 50 -j DROP

1372

+

1373

+# iptables --list

1374

+Chain INPUT (policy ACCEPT)

1375

+target prot opt source destination

1376

+DROP icmp -- anywhere anywhere icmp echo-request random 50%

1377

+</verb></tscreen>

1378

+

1379

+

1380

+Supported options for random match are :

1381

+

1382

+<descrip>

1383

+<tag>[--average percent]</> -> The probability in percentage of the match.

1384

+If omitted, a probability of 50% percent is set. Percentage must be within : 1 <= percent <= 99.

1385

+</descrip>

1386

+

1387

+<sect1>realm patch

1388

+

1389

+This patch by Sampsa Ranta <sampsa@netsonic.fi> adds a new match that allows you

1390

+to use realm key from routing as match criteria similar to the one found in the packet

1391

+classifier.

1392

+

1393

+

1394

+For example, to log all the outgoing packet with a realm of 10, you can do the following :

1395

+

1396

+<tscreen><verb>

1397

+# iptables -A OUTPUT -m realm --realm 10 -j LOG

1398

+

1399

+# iptables --list

1400

+Chain OUTPUT (policy ACCEPT)

1401

+target prot opt source destination

1402

+LOG all -- anywhere anywhere REALM match 0xa LOG level warning

1403

+</verb></tscreen>

1404

+

1405

+

1406

+Supported options for the realm match are :

1407

+

1408

+<descrip>

1409

+<tag>--realm [!] value[/mask]</> -> Match realm

1410

+</descrip>

1411

+

1412

+<sect1>recent patch

1413

+

1414

+This patch by Stephen Frost <sfrost@snowman.net> adds a new match that allows you

1415

+to dynamically create a list of IP addresses and then match against that list in a few

1416

+different ways.

1417

+

1418

+

1419

+For example, you can create a `badguy' list out of people attempting to connect to port 139

1420

+on your firewall and then DROP all future packets from them without considering them.

1421

+

1422

+<tscreen><verb>

1423

+# iptables -A FORWARD -m recent --name badguy --rcheck --seconds 60 -j DROP

1424

+# iptables -A FORWARD -p tcp -i eth0 --dport 139 -m recent --name badguy --set -j DROP

1425

+

1426

+# iptables --list

1427

+Chain FORWARD (policy ACCEPT)

1428

+target prot opt source destination

1429

+DROP all -- anywhere anywhere recent: CHECK seconds: 60

1430

+DROP tcp -- anywhere anywhere tcp dpt:netbios-ssn recent: SET

1431

+</verb></tscreen>

1432

+

1433

+

1434

+Supported options for the recent match are :

1435

+

1436

+<descrip>

1437

+<tag>--name name</> -> Specify the list to use for the commands. If no name is given

1438

+then 'DEFAULT' will be used.

1439

+

1440

+<tag>[!] --set</> -> This will add the source address of the packet to the list.

1441

+If the source address is already in the list, this will update the existing entry. This will

1442

+always return success or failure if `!' is passed in.

1443

+

1444

+<tag>[!] --rcheck</> -> This will check if the source address of the packet is currently

1445

+in the list and return true if it is, and false otherwise. Opposite is returned if `!' is passed in.

1446

+

1447

+<tag>[!] --update</> -> This will check if the source address of the packet is currently

1448

+in the list. If it is then that entry will be updated and the rule will return true. If the source

1449

+address is not in the list then the rule will return false. Opposite is returned if `!' is passed in.

1450

+

1451

+<tag>[!] --remove</> -> This will check if the source address of the packet is currently

1452

+in the list and if so that address will be removed from the list and the rule will return true.

1453

+If the address is not found, false is returned. Opposite is returned if `!' is passed in.

1454

+

1455

+<tag>[!] --seconds seconds</> -> This option must be used in conjunction with one of `rcheck' or

1456

+`update'. When used, this will narrow the match to only happen when the address is in the list and was seen

1457

+within the last given number of seconds. Opposite is returned if `!' is passed in.

1458

+

1459

+<tag>[!] --hitcount hits</> -> This option must be used in conjunction with one of `rcheck' or

1460

+`update'. When used, this will narrow the match to only happen when the address is in the list and packets

1461

+had been received greater than or equal to the given value. This option may be used along with `seconds'

1462

+to create an even narrower match requiring a certain number of hits within a specific time frame.

1463

+Opposite returned if `!' passed in.

1464

+

1465

+<tag>--rttl</> -> This option must be used in conjunction with one of `rcheck' or `update'.

1466

+When used, this will narrow the match to only happen when the address is in the list and the TTL of

1467

+the current packet matches that of the packet which hit the --set rule. This may be useful if you have

1468

+problems with people faking their source address in order to DoS you via this module by disallowing others

1469

+access to your site by sending bogus packets to you.

1470

+</descrip>

1471

+

1472

+<sect1>record-rpc patch

1473

+

1474

+This patch by Marcelo Barbosa Lima <marcelo.lima@dcc.unicamp.br> adds a new match that allows

1475

+you to match if the source of the packet has requested that port through the portmapper before,

1476

+or it is a new GET request to the portmapper, allowing effective RPC filtering.

1477

+

1478

+

1479

+To match RPC connection tracking information, simply do the following :

1480

+

1481

+<tscreen><verb>

1482

+# iptables -A INPUT -m record_rpc -j ACCEPT

1483

+

1484

+# iptables --list

1485

+Chain INPUT (policy ACCEPT)

1486

+target prot opt source destination

1487

+ACCEPT all -- anywhere anywhere

1488

+</verb></tscreen>

1489

+

1490

+

1491

+The record_rpc match does not take any option.

1492

+

1493

+

1494

+Do not worry for the match information not printed,

1495

+it's simply because the print() function of this match is empty :

1496

+

1497

+<tscreen><verb>

1498

+/* Prints out the union ipt_matchinfo. */

1499

+static void

1500

+print(const struct ipt_ip *ip,

1501

+ const struct ipt_entry_match *match,

1502

+ int numeric)

1503

+{

1504

+}

1505

+</verb></tscreen>

1506

+

1507

+<sect1>string patch

1508

+

1509

+This patch by Emmanuel Roger <winfield@freegates.be> adds a new match that allows

1510

+you to match a string anywhere in the packet.

1511

+

1512

+

1513

+For example, to match packets containing the string ``cmd.exe'' anywhere

1514

+in the packet and queue them to a userland IDS, you could use :

1515

+

1516

+<tscreen><verb>

1517

+# iptables -A INPUT -m string --string 'cmd.exe' -j QUEUE

1518

+

1519

+# iptables --list

1520

+Chain INPUT (policy ACCEPT)

1521

+target prot opt source destination

1522

+QUEUE all -- anywhere anywhere STRING match cmd.exe

1523

+</verb></tscreen>

1524

+

1525

+

1526

+Please do use this match with caution. A lot of people want to use

1527

+this match to stop worms, along with the DROP target. This is a major mistake.

1528

+It would be defeated by any IDS evasion method.

1529

+

1530

+

1531

+In a similar fashion, a lot of people have been using this match as a mean

1532

+to stop particular functions in HTTP like POST or GET by dropping

1533

+any HTTP packet containing the string POST. Please understand that this job

1534

+is better done by a filtering proxy. Additionally, any HTML content with

1535

+the word POST would get dropped with the former method.

1536

+This match has been designed to be able to queue to userland interesting packets

1537

+for better analysis, that's all. Dropping packet based on this would be defeated

1538

+by any IDS evasion method.

1539

+

1540

+

1541

+Supported options for the string match are :

1542

+

1543

+<descrip>

1544

+<tag>--string [!] string</> -> Match a string in a packet

1545

+</descrip>

1546

+

1547

+<sect1>time patch

1548

+

1549

+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new match that allows

1550

+you to match a packet based on its arrival or departure (for locally generated packets) timestamp.

1551

+

1552

+

1553

+for example, to accept packets that have an arrival time from 8:00H to 18:00H from Monday

1554

+to Friday you can do as follows :

1555

+

1556

+<tscreen><verb>

1557

+# iptables -A INPUT -m time --timestart 8:00 --timestop 18:00 --days Mon,Tue,Wed,Thu,Fri -j ACCEPT

1558

+

1559

+# iptables --list

1560

+Chain INPUT (policy ACCEPT)

1561

+target prot opt source destination

1562

+ACCEPT all -- anywhere anywhere TIME from 8:0 to 18:0 on Mon,Tue,Wed,Thu,Fri

1563

+</verb></tscreen>

1564

+

1565

+

1566

+Supported options for the time match are :

1567

+

1568

+<descrip>

1569

+<tag>--timestart value</> -> minimum HH:MM

1570

+<tag>--timestop value</> -> maximum HH:MM

1571

+<tag>--days listofdays</> -> a list of days to apply, from (case sensitive)

1572

+<itemize>

1573

+<item>Mon

1574

+<item>Tue

1575

+<item>Wed

1576

+<item>Thu

1577

+<item>Fri

1578

+<item>Sat

1579

+<item>Sun

1580

+</itemize>

1581

+</descrip>

1582

+

1583

+<sect1>ttl patch

1584

+

1585

+This patch by Harald Welte <laforge@gnumonks.org> adds a new match that allows you

1586

+to match a packet based on its TTL.

1587

+

1588

+

1589

+For example if you want to log any packet that have a TTL less than 5, you can do as follows :

1590

+

1591

+<tscreen><verb>

1592

+# iptables -A INPUT -m ttl --ttl-lt 5 -j LOG

1593

+

1594

+# iptables --list

1595

+Chain INPUT (policy ACCEPT)

1596

+target prot opt source destination

1597

+LOG all -- anywhere anywhere TTL match TTL < 5 LOG level warning

1598

+</verb></tscreen>

1599

+

1600

+

1601

+Options supported by the ttl match are :

1602

+

1603

+<descrip>

1604

+<tag>--ttl-eq value</> -> Match time to live value

1605

+<tag>--ttl-lt value</> -> Match TTL < value

1606

+<tag>--ttl-gt value</> -> Match TTL > value

1607

+</descrip>

1608

+

1609

+<sect>New netfilter targets

1610

+

1611

+In this section, we will attempt to explain the usage of new netfilter targets.

1612

+The patches will appear in alphabetical order. Additionally, we will not explain

1613

+patches that break other patches. But this might come later.

1614

+

1615

+

1616

+Generally speaking, for targets, you can get the help hints from a particular

1617

+module by typing :

1618

+

1619

+<tscreen>

1620

+<verb>

1621

+# iptables -j THE_TARGET_YOU_WANT --help

1622

+</verb>

1623

+</tscreen>

1624

+

1625

+

1626

+This would display the normal iptables help message, plus the specific

1627

+``THE_TARGET_YOU_WANT'' target help message at the end.

1628

+

1629

+<sect1>ftos patch

1630

+

1631

+This patch by Matthew G. Marsh <mgm@paktronix.com> adds a new target that allows you

1632

+to set the TOS of packets to an arbitrary value.

1633

+

1634

+

1635

+For example, if you want to set the TOS of all the outgoing packets to be 15, you can do as follows :

1636

+

1637

+<tscreen><verb>

1638

+# iptables -t mangle -A OUTPUT -j FTOS --set-ftos 15

1639

+

1640

+# iptables -t mangle --list

1641

+Chain OUTPUT (policy ACCEPT)

1642

+target prot opt source destination

1643

+FTOS all -- anywhere anywhere TOS set 0x0f

1644

+</verb></tscreen>

1645

+

1646

+

1647

+Supported options for the FTOS target are :

1648

+

1649

+<descrip>

1650

+<tag>--set-ftos value</> -> Set TOS field in packet header to value. This value can be in decimal (ex: <tt>32</tt>)

1651

+or in hex (ex: <tt>0x20</tt>)

1652

+</descrip>

1653

+

1654

+<sect1>IPV4OPTSSTRIP patch

1655

+

1656

+This patch by Fabrice MARIE <fabrice@netfilter.org> adds a new target that allows you

1657

+to strip all the IP options from an IPv4 packet.

1658

+

1659

+

1660

+It's simpled loaded as follows :

1661

+

1662

+<tscreen><verb>

1663

+# iptables -t mangle -A PREROUTING -j IPV4OPTSSTRIP

1664

+

1665

+# iptables -t mangle --list

1666

+Chain PREROUTING (policy ACCEPT)

1667

+target prot opt source destination

1668

+IPV4OPTSSTRIP all -- anywhere anywhere

1669

+</verb></tscreen>

1670

+

1671

+

1672

+This target doesn't support any option.

1673

+

1674

+<sect1>NETLINK patch

1675

+

1676

+This patch by Gianni Tedesco <gianni@ecsc.co.uk> adds a new target that allows you to

1677

+send dropped packets to userspace via a netlink socket.

1678

+

1679

+

1680

+For example, if you want to drop all pings and send them to a userland netlink socket instead,

1681

+you can do as follows :

1682

+

1683

+<tscreen><verb>

1684

+# iptables -A INPUT -p icmp --icmp-type echo-request -j NETLINK --nldrop

1685

+

1686

+# iptables --list

1687

+Chain INPUT (policy ACCEPT)

1688

+target prot opt source destination

1689

+NETLINK icmp -- anywhere anywhere icmp echo-request nldrop

1690

+</verb></tscreen>

1691

+

1692

+

1693

+Supported options for the NETLINK target are :

1694

+

1695

+<descrip>

1696

+<tag>--nldrop</> -> Drop the packet too

1697

+<tag>--nlmark <number></> -> Mark the packet

1698

+<tag>--nlsize <bytes></> -> Limit packet size

1699

+</descrip>

1700

+

1701

+

1702

+For more information on netlink sockets, you can refer to the

1703

+<url url="http://www.skyfree.org/linux/kernel_network/netlink.html" name="Netlink Sockets Tour">.

1704

+

1705

+<sect1>NETMAP patch

1706

+

1707

+This patch by Svenning Soerensen <svenning@post5.tele.dk> adds a new target that allows you

1708

+create a static 1:1 mapping of the network address, while keeping host addresses intact.

1709

+

1710

+

1711

+For example, if you want to alter the destination of incoming connections from

1712

+1.2.3.0/24 to 5.6.7.0/24, you can do as follows :

1713

+

1714

+<tscreen><verb>

1715

+# iptables -t nat -A PREROUTING -d 1.2.3.0/24 -j NETMAP --to 5.6.7.0/24

1716

+

1717

+# iptables -t nat --list

1718

+Chain PREROUTING (policy ACCEPT)

1719

+target prot opt source destination

1720

+NETMAP all -- anywhere 1.2.3.0/24 5.6.7.0/24

1721

+</verb></tscreen>

1722

+

1723

+

1724

+Supported options for NETMAP target are :

1725

+

1726

+<descrip>

1727

+<tag>--to address[/mask]</> -> Network address to map to.

1728

+</descrip>

1729

+

1730

+<sect1>ROUTE patch

1731

+

1732

+This patch by C�dric de Launois <delaunois@info.ucl.ac.be> adds a new

1733

+target which allows you to setup unusual routes not supported by the

1734

+standard kernel routing table. The ROUTE target lets you route

1735

+a received packet through an interface or towards a host, even if the

1736

+regular destination of the packet is the router itself. The ROUTE target is

1737

+also able to change the incoming interface of a packet. Packets are

1738

+directly put on the wire and do not traverse any other table.

1739

+

1740

+

1741

+This target does not modify the packets and is a final target.

1742

+It has to be used inside the mangle table.

1743

+

1744

+

1745

+Whenever possible, you should use the MARK target together with

1746

+iproute2 instead of this ROUTE target. However, this target is useful

1747

+to force the use of an interface or a next hop and to change the

1748

+incoming interface of a packet. People also use it for easiness

1749

+and to simplify their rules (one rule to route a packet is easier

1750

+that one MARK rule + one iproute2 rule).

1751

+

1752

+

1753

+Options supported by the ROUTE target are :

1754

+

1755

+<descrip>

1756

+<tag>--oif ifname</>

1757

+Send the packet out using `ifname' network interface. The destination

1758

+host must be on the same link or the interface must be a tunnel.

1759

+Otherwise, arp resolution cannot be performed and the packet is dropped.

1760

+<tag>--iif ifname</>

1761

+Change the packet's incoming interface to `ifname'.

1762

+<tag>--gw ip</>

1763

+Route the packet via this gateway. The packet is routed as if

1764

+its destination IP address was this ip.

1765

+</descrip>

1766

+

1767

+

1768

+

1769

+For example, assume that you want to redirect ssh packets towards a

1770

+server inside your network, without modifying those packets in any way

1771

+(this excludes the use of the standard port forwarding mechanism).

1772

+A solution is to use an ipip tunnel and the ROUTE target to reroute ssh

1773

+packets to the real ssh server, which has the same IP address as the router.

1774

+It is not possible to reroute those packets using the standard routing

1775

+mechanisms, because the kernel locally delivers a packet having

1776

+a destination address belonging to the router itself.

1777

+

1778

+

1779

+Time for ASCII art :

1780

+<verb>

1781

+ eth0 +------+ 192.168.0.1 192.168.0.2 +----+

1782

+ ----------------|router|--------------------------------|host|

1783

+ IP: 150.150.0.1 +------+ +----+

1784

+ | | tunl1 IP: 150.150.0.1 | |

1785

+ | +------------------------------------+ |

1786

+ +----------------------------------------+

1787

+ IPIP tunnel

1788

+</verb>

1789

+

1790

+

1791

+For the example above, you can do as follows :

1792

+

1793

+<tscreen><verb>

1794

+# iptables -A PREROUTING -t mangle -i eth0 -p tcp --dport 22 -j ROUTE --oif tunl1

1795

+# iptables -A PREROUTING -t mangle -i tunl1 -j ROUTE --oif eth0

1796

+

1797

+# iptables -L PREROUTING -t mangle

1798

+Chain PREROUTING (policy ACCEPT)

1799

+target prot opt source destination

1800

+ROUTE tcp -- anywhere anywhere tcp dpt:ssh ROUTE oif tunl1

1801

+ROUTE all -- anywhere anywhere ROUTE oif eth0

1802

+</verb></tscreen>

1803

+

1804

+

1805

+Another example : if you want to quickly and easily balance the load between two

1806

+gateways 10.0.0.1 and 10.0.0.2, then you can do as follows :

1807

+

1808

+<tscreen><verb>

1809

+# iptables -A PREROUTING -t mangle -m random --average 50 -j ROUTE --gw 10.0.0.1

1810

+# iptables -A PREROUTING -t mangle -j ROUTE --gw 10.0.0.2

1811

+

1812

+# iptables -L PREROUTING -t mangle

1813

+Chain PREROUTING (policy ACCEPT)

1814

+target prot opt source destination

1815

+ROUTE all -- anywhere anywhere random 50% ROUTE gw 10.0.0.1

1816

+ROUTE all -- anywhere anywhere ROUTE gw 10.0.0.2

1817

+</verb></tscreen>

1818

+

1819

+<sect1>SAME patch

1820

+

1821

+This patch by Martin Josefsson <gandalf@wlug.westbo.se> adds a new target

1822

+which is similar to SNAT and will gives a client the same address for each connection.

1823

+

1824

+

1825

+For example, if you want to modify the source address of the connections

1826

+to be 1.2.3.4-1.2.3.7 you can do as follows :

1827

+

1828

+<tscreen><verb>

1829

+# iptables -t nat -A POSTROUTING -j SAME --to 1.2.3.4-1.2.3.7

1830

+

1831

+# iptables -t nat --list

1832

+Chain POSTROUTING (policy ACCEPT)

1833

+target prot opt source destination

1834

+SAME all -- anywhere anywhere same:1.2.3.4-1.2.3.7

1835

+</verb></tscreen>

1836

+

1837

+

1838

+Options supported by the SAME target are :

1839

+

1840

+<descrip>

1841

+<tag>--to <ipaddr>-<ipaddr></> -> Addresses to map source to.

1842

+May be specified more than once for multiple ranges.

1843

+<tag>--nodst</> -> Don't use destination-ip in source selection

1844

+</descrip>

1845

+

1846

+<sect1>tcp-MSS patch

1847

+

1848

+This patch by Marc Boucher <marc+nf@mbsi.ca> adds a new target that allows you to examine and

1849

+alter the MSS value of TCP SYN packets, to control the maximum size

1850

+for that connection.

1851

+

1852

+

1853

+As explained by Marc himself, THIS IS A HACK, used to overcome criminally

1854

+brain-dead ISPs or servers which block ICMP Fragmentation Needed

1855

+packets.

1856

+

1857

+

1858

+Typical usage would be :

1859

+

1860

+<tscreen><verb>

1861

+# iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

1862

+

1863

+# iptables --list

1864

+Chain FORWARD (policy ACCEPT)

1865

+target prot opt source destination

1866

+TCPMSS tcp -- anywhere anywhere tcp flags:SYN,RST/SYN TCPMSS clamp to PMTU

1867

+</verb></tscreen>

1868

+

1869

+

1870

+Options supported by the tcp-MSS target are (mutually-exclusive) :

1871

+

1872

+<descrip>

1873

+<tag>--set-mss value</> explicitly set MSS option to specified value

1874

+<tag>--clamp-mss-to-pmtu</> automatically clamp MSS value to (path_MTU - 40)

1875

+</descrip>

1876

+

1877

+<sect1>TTL patch

1878

+

1879

+This patch by Harald Welte <laforge@gnumonks.org> adds a new target that

1880

+enables the user to set the TTL value of an IP packet or to increment/decrement it

1881

+by a given value.

1882

+

1883

+

1884

+For example, if you want to set the TTL of all outgoing connections

1885

+to 126, you can do as follows :

1886

+

1887

+<tscreen><verb>

1888

+# iptables -t mangle -A OUTPUT -j TTL --ttl-set 126

1889

+

1890

+# iptables -t mangle --list

1891

+Chain OUTPUT (policy ACCEPT)

1892

+target prot opt source destination

1893

+TTL all -- anywhere anywhere TTL set to 126

1894

+</verb></tscreen>

1895

+

1896

+

1897

+Supported options for the TTL target are :

1898

+

1899

+<descrip>

1900

+<tag>--ttl-set value</> -> Set TTL to <value>

1901

+<tag>--ttl-dec value</> -> Decrement TTL by <value>

1902

+<tag>--ttl-inc value</> -> Increment TTL by <value>

1903

+</descrip>

1904

+

1905

+<sect1>ulog patch

1906

+

1907

+This patch by Harald Welte <laforge@gnumonks.org> adds a new target

1908

+which supplies a more advanced packet logging mechanism than the standard LOG target.

1909

+The `libipulog/' contains a library for receiving the ULOG messages.

1910

+

1911

+

1912

+Harald maintains a

1913

+<url url="http://www.gnumonks.org/projects/ulogd" name="web page"> containing the proper documentation

1914

+for ULOG, so there is no point for me to explain this here..

1915

+

1916

+<sect>New connection tracking patches

1917

+

1918

+In this sections, we will show the available connection tracking/nat patches.

1919

+To use them, simply load the corresponding modules (with options if needed)

1920

+for them to be in effect.

1921

+

1922

+<sect1>amanda-conntrack-nat patch

1923

+

1924

+This patch by Brian J. Murrell <netfilter@interlinx.bc.ca> adds support

1925

+for connection tracking and nat of the Amanda backup tool protocol.

1926

+

1927

+<sect1>eggdrop-conntrack patch

1928

+

1929

+This patch by Magnus Sandin <magnus@sandin.cx> adds support

1930

+for connection tracking for eggdrop bot networks.

1931

+

1932

+<sect1>h323-conntrack-nat patch

1933

+

1934

+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> adds

1935

+H.323/netmeeting support module for netfilter connection tracking and NAT.

1936

+

1937

+

1938

+H.323 uses/relies on the following data streams :

1939

+

1940

+<itemize>

1941

+<item>port 389 -> Internet Locator Server (TCP).

1942

+<item>port 522 -> User Location Server (TCP).

1943

+<item>port 1503 -> T.120 Protocol (TCP).

1944

+<item>port 1720 -> H.323 (H.225 call setup, TCP)

1945

+<item>port 1731 -> Audio call control (TCP)

1946

+<item>Dynamic port -> H.245 call control (TCP)

1947

+<item>Dynamic port -> RTCP/RTP streaming (UDP)

1948

+</itemize>

1949

+

1950

+

1951

+The H.323 conntrack/NAT modules support the connection tracking/NATing of

1952

+the data streams requested on the dynamic ports. The helpers use the

1953

+search/replace hack from the ip_masq_h323.c module for the 2.2 kernel

1954

+series.

1955

+

1956

+

1957

+At the very minimum, H.323/netmeeting (video/audio) is functional by letting

1958

+trough the 1720 port and loading these H.323 module(s).

1959

+

1960

+

1961

+The H.323 conntrack/NAT modules do not support :

1962

+

1963

+<itemize>

1964

+<item>H.245 tunnelling

1965

+<item>H.225 RAS (gatekeepers)

1966

+</itemize>

1967

+

1968

+<sect1>irc-conntrack-nat patch

1969

+

1970

+This patch by Harald Welte <laforge@gnumonks.org> allows DCC to work though NAT and

1971

+connection tracking. By default, this module will track IRC connection on port 6667.

1972

+But you can change this for another port with the `ports=xx' argument.

1973

+

1974

+<sect1>mms-conntrack-nat patch

1975

+

1976

+This patch by Filip Sneppe <filip.sneppe@cronos.be> adds support for

1977

+connection tracking of Microsoft Streaming Media Services protocol.

1978

+

1979

+

1980

+This allows client (Windows Media Player) and server

1981

+to negotiate protocol (UDP, TCP) and port for the media stream.

1982

+A partially reverse engineered protocol analysis is available

1983

+from <url url="http://get.to/sdp" name="here">, together with a link to a Linux client.

1984

+

1985

+

1986

+It is recommended to open UDP port 1755 to the server, as this port is used

1987

+for retransmission requests.

1988

+

1989

+

1990

+This helper has been tested in SNAT and DNAT setups.

1991

+

1992

+<sect1>pptp patch

1993

+

1994

+This patch by Harald Welte <laforge@gnumonks.org> allows netfilter to track pptp connection as well as to NAT them.

1995

+

1996

+<sect1>quake3-conntrack patch

1997

+

1998

+This patch by Filip Sneppe <filip.sneppe@cronos.be> adds support for

1999

+Quake III Arena connection tracking and nat.

2000

+

2001

+<sect1>rsh patch

2002

+

2003

+This patch by Ian Larry Latter <Ian.Latter@mq.edu.au> adds support for

2004

+RSH connection tracking.

2005

+

2006

+

2007

+An RSH connection tracker is required if the dynamic stderr "Server

2008

+to Client" connection is to occur during a normal RSH session. This

2009

+typically operates as follows :

2010

+

2011

+<tscreen><verb>

2012

+ Client 0:1023 --> Server 514 (stream 1 - stdin/stdout)

2013

+ Client 0:1023 <-- Server 0:1023 (stream 2 - stderr)

2014

+</verb></tscreen>

2015

+

2016

+

2017

+The author of this patch is warning you that this module could be dangerous, and

2018

+that it is not "best practice" to use RSH, and you should use SSH in all instances.

2019

+

2020

+<sect1>snmp-nat patch

2021

+

2022

+This patch by James Morris <jmorris@intercode.com.au> allows netfilter to NAT basic SNMP

2023

+This is the ``basic'' form of SNMP-ALG, as described in

2024

+<url url="http://www.faqs.org/rfcs/rfc2962.html" name="RFC 2962">,

2025

+it works by modifying IP addresses inside SNMP payloads

2026

+to match IP-layer NAT mapping.

2027

+

2028

+<sect1>talk-conntrack-nat patch

2029

+

2030

+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> allows netfilter to track

2031

+talk connections, as well as to NAT them. By default both otalk (UDP port 517) and talk (UDP port 518) are

2032

+supported. otalk/talk supports can selectively be enabled/disabled

2033

+by the module parameters of the ip_conntrack_talk and ip_nat_talk modules. The options are :

2034

+

2035

+<itemize>

2036

+<item>otalk = 0 | 1

2037

+<item>talk = 0 | 1

2038

+</itemize>

2039

+

2040

+

2041

+where `0' means `do not support' while `1' means `do support'

2042

+the given protocol flavor.

2043

+

2044

+<sect1>tcp-window-tracking patch

2045

+

2046

+This patch by Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> allows netfilter

2047

+do TCP connection tracking according to the article

2048

+<url url="http://www.iae.nl/users/guido/papers/tcp_filtering.ps.gz" name="Real Stateful TCP Packet Filtering in IP Filter"> by

2049

+Guido van Rooij. It supports window scaling, and can now handle already established connections.

2050

+

2051

+<sect1>tftp patch

2052

+

2053

+This patch by Magnus Boden <mb@ozaba.mine.nu> allows netfilter to track

2054

+tftp connections as well as to NAT them. By default, this module will track

2055

+tftp connections on port 69. But you can change this for another port with the

2056

+`ports=xx' argument.

2057

+

2058

+<sect>New IPv6 netfilter matches

2059

+

2060

+In this section, we will attempt to explain the usage of new netfilter matches.

2061

+The patches will appear in alphabetical order. Additionally, we will not explain

2062

+patches that break other patches. But this might come later.

2063

+

2064

+

2065

+Generally speaking, for matches, you can get the help hints from a particular

2066

+module by typing :

2067

+

2068

+<tscreen>

2069

+<verb>

2070

+# ip6tables -m the_match_you_want --help

2071

+</verb>

2072

+</tscreen>

2073

+

2074

+

2075

+This would display the normal ip6tables help message, plus the specific

2076

+``the_match_you_want'' match help message at the end.

2077

+

2078

+<sect1>agr patch

2079

+

2080

+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds 1 new match :

2081

+

2082

+<itemize>

2083

+<item>``eui64'' : lets you match the IPv6 packet based on it's addressing parameters.

2084

+</itemize>

2085

+

2086

+

2087

+This patch can be quite useful for people using EUI-64 IPv6 addressing scheme

2088

+who are willing to check the packets based on the delivered address on a LAN.

2089

+

2090

+

2091

+For example, we will redirect the packets that have a correct EUI-64 address:

2092

+

2093

+<tscreen><verb>

2094

+# ip6tables -N ipv6ok

2095

+# ip6tables -A INPUT -m eui64 -j ipv6ok

2096

+# ip6tables -A INPUT -s ! 3FFE:2F00:A0::/64 -j ipv6ok

2097

+# ip6tables -A INPUT -j LOG

2098

+# ip6tables -A ipv6ok -j ACCEPT

2099

+

2100

+# ip6tables --list

2101

+Chain INPUT (policy ACCEPT)

2102

+target prot opt source destination

2103

+ipv6ok all anywhere anywhere eui64

2104

+ipv6ok all !3ffe:2f00:a0::/64 anywhere

2105

+LOG all anywhere anywhere LOG level warning

2106

+

2107

+Chain ipv6ok (2 references)

2108

+target prot opt source destination

2109

+ACCEPT all anywhere anywhere

2110

+</verb></tscreen>

2111

+

2112

+

2113

+This match hasn't got any option.

2114

+

2115

+<sect1>ahesp6 patch

2116

+

2117

+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match

2118

+that allows you to match a packet based on its ah and esp headers' content.

2119

+The name of the matches:

2120

+<itemize>

2121

+<item>``ah'' : lets you match the IPv6 packet based on its ah header.

2122

+<item>``esp'' : lets you match the IPv6 packet based on its esp header.

2123

+</itemize>

2124

+

2125

+

2126

+For example, we will drop all the AH packets that have a SPI equal to

2127

+500, and check the contents of the restricted area in the header :

2128

+

2129

+<tscreen><verb>

2130

+# ip6tables -A INPUT -m ah --ahspi 500 --ahres -j DROP

2131

+

2132

+# ip6tables --list

2133

+Chain INPUT (policy ACCEPT)

2134

+target prot opt source destination

2135

+DROP all anywhere anywhere ah spi:500 reserved

2136

+</verb></tscreen>

2137

+

2138

+

2139

+Supported options for the ah match are :

2140

+

2141

+<descrip>

2142

+<tag>--ahspi [!] spi[:spi]</> -> match spi (range)

2143

+<tag>--ahlen [!] length</> -> length ot this header

2144

+<tag>--ahres </> -> checks the contents of the reserved field

2145

+</descrip>

2146

+

2147

+

2148

+The esp match works exactly the same as in IPv4 :

2149

+

2150

+<tscreen><verb>

2151

+# ip6tables -A INPUT -m esp --espspi 500 -j DROP

2152

+

2153

+# iptables --list

2154

+Chain INPUT (policy ACCEPT)

2155

+target prot opt source destination

2156

+DROP all anywhere anywhere esp spi:500

2157

+</verb></tscreen>

2158

+

2159

+

2160

+Supported options for the esp match are :

2161

+

2162

+<descrip>

2163

+<tag>--espspi [!] spi[:spi]</> -> match spi (range)

2164

+</descrip>

2165

+

2166

+In IPv6 these matches can be concatenated:

2167

+

2168

+<tscreen><verb>

2169

+# ip6tables -A INPUT -m ah --ahspi 500 --ahres --ahlen ! 40 -m esp --espspi 500 -j DROP

2170

+

2171

+# iptables --list

2172

+Chain INPUT (policy ACCEPT)

2173

+target prot opt source destination

2174

+DROP all anywhere anywhere ah spi:500 length:!40 reserved esp spi:500

2175

+</verb></tscreen>

2176

+

2177

+<sect1>frag6 patch

2178

+

2179

+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match

2180

+that allows you to match a packet based on the content of its fragmentation

2181

+header.

2182

+The name of the match:

2183

+<itemize>

2184

+<item>``frag'' : lets you match the IPv6 packet based on its fragmentation

2185

+header.

2186

+</itemize>

2187

+

2188

+

2189

+For example, we will drop all the packets that have an ID between 100 and 200,

2190

+and the packet is the first fragment :

2191

+

2192

+<tscreen><verb>

2193

+# ip6tables -A INPUT -m frag --fragid 100:200 --fragfirst -j DROP

2194

+

2195

+# ip6tables --list

2196

+Chain INPUT (policy ACCEPT)

2197

+target prot opt source destination

2198

+DROP all anywhere anywhere frag ids:100:200 first

2199

+</verb></tscreen>

2200

+

2201

+

2202

+Supported options for the frag match are :

2203

+

2204

+<descrip>

2205

+<tag>--fragid [!] id[:id]</> -> match the id (range) of the fragmenation

2206

+<tag>--fraglen [!] length</> -> match total length of this header

2207

+<tag>--fragres</> -> checks the contents of the reserved field

2208

+<tag>--fragfirst</> -> matches on the first fragment

2209

+<tag>--fragmore</> -> there are more fragments

2210

+<tag>--fraglast</> -> this is the last fragment

2211

+</descrip>

2212

+

2213

+<sect1>ipv6header patch

2214

+

2215

+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match

2216

+that allows you to match a packet based on its extension headers.

2217

+The name of the match:

2218

+<itemize>

2219

+<item>``ipv6header'' : lets you match the IPv6 packet based on its headers.

2220

+</itemize>

2221

+

2222

+

2223

+For example, let's drop the packets which have got hop-by-hop, ipv6-route

2224

+headers and a protocol payload:

2225

+

2226

+<tscreen><verb>

2227

+# ip6tables -A INPUT -m ipv6header --header hop-by-hop,ipv6-route,protocol -j DROP

2228

+

2229

+# ip6tables --list

2230

+Chain INPUT (policy ACCEPT)

2231

+target prot opt source destination

2232

+DROP all anywhere anywhere ipv6header flags:hop-by-hop,ipv6-route,protocol

2233

+</verb></tscreen>

2234

+

2235

+

2236

+And now, let's drop the packets which have got an ipv6-route extension header:

2237

+

2238

+<tscreen><verb>

2239

+# ip6tables -A INPUT -m ipv6header --header ipv6-route --soft -j DROP

2240

+

2241

+# ip6ptables --list

2242

+Chain INPUT (policy ACCEPT)

2243

+target prot opt source destination

2244

+DROP all anywhere anywhere ipv6header flags:ipv6-route soft

2245

+</verb></tscreen>

2246

+

2247

+

2248

+Supported options for the ipv6header match are :

2249

+<descrip>

2250

+<tag>[!] --header headers</> -> You can specify the interested

2251

+headers with this option. Accepted formats:

2252

+<itemize>

2253

+<item>hop,dst,route,frag,auth,esp,none,proto

2254

+<item>hop-by-hop,ipv6-opts,ipv6-route,ipv6-frag,ah,esp,ipv6-nonxt,protocol

2255

+<item>0,60,43,44,51,50,59

2256

+</itemize>

2257

+<tag>--soft</> -> You can specify the soft mode: in this mode

2258

+the match checks the existance of the header, not the full match!

2259

+</descrip>

2260

+

2261

+<sect1>ipv6-ports patch

2262

+

2263

+This patch by Jan Rekorajski <baggins@pld.org.pl> adds 4 new matches :

2264

+

2265

+<itemize>

2266

+<item>``limit'' : lets you to restrict the number of parallel TCP connections from a particular host or network.

2267

+<item>``mac'' : lets you match a packet based on its MAC address.

2268

+<item>``multiport'' : lets you to specify ports with a mix of port-ranges and single ports for UDP and TCP protocols.

2269

+<item>``owner'' : lets you match a packet based on its originator process' owner id.

2270

+</itemize>

2271

+

2272

+

2273

+These matches are the ports of the IPv4 versions. See the main documentation for the details!

2274

+

2275

+<sect1>length patch

2276

+

2277

+This patch by Imran Patel <ipatel@crosswinds.net> adds a new match

2278

+that allows you to match a packet based on its length. (This patch is shameless adaption from the

2279

+IPv4 match written by James Morris <jmorris@intercode.com.au>)

2280

+

2281

+

2282

+For example, let's drop all the pings with a packet size greater than

2283

+85 bytes :

2284

+

2285

+<tscreen><verb>

2286

+# ip6tables -A INPUT -p ipv6-icmp --icmpv6-type echo-request -m length --length 85:0xffff -j DROP

2287

+

2288

+# ip6ptables --list

2289

+Chain INPUT (policy ACCEPT)

2290

+target prot opt source destination

2291

+DROP ipv6-icmp -- anywhere anywhere ipv6-icmp echo-request length 85:65535

2292

+</verb></tscreen>

2293

+

2294

+

2295

+Supported options for the length match are :

2296

+

2297

+<descrip>

2298

+<tag>[!] --length length[:length]</> -> Match packet length

2299

+against value or range of values (inclusive)

2300

+</descrip>

2301

+

2302

+

2303

+Values of the range not present will be implied. The implied value for minimum

2304

+is 0, and for maximum is 65535.

2305

+

2306

+<sect1>route6 patch

2307

+

2308

+This patch by Andras Kis-Szabo <kisza@sch.bme.hu> adds a new match

2309

+that allows you to match a packet based on the content of its routing

2310

+header.

2311

+The name of the match:

2312

+<itemize>

2313

+<item>``rt'' : lets you match the IPv6 packet based on its routing

2314

+header.

2315

+</itemize>

2316

+

2317

+

2318

+For example, we will drop all the packets that have 0 routing type, the packet

2319

+is near the last hop (max 2 hops far), the routing path contains ::1 and ::2

2320

+(but not exactly):

2321

+

2322

+<tscreen><verb>

2323

+# ip6tables -A INPUT -m rt --rt-type 0 --rt-segsleft :2 --rt-0-addrs ::1,::2 --rt-0-not-strict -j DROP

2324

+

2325

+# ip6tables --list

2326

+Chain INPUT (policy ACCEPT)

2327

+target prot opt source destination

2328

+DROP all anywhere anywhere rt type:0 segslefts:0:2 0-addrs ::1,::2 0-not-strict

2329

+</verb></tscreen>

2330

+

2331

+

2332

+Supported options for the rt match are :

2333

+

2334

+<descrip>

2335

+<tag>--rt-type [!] type</> -> matches the type

2336

+<tag>--rt-segsleft [!] num[:num]</> -> matches the Segments Left field (range)

2337

+<tag>--rt-len [!] length</> -> total length of this header

2338

+<tag>--rt-0-res</> -> checks the contents of the reserved field

2339

+<tag>--rt-0-addrs ADDR[,ADDR...]</> -> Type=0 addresses (list, max: 16)

2340

+<tag>--rt-0-not-strict</> -> List of Type=0 addresses not a strict list

2341

+</descrip>

2342

+

2343

+<sect>New IPv6 netfilter targets

2344

+

2345

+In this section, we will attempt to explain the usage of new netfilter targets.

2346

+The patches will appear in alphabetical order. Additionally, we will not explain

2347

+patches that break other patches. But this might come later.

2348

+

2349

+

2350

+Generally speaking, for targets, you can get the help hints from a particular

2351

+module by typing :

2352

+

2353

+<tscreen>

2354

+<verb>

2355

+# ip6tables -j THE_TARGET_YOU_WANT --help

2356

+</verb>

2357

+</tscreen>

2358

+

2359

+

2360

+This would display the normal iptables help message, plus the specific

2361

+``THE_TARGET_YOU_WANT'' target help message at the end.

2362

+

2363

+<sect1>LOG patch

2364

+

2365

+This patch by Jan Rekorajski <baggins@pld.org.pl> adds a new target that allows you

2366

+to LOG the packets as in the IPv4 version of iptables.

2367

+

2368

+

2369

+The examples are the same as in iptables. See the man page for details!

2370

+

2371

+<sect1>REJECT patch

2372

+

2373

+This patch by Harald Welte <laforge@gnumonks.org> adds a new target that allows you

2374

+to REJECT the packets as in the IPv4 version of iptables.

2375

+

2376

+

2377

+The examples are the same as in iptables. See the man page for details!

2378

+

2379

+<sect>New IPv6 connection tracking patches

2380

+

2381

+The connection tracking hasn't supported, yet.

2382

+

2383

+<sect>Contributing

2384

+

2385

+<sect1>Contributing a new extension

2386

+

2387

+Netfilter core-team always welcome new extensions/bug-fixes. In this section we will not focus

2388

+on how to package a new extension to ease its inclusion into patch-o-matic yet. But this might

2389

+come in a future version of this HOWTO.

2390

+

2391

+

2392

+First of all, you should be familiar with the

2393

+<url url="http://www.netfilter.org/documentation/HOWTO/netfilter-hacking-HOWTO.html" name="Netfilter Hacking HOWTO">.

2394

+

2395

+

2396

+Rusty has already written a guideline on how to make new patches for netfilter,

2397

+it's in :

2398

+

2399

+<tscreen><verb>

2400

+/path/to/netfiltercvs/netfilter/patch-o-matic/NEWPATCHES

2401

+</verb></tscreen>

2402

+

2403

+

2404

+Or read the latest version online at :

2405

+<url url="http://cvs.netfilter.org/cgi-bin/cvsweb/netfilter/patch-o-matic/NEWPATCHES" name="NEWPATCHES">.

2406

+

2407

+

2408

+Finally, it's a good idea to subscribe to netfilter-devel mailing list.

2409

+More info on how to subscribe can be found on the netfilter homepage.

2410

+

2411

+<sect1>Contributing to this HOWTO

2412

+

2413

+You are mostly welcome to update this HOWTO. To do so, the preferred way

2414

+is to send a patch of the SGML master of this document to the

2415

+netfilter-devel mailing list.

2416

+

2417

+

2418

+Thanks for your help! Thanks to the developers who contributed the

2419

+netfilter-extensions-HOWTO parts related to their patches.

2420

+</article>

2421

Index: iptables-1.4.12/howtos/netfilter-hacking-HOWTO.sgml

2422

===================================================================

2423

--- /dev/null 1970-01-01 00:00:00.000000000 +0000

2424

+++ iptables-1.4.12/howtos/netfilter-hacking-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600

2425

@@ -0,0 +1,1978 @@

2426

+<!doctype linuxdoc system>

2427

+

2428

+<!-- This is the Linux Netfilter Hacking HOWTO.

2429

+ -->

2430

+

2431

+

2432

+

2433

+<article>

2434

+

2435

+

2436

+

2437

+<title>Linux netfilter Hacking HOWTO

2438

+<author>Rusty Russell and Harald Welte, mailing list <tt>netfilter@lists.samba.org</tt>

2439

+<date>$Revision: 1.14 $ $Date: 2002/07/02 04:07:19 $

2440

+<abstract>

2441

+This document describes the netfilter architecture for Linux, how to

2442

+hack it, and some of the major systems which sit on top of it, such as

2443

+packet filtering, connection tracking and Network Address Translation.

2444

+</abstract>

2445

+

2446

+

2447

+<toc>

2448

+

2449

+

2450

+

2451

+<sect>Introduction<label id="intro">

2452

+

2453

+

2454

+Hi guys.

2455

+

2456

+

2457

+This document is a journey; some parts are well-traveled, and in

2458

+other areas you will find yourself almost alone. The best advice I

2459

+can give you is to grab a large, cozy mug of coffee or hot chocolate,

2460

+get into a comfortable chair, and absorb the contents before venturing

2461

+out into the sometimes dangerous world of network hacking.

2462

+

2463

+For more understanding of the use of the infrastructure on top of

2464

+the netfilter framework, I recommend reading the Packet Filtering

2465

+HOWTO and the NAT HOWTO. For information on kernel programming I

2466

+suggest Rusty's Unreliable Guide to Kernel Hacking and Rusty's

2467

+Unreliable Guide to Kernel Locking.

2468

+

2469

2470

+

2471

+<sect1>What is netfilter?

2472

+

2473

+

2474

+netfilter is a framework for packet mangling, outside the normal

2475

+Berkeley socket interface. It has four parts. Firstly, each protocol

2476

+defines "hooks" (IPv4 defines 5) which are well-defined points in a

2477

+packet's traversal of that protocol stack. At each of these points,

2478

+the protocol will call the netfilter framework with the packet and the

2479

+hook number.

2480

+

2481

+

2482

+Secondly, parts of the kernel can register to listen to the different

2483

+hooks for each protocol. So when a packet is passed to the netfilter

2484

+framework, it checks to see if anyone has registered for that protocol

2485

+and hook; if so, they each get a chance to examine (and possibly

2486

+alter) the packet in order, then discard the packet

2487

+(<tt>NF_DROP</tt>), allow it to pass (<tt>NF_ACCEPT</tt>), tell

2488

+netfilter to forget about the packet (<tt>NF_STOLEN</tt>), or ask

2489

+netfilter to queue the packet for userspace (<tt>NF_QUEUE</tt>).

2490

+

2491

+

2492

+The third part is that packets that have been queued are collected (by

2493

+the ip_queue driver) for sending to userspace; these packets are

2494

+handled asynchronously.

2495

+

2496

+

2497

+The final part consists of cool comments in the code and

2498

+documentation. This is instrumental for any experimental project.

2499

+The netfilter motto is (stolen shamelessly from Cort Dougan):

2500

+

2501

+<tscreen><verb>

2502

+ ``So... how is this better than KDE?''

2503

+</verb></tscreen>

2504

+

2505

+(This motto narrowly edged out `Whip me, beat me, make me use

2506

+ipchains').

2507

+

2508

+

2509

+In addition to this raw framework, various modules have been written

2510

+which provide functionality similar to previous (pre-netfilter)

2511

+kernels, in particular, an extensible NAT system, and an extensible

2512

+packet filtering system (iptables).

2513

+

2514

+<sect1>What's wrong with what we had in 2.0 and 2.2?

2515

+

2516

+

2517

+<enum>

2518

+<item>No infrastructure established for passing packet to userspace:

2519

+<itemize>

2520

+<item>Kernel coding is hard

2521

+<item>Kernel coding must be done in C/C++

2522

+<item>Dynamic filtering policies do not belong in kernel

2523

+<item> 2.2 introduced copying packets to userspace via netlink, but

2524

+ reinjecting packets is slow, and subject to `sanity' checks.

2525

+ For example, reinjecting packet claiming to come from an

2526

+ existing interface is not possible.

2527

+</itemize>

2528

+

2529

+<item>Transparent proxying is a crock:

2530

+

2531

+<itemize>

2532

+

2533

+<item> We look up <bf>every</bf> packet to see if there is a socket

2534

+bound to that address

2535

+

2536

+<item> Root is allowed to bind to foreign addresses

2537

+

2538

+<item> Can't redirect locally-generated packets

2539

+

2540

+<item> REDIRECT doesn't handle UDP replies: redirecting UDP named

2541

+packets to 1153 doesn't work because some clients don't like replies

2542

+coming from anything other than port 53.

2543

+

2544

+<item> REDIRECT doesn't coordinate with tcp/udp port allocation: a

2545

+user may get a port shadowed by a REDIRECT rule.

2546

+

2547

+<item>Has been broken at least twice during 2.1 series.

2548

+

2549

+<item>Code is extremely intrusive. Consider the stats on the number

2550

+of #ifdef CONFIG_IP_TRANSPARENT_PROXY in 2.2.1: 34 occurrences in 11

2551

+files. Compare this with CONFIG_IP_FIREWALL, which has 10 occurrences

2552

+in 5 files.

2553

+</itemize>

2554

+

2555

+<item>Creating packet filter rules independent of interface addresses

2556

+ is not possible:

2557

+

2558

+<itemize>

2559

+<item>Must know local interface addresses to distinguish

2560

+locally-generated or locally-terminating packets from through

2561

+packets.

2562

+

2563

+<item>Even that is not enough in cases of redirection or

2564

+masquerading.

2565

+

2566

+<item>Forward chain only has information on outgoing interface,

2567

+meaning you have to figure where a packet came from using knowledge of

2568

+the network topography.

2569

+</itemize>

2570

+

2571

+<item>Masquerading is tacked onto packet filtering:

2572

+ Interactions between packet filtering and masquerading make firewalling

2573

+ complex:

2574

+<itemize>

2575

+<item>At input filtering, reply packets appear to be destined for box itself

2576

+<item>At forward filtering, demasqueraded packets are not seen at all

2577

+<item>At output filtering, packets appear to come from local box

2578

+</itemize>

2579

+

2580

+<item>TOS manipulation, redirect, ICMP unreachable and mark (which can

2581

+effect port forwarding, routing, and QoS) are tacked onto packet

2582

+filter code as well.

2583

+

2584

+<item>ipchains code is neither modular, nor extensible (eg. MAC

2585

+address filtering, options filtering, etc).

2586

+

2587

+<item>Lack of sufficient infrastructure has led to a profusion of

2588

+different techniques:

2589

+<itemize>

2590

+<item>Masquerading, plus per-protocol modules

2591

+<item>Fast static NAT by routing code (doesn't have per-protocol handling)

2592

+<item>Port forwarding, redirect, auto forwarding

2593

+<item>The Linux NAT and Virtual Server Projects.

2594

+</itemize>

2595

+

2596

+<item>Incompatibility between CONFIG_NET_FASTROUTE and packet filtering:

2597

+<itemize>

2598

+<item>Forwarded packets traverse three chains anyway

2599

+<item>No way to tell if these chains can be bypassed

2600

+</itemize>

2601

+

2602

+<item>Inspection of packets dropped due to routing protection

2603

+(eg. Source Address Verification) not possible.

2604

+

2605

+<item>No way of atomically reading counters on packet filter rules.

2606

+

2607

+<item>CONFIG_IP_ALWAYS_DEFRAG is a compile-time option, making life

2608

+difficult for distributions who want one general-purpose kernel.

2609

+

2610

+</enum>

2611

+

2612

+<sect1>Who are you?

2613

+

2614

+

2615

+I'm the only one foolish enough to do this. As ipchains co-author and

2616

+current Linux Kernel IP Firewall maintainer, I see many of the

2617

+problems that people have with the current system, as well as getting

2618

+exposure to what they are trying to do.

2619

+

2620

+<sect1>Why does it crash?

2621

+

2622

+

2623

+Woah! You should have seen it <bf>last</bf> week!

2624

+

2625

+

2626

+Because I'm not as great a programmer as we might all wish, and I

2627

+certainly haven't tested all scenarios, because of lack of time,

2628

+equipment and/or inspiration. I do have a testsuite, which I

2629

+encourage you to contribute to.

2630

+

2631

+<sect>Where Can I Get The Latest?

2632

+

2633

+There is a CVS server on netfilter.org which contains the latest

2634

+HOWTOs, userspace tools and testsuite. For casual browsing, you

2635

+can use the

2636

+<url url="http://cvs.netfilter.org/" name="Web Interface">.

2637

+

2638

+To grab the latest sources, you can do the following:

2639

+

2640

+<enum>

2641

+<item> Log in to the netfilter CVS server anonymously:

2642

+<tscreen><verb>

2643

+cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic login

2644

+</verb></tscreen>

2645

+<item> When it asks you for a password type `cvs'.

2646

+<item> Check out the code using:

2647

+<tscreen><verb>

2648

+# cvs -d :pserver:cvs@pserver.netfilter.org:/cvspublic co netfilter/userspace

2649

+</verb></tscreen>

2650

+<item> To update to the latest version, use

2651

+<tscreen><verb>

2652

+cvs update -d -P

2653

+</verb></tscreen>

2654

+</enum>

2655

+

2656

+<sect>Netfilter Architecture

2657

+

2658

+Netfilter is merely a series of hooks in various points in a

2659

+protocol stack (at this stage, IPv4, IPv6 and DECnet). The

2660

+(idealized) IPv4 traversal diagram looks like the following:

2661

+

2662

+<tscreen><verb>

2663

+A Packet Traversing the Netfilter System:

2664

+

2665

+ --->[1]--->[ROUTE]--->[3]--->[4]--->

2666

+ | ^

2667

+ | |

2668

+ | [ROUTE]

2669

+ v |

2670

+ [2] [5]

2671

+ | ^

2672

+ | |

2673

+ v |

2674

+</verb></tscreen><label id="netfilter-traversal">

2675

+

2676

+On the left is where packets come in: having passed the simple sanity

2677

+checks (i.e., not truncated, IP checksum OK, not a promiscuous receive),

2678

+they are passed to the netfilter framework's NF_IP_PRE_ROUTING [1] hook.

2679

+

2680

+

2681

+Next they enter the routing code, which decides whether the packet is

2682

+destined for another interface, or a local process. The routing code

2683

+may drop packets that are unroutable.

2684

+

2685

+

2686

+If it's destined for the box itself, the netfilter framework is called

2687

+again for the NF_IP_LOCAL_IN [2] hook, before being passed to the

2688

+process (if any).

2689

+

2690

+

2691

+If it's destined to pass to another interface instead, the netfilter

2692

+framework is called for the NF_IP_FORWARD [3] hook.

2693

+

2694

+

2695

+The packet then passes a final netfilter hook, the NF_IP_POST_ROUTING

2696

+[4] hook, before being put on the wire again.

2697

+

2698

+

2699

+The NF_IP_LOCAL_OUT [5] hook is called for packets that are created

2700

+locally. Here you can see that routing occurs after this hook is

2701

+called: in fact, the routing code is called first (to figure out the

2702

+source IP address and some IP options): if you want to alter the

2703

+routing, you must alter the `skb->dst' field yourself, as is done in

2704

+the NAT code.

2705

+

2706

+<sect1>Netfilter Base

2707

+

2708

+Now we have an example of netfilter for IPv4, you can see when each

2709

+hook is activated. This is the essence of netfilter.

2710

+

2711

+

2712

+Kernel modules can register to listen at any of these hooks. A module

2713

+that registers a function must specify the priority of the function

2714

+within the hook; then when that netfilter hook is called from the core

2715

+networking code, each module registered at that point is called in the

2716

+order of priorites, and is free to manipulate the packet. The

2717

+module can then tell netfilter to do one of five things:

2718

+

2719

+<enum>

2720

+<item> NF_ACCEPT: continue traversal as normal.

2721

+<item> NF_DROP: drop the packet; don't continue traversal.

2722

+<item> NF_STOLEN: I've taken over the packet; don't continue traversal.

2723

+<item> NF_QUEUE: queue the packet (usually for userspace handling).

2724

+<item> NF_REPEAT: call this hook again.

2725

+</enum>

2726

+

2727

+

2728

+The other parts of netfilter (handling queued packets, cool comments)

2729

+will be covered in the kernel section later.

2730

+

2731

+

2732

+Upon this foundation, we can build fairly complex packet

2733

+manipulations, as shown in the next two sections.

2734

+

2735

+<sect1>Packet Selection: IP Tables

2736

+

2737

+A packet selection system called IP Tables has been built over the

2738

+netfilter framework. It is a direct descendent of ipchains (that came

2739

+from ipfwadm, that came from BSD's ipfw IIRC), with extensibility.

2740

+Kernel modules can register a new table, and ask for a packet to

2741

+traverse a given table. This packet selection method is used for

2742

+packet filtering (the `filter' table), Network Address Translation

2743

+(the `nat' table) and general pre-route packet mangling (the `mangle'

2744

+table).

2745

+

2746

+The hooks that are registered with netfilter are as follows (with

2747

+the functions in each hook in the order that they are actually

2748

+called):

2749

+

2750

+<tscreen><verb>

2751

+

2752

+ --->PRE------>[ROUTE]--->FWD---------->POST------>

2753

+ Conntrack | Mangle ^ Mangle

2754

+ Mangle | Filter | NAT (Src)

2755

+ NAT (Dst) | | Conntrack

2756

+ (QDisc) | [ROUTE]

2757

+ v |

2758

+ IN Filter OUT Conntrack

2759

+ | Conntrack ^ Mangle

2760

+ | Mangle | NAT (Dst)

2761

+ v | Filter

2762

+</verb></tscreen>

2763

+

2764

+<sect2>Packet Filtering

2765

+

2766

+

2767

+This table, `filter', should never alter packets: only filter them.

2768

+

2769

+

2770

+One of the advantages of iptables filter over ipchains is that it is

2771

+small and fast, and it hooks into netfilter at the NF_IP_LOCAL_IN,

2772

+NF_IP_FORWARD and NF_IP_LOCAL_OUT points. This means that for any

2773

+given packet, there is one (and only one) possible place to filter it.

2774

+This makes things much simpler for users than ipchains was. Also, the

2775

+fact that the netfilter framework provides both the input and output

2776

+interfaces for the NF_IP_FORWARD hook means that many kinds of

2777

+filtering are far simpler.

2778

+

2779

+

2780

+Note: I have ported the kernel portions of both ipchains and ipfwadm

2781

+as modules on top of netfilter, enabling the use of the old ipfwadm

2782

+and ipchains userspace tools without requiring an upgrade.

2783

+

2784

+<sect2>NAT

2785

+

2786

+

2787

+This is the realm of the `nat' table, which is fed packets from two

2788

+netfilter hooks: for non-local packets, the NF_IP_PRE_ROUTING and

2789

+NF_IP_POST_ROUTING hooks are perfect for destination and source

2790

+alterations respectively. If CONFIG_IP_NF_NAT_LOCAL is defined, the

2791

+hooks NF_IP_LOCAL_OUT and NF_IP_LOCAL_IN are used for altering the

2792

+destination of local packets.

2793

+

2794

+

2795

+This table is slightly different from the `filter' table, in that only

2796

+the first packet of a new connection will traverse the table: the

2797

+result of this traversal is then applied to all future packets in the

2798

+same connection.

2799

+

2800

+<sect3>Masquerading, Port Forwarding, Transparent Proxying

2801

+

2802

+I divide NAT into Source NAT (where the first packet has its source

2803

+altered), and Destination NAT (the first packet has its destination

2804

+altered).

2805

+

2806

+Masquerading is a special form of Source NAT: port forwarding and

2807

+transparent proxying are special forms of Destination NAT. These are

2808

+now all done using the NAT framework, rather than being independent

2809

+entities.

2810

+

2811

+<sect2>Packet Mangling

2812

+

2813

+The packet mangling table (the `mangle' table) is used for actual

2814

+changing of packet information. Example applications are the TOS and

2815

+TCPMSS targets. The mangle table hooks into all five netfilter hooks.

2816

+(please note this changed with kernel 2.4.18. Previous kernels didn't

2817

+have mangle attached to all hooks)

2818

+

2819

+<sect1>Connection Tracking

2820

+

2821

+Connection tracking is fundamental to NAT, but it is implemented as a

2822

+separate module; this allows an extension to the packet filtering code

2823

+to simply and cleanly use connection tracking (the `state' module).

2824

+

2825

+<sect1>Other Additions

2826

+

2827

+The new flexibility provides both the opportunity to do really

2828

+funky things, but for people to write enhancements or complete

2829

+replacements that can be mixed and matched.

2830

+

2831

+<sect>Information for Programmers

2832

+

2833

+I'll let you in on a secret: my pet hamster did all the coding. I

2834

+was just a channel, a `front' if you will, in my pet's grand plan.

2835

+So, don't blame me if there are bugs. Blame the cute, furry one.

2836

+

2837

+<sect1>Understanding ip_tables

2838

+

2839

+iptables simply provides a named array of rules in memory (hence

2840

+the name `iptables'), and such information as where packets from each

2841

+hook should begin traversal. After a table is registered, userspace

2842

+can read and replace its contents using getsockopt() and setsockopt().

2843

+

2844

+iptables does not register with any netfilter hooks: it relies on

2845

+other modules to do that and feed it the packets as appropriate; a

2846

+module must register the netfilter hooks and ip_tables separately, and

2847

+provide the mechanism to call ip_tables when the hook is reached.

2848

+

2849

+<sect2> ip_tables Data Structures

2850

+

2851

+For convenience, the same data structure is used to represent a

2852

+rule by userspace and within the kernel, although a few fields are

2853

+only used inside the kernel.

2854

+

2855

+Each rule consists of the following parts:

2856

+<enum>

2857

+<item> A `struct ipt_entry'.

2858

+<item> Zero or more `struct ipt_entry_match' structures, each with a

2859

+ variable amount (0 or more bytes) of data appended to it.

2860

+<item> A `struct ipt_entry_target' structure, with a variable amount

2861

+ (0 or more bytes) of data appended to it.

2862

+</enum>

2863

+

2864

+The variable nature of the rule gives a huge amount of flexibility for

2865

+extensions, as we'll see, especially as each match or target can carry

2866

+an arbitrary amount of data. This does create a few traps, however:

2867

+we have to watch out for alignment. We do this by ensuring that the

2868

+`ipt_entry', `ipt_entry_match' and `ipt_entry_target' structures are

2869

+conveniently sized, and that all data is rounded up to the maximal

2870

+alignment of the machine using the IPT_ALIGN() macro.

2871

+

2872

+

2873

+The `struct ipt_entry' has the following fields:

2874

+<enum>

2875

+<item> A `struct ipt_ip' part, containing the specifications for the

2876

+IP header that it is to match.

2877

+

2878

+<item> An `nf_cache' bitfield showing what parts of the packet this

2879

+rule examined.

2880

+

2881

+<item> A `target_offset' field indicating the offset from the

2882

+beginning of this rule where the ipt_entry_target structure begins.

2883

+This should always be aligned correctly (with the IPT_ALIGN macro).

2884

+

2885

+<item> A `next_offset' field indicating the total size of this rule,

2886

+including the matches and target. This should also be aligned

2887

+correctly using the IPT_ALIGN macro.

2888

+

2889

+<item> A `comefrom' field used by the kernel to track packet

2890

+traversal.

2891

+

2892

+<item> A `struct ipt_counters' field containing the packet and byte

2893

+counters for packets which matched this rule.

2894

+</enum>

2895

+

2896

+

2897

+The `struct ipt_entry_match' and `struct ipt_entry_target' are very

2898

+similar, in that they contain a total (IPT_ALIGN'ed) length field

2899

+(`match_size' and `target_size' respectively) and a union holding the

2900

+name of the match or target (for userspace), and a pointer (for the

2901

+kernel).

2902

+

2903

+

2904

+Because of the tricky nature of the rule data structure, some helper

2905

+routines are provided:

2906

+

2907

+<descrip>

2908

+<tag>ipt_get_target()</tag> This inline function returns a pointer to

2909

+the target of a rule.

2910

+

2911

+<tag>IPT_MATCH_ITERATE()</tag> This macro calls the given function for

2912

+every match in the given rule. The function's first argument is the

2913

+`struct ipt_match_entry', and other arguments (if any) are those

2914

+supplied to the IPT_MATCH_ITERATE() macro. The function must return

2915

+either zero for the iteration to continue, or a non-zero value to

2916

+stop.

2917

+

2918

+<tag>IPT_ENTRY_ITERATE()</tag> This function takes a pointer to an

2919

+entry, the total size of the table of entries, and a function to call.

2920

+The functions first argument is the `struct ipt_entry', and other

2921

+arguments (if any) are those supplied to the IPT_ENTRY_ITERATE()

2922

+macro. The function must return either zero for the iteration to

2923

+continue, or a non-zero value to stop.

2924

+</descrip>

2925

+

2926

+<sect2>ip_tables From Userspace

2927

+

2928

+Userspace has four operations: it can read the current table, read

2929

+the info (hook positions and size of table), replace the table (and

2930

+grab the old counters), and add in new counters.

2931

+

2932

+This allows any atomic operation to be simulated by userspace: this

2933

+is done by the libiptc library, which provides convenience

2934

+"add/delete/replace" semantics for programs.

2935

+

2936

+Because these tables are transferred into kernel space, alignment

2937

+becomes an issue for machines which have different userspace and

2938

+kernelspace type rules (eg. Sparc64 with 32-bit userland). These

2939

+cases are handled by overriding the definition of IPT_ALIGN for these

2940

+platforms in `libiptc.h'.

2941

+

2942

+<sect2> ip_tables Use And Traversal

2943

+

2944

+The kernel starts traversing at the location indicated by the

2945

+particular hook. That rule is examined, if the `struct ipt_ip'

2946

+elements match, each `struct ipt_entry_match' is checked in turn (the

2947

+match function associated with that match is called). If the match

2948

+function returns 0, iteration stops on that rule. If it sets the

2949

+`hotdrop' parameter to 1, the packet will also be immediately dropped

2950

+(this is used for some suspicious packets, such as in the tcp match

2951

+function).

2952

+

2953

+If the iteration continues to the end, the counters are

2954

+incremented, the `struct ipt_entry_target' is examined: if it's a

2955

+standard target, the `verdict' field is read (negative means a packet

2956

+verdict, positive means an offset to jump to). If the answer is

2957

+positive and the offset is not that of the next rule, the `back'

2958

+variable is set, and the previous `back' value is placed in that

2959

+rule's `comefrom' field.

2960

+

2961

+For non-standard targets, the target function is called: it returns

2962

+a verdict (non-standard targets can't jump, as this would break the

2963

+static loop-detection code). The verdict can be IPT_CONTINUE, to

2964

+continue on to the next rule.

2965

+

2966

+<sect1>Extending iptables

2967

+

2968

+Because I'm lazy, <tt>iptables</tt> is fairly extensible. This is

2969

+basically a scam to palm off work onto other people, which is what

2970

+Open Source is all about (cf. Free Software, which as RMS would say,

2971

+is about freedom, and I was sitting in one of his talks when I wrote

2972

+this).

2973

+

2974

+Extending <tt>iptables</tt> potentially involves two parts:

2975

+extending the kernel, by writing a new module, and possibly extending

2976

+the userspace program <tt>iptables</tt>, by writing a new shared

2977

+library.

2978

+

2979

+<sect2>The Kernel

2980

+

2981

+Writing a kernel module itself is fairly simple, as you can see

2982

+from the examples. One thing to be aware of is that your code must be

2983

+re-entrant: there can be one packet coming in from userspace, while

2984

+another arrives on an interrupt. In fact in SMP there can be one

2985

+packet on an interrupt per CPU in 2.3.4 and above.

2986

+

2987

+

2988

+The functions you need to know about are:

2989

+

2990

+<descrip>

2991

+<tag>init_module()</tag> This is the entry-point of the module. It

2992

+returns a negative error number, or 0 if it successfully registers

2993

+itself with netfilter.

2994

+

2995

+<tag>cleanup_module()</tag> This is the exit point of the module; it

2996

+should unregister itself with netfilter.

2997

+

2998

+<tag>ipt_register_match()</tag> This is used to register a new match

2999

+type. You hand it a `struct ipt_match', which is usually declared as

3000

+a static (file-scope) variable.

3001

+

3002

+<tag>ipt_register_target()</tag> This is used to register a new

3003

+type. You hand it a `struct ipt_target', which is usually declared as

3004

+a static (file-scope) variable.

3005

+

3006

+<tag>ipt_unregister_target()</tag> Used to unregister your target.

3007

+

3008

+<tag>ipt_unregister_match()</tag> Used to unregister your match.

3009

+</descrip>

3010

+

3011

+One warning about doing tricky things (such as providing counters)

3012

+in the extra space in your new match or target. On SMP machines, the

3013

+entire table is duplicated using memcpy for each CPU: if you really

3014

+want to keep central information, you should see the method used in

3015

+the `limit' match.

3016

+

3017

+<sect3>New Match Functions

3018

+

3019

+New match functions are usually written as a standalone module.

3020

+It's possible to have these modules extensible in turn, although it's

3021

+usually not necessary. One way would be to use the netfilter

3022

+framework's `nf_register_sockopt' function to allows users to talk to

3023

+your module directly. Another way would be to export symbols for

3024

+other modules to register themselves, the same way netfilter and

3025

+ip_tables do.

3026

+

3027

+The core of your new match function is the struct ipt_match which

3028

+it passes to `ipt_register_match()'. This structure has the following

3029

+fields:

3030

+

3031

+<descrip>

3032

+<tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.

3033

+

3034

+<tag>name</tag> This field is the name of the match function, as

3035

+referred to by userspace. The name should match the name of the

3036

+module (i.e., if the name is "mac", the module must be "ipt_mac.o") for

3037

+auto-loading to work.

3038

+

3039

+<tag>match</tag> This field is a pointer to a match function, which

3040

+takes the skb, the in and out device pointers (one of which may be

3041

+NULL, depending on the hook), a pointer to the match data in the rule

3042

+that is worked on (the structure that was prepared in userspace), the

3043

+IP offset (non-zero means

3044

+a non-head fragment), a pointer to the protocol header (i.e., just

3045

+past the IP header), the length of the data (ie. the packet length

3046

+minus the IP header length) and finally a pointer to a `hotdrop'

3047

+variable. It should return non-zero if the packet matches, and can

3048

+set `hotdrop' to 1 if it returns 0, to indicate that the packet must

3049

+be dropped immediately.

3050

+

3051

+<tag>checkentry</tag> This field is a pointer to a function which

3052

+checks the specifications for a rule; if this returns 0, then the rule

3053

+will not be accepted from the user. For example, the "tcp" match type

3054

+will only accept tcp packets, and so if the `struct ipt_ip' part of

3055

+the rule does not specify that the protocol must be tcp, a zero is

3056

+returned. The tablename argument allows your match to control what

3057

+tables it can be used in, and the `hook_mask' is a bitmask of hooks

3058

+this rule may be called from: if your match does not make sense from

3059

+some netfilter hooks, you can avoid that here.

3060

+

3061

+<tag>destroy</tag> This field is a pointer to a function which is

3062

+called when an entry using this match is deleted. This allows you to

3063

+dynamically allocate resources in checkentry and clean them up here.

3064

+

3065

+<tag>me</tag> This field is set to `THIS_MODULE', which gives a

3066

+pointer to your module. It causes the usage-count to go up and down

3067

+as rules of that type are created and destroyed. This prevents a user

3068

+removing the module (and hence cleanup_module() being called) if a

3069

+rule refers to it.

3070

+</descrip>

3071

+

3072

+<sect3>New Targets

3073

+

3074

+If your target alters the packet (ie. the headers or the body), it

3075

+must call skb_unshare() to copy the packet in case it is cloned:

3076

+otherwise any raw sockets which have a clone of the skbuff will see

3077

+the alterations (ie. people will see wierd stuff happening in

3078

+tcpdump).

3079

+

3080

+New targets are also usually written as a standalone module. The

3081

+discussions under the above section on `New Match Functions' apply

3082

+equally here.

3083

+

3084

+The core of your new target is the struct ipt_target that it

3085

+passes to ipt_register_target(). This structure has the following

3086

+fields:

3087

+

3088

+ <descrip>

3089

+ <tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.

3090

+

3091

+ <tag>name</tag> This field is the name of the target function, as

3092

+ referred to by userspace. The name should match the name of the

3093

+ module (i.e., if the name is "REJECT", the module must be

3094

+ "ipt_REJECT.o") for auto-loading to work.

3095

+

3096

+ <tag>target</tag> This is a pointer to the target function, which

3097

+ takes the skbuff, the hook number, the input and output device

3098

+ pointers (either of which may be NULL), a pointer to the target data,

3099

+ and the position of the rule in the table. The target function may

3100

+ return either IPT_CONTINUE (-1) if traversing should continue, or a

3101

+ netfilter verdict (NF_DROP, NF_ACCEPT, NF_STOLEN etc.).

3102

+

3103

+ <tag>checkentry</tag> This field is a pointer to a function which

3104

+ checks the specifications for a rule; if this returns 0, then the

3105

+ rule will not be accepted from the user.

3106

+

3107

+ <tag>destroy</tag> This field is a pointer to a function which is

3108

+ called when an entry using this target is deleted. This allows you

3109

+ to dynamically allocate resources in checkentry and clean them up

3110

+ here.

3111

+

3112

+ <tag>me</tag> This field is set to `THIS_MODULE', which gives a

3113

+ pointer to your module. It causes the usage-count to go up and down

3114

+ as rules with this as a target are created and destroyed. This

3115

+ prevents a user removing the module (and hence cleanup_module() being

3116

+ called) if a rule refers to it.

3117

+ </descrip>

3118

+

3119

+<sect3>New Tables

3120

+

3121

+You can create a new table for your specific purpose if you wish.

3122

+To do this, you call `ipt_register_table()', with a `struct

3123

+ipt_table', which has the following fields:

3124

+

3125

+ <descrip>

3126

+ <tag>list</tag> This field is set to any junk, say `{ NULL, NULL }'.

3127

+

3128

+ <tag>name</tag> This field is the name of the table function, as

3129

+ referred to by userspace. The name should match the name of the

3130

+ module (i.e., if the name is "nat", the module must be

3131

+ "iptable_nat.o") for auto-loading to work.

3132

+

3133

+ <tag>table</tag> This is a fully-populated `struct ipt_replace', as

3134

+ used by userspace to replace a table. The `counters' pointer should

3135

+ be set to NULL. This data structure can be declared `__initdata' so

3136

+ it is discarded after boot.

3137

+

3138

+ <tag>valid_hooks</tag> This is a bitmask of the IPv4 netfilter hooks

3139

+ you will enter the table with: this is used to check that those entry

3140

+ points are valid, and to calculate the possible hooks for ipt_match

3141

+ and ipt_target `checkentry()' functions.

3142

+

3143

+ <tag>lock</tag> This is the read-write spinlock for the entire table;

3144

+ initialize it to RW_LOCK_UNLOCKED.

3145

+

3146

+ <tag>private</tag> This is used internally by the ip_tables code.

3147

+ </descrip>

3148

+

3149

+<sect2>Userspace Tool

3150

+

3151

+Now you've written your nice shiny kernel module, you may want to

3152

+control the options on it from userspace. Rather than have a branched

3153

+version of <tt>iptables</tt> for each extension, I use the very latest

3154

+90's technology: furbies. Sorry, I mean shared libraries.

3155

+

3156

+New tables generally don't require any extension to

3157

+<tt>iptables</tt>: the user just uses the `-t' option to make it use

3158

+the new table.

3159

+

3160

+The shared library should have an `_init()' function, which will

3161

+automatically be called upon loading: the moral equivalent of the

3162

+kernel module's `init_module()' function. This should call

3163

+`register_match()' or `register_target()', depending on whether your

3164

+shared library provides a new match or a new target.

3165

+

3166

+You need to provide a shared library: this can be used to

3167

+initialize part of the structure, or provide additional options. I

3168

+now insist on a shared library even if it doesn't do anything, to

3169

+reduce problem reports where the shares libraries are missing.

3170

+

3171

+There are useful functions described in the `iptables.h' header,

3172

+especially:

3173

+<descrip>

3174

+<tag>check_inverse()</tag> checks if an argument is actually a `!',

3175

+and if so, sets the `invert' flag if not already set. If it returns

3176

+true, you should increment optind, as done in the examples.

3177

+

3178

+<tag>string_to_number()</tag> converts a string into a number in the

3179

+given range, returning -1 if it is malformed or out of range.

3180

+`string_to_number' rely on `strtol' (see the manpage), meaning

3181

+that a leading "0x" would make the number be in Hexadecimal base, a leading

3182

+"0" would make it be in Octal base.

3183

+

3184

+<tag>exit_error()</tag> should be called if an error is found.

3185

+Usually the first argument is `PARAMETER_PROBLEM', meaning the user

3186

+didn't use the command line correctly.

3187

+</descrip>

3188

+

3189

+<sect3>New Match Functions

3190

+

3191

+Your shared library's _init() function hands `register_match()' a

3192

+pointer to a static `struct iptables_match', which has the following

3193

+fields:

3194

+

3195

+<descrip>

3196

+<tag>next</tag> This pointer is used to make a linked list of matches

3197

+(such as used for listing rules). It should be set to NULL initially.

3198

+

3199

+<tag>name</tag> The name of the match function. This should match the

3200

+library name (eg "tcp" for `libipt_tcp.so').

3201

+

3202

+<tag>version</tag> Usually set to the IPTABLES_VERSION macro: this is

3203

+used to ensure that the <tt>iptables</tt> binary doesn't pick up the

3204

+wrong shared libraries by mistake.

3205

+

3206

+<tag>size</tag> The size of the match data for this match; you should

3207

+use the IPT_ALIGN() macro to ensure it is correctly aligned.

3208

+

3209

+<tag>userspacesize</tag> For some matches, the kernel changes some

3210

+fields internally (the `limit' target is a case of this). This means

3211

+that a simple `memcmp()' is insufficient to compare two rules

3212

+(required for delete-matching-rule functionality). If this is the

3213

+case, place all the fields which do not change at the start of the

3214

+structure, and put the size of the unchanging fields here. Usually,

3215

+however, this will be identical to the `size' field.

3216

+

3217

+<tag>help</tag> A function which prints out the option synopsis.

3218

+

3219

+<tag>init</tag> This can be used to initialize the extra space (if

3220

+any) in the ipt_entry_match structure, and set any nfcache bits; if

3221

+you are examining something not expressible using the contents of

3222

+`linux/include/netfilter_ipv4.h', then simply OR in the NFC_UNKNOWN

3223

+bit. It will be called before `parse()'.

3224

+

3225

+<tag>parse</tag> This is called when an unrecognized option is seen on

3226

+the command line: it should return non-zero if the option was indeed

3227

+for your library. `invert' is true if a `!' has already been seen.

3228

+The `flags' pointer is for the exclusive use of your match library,

3229

+and is usually used to store a bitmask of options which have been

3230

+specified. Make sure you adjust the nfcache field. You may extend

3231

+the size of the `ipt_entry_match' structure by reallocating if

3232

+necessary, but then you must ensure that the size is passed through

3233

+the IPT_ALIGN macro.

3234

+

3235

+<tag>final_check</tag> This is called after the command line has been

3236

+parsed, and is handed the `flags' integer reserved for your library.

3237

+This gives you a chance to check that any compulsory options have been

3238

+specified, for example: call `exit_error()' if this is the case.

3239

+

3240

+<tag>print</tag> This is used by the chain listing code to print (to

3241

+standard output) the extra match information (if any) for a rule. The

3242

+numeric flag is set if the user specified the `-n' flag.

3243

+

3244

+<tag>save</tag> This is the reverse of parse: it is used by

3245

+`iptables-save' to reproduce the options which created the rule.

3246

+

3247

+<tag>extra_opts</tag> This is a NULL-terminated list of extra options

3248

+which your library offers. This is merged with the current options

3249

+and handed to getopt_long; see the man page for details. The return

3250

+code for getopt_long becomes the first argument (`c') to your

3251

+`parse()' function.

3252

+</descrip>

3253

+

3254

+There are extra elements at the end of this structure for use

3255

+internally by <tt>iptables</tt>: you don't need to set them.

3256

+

3257

+<sect3>New Targets

3258

+

3259

+Your shared library's _init() function hands `register_target()' it

3260

+a pointer to a static `struct iptables_target', which has similar

3261

+fields to the iptables_match structure detailed above.

3262

+

3263

+<sect2>Using `libiptc'

3264

+

3265

+<tt>libiptc</tt> is the iptables control library, designed for

3266

+listing and manipulating rules in the iptables kernel module. While

3267

+its current use is for the iptables program, it makes writing other

3268

+tools fairly easy. You need to be root to use these functions.

3269

+

3270

+The kernel tables themselves are simply a table of rules, and a set

3271

+of numbers representing entry points. Chain names ("INPUT", etc) are

3272

+provided as an abstraction by the library. User defined chains are

3273

+labelled by inserting an error node before the head of the

3274

+user-defined chain, which contains the chain name in the extra data

3275

+section of the target (the builtin chain positions are defined by the

3276

+three table entry points).

3277

+

3278

+The following standard targets are supported: ACCEPT, DROP, QUEUE

3279

+(which are translated to NF_ACCEPT, NF_DROP, and NF_QUEUE,

3280

+respectively), RETURN (which is translated to a special IPT_RETURN

3281

+value handled by ip_tables), and JUMP (which is translated from the

3282

+chain name to an actual offset within the table).

3283

+

3284

+When `iptc_init()' is called, the table, including the counters, is

3285

+read. This table is manipulated by the `iptc_insert_entry()',

3286

+`iptc_replace_entry()', `iptc_append_entry()', `iptc_delete_entry()',

3287

+`iptc_delete_num_entry()', `iptc_flush_entries()',

3288

+`iptc_zero_entries()', `iptc_create_chain()' `iptc_delete_chain()',

3289

+and `iptc_set_policy()' functions.

3290

+

3291

+The table changes are not written back until the `iptc_commit()'

3292

+function is called. This means it is possible for two library users

3293

+operating on the same chain to race each other; locking would be

3294

+required to prevent this, and it is not currently done.

3295

+

3296

+There is no race with counters, however; counters are added back in

3297

+to the kernel in such a way that counter increments between the

3298

+reading and writing of the table still show up in the new table.

3299

+

3300

+There are various helper functions:

3301

+

3302

+<descrip>

3303

+<tag>iptc_first_chain()</tag> This function returns the first chain

3304

+name in the table.

3305

+

3306

+<tag>iptc_next_chain()</tag> This function returns the next chain name

3307

+in the table: NULL means no more chains.

3308

+

3309

+<tag>iptc_builtin()</tag> Returns true if the given chain name is the

3310

+name of a builtin chain.

3311

+

3312

+<tag>iptc_first_rule()</tag> This returns a pointer to the first rule

3313

+in the given chain name: NULL for an empty chain.

3314

+

3315

+<tag>iptc_next_rule()</tag> This returns a pointer to the next rule in

3316

+the chain: NULL means the end of the chain.

3317

+

3318

+<tag>iptc_get_target()</tag> This gets the target of the given rule. If

3319

+it's an extended target, the name of that target is returned. If it's

3320

+a jump to another chain, the name of that chain is returned. If it's

3321

+a verdict (eg. DROP), that name is returned. If it has no target (an

3322

+accounting-style rule), then the empty string is returned.

3323

+

3324

+Note that this function should be used instead of using the value

3325

+of the `verdict' field of the ipt_entry structure directly, as it

3326

+offers the above further interpretations of the standard verdict.

3327

+

3328

+<tag>iptc_get_policy()</tag> This gets the policy of a builtin chain,

3329

+and fills in the `counters' argument with the hit statistics on that

3330

+policy.

3331

+

3332

+<tag>iptc_strerror()</tag> This function returns a more meaningful

3333

+explanation of a failure code in the iptc library. If a function

3334

+fails, it will always set errno: this value can be passed to

3335

+iptc_strerror() to yield an error message.

3336

+</descrip>

3337

+

3338

+<sect1>Understanding NAT

3339

+

3340

+Welcome to Network Address Translation in the kernel. Note that

3341

+the infrastructure offered is designed more for completeness than raw

3342

+efficiency, and that future tweaks may increase the efficiency

3343

+markedly. For the moment I'm happy that it works at all.

3344

+

3345

+NAT is separated into connection tracking (which doesn't manipulate

3346

+packets at all), and the NAT code itself. Connection tracking is also

3347

+designed to be used by an iptables modules, so it makes subtle

3348

+distinctions in states which NAT doesn't care about.

3349

+

3350

+<sect2>Connection Tracking

3351

+

3352

+Connection tracking hooks into high-priority NF_IP_LOCAL_OUT and

3353

+NF_IP_PRE_ROUTING hooks, in order to see packets before they enter the

3354

+system.

3355

+

3356

+The nfct field in the skb is a pointer to inside the struct

3357

+ip_conntrack, at one of the infos[] array. Hence we can tell the

3358

+state of the skb by which element in this array it is pointing to:

3359

+this pointer encodes both the state structure and the relationship of

3360

+this skb to that state.

3361

+

3362

+The best way to extract the `nfct' field is to call

3363

+`ip_conntrack_get()', which returns NULL if it's not set, or the

3364

+connection pointer, and fills in ctinfo which describes the

3365

+relationship of the packet to that connection. This enumerated type

3366

+has several values:

3367

+

3368

+<descrip>

3369

+

3370

+<tag>IP_CT_ESTABLISHED</tag> The packet is part of an established

3371

+connection, in the original direction.

3372

+

3373

+<tag>IP_CT_RELATED</tag> The packet is related to the connection, and

3374

+is passing in the original direction.

3375

+

3376

+<tag>IP_CT_NEW</tag> The packet is trying to create a new connection

3377

+(obviously, it is in the original direction).

3378

+

3379

+<tag>IP_CT_ESTABLISHED + IP_CT_IS_REPLY</tag> The packet is part of an

3380

+established connection, in the reply direction.

3381

+

3382

+<tag>IP_CT_RELATED + IP_CT_IS_REPLY</tag> The packet is related to the

3383

+connection, and is passing in the reply direction.

3384

+</descrip>

3385

+

3386

+Hence a reply packet can be identified by testing for >=

3387

+IP_CT_IS_REPLY.

3388

+

3389

+<sect1>Extending Connection Tracking/NAT

3390

+

3391

+These frameworks are designed to accommodate any number of protocols

3392

+and different mapping types. Some of these mapping types might be

3393

+quite specific, such as a load-balancing/fail-over mapping type.

3394

+

3395

+Internally, connection tracking converts a packet to a "tuple",

3396

+representing the interesting parts of the packet, before searching for

3397

+bindings or rules which match it. This tuple has a manipulatable

3398

+part, and a non-manipulatable part; called "src" and "dst", as this is

3399

+the view for the first packet in the Source NAT world (it'd be a reply

3400

+packet in the Destination NAT world). The tuple for every packet in

3401

+the same packet stream in that direction is the same.

3402

+

3403

+For example, a TCP packet's tuple contains the manipulatable part:

3404

+source IP and source port, the non-manipulatable part: destination IP

3405

+and the destination port. The manipulatable and non-manipulatable

3406

+parts do not need to be the same type though; for example, an ICMP

3407

+packet's tuple contains the manipulatable part: source IP and the ICMP

3408

+id, and the non-manipulatable part: the destination IP and the ICMP

3409

+type and code.

3410

+

3411

+Every tuple has an inverse, which is the tuple of the reply packets

3412

+in the stream. For example, the inverse of an ICMP ping packet, icmp

3413

+id 12345, from 192.168.1.1 to 1.2.3.4, is a ping-reply packet, icmp id

3414

+12345, from 1.2.3.4 to 192.168.1.1.

3415

+

3416

+These tuples, represented by the `struct ip_conntrack_tuple', are used

3417

+widely. In fact, together with the hook the packet came in on (which

3418

+has an effect on the type of manipulation expected), and the device

3419

+involved, this is the complete information on the packet.

3420

+

3421

+Most tuples are contained within a `struct

3422

+ip_conntrack_tuple_hash', which adds a doubly linked list entry, and a

3423

+pointer to the connection that the tuple belongs to.

3424

+

3425

+A connection is represented by the `struct ip_conntrack': it has

3426

+two `struct ip_conntrack_tuple_hash' fields: one referring to the

3427

+direction of the original packet (tuplehash[IP_CT_DIR_ORIGINAL]), and

3428

+one referring to packets in the reply direction

3429

+(tuplehash[IP_CT_DIR_REPLY]).

3430

+

3431

+Anyway, the first thing the NAT code does is to see if the

3432

+connection tracking code managed to extract a tuple and find an

3433

+existing connection, by looking at the skbuff's nfct field; this tells

3434

+us if it's an attempt on a new connection, or if not, which direction

3435

+it is in; in the latter case, then the manipulations determined

3436

+previously for that connection are done.

3437

+

3438

+If it was the start of a new connection, we look for a rule for that

3439

+tuple, using the standard iptables traversal mechanism, on the `nat'

3440

+table. If a rule matches, it is used to initialize the manipulations

3441

+for both that direction and the reply; the connection-tracking code is

3442

+told that the reply it should expect has changed. Then, it's

3443

+manipulated as above.

3444

+

3445

+If there is no rule, a `null' binding is created: this usually does

3446

+not map the packet, but exists to ensure we don't map another stream

3447

+over an existing one. Sometimes, the null binding cannot be created,

3448

+because we have already mapped an existing stream over it, in which

3449

+case the per-protocol manipulation may try to remap it, even though

3450

+it's nominally a `null' binding.

3451

+

3452

+<sect2>Standard NAT Targets

3453

+

3454

+NAT targets are like any other iptables target extensions, except

3455

+they insist on being used only in the `nat' table. Both the SNAT and

3456

+DNAT targets take a `struct ip_nat_multi_range' as their extra data;

3457

+this is used to specify the range of addresses a mapping is allowed to

3458

+bind into. A range element, `struct ip_nat_range' consists of an

3459

+inclusive minimum and maximum IP address, and an inclusive maximum and

3460

+minimum protocol-specific value (eg. TCP ports). There is also room

3461

+for flags, which say whether the IP address can be mapped (sometimes

3462

+we only want to map the protocol-specific part of a tuple, not the

3463

+IP), and another to say that the protocol-specific part of the range

3464

+is valid.

3465

+

3466

+A multi-range is an array of these `struct ip_nat_range' elements;

3467

+this means that a range could be "1.1.1.1-1.1.1.2 ports 50-55 AND

3468

+1.1.1.3 port 80". Each range element adds to the range (a union, for

3469

+those who like set theory).

3470

+

3471

+<sect2>New Protocols

3472

+

3473

+<sect3> Inside The Kernel

3474

+

3475

+Implementing a new protocol first means deciding what the

3476

+manipulatable and non-manipulatable parts of the tuple should be.

3477

+Everything in the tuple has the property that it identifies the stream

3478

+uniquely. The manipulatable part of the tuple is the part you can do

3479

+NAT with: for TCP this is the source port, for ICMP it's the icmp ID;

3480

+something to use as a "stream identifier". The non-manipulatable part

3481

+is the rest of the packet that uniquely identifies the stream, but we

3482

+can't play with (eg. TCP destination port, ICMP type).

3483

+

3484

+Once you've decided this, you can write an extension to the

3485

+connection-tracking code in the directory, and go about populating the

3486

+`ip_conntrack_protocol' structure which you need to pass to

3487

+`ip_conntrack_register_protocol()'.

3488

+

3489

+The fields of `struct ip_conntrack_protocol' are:

3490

+

3491

+<descrip>

3492

+<tag>list</tag> Set it to '{ NULL, NULL }'; used to sew you into the list.

3493

+

3494

+<tag>proto</tag> Your protocol number; see `/etc/protocols'.

3495

+

3496

+<tag>name</tag> The name of your protocol. This is the name the user

3497

+will see; it's usually best if it's the canonical name in

3498

+`/etc/protocols'.

3499

+

3500

+<tag>pkt_to_tuple</tag> The function which fills out the protocol

3501

+specific parts of the tuple, given the packet. The `datah' pointer

3502

+points to the start of your header (just past the IP header), and the

3503

+datalen is the length of the packet. If the packet isn't long enough

3504

+to contain the header information, return 0; datalen will always be

3505

+at least 8 bytes though (enforced by framework).

3506

+

3507

+<tag>invert_tuple</tag> This function is simply used to change the

3508

+protocol-specific part of the tuple into the way a reply to that

3509

+packet would look.

3510

+

3511

+<tag>print_tuple</tag> This function is used to print out the

3512

+protocol-specific part of a tuple; usually it's sprintf()'d into the

3513

+buffer provided. The number of buffer characters used is returned.

3514

+This is used to print the states for the /proc entry.

3515

+

3516

+<tag>print_conntrack</tag> This function is used to print the private

3517

+part of the conntrack structure, if any, also used for printing the

3518

+states in /proc.

3519

+

3520

+<tag>packet</tag> This function is called when a packet is seen which

3521

+is part of an established connection. You get a pointer to the

3522

+conntrack structure, the IP header, the length, and the ctinfo. You

3523

+return a verdict for the packet (usually NF_ACCEPT), or -1 if the

3524

+packet is not a valid part of the connection. You can delete the

3525

+connection inside this function if you wish, but you must use the

3526

+following idiom to avoid races (see ip_conntrack_proto_icmp.c):

3527

+

3528

+<tscreen><verb>

3529

+if (del_timer(&ct->timeout))

3530

+ ct->timeout.function((unsigned long)ct);

3531

+</verb></tscreen>

3532

+

3533

+<tag>new</tag> This function is called when a packet creates a

3534

+connection for the first time; there is no ctinfo arg, since the first

3535

+packet is of ctinfo IP_CT_NEW by definition. It returns 0 to fail to

3536

+create the connection, or a connection timeout in jiffies.

3537

+</descrip>

3538

+

3539

+Once you've written and tested that you can track your new protocol,

3540

+it's time to teach NAT how to translate it. This means writing a new

3541

+module; an extension to the NAT code and go about populating the

3542

+`ip_nat_protocol' structure which you need to pass to

3543

+`ip_nat_protocol_register()'.

3544

+

3545

+<descrip>

3546

+<tag>list</tag> Set it to '{ NULL, NULL }'; used to sew you into the list.

3547

+

3548

+<tag>name</tag> The name of your protocol. This is the name the user

3549

+will see; it's best if it's the canonical name in `/etc/protocols' for

3550

+userspace auto-loading, as we'll see later.

3551

+

3552

+<tag>protonum</tag> Your protocol number; see `/etc/protocols'.

3553

+

3554

+<tag>manip_pkt</tag> This is the other half of connection tracking's

3555

+pkt_to_tuple function: you can think of it as "tuple_to_pkt". There

3556

+are some differences though: you get a pointer to the start of the IP

3557

+header, and the total packet length. This is because some protocols

3558

+(UDP, TCP) need to know the IP header. You're given the

3559

+ip_nat_tuple_manip field from the tuple (i.e., the "src" field), rather

3560

+than the entire tuple, and the type of manipulation you are to

3561

+perform.

3562

+

3563

+<tag>in_range</tag> This function is used to tell if manipulatable

3564

+part of the given tuple is in the given range. This function is a bit

3565

+tricky: we're given the manipulation type which has been applied to

3566

+the tuple, which tells us how to interpret the range (is it a source

3567

+range or a destination range we're aiming for?).

3568

+

3569

+This function is used to check if an existing mapping puts us in

3570

+the right range, and also to check if no manipulation is necessary at

3571

+all.

3572

+

3573

+<tag>unique_tuple</tag> This function is the core of NAT: given a

3574

+tuple and a range, we're to alter the per-protocol part of the tuple

3575

+to place it within the range, and make it unique. If we can't find an

3576

+unused tuple in the range, return 0. We also get a pointer to the

3577

+conntrack structure, which is required for ip_nat_used_tuple().

3578

+

3579

+The usual approach is to simply iterate the per-protocol part of

3580

+the tuple through the range, checking `ip_nat_used_tuple()' on it,

3581

+until one returns false.

3582

+

3583

+Note that the null-mapping case has already been checked: it's

3584

+either outside the range given, or already taken.

3585

+

3586

+If IP_NAT_RANGE_PROTO_SPECIFIED isn't set, it means that the user

3587

+is doing NAT, not NAPT: do something sensible with the range. If no

3588

+mapping is desirable (for example, within TCP, a destination mapping

3589

+should not change the TCP port unless ordered to), return 0.

3590

+

3591

+<tag>print</tag> Given a character buffer, a match tuple and a mask,

3592

+write out the per-protocol parts and return the length of the buffer

3593

+used.

3594

+

3595

+<tag>print_range</tag> Given a character buffer and a range, write out

3596

+the per-protocol part of the range, and return the length of the

3597

+buffer used. This won't be called if the IP_NAT_RANGE_PROTO_SPECIFIED

3598

+flag wasn't set for the range.

3599

+</descrip>

3600

+

3601

+<sect2>New NAT Targets

3602

+

3603

+This is the really interesting part. You can write new NAT targets

3604

+which provide a new mapping type: two extra targets are provided in

3605

+the default package: MASQUERADE and REDIRECT. These are fairly simple

3606

+to illustrate the potential and power of writing a new NAT target.

3607

+

3608

+These are written just like any other iptables targets, but

3609

+internally they will extract the connection and call

3610

+`ip_nat_setup_info()'.

3611

+

3612

+<sect2>Protocol Helpers

3613

+

3614

+Protocol helpers for connection tracking allow the connection

3615

+tracking code to understand protocols which use multiple network

3616

+connections (eg. FTP) and mark the `child' connections as being

3617

+related to the initial connection, usually by reading the related

3618

+address out of the data stream.

3619

+

3620

+Protocol helpers for NAT do two things: firstly allow the NAT code

3621

+to manipulate the data stream to change the address contained within

3622

+it, and secondly to perform NAT on the related connection when it

3623

+comes in, based on the original connection.

3624

+

3625

+<sect2>Connection Tracking Helper Modules

3626

+

3627

+<sect3>Description

3628

+

3629

+The duty of a connection tracking module is to specify which packets

3630

+belong to an already established connection. The module has the

3631

+following means to do that:

3632

+

3633

+<itemize>

3634

+<item>Tell netfilter which packets our module is interested in (most

3635

+helpers operate on a particular port).

3636

+

3637

+<item>Register a function with netfilter. This function is called for

3638

+every packet which matches the criteria above.

3639

+

3640

+<item>An `ip_conntrack_expect_related()' function which can be called

3641

+from there to tell netfilter to expect related connections.</item>

3642

+</itemize>

3643

+

3644

+

3645

+If there is some additional work to be done at the time the first packet

3646

+of the expected connection arrives, the module can register a callback

3647

+function which is called at that time.

3648

+

3649

+<sect3>Structures and Functions Available

3650

+

3651

+Your kernel module's init function has to call

3652

+`ip_conntrack_helper_register()' with a pointer to a

3653

+`struct ip_conntrack_helper'. This struct has the following fields:

3654

+

3655

+<descrip>

3656

+<tag>list</tag>This is the header for the linked list. Netfilter

3657

+handles this list internally. Just initialize it with `{ NULL, NULL }'.

3658

+

3659

+<tag>name</tag>This is a pointer to a string constant specifying the

3660

+name of the protocol. ("ftp", "irc", ...)

3661

+

3662

+<tag>flags</tag>A set of flags with one or more out of the following flgs:

3663

+<itemize>

3664

+<item>IP_CT_HELPER_F_REUSE_EXPECT : Reuse expectations if the limit (see

3665

+`max_expected` below) is reached.</item>

3666

+</itemize>

3667

+

3668

+<tag>me</tag>A pointer to the module structure of the helper. Intitialize this with the `THIS_MODULE' macro.

3669

+

3670

+<tag>max_expected</tag>Maximum number of unconfirmed (outstanding) expectations.

3671

+

3672

+<tag>timeout</tag>Timeout (in seconds) for each unconfirmed expectation. An expectation is deleted `timeout' seconds after the expectation was issued with the `ip_conntrack_expect_related()' function.

3673

+

3674

+<tag>tuple</tag>This is a `struct ip_conntrack_tuple' which specifies

3675

+the packets our conntrack helper module is interested in.

3676

+

3677

+<tag>mask</tag>Again a `struct ip_conntrack_tuple'. This mask

3678

+specifies which bits of <tt>tuple</tt> are valid.

3679

+

3680

+<tag>help</tag>The function which netfilter should call for each

3681

+packet matching tuple+mask

3682

+</descrip>

3683

+

3684

+<sect3>Example skeleton of a conntrack helper module

3685

+

3686

+<tscreen><code>

3687

+#define FOO_PORT 111

3688

+

3689

+static int foo_expectfn(struct ip_conntrack *new)

3690

+{

3691

+ /* called when the first packet of an expected

3692

+ connection arrives */

3693

+

3694

+ return 0;

3695

+}

3696

+

3697

+static int foo_help(const struct iphdr *iph, size_t len,

3698

+ struct ip_conntrack *ct,

3699

+ enum ip_conntrack_info ctinfo)

3700

+{

3701

+ /* analyze the data passed on this connection and

3702

+ decide how related packets will look like */

3703

+

3704

+ /* update per master-connection private data

3705

+ (session state, ...) */

3706

+ ct->help.ct_foo_info = ...

3707

+

3708

+ if (there_will_be_new_packets_related_to_this_connection)

3709

+ {

3710

+ struct ip_conntrack_expect exp;

3711

+

3712

+ memset(&exp, 0, sizeof(exp));

3713

+ exp.t = tuple_specifying_related_packets;

3714

+ exp.mask = mask_for_above_tuple;

3715

+ exp.expectfn = foo_expectfn;

3716

+ exp.seq = tcp_sequence_number_of_expectation_cause;

3717

+

3718

+ /* per slave-connection private data */

3719

+ exp.help.exp_foo_info = ...

3720

+

3721

+ ip_conntrack_expect_related(ct, &exp);

3722

+ }

3723

+ return NF_ACCEPT;

3724

+}

3725

+

3726

+static struct ip_conntrack_helper foo;

3727

+

3728

+static int __init init(void)

3729

+{

3730

+ memset(&foo, 0, sizeof(struct ip_conntrack_helper);

3731

+

3732

+ foo.name = "foo";

3733

+ foo.flags = IP_CT_HELPER_F_REUSE_EXPECT;

3734

+ foo.me = THIS_MODULE;

3735

+ foo.max_expected = 1; /* one expectation at a time */

3736

+ foo.timeout = 0; /* expectation never expires */

3737

+

3738

+ /* we are interested in all TCP packets with destport 111 */

3739

+ foo.tuple.dst.protonum = IPPROTO_TCP;

3740

+ foo.tuple.dst.u.tcp.port = htons(FOO_PORT);

3741

+ foo.mask.dst.protonum = 0xFFFF;

3742

+ foo.mask.dst.u.tcp.port = 0xFFFF;

3743

+ foo.help = foo_help;

3744

+

3745

+ return ip_conntrack_helper_register(&foo);

3746

+}

3747

+

3748

+static void __exit fini(void)

3749

+{

3750

+ ip_conntrack_helper_unregister(&foo);

3751

+}

3752

+</code></tscreen>

3753

+

3754

+

3755

+<sect2>NAT helper modules

3756

+

3757

+<sect3>Description

3758

+

3759

+NAT helper modules do some application specific NAT handling. Usually

3760

+this includes on-the-fly manipulation of data: think about the PORT

3761

+command in FTP, where the client tells the server which IP/port to

3762

+connect to. Therefor an FTP helper module must replace the IP/port

3763

+after the PORT command in the FTP control connection.

3764

+

3765

+

3766

+If we are dealing with TCP, things get slightly more complicated. The

3767

+reason is a possible change of the packet size (FTP example: the

3768

+length of the string representing an IP/port tuple after the PORT

3769

+command has changed). If we change the packet size, we have a syn/ack

3770

+difference between left and right side of the NAT box. (i.e. if we had

3771

+extended one packet by 4 octets, we have to add this offset to the TCP

3772

+sequence number of each following packet).

3773

+

3774

+

3775

+Special NAT handling of all related packets is required, too. Take as

3776

+example again FTP, where all incoming packets of the DATA connection

3777

+have to be NATed to the IP/port given by the client with the PORT

3778

+command on the control connection, rather than going through the

3779

+normal table lookup.

3780

+

3781

+<itemize>

3782

+<item>callback for the packet causing the related connection (foo_help)

3783

+<item>callback for all related packets (foo_nat_expected)

3784

+</itemize>

3785

+

3786

+<sect3>Structures and Functions Available

3787

+

3788

+Your nat helper module's `init()' function calls

3789

+`ip_nat_helper_register()' with a pointer to a `struct

3790

+ip_nat_helper'. This struct has the following members:

3791

+

3792

+<descrip>

3793

+<tag>list</tag>Just again the list header for netfilters internal use.

3794

+Initialize this with { NULL, NULL }.

3795

+

3796

+<tag>name</tag>A pointer to a string constant with the protocol's name

3797

+

3798

+<tag>flags</tag>A set out of zero, one or more of the following flags:

3799

+<itemize>

3800

+<item>IP_NAT_HELPER_F_ALWAYS : Call the NAT helper for every packet,

3801

+not only for packets where conntrack has detected an expectation-cause.</item>

3802

+<item>IP_NAT_HELPER_F_STANDALONE : Tell the NAT core that this protocol

3803

+doesn't have a conntrack helper, only a NAT helper.</item>

3804

+</itemize>

3805

+

3806

+<tag>me</tag>A pointer to the module structure of the helper. Initialize

3807

+this using the `THIS_MODULE' macro.

3808

+

3809

+<tag>tuple</tag>a `struct ip_conntrack_tuple' describing which packets

3810

+our NAT helper is interested in.

3811

+

3812

+<tag>mask</tag>a `struct ip_conntrack_tuple', telling netfilter which

3813

+bits of <tt>tuple</tt> are valid.

3814

+

3815

+<tag>help</tag>The help function which is called for each packet

3816

+matching tuple+mask.

3817

+

3818

+<tag>expect</tag>The expect function which is called for every first

3819

+packet of an expected connection.

3820

+

3821

+</descrip>

3822

+

3823

+This is very similar to writing a connection tracking helper.

3824

+

3825

+<sect3>Example NAT helper module

3826

+

3827

+<tscreen><code>

3828

+#define FOO_PORT 111

3829

+

3830

+static int foo_nat_expected(struct sk_buff **pksb,

3831

+ unsigned int hooknum,

3832

+ struct ip_conntrack *ct,

3833

+ struct ip_nat_info *info)

3834

+/* called whenever the first packet of a related connection arrives.

3835

+ params: pksb packet buffer

3836

+ hooknum HOOK the call comes from (POST_ROUTING, PRE_ROUTING)

3837

+ ct information about this (the related) connection

3838

+ info &ct->nat.info

3839

+ return value: Verdict (NF_ACCEPT, ...)

3840

+{

3841

+ /* Change ip/port of the packet to the masqueraded

3842

+ values (read from master->tuplehash), to map it the same way,

3843

+ call ip_nat_setup_info, return NF_ACCEPT. */

3844

+

3845

+}

3846

+

3847

+static int foo_help(struct ip_conntrack *ct,

3848

+ struct ip_conntrack_expect *exp,

3849

+ struct ip_nat_info *info,

3850

+ enum ip_conntrack_info ctinfo,

3851

+ unsigned int hooknum,

3852

+ struct sk_buff **pksb)

3853

+/* called for every packet where conntrack detected an expectation-cause

3854

+ params: ct struct ip_conntrack of the master connection

3855

+ exp struct ip_conntrack_expect of the expectation

3856

+ caused by the conntrack helper for this protocol

3857

+ info (STATE: related, new, established, ... )

3858

+ hooknum HOOK the call comes from (POST_ROUTING, PRE_ROUTING)

3859

+ pksb packet buffer

3860

+*/

3861

+{

3862

+

3863

+ /* extract information about future related packets (you can

3864

+ share information with the connection tracking's foo_help).

3865

+ Exchange address/port with masqueraded values, insert tuple

3866

+ about related packets */

3867

+}

3868

+

3869

+static struct ip_nat_helper hlpr;

3870

+

3871

+static int __init(void)

3872

+{

3873

+ int ret;

3874

+

3875

+ memset(&hlpr, 0, sizeof(struct ip_nat_helper));

3876

+ hlpr.list = { NULL, NULL };

3877

+ hlpr.tuple.dst.protonum = IPPROTO_TCP;

3878

+ hlpr.tuple.dst.u.tcp.port = htons(FOO_PORT);

3879

+ hlpr.mask.dst.protonum = 0xFFFF;

3880

+ hlpr.mask.dst.u.tcp.port = 0xFFFF;

3881

+ hlpr.help = foo_help;

3882

+ hlpr.expect = foo_nat_expect;

3883

+

3884

+ ret = ip_nat_helper_register(hlpr);

3885

+

3886

+ return ret;

3887

+}

3888

+

3889

+static void __exit(void)

3890

+{

3891

+ ip_nat_helper_unregister(&hlpr);

3892

+}

3893

+</code></tscreen>

3894

+

3895

+<sect1>Understanding Netfilter

3896

+

3897

+Netfilter is pretty simple, and is described fairly thoroughly in

3898

+the previous sections. However, sometimes it's necessary to go

3899

+beyond what the NAT or ip_tables infrastructure offers, or you may

3900

+want to replace them entirely.

3901

+

3902

+One important issue for netfilter (well, in the future) is caching.

3903

+Each skb has an `nfcache' field: a bitmask of what fields in the

3904

+header were examined, and whether the packet was altered or not. The

3905

+idea is that each hook off netfilter OR's in the bits relevant to it,

3906

+so that we can later write a cache system which will be clever enough

3907

+to realize when packets do not need to be passed through netfilter at

3908

+all.

3909

+

3910

+The most important bits are NFC_ALTERED, meaning the packet was

3911

+altered (this is already used for IPv4's NF_IP_LOCAL_OUT hook, to

3912

+reroute altered packets), and NFC_UNKNOWN, which means caching should

3913

+not be done because some property which cannot be expressed was

3914

+examined. If in doubt, simply set the NFC_UNKNOWN flag on the skb's

3915

+nfcache field inside your hook.

3916

+

3917

+<sect1>Writing New Netfilter Modules

3918

+

3919

+<sect2> Plugging Into Netfilter Hooks

3920

+

3921

+ To receive/mangle packets inside the kernel, you can simply write

3922

+a module which registers a "netfilter hook". This is basically an

3923

+expression of interest at some given point; the actual points are

3924

+protocol-specific, and defined in protocol-specific netfilter headers,

3925

+such as "netfilter_ipv4.h".

3926

+

3927

+ To register and unregister netfilter hooks, you use the functions

3928

+`nf_register_hook' and `nf_unregister_hook'. These each take a

3929

+pointer to a `struct nf_hook_ops', which you populate as follows:

3930

+

3931

+<descrip>

3932

+<tag>list</tag> Used to sew you into the linked list: set to '{ NULL,

3933

+NULL }'

3934

+

3935

+<tag>hook</tag> The function which is called when a packet hits this

3936

+hook point. Your function must return NF_ACCEPT, NF_DROP or NF_QUEUE.

3937

+If NF_ACCEPT, the next hook attached to that point will be called. If

3938

+NF_DROP, the packet is dropped. If NF_QUEUE, it's queued. You

3939

+receive a pointer to an skb pointer, so you can entirely replace the

3940

+skb if you wish.

3941

+

3942

+<tag>flush</tag> Currently unused: designed to pass on packet hits

3943

+when the cache is flushed. May never be implemented: set it to NULL.

3944

+

3945

+<tag>pf</tag> The protocol family, eg, `PF_INET' for IPv4.

3946

+

3947

+<tag>hooknum</tag> The number of the hook you are interested in, eg

3948

+`NF_IP_LOCAL_OUT'.

3949

+</descrip>

3950

+

3951

+<sect2> Processing Queued Packets

3952

+

3953

+This interface is currently used by ip_queue; you can register to

3954

+handle queued packets for a given protocol. This has similar semantics

3955

+to registering for a hook, except you can block processing the packet,

3956

+and you only see packets for which a hook has replied `NF_QUEUE'.

3957

+

3958

+The two functions used to register interest in queued packets are

3959

+`nf_register_queue_handler()' and `nf_unregister_queue_handler()'. The

3960

+function you register will be called with the `void *' pointer you

3961

+handed it to `nf_register_queue_handler()'.

3962

+

3963

+

3964

+If no-one is registered to handle a protocol, then returning NF_QUEUE

3965

+is equivalent to returning NF_DROP.

3966

+

3967

+

3968

+Once you have registered interest in queued packets, they begin

3969

+queueing. You can do whatever you want with them, but you must call

3970

+`nf_reinject()' when you are finished with them (don't simply

3971

+kfree_skb() them). When you reinject an skb, you hand it the skb, the

3972

+`struct nf_info' which your queue handler was given, and a verdict:

3973

+NF_DROP causes them to be dropped, NF_ACCEPT causes them to continue

3974

+to iterate through the hooks, NF_QUEUE causes them to be queued again,

3975

+and NF_REPEAT causes the hook which queued the packet to be consulted

3976

+again (beware infinite loops).

3977

+

3978

+You can look inside the `struct nf_info' to get auxiliary

3979

+information about the packet, such as the interfaces and hook it was

3980

+on.

3981

+

3982

+<sect2> Receiving Commands From Userspace

3983

+

3984

+It is common for netfilter components to want to interact with

3985

+userspace. The method for doing this is by using the setsockopt

3986

+mechanism. Note that each protocol must be modified to call

3987

+nf_setsockopt() for setsockopt numbers it doesn't understand (and

3988

+nf_getsockopt() for getsockopt numbers), and so far only IPv4, IPv6

3989

+and DECnet have been modified.

3990

+

3991

+Using a now-familiar technique, we register a `struct

3992

+nf_sockopt_ops' using the nf_register_sockopt() call. The fields of

3993

+this structure are as follows:

3994

+

3995

+<descrip>

3996

+<tag>list</tag> Used to sew it into the linked list: set to '{ NULL,

3997

+NULL }'.

3998

+

3999

+<tag>pf</tag> The protocol family you handle, eg. PF_INET.

4000

+

4001

+<tag>set_optmin</tag> and

4002

+<tag>set_optmax</tag>

4003

+

4004

+These specify the (exclusive) range of setsockopt numbers handled.

4005

+Hence using 0 and 0 means you have no setsockopt numbers.

4006

+

4007

+<tag>set</tag> This is the function called when the user calls one of

4008

+your setsockopts. You should check that they have NET_ADMIN

4009

+capability within this function.

4010

+

4011

+<tag>get_optmin</tag> and

4012

+<tag>get_optmax</tag>

4013

+

4014

+These specify the (exclusive) range of getsockopt numbers handled.

4015

+Hence using 0 and 0 means you have no getsockopt numbers.

4016

+

4017

+<tag>get</tag> This is the function called when the user calls one of

4018

+your getsockopts. You should check that they have NET_ADMIN

4019

+capability within this function.

4020

+</descrip>

4021

+

4022

+The final two fields are used internally.

4023

+

4024

+<sect1>Packet Handling in Userspace

4025

+

4026

+Using the libipq library and the `ip_queue' module, almost anything

4027

+which can be done inside the kernel can now be done in userspace.

4028

+This means that, with some speed penalty, you can develop your code

4029

+entirely in userspace. Unless you are trying to filter large

4030

+bandwidths, you should find this approach superior to in-kernel packet

4031

+mangling.

4032

+

4033

+In the very early days of netfilter, I proved this by porting an

4034

+embryonic version of iptables to userspace. Netfilter opens the doors

4035

+for more people to write their own, fairly efficient netmangling

4036

+modules, in whatever language they want.

4037

+

4038

+<sect>Translating 2.0 and 2.2 Packet Filter Modules

4039

+

4040

+Look at the ip_fw_compat.c file for a simple layer which should

4041

+make porting quite simple.

4042

+

4043

+<sect>Netfilter Hooks for Tunnel Writers

4044

+

4045

+Authors of tunnel (or encapsulation) drivers should follow two

4046

+simple rules for the 2.4 kernel (as do the drivers inside the kernel,

4047

+like net/ipv4/ipip.c):

4048

+

4049

+<itemize>

4050

+<item>

4051

+Release skb->nfct if you're going to make the packet unrecognisable

4052

+(ie. decapsulating/encapsulating). You don't need to do this if you

4053

+unwrap it into a *new* skb, but if you're going to do it in place, you

4054

+must do this.

4055

+

4056

+Otherwise: the NAT code will use the old connection tracking

4057

+information to mangle the packet, with bad consequences.

4058

+

4059

+<item>Make sure the encapsulated packets go through the LOCAL_OUT

4060

+hook, and decapsulated packets go through the PRE_ROUTING hook (most

4061

+tunnels use ip_rcv(), which does this for you).

4062

+

4063

+Otherwise: the user will not be able to filter as they expect to with

4064

+tunnels.

4065

+</itemize>

4066

+

4067

+The canonical way to do the first is to insert code like the

4068

+following before you wrap or unwrap the packet:

4069

+

4070

+<tscreen><verb>

4071

+ /* Tell the netfilter framework that this packet is not the

4072

+ same as the one before! */

4073

+#ifdef CONFIG_NETFILTER

4074

+ nf_conntrack_put(skb->nfct);

4075

+ skb->nfct = NULL;

4076

+#ifdef CONFIG_NETFILTER_DEBUG

4077

+ skb->nf_debug = 0;

4078

+#endif

4079

+#endif

4080

+</verb></tscreen>

4081

+

4082

+Usually, all you need to do for the second, is to find where the

4083

+newly encapsulated packet goes into "ip_send()", and replace it with

4084

+something like:

4085

+

4086

+<tscreen><verb>

4087

+ /* Send "new" packet from local host */

4088

+ NF_HOOK(PF_INET, NF_IP_LOCAL_OUT, skb, NULL, rt->u.dst.dev, ip_send);

4089

+</verb></tscreen>

4090

+

4091

+ Following these rules means that the person setting up the packet

4092

+filtering rules on the tunnel box will see something like the

4093

+following sequence for a packet being tunnelled:

4094

+

4095

+<enum>

4096

+<item> FORWARD hook: normal packet (from eth0 -> tunl0)

4097

+<item> LOCAL_OUT hook: encapsulated packet (to eth1).

4098

+</enum>

4099

+

4100

+And for the reply packet:

4101

+<enum>

4102

+<item> LOCAL_IN hook: encapsulated reply packet (from eth1)

4103

+<item> FORWARD hook: reply packet (from eth1 -> eth0).

4104

+</enum>

4105

+

4106

+<sect>The Test Suite

4107

+

4108

+Within the CVS repository lives a test suite: the more the test

4109

+suite covers, the greater confidence you can have that changes to the

4110

+code hasn't quietly broken something. Trivial tests are at least as

4111

+important as tricky tests: it's the trivial tests which simplify the

4112

+complex tests (since you know the basics work fine before the complex

4113

+test gets run).

4114

+

4115

+The tests are simple: they are just shell scripts under the

4116

+testsuite/ subdirectory which are supposed to succeed. The scripts

4117

+are run in alphabetical order, so `01test' is run before `02test'.

4118

+Currently there are 5 test directories:

4119

+

4120

+<descrip>

4121

+<tag>00netfilter/</tag> General netfilter framework tests.

4122

+<tag>01iptables/</tag> iptables tests.

4123

+<tag>02conntrack/</tag> connection tracking tests.

4124

+<tag>03NAT/</tag> NAT tests

4125

+<tag>04ipchains-compat/</tag> ipchains/ipfwadm compatibility tests

4126

+</descrip>

4127

+

4128

+Inside the testsuite/ directory is a script called `test.sh'. It

4129

+configures two dummy interfaces (tap0 and tap1), turns forwarding on,

4130

+and removes all netfilter modules. Then it runs through the

4131

+directories above and runs each of their test.sh scripts until one

4132

+fails. This script takes two optional arguments: `-v' meaning to

4133

+print out each test as it proceeds, and an optional test name: if this

4134

+is given, it will skip over all tests until this one is found.

4135

+

4136

+<sect1>Writing a Test

4137

+

4138

+Create a new file in the appropriate directory: try to number your

4139

+test so that it gets run at the right time. For example, in order to

4140

+test ICMP reply tracking (02conntrack/02reply.sh), we need to first

4141

+check that outgoing ICMPs are tracked properly

4142

+(02conntrack/01simple.sh).

4143

+

4144

+It's usually better to create many small files, each of which

4145

+covers one area, because it helps to isolate problems immediately for

4146

+people running the testsuite.

4147

+

4148

+If something goes wrong in the test, simply do an `exit 1', which

4149

+causes failure; if it's something you expect may fail, you should

4150

+print a unique message. Your test should end with `exit 0' if

4151

+everything goes OK. You should check the success of <bf>every</bf>

4152

+command, either using `set -e' at the top of the script, or

4153

+appending `|| exit 1' to the end of each command.

4154

+

4155

+The helper functions `load_module' and `remove_module' can be used

4156

+to load modules: you should never rely on autoloading in the testsuite

4157

+unless that is what you are specifically testing.

4158

+

4159

+<sect1>Variables And Environment

4160

+

4161

+You have two play interfaces: tap0 and tap1. Their interface

4162

+addresses are in variables <tt>$TAP0</tt> and <tt>$TAP1</tt>

4163

+respectively. They both have netmasks of 255.255.255.0; their

4164

+networks are in $TAP0NET and $TAP1NET respectively.

4165

+

4166

+There is an empty temporary file in $TMPFILE. It is deleted at the

4167

+end of your test.

4168

+

4169

+Your script will be run from the testsuite/ directory, wherever it

4170

+is. Hence you should access tools (such as iptables) using path

4171

+starting with `../userspace'.

4172

+

4173

+Your script can print out more information if $VERBOSE is set

4174

+(meaning that the user specified `-v' on the command line).

4175

+

4176

+<sect1>Useful Tools

4177

+

4178

+

4179

+There are several useful testsuite tools in the "tools" subdirectory:

4180

+each one exits with a non-zero exit status if there is a problem.

4181

+

4182

+<sect2>gen_ip

4183

+

4184

+You can generate IP packets using `gen_ip', which outputs an IP

4185

+packet to standard input. You can feed packets in the tap0 and tap1

4186

+by sending standard output to /dev/tap0 and /dev/tap1 (these are

4187

+created upon first running the testsuite if they don't exist).

4188

+

4189

+gen_ip is a simplistic program which is currently very fussy about

4190

+its argument order. First are the general optional arguments:

4191

+

4192

+<descrip>

4193

+

4194

+<tag>FRAG=offset,length</tag> Generate the packet, then turn it into a

4195

+ fragment at the following offset and length.

4196

+

4197

+<tag>MF</tag> Set the `More Fragments' bit on the packet.

4198

+

4199

+<tag>MAC=xx:xx:xx:xx:xx:xx</tag> Set the source MAC address on the

4200

+ packet.

4201

+

4202

+<tag>TOS=tos</tag> Set the TOS field on the packet (0 to 255).

4203

+

4204

+</descrip>

4205

+

4206

+Next come the compulsory arguments:

4207

+

4208

+<descrip>

4209

+<tag>source ip</tag> Source IP address of the packet.

4210

+

4211

+<tag>dest ip</tag> Destination IP address of the packet.

4212

+

4213

+<tag>length</tag> Total length of the packet, including headers.

4214

+

4215

+<tag>protocol</tag> Protocol number of the packet, eg 17 = UDP.

4216

+

4217

+</descrip>

4218

+

4219

+Then the arguments depend on the protocol: for UDP (17), they are the

4220

+source and destination port numbers. For ICMP (1), they are the type

4221

+and code of the ICMP message: if the type is 0 or 8 (ping-reply or

4222

+ping), then two additional arguments (the ID and sequence fields) are

4223

+required. For TCP, the source and destination ports, and flags

4224

+("SYN", "SYN/ACK", "ACK", "RST" or "FIN") are required. There are

4225

+three optional arguments: "OPT=" followed by a comma-separated list of

4226

+options, "SYN=" followed by a sequence number, and "ACK=" followed by

4227

+a sequence number. Finally, the optional argument "DATA" indicates

4228

+that the payload of the TCP packet is to be filled with the contents

4229

+of standard input.

4230

+

4231

+<sect2>rcv_ip

4232

+

4233

+You can see IP packets using `rcv_ip', which prints out the command

4234

+line as close as possible to the original value fed to gen_ip

4235

+(fragments are the exception).

4236

+

4237

+This is extremely useful for analyzing packets. It takes two

4238

+compulsory arguments:

4239

+

4240

+<descrip>

4241

+<tag>wait time</tag> The maximum time in seconds to wait for a packet

4242

+ from standard input.

4243

+

4244

+<tag>iterations</tag> The number of packets to receive.

4245

+</descrip>

4246

+

4247

+There is one optional argument, "DATA", which causes the payload of a

4248

+TCP packet to be printed on standard output after the packet header.

4249

+

4250

+The standard way to use `rcv_ip' in a shell script is as follows:

4251

+

4252

+<verb>

4253

+# Set up job control, so we can use & in shell scripts.

4254

+set -m

4255

+

4256

+# Wait two seconds for one packet from tap0

4257

+../tools/rcv_ip 2 1 < /dev/tap0 > $TMPFILE &

4258

+

4259

+# Make sure that rcv_ip has started running.

4260

+sleep 1

4261

+

4262

+# Send a ping packet

4263

+../tools/gen_ip $TAP1NET.2 $TAP0NET.2 100 1 8 0 55 57 > /dev/tap1 || exit 1

4264

+

4265

+# Wait for rcv_ip,

4266

+if wait %../tools/rcv_ip; then :

4267

+else

4268

+ echo rcv_ip failed:

4269

+ cat $TMPFILE

4270

+ exit 1

4271

+fi

4272

+</verb>

4273

+

4274

+<sect2>gen_err

4275

+

4276

+This program takes a packet (as generated by gen_ip, for example)

4277

+on standard input, and turns it into an ICMP error.

4278

+

4279

+It takes three arguments: a source IP address, a type and a code.

4280

+The destination IP address will be set to the source IP address of the

4281

+packet fed in standard input.

4282

+

4283

+<sect2>local_ip

4284

+

4285

+This takes a packet from standard input and injects it into the

4286

+system from a raw socket. This give the appearance of a

4287

+locally-generated packet (as separate from feeding a packet in one of

4288

+the ethertap devices, which looks like a remotely-generated packet).

4289

+

4290

+<sect1>Random Advice

4291

+

4292

+All the tools assume they can do everything in one read or write:

4293

+this is true for the ethertap devices, but might not be true if you're

4294

+doing something tricky with pipes.

4295

+

4296

+dd can be used to cut packets: dd has an obs (output block size)

4297

+option which can be used to make it output the packet in a single

4298

+write.

4299

+

4300

+Test for success first: eg. testing that packets are successfully

4301

+blocked. First test that packets pass through normally, <bf>then</bf>

4302

+test that some packets are blocked. Otherwise an unrelated failure

4303

+could be stopping the packets...

4304

+

4305

+Try to write exact tests, not `throw random stuff and see what

4306

+happens' tests. If an exact test goes wrong, it's a useful thing to

4307

+know. If a random test goes wrong once, it doesn't help much.

4308

+

4309

+If a test fails without a message, you can add `-x' to the top line

4310

+of the script (ie. `#! /bin/sh -x') to see what commands it's running.

4311

+

4312

+If a test fails randomly, check for random network traffic

4313

+interfering (try downing all your external interfaces). Sitting on

4314

+the same network as Andrew Tridgell, I tend to get plagued by Windows

4315

+broadcasts, for example.

4316

+

4317

+<sect>Motivation

4318

+

4319

+As I was developing ipchains, I realized (in one of those

4320

+blinding-flash-while-waiting-for-entree moments in a Chinese

4321

+restaurant in Sydney) that packet filtering was being done in the

4322

+wrong place. I can't find it now, but I remember sending mail to Alan

4323

+Cox, who kind of said `why don't you finish what you're doing, first,

4324

+even though you're probably right'. In the short term, pragmatism was

4325

+to win over The Right Thing.

4326

+

4327

+After I finished ipchains, which was initially going to be a minor

4328

+modification of the kernel part of ipfwadm, and turned into a larger

4329

+rewrite, and wrote the HOWTO, I became aware of just how much

4330

+confusion there is in the wider Linux community about issues like

4331

+packet filtering, masquerading, port forwarding and the like.

4332

+

4333

+This is the joy of doing your own support: you get a closer feel

4334

+for what the users are trying to do, and what they are struggling

4335

+with. Free software is most rewarding when it's in the hands of the

4336

+most users (that's the point, right?), and that means making it easy.

4337

+The architecture, not the documentation, was the key flaw.

4338

+

4339

+So I had the experience, with the ipchains code, and a good idea of

4340

+what people out there were doing. There were only two problems.

4341

+

4342

+Firstly, I didn't want to get back into security. Being a security

4343

+consultant is a constant moral tug-of-war between your conscience and

4344

+your wallet. At a fundamental level, you are selling the feeling of

4345

+security, which is at odds with actual security. Maybe working in a

4346

+military setting, where they understand security, it'd be different.

4347

+

4348

+The second problem is that newbie users aren't the only concern; an

4349

+increasing number of large companies and ISPs are using this stuff. I

4350

+needed reliable input from that class of users if it was to scale to

4351

+tomorrow's home users.

4352

+

4353

+These problems were resolved, when I ran into David Bonn, of

4354

+WatchGuard fame, at Usenix in July 1998. They were looking for a

4355

+Linux kernel coder; in the end we agreed that I'd head across to their

4356

+Seattle offices for a month and we'd see if we could hammer out an

4357

+agreement whereby they'd sponsor my new code, and my current support

4358

+efforts. The rate we agreed on was more than I asked, so I didn't

4359

+take a pay cut. This means I don't have to even think about external

4360

+conslutting for a while.

4361

+

4362

+Exposure to WatchGuard gave me exposure to the large clients I

4363

+need, and being independent from them allowed me to support all users

4364

+(eg. WatchGuard competitors) equally.

4365

+

4366

+So I could have simply written netfilter, ported ipchains over the

4367

+top, and been done with it. Unfortunately, that would leave all the

4368

+masquerading code in the kernel: making masquerading independent from

4369

+filtering is the one of the major wins point of moving the packet

4370

+filtering points, but to do that masquerading also needed to be moved

4371

+over to the netfilter framework as well.

4372

+

4373

+Also, my experience with ipfwadm's `interface-address' feature (the

4374

+one I removed in ipchains) had taught me that there was no hope of

4375

+simply ripping out the masquerading code and expecting someone who

4376

+needed it to do the work of porting it onto netfilter for me.

4377

+

4378

+So I needed to have at least as many features as the current code;

4379

+preferably a few more, to encourage niche users to become early

4380

+adopters. This means replacing transparent proxying (gladly!),

4381

+masquerading and port forwarding. In other words, a complete NAT layer.

4382

+

4383

+Even if I had decided to port the existing masquerading layer,

4384

+instead of writing a generic NAT system, the masquerading code was

4385

+showing its age, and lack of maintenance. See, there was no

4386

+masquerading maintainer, and it shows. It seems that serious users

4387

+generally don't use masquerading, and there aren't many home users up

4388

+to the task of doing maintenance. Brave people like Juan Ciarlante

4389

+were doing fixes, but it had reached to the stage (being extended over

4390

+and over) that a rewrite was needed.

4391

+

4392

+Please note that I wasn't the person to do a NAT rewrite: I didn't

4393

+use masquerading any more, and I'd not studied the existing code at

4394

+the time. That's probably why it took me longer than it should have.

4395

+But the result is fairly good, in my opinion, and I sure as hell

4396

+learned a lot. No doubt the second version will be even better, once

4397

+we see how people use it.

4398

+

4399

+<sect>Thanks

4400

+

4401

+Thanks to those who helped, expecially Harald Welte for writing the

4402

+Protocol Helpers section.

4403

+</article>

4404

Index: iptables-1.4.12/howtos/packet-filtering-HOWTO.sgml

4405

===================================================================

4406

--- /dev/null 1970-01-01 00:00:00.000000000 +0000

4407

+++ iptables-1.4.12/howtos/packet-filtering-HOWTO.sgml 2011-11-07 13:57:14.000000000 -0600

4408

@@ -0,0 +1,1339 @@

4409

+<!doctype linuxdoc system>

4410

+

4411

+<!-- This is the Linux Packet Filtering HOWTO.

4412

+ -->

4413

+

4414

+

4415

+

4416

+<article>

4417

+

4418

+

4419

+

4420

+<title>Linux 2.4 Packet Filtering HOWTO

4421

+<author>Rusty Russell, mailing list <tt>netfilter@lists.samba.org</tt>

4422

+<date>$Revision: 1.26 $ $Date: 2002/01/24 13:42:53 $

4423

+<abstract>

4424

+This document describes how to use iptables to filter out bad packets

4425

+for the 2.4 Linux kernels.

4426

+</abstract>

4427

+

4428

+

4429

+<toc>

4430

+

4431

+

4432

+

4433

+<sect>Introduction<label id="intro">

4434

+

4435

+

4436

+Welcome, gentle reader.

4437

+

4438

+

4439

+It is assumed you know what an IP address, a network address, a

4440

+netmask, routing and DNS are. If not, I recommend that you read the

4441

+Network Concepts HOWTO.

4442

+

4443

+

4444

+This HOWTO flips between a gentle introduction (which will leave you

4445

+feeling warm and fuzzy now, but unprotected in the Real World) and raw

4446

+full-disclosure (which would leave all but the hardiest souls

4447

+confused, paranoid and seeking heavy weaponry).

4448

+

4449

+

4450

+Your network is not <bf>secure</bf>. The problem of allowing rapid,

4451

+convenient communication while restricting its use to good, and not

4452

+evil intents is congruent to other intractable problems such as

4453

+allowing free speech while disallowing a call of ``Fire!'' in a

4454

+crowded theater. It will not be solved in the space of this HOWTO.

4455

+

4456

+

4457

+So only you can decide where the compromise will be. I will try to

4458

+instruct you in the use of some of the tools available and some

4459

+vulnerabilities to be aware of, in the hope that you will use them for

4460

+good, and not evil purposes. Another equivalent problem.

4461

+

4462

4463

+

4464

+<sect>Where is the official Web Site? Is there a Mailing List?

4465

+

4466

+There are three official sites:

4467

+<itemize>

4468

+<item>Thanks to <url url="http://netfilter.filewatcher.org/" name="Filewatcher">.

4469

+<item>Thanks to <url url="http://netfilter.samba.org/" name="The Samba Team and SGI">.

4470

+<item>Thanks to <url url="http://netfilter.gnumonks.org/" name="Harald Welte">.

4471

+</itemize>

4472

+ You can reach all of them using round-robin DNS via

4473

+<url url="http://www.netfilter.org/"> and <url url="http://www.iptables.org/">

4474

+

4475

+For the official netfilter mailing list, see

4476

+<url url="http://www.netfilter.org/contact.html#list" name="netfilter List">.

4477

+

4478

+<sect>So What's A Packet Filter?

4479

+

4480

+

4481

+A packet filter is a piece of software which looks at the

4482

+header of packets as they pass through, and decides the fate

4483

+of the entire packet. It might decide to <bf>DROP</bf> the packet

4484

+(i.e., discard the packet as if it had never received it),

4485

+<bf>ACCEPT</bf> the packet (i.e., let the packet go through), or

4486

+something more complicated.

4487

+

4488

+

4489

+Under Linux, packet filtering is built into the kernel (as a kernel

4490

+module, or built right in), and there are a few trickier things we can

4491

+do with packets, but the general principle of looking at the headers

4492

+and deciding the fate of the packet is still there.

4493

+

4494

+<sect1>Why Would I Want to Packet Filter?

4495

+

4496

+

4497

+Control. Security. Watchfulness.

4498

+

4499

+

4500

+<descrip>

4501

+<tag/Control:/ when you are using a Linux box to connect your internal

4502

+network to another network (say, the Internet) you have an opportunity

4503

+to allow certain types of traffic, and disallow others. For example,

4504

+the header of a packet contains the destination address of the packet,

4505

+so you can prevent packets going to a certain part of the outside

4506

+network. As another example, I use Netscape to access the Dilbert

4507

+archives. There are advertisements from doubleclick.net on the page,

4508

+and Netscape wastes my time by cheerfully downloading them.

4509

+Telling the packet filter not to allow any packets to or from the

4510

+addresses owned by doubleclick.net solves that problem (there are

4511

+better ways of doing this though: see Junkbuster).

4512

+

4513

+<tag/Security:/ when your Linux box is the only thing between the

4514

+chaos of the Internet and your nice, orderly network, it's nice to

4515

+know you can restrict what comes tromping in your door. For example,

4516

+you might allow anything to go out from your network, but you might be

4517

+worried about the well-known `Ping of Death' coming in from malicious

4518

+outsiders. As another example, you might not want outsiders

4519

+telnetting to your Linux box, even though all your accounts have

4520

+passwords. Maybe you want (like most people) to be an observer on the

4521

+Internet, and not a server (willing or otherwise). Simply don't let

4522

+anyone connect in, by having the packet filter reject incoming packets

4523

+used to set up connections.

4524

+

4525

+<tag/Watchfulness:/ sometimes a badly configured machine on the local

4526

+network will decide to spew packets to the outside world. It's nice

4527

+to tell the packet filter to let you know if anything abnormal occurs;

4528

+maybe you can do something about it, or maybe you're just curious by

4529

+nature.

4530

+</descrip>

4531

+

4532

+<sect1>How Do I Packet Filter Under Linux?<label id="filter-linux">

4533

+

4534

+Linux kernels have had packet filtering since the 1.1 series. The

4535

+first generation, based on ipfw from BSD, was ported by Alan Cox in

4536

+late 1994. This was enhanced by Jos Vos and others for Linux 2.0; the

4537

+userspace tool `ipfwadm' controlled the kernel filtering rules. In

4538

+mid-1998, for Linux 2.2, I reworked the kernel quite heavily, with the

4539

+help of Michael Neuling, and introduced the userspace tool `ipchains'.

4540

+Finally, the fourth-generation tool, `iptables', and another kernel

4541

+rewrite occurred in mid-1999 for Linux 2.4. It is this iptables which

4542

+this HOWTO concentrates on.

4543

+

4544

+

4545

+You need a kernel which has the netfilter infrastructure in it:

4546

+netfilter is a general framework inside the Linux kernel which other

4547

+things (such as the iptables module) can plug into. This means you

4548

+need kernel 2.3.15 or beyond, and answer `Y' to CONFIG_NETFILTER in

4549

+the kernel configuration.

4550

+

4551

+

4552

+The tool <tt>iptables</tt> talks to the kernel and tells it what

4553

+packets to filter. Unless you are a programmer, or overly curious,

4554

+this is how you will control the packet filtering.

4555

+

4556

+<sect2> iptables

4557

+

4558

+

4559

+The <tt>iptables</tt> tool inserts and deletes rules from the kernel's

4560

+packet filtering table. This means that whatever you set up, it will

4561

+be lost upon reboot; see <ref id="permanent" name="Making Rules

4562

+Permanent"> for how to make sure they are restored the next time Linux

4563

+is booted.

4564

+

4565

+

4566

+<tt>iptables</tt> is a replacement for <tt>ipfwadm</tt> and

4567

+<tt>ipchains</tt>: see

4568

+<ref id="oldstyle" name="Using ipchains and ipfwadm"> for how to painlessly

4569

+avoid using iptables if you're using one of those tools.

4570

+

4571

+<sect2> Making Rules Permanent<label id="permanent">

4572

+

4573

+Your current firewall setup is stored in the kernel, and thus will

4574

+be lost on reboot. You can try the iptables-save and iptables-restore

4575

+scripts to save them to, and restore them from a file.

4576

+

4577

+The other way is to put the commands required to set up your rules

4578

+in an initialization script. Make sure you do something intelligent

4579

+if one of the commands should fail (usually `exec /sbin/sulogin').

4580

+

4581

+<sect>Who the hell are you, and why are you playing with my kernel?

4582

+

4583

+

4584

+I'm Rusty Russell; the Linux IP Firewall maintainer and just another

4585

+working coder who happened to be in the right place at the right time.

4586

+I wrote ipchains (see <ref id="filter-linux" name="How Do I Packet

4587

+Filter Under Linux?"> above for due credit to the people who did the

4588

+actual work), and learnt enough to get packet filtering right this

4589

+time. I hope.

4590

+

4591

+

4592

+<url url="http://www.watchguard.com" name="WatchGuard">, an excellent

4593

+firewall company who sell the really nice plug-in Firebox, offered to

4594

+pay me to do nothing, so I could spend all my time writing this stuff,

4595

+and maintaining my previous stuff. I predicted 6 months, and it took

4596

+12, but I felt by the end that it had been done Right. Many rewrites,

4597

+a hard-drive crash, a laptop being stolen, a couple of corrupted

4598

+filesystems and one broken screen later, here it is.

4599

+

4600

+

4601

+While I'm here, I want to clear up some people's misconceptions: I am

4602

+no kernel guru. I know this, because my kernel work has brought me

4603

+into contact with some of them: David S. Miller, Alexey Kuznetsov,

4604

+Andi Kleen, Alan Cox. However, they're all busy doing the deep magic,

4605

+leaving me to wade in the shallow end where it's safe.

4606

+

4607

+<!-- This is probably no longer true; somewhere in writing all this

4608

+kernel code and documentation I seem to have picked up a fair number

4609

+of kernel tricks. But I'm still nowhere near as clever as I think I

4610

+am. -->

4611

+

4612

+<sect> Rusty's Really Quick Guide To Packet Filtering

4613

+

4614

+

4615

+Most people just have a single PPP connection to the Internet, and

4616

+don't want anyone coming back into their network, or the firewall:

4617

+

4618

+<tscreen><verb>

4619

+## Insert connection-tracking modules (not needed if built into kernel).

4620

+# insmod ip_conntrack

4621

+# insmod ip_conntrack_ftp

4622

+

4623

+## Create chain which blocks new connections, except if coming from inside.

4624

+# iptables -N block

4625

+# iptables -A block -m state --state ESTABLISHED,RELATED -j ACCEPT

4626

+# iptables -A block -m state --state NEW -i ! ppp0 -j ACCEPT

4627

+# iptables -A block -j DROP

4628

+

4629

+## Jump to that chain from INPUT and FORWARD chains.

4630

+# iptables -A INPUT -j block

4631

+# iptables -A FORWARD -j block

4632

+</verb></tscreen>

4633

+

4634

+<sect> How Packets Traverse The Filters

4635

+

4636

+

4637

+The kernel starts with three lists of rules in the `filter' table;

4638

+these lists are called <bf>firewall chains</bf> or just

4639

+<bf>chains</bf>. The three chains are called <bf>INPUT</bf>,

4640

+<bf>OUTPUT</bf> and <bf>FORWARD</bf>.

4641

+

4642

+

4643

+For ASCII-art fans, the chains are arranged like so: <bf>(Note: this

4644

+is a very different arrangement from the 2.0 and 2.2 kernels!)</bf>

4645

+

4646

+<verb>

4647

+ _____

4648

+Incoming / \ Outgoing

4649

+ -->[Routing ]--->|FORWARD|------->

4650

+ [Decision] \_____/ ^

4651

+ | |

4652

+ v ____

4653

+ ___ / \

4654

+ / \ |OUTPUT|

4655

+ |INPUT| \____/

4656

+ \___/ ^

4657

+ | |

4658

+ ----> Local Process ----

4659

+</verb>

4660

+

4661

+The three circles represent the three chains mentioned above. When

4662

+a packet reaches a circle in the diagram, that chain is examined to

4663

+decide the fate of the packet. If the chain says to DROP the packet,

4664

+it is killed there, but if the chain says to ACCEPT the packet, it

4665

+continues traversing the diagram.

4666

+

4667

+

4668

+A chain is a checklist of <bf>rules</bf>. Each rule says `if the packet

4669

+header looks like this, then here's what to do with the packet'. If

4670

+the rule doesn't match the packet, then the next rule in the chain is

4671

+consulted. Finally, if there are no more rules to consult, then the

4672

+kernel looks at the chain <bf>policy</bf> to decide what to do. In a

4673

+security-conscious system, this policy usually tells the kernel to

4674

+DROP the packet.

4675

+

4676

+

4677

+<enum>

4678

+<item>When a packet comes in (say, through the Ethernet card) the kernel

4679

+first looks at the destination of the packet: this is called

4680

+`routing'.

4681

+

4682

+<item>If it's destined for this box, the packet passes downwards

4683

+in the diagram, to the INPUT chain. If it passes this, any processes

4684

+waiting for that packet will receive it.

4685

+

4686

+<item>Otherwise, if the kernel does not have forwarding enabled, or it

4687

+doesn't know how to forward the packet, the packet is dropped. If

4688

+forwarding is enabled, and the packet is destined for another network

4689

+interface (if you have another one), then the packet goes rightwards

4690

+on our diagram to the FORWARD chain. If it is ACCEPTed, it will be

4691

+sent out.

4692

+

4693

+<item>Finally, a program running on the box can send network packets.

4694

+These packets pass through the OUTPUT chain immediately: if it says

4695

+ACCEPT, then the packet continues out to whatever interface it is

4696

+destined for.

4697

+</enum>

4698

+

4699

+<sect>Using iptables

4700

+

4701

+

4702

+iptables has a fairly detailed manual page (<tt>man iptables</tt>),

4703

+and if you need more detail on particulars. Those of you familiar

4704

+with ipchains may simply want to look at <ref id="Appendix-A"

4705

+name="Differences Between iptables and ipchains">; they are very

4706

+similar.

4707

+

4708

+

4709

+There are several different things you can do with <tt>iptables</tt>.

4710

+You start with three built-in chains <tt>INPUT</tt>, <tt>OUTPUT</tt>

4711

+and <tt>FORWARD</tt> which you can't delete. Let's look at the

4712

+operations to manage whole chains:

4713

+

4714

+<enum>

4715

+<item> Create a new chain (-N).

4716

+<item> Delete an empty chain (-X).

4717

+<item> Change the policy for a built-in chain. (-P).

4718

+<item> List the rules in a chain (-L).

4719

+<item> Flush the rules out of a chain (-F).

4720

+<item> Zero the packet and byte counters on all rules in a chain (-Z).

4721

+</enum>

4722

+

4723

+There are several ways to manipulate rules inside a chain:

4724

+

4725

+<enum>

4726

+<item> Append a new rule to a chain (-A).

4727

+<item> Insert a new rule at some position in a chain (-I).

4728

+<item> Replace a rule at some position in a chain (-R).

4729

+<item> Delete a rule at some position in a chain, or the first that matches (-D).

4730

+</enum>

4731

+

4732

+<sect1> What You'll See When Your Computer Starts Up

4733

+

4734

+

4735

+iptables may be a module, called (`iptable_filter.o'), which should be

4736

+automatically loaded when you first run <tt>iptables</tt>. It can

4737

+also be built into the kernel permenantly.

4738

+

4739

+Before any iptables commands have been run (be careful: some

4740

+distributions will run iptables in their initialization scripts),

4741

+there will be no rules in any of the built-in chains (`INPUT',

4742

+`FORWARD' and `OUTPUT'), all the chains will have a policy of ACCEPT.

4743

+You can alter the default policy of the FORWARD chain by providing the

4744

+`forward=0' option to the iptable_filter module.

4745

+

4746

+<sect1> Operations on a Single Rule

4747

+

4748

+

4749

+This is the bread-and-butter of packet filtering; manipulating rules.

4750

+Most commonly, you will probably use the append (-A) and delete (-D)

4751

+commands. The others (-I for insert and -R for replace) are simple

4752

+extensions of these concepts.

4753

+

4754

+

4755

+Each rule specifies a set of conditions the packet must meet, and what

4756

+to do if it meets them (a `target'). For example, you might want to

4757

+drop all ICMP packets coming from the IP address 127.0.0.1. So in

4758

+this case our conditions are that the protocol must be ICMP and that

4759

+the source address must be 127.0.0.1. Our target is `DROP'.

4760

+

4761

+

4762

+127.0.0.1 is the `loopback' interface, which you will have even if you

4763

+have no real network connection. You can use the `ping' program to

4764

+generate such packets (it simply sends an ICMP type 8 (echo request)

4765

+which all cooperative hosts should obligingly respond to with an ICMP

4766

+type 0 (echo reply) packet). This makes it useful for testing.

4767

+

4768

+<tscreen><verb>

4769

+# ping -c 1 127.0.0.1

4770

+PING 127.0.0.1 (127.0.0.1): 56 data bytes

4771

+64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.2 ms

4772

+

4773

+--- 127.0.0.1 ping statistics ---

4774

+1 packets transmitted, 1 packets received, 0% packet loss

4775

+round-trip min/avg/max = 0.2/0.2/0.2 ms

4776

+# iptables -A INPUT -s 127.0.0.1 -p icmp -j DROP

4777

+# ping -c 1 127.0.0.1

4778

+PING 127.0.0.1 (127.0.0.1): 56 data bytes

4779

+

4780

+--- 127.0.0.1 ping statistics ---

4781

+1 packets transmitted, 0 packets received, 100% packet loss

4782

+#

4783

+</verb></tscreen>

4784

+

4785

+You can see here that the first ping succeeds (the `-c 1' tells ping

4786

+to only send a single packet).

4787

+

4788

+

4789

+Then we append (-A) to the `INPUT' chain, a rule specifying that for

4790

+packets from 127.0.0.1 (`-s 127.0.0.1') with protocol ICMP (`-p icmp')

4791

+we should jump to DROP (`-j DROP').

4792

+

4793

+

4794

+Then we test our rule, using the second ping. There will be a pause

4795

+before the program gives up waiting for a response that will never

4796

+come.

4797

+

4798

+

4799

+We can delete the rule in one of two ways. Firstly, since we know

4800

+that it is the only rule in the input chain, we can use a numbered

4801

+delete, as in:

4802

+<tscreen><verb>

4803

+ # iptables -D INPUT 1

4804

+ #

4805

+</verb></tscreen>

4806

+To delete rule number 1 in the INPUT chain.

4807

+

4808

+

4809

+The second way is to mirror the -A command, but replacing the -A with

4810

+-D. This is useful when you have a complex chain of rules and you

4811

+don't want to have to count them to figure out that it's rule 37 that

4812

+you want to get rid of. In this case, we would use:

4813

+<tscreen><verb>

4814

+ # iptables -D INPUT -s 127.0.0.1 -p icmp -j DROP

4815

+ #

4816

+</verb></tscreen>

4817

+The syntax of -D must have exactly the same options as the -A (or -I

4818

+or -R) command. If there are multiple identical rules in the same

4819

+chain, only the first will be deleted.

4820

+

4821

+<sect1>Filtering Specifications

4822

+

4823

+

4824

+We have seen the use of `-p' to specify protocol, and `-s' to specify

4825

+source address, but there are other options we can use to specify

4826

+packet characteristics. What follows is an exhaustive compendium.

4827

+

4828

+<sect2>Specifying Source and Destination IP Addresses

4829

+

4830

+

4831

+Source (`-s', `--source' or `--src') and destination (`-d',

4832

+`--destination' or `--dst') IP addresses can be specified in four

4833

+ways. The most common way is to use the full name, such as

4834

+`localhost' or `www.linuxhq.com'. The second way is to specify the IP

4835

+address such as `127.0.0.1'.

4836

+

4837

+

4838

+The third and fourth ways allow specification of a group of IP

4839

+addresses, such as `199.95.207.0/24' or `199.95.207.0/255.255.255.0'.

4840

+These both specify any IP address from 199.95.207.0 to 199.95.207.255

4841

+inclusive; the digits after the `/' tell which parts of the IP address

4842

+are significant. `/32' or `/255.255.255.255' is the default (match

4843

+all of the IP address). To specify any IP address at all `/0' can be

4844

+used, like so:

4845

+<tscreen><verb>

4846

+ [ NOTE: `-s 0/0' is redundant here. ]

4847

+ # iptables -A INPUT -s 0/0 -j DROP

4848

+ #

4849

+</verb></tscreen>

4850

+

4851

+This is rarely used, as the effect above is the same as not specifying

4852

+the `-s' option at all.

4853

+

4854

+<sect2>Specifying Inversion

4855

+

4856

+

4857

+Many flags, including the `-s' (or `--source') and `-d'

4858

+(`--destination') flags can have their arguments preceded by `!'

4859

+(pronounced `not') to match addresses NOT equal to the ones given.

4860

+For example. `-s ! localhost' matches any packet <bf>not</bf> coming

4861

+from localhost.

4862

+

4863

+<sect2>Specifying Protocol

4864

+

4865

+

4866

+The protocol can be specified with the `-p' (or `--protocol') flag.

4867

+Protocol can be a number (if you know the numeric protocol values for

4868

+IP) or a name for the special cases of `TCP', `UDP' or `ICMP'. Case

4869

+doesn't matter, so `tcp' works as well as `TCP'.

4870

+

4871

+

4872

+The protocol name can be prefixed by a `!', to invert it, such as `-p

4873

+! TCP' to specify packets which are <bf>not</bf> TCP.

4874

+

4875

+<sect2>Specifying an Interface

4876

+

4877

+

4878

+The `-i' (or `--in-interface') and `-o' (or `--out-interface') options

4879

+specify the name of an <bf>interface</bf> to match. An interface is

4880

+the physical device the packet came in on (`-i') or is going out on

4881

+(`-o'). You can use the <tt>ifconfig</tt> command to list the

4882

+interfaces which are `up' (i.e., working at the moment).

4883

+

4884

+

4885

+Packets traversing the <tt>INPUT</tt> chain don't have an output

4886

+interface, so any rule using `-o' in this chain will never match.

4887

+Similarly, packets traversing the <tt>OUTPUT</tt> chain don't have an

4888

+input interface, so any rule using `-i' in this chain will never match.

4889

+

4890

+Only packets traversing the <tt>FORWARD</tt> chain have both an

4891

+input and output interface.

4892

+

4893

+

4894

+It is perfectly legal to specify an interface that currently does not

4895

+exist; the rule will not match anything until the interface comes up.

4896

+This is extremely useful for dial-up PPP links (usually interface

4897

+<tt>ppp0</tt>) and the like.

4898

+

4899

+

4900

+As a special case, an interface name ending with a `+' will match all

4901

+interfaces (whether they currently exist or not) which begin with that

4902

+string. For example, to specify a rule which matches all PPP

4903

+interfaces, the <tt>-i ppp+</tt> option would be used.

4904

+

4905

+

4906

+The interface name can be preceded by a `!' with spaces around it, to

4907

+match a packet which does <bf>not</bf> match the specified

4908

+interface(s), eg <tt>-i ! ppp+</tt>.

4909

+

4910

+<sect2>Specifying Fragments

4911

+

4912

+

4913

+Sometimes a packet is too large to fit down a wire all at once. When

4914

+this happens, the packet is divided into <bf>fragments</bf>, and sent

4915

+as multiple packets. The other end reassembles these fragments to

4916

+reconstruct the whole packet.

4917

+

4918

+

4919

+The problem with fragments is that the initial fragment has the

4920

+complete header fields (IP + TCP, UDP and ICMP) to examine, but

4921

+subsequent packets only have a subset of the headers (IP without the

4922

+additional protocol fields). Thus looking inside subsequent fragments

4923

+for protocol headers (such as is done by the TCP, UDP and ICMP

4924

+extensions) is not possible.

4925

+

4926

+

4927

+If you are doing connection tracking or NAT, then all fragments will

4928

+get merged back together before they reach the packet filtering code,

4929

+so you need never worry about fragments.

4930

+

4931

+

4932

+Please also note that in the INPUT chain of the filter table (or any other

4933

+table hooking into the NF_IP_LOCAL_IN hook) is traversed after

4934

+defragmentation of the core IP stack.

4935

+

4936

+

4937

+Otherwise, it is important to understand how fragments get treated by

4938

+the filtering rules. Any filtering rule that asks for information we

4939

+don't have will not match. This means that the first fragment is

4940

+treated like any other packet. Second and further fragments won't be.

4941

+Thus a rule <tt>-p TCP --sport www</tt> (specifying a source port of

4942

+`www') will never match a fragment (other than the first fragment).

4943

+Neither will the opposite rule <tt>-p TCP --sport ! www</tt>.

4944

+

4945

+

4946

+However, you can specify a rule specifically for second and further

4947

+fragments, using the `-f' (or `--fragment') flag. It is also legal to

4948

+specify that a rule does not apply to second and further

4949

+fragments, by preceding the `-f' with ` ! '.

4950

+

4951

+

4952

+Usually it is regarded as safe to let second and further fragments

4953

+through, since filtering will effect the first fragment, and thus

4954

+prevent reassembly on the target host; however, bugs have been known

4955

+to allow crashing of machines simply by sending fragments. Your call.

4956

+

4957

+

4958

+Note for network-heads: malformed packets (TCP, UDP and ICMP packets

4959

+too short for the firewalling code to read the ports or ICMP code and

4960

+type) are dropped when such examinations are attempted. So are TCP

4961

+fragments starting at position 8.

4962

+

4963

+

4964

+As an example, the following rule will drop any fragments going to

4965

+192.168.1.1:

4966

+

4967

+<tscreen><verb>

4968

+# iptables -A OUTPUT -f -d 192.168.1.1 -j DROP

4969

+#

4970

+</verb></tscreen>

4971

+

4972

+<sect2>Extensions to iptables: New Matches

4973

+

4974

+<tt>iptables</tt> is <bf>extensible</bf>, meaning that both the

4975

+kernel and the iptables tool can be extended to provide new features.

4976

+

4977

+Some of these extensions are standard, and other are more exotic.

4978

+Extensions can be made by other people and distributed separately for

4979

+niche users.

4980

+

4981

+Kernel extensions normally live in the kernel module subdirectory,

4982

+such as /lib/modules/2.4.0-test10/kernel/net/ipv4/netfilter. They are demand loaded if your

4983

+kernel was compiled with CONFIG_KMOD set, so you should not need to

4984

+manually insert them.

4985

+

4986

+Extensions to the iptables program are shared libraries which

4987

+usually live in /usr/local/lib/, although a distribution

4988

+would put them in /lib/iptables or /usr/lib/iptables.

4989

+

4990

+Extensions come in two types: new targets, and new matches (we'll

4991

+talk about new targets a little later). Some protocols automatically

4992

+offer new tests: currently these are TCP, UDP and ICMP as shown below.

4993

+

4994

+For these you will be able to specify the new tests on the command

4995

+line after the `-p' option, which will load the extension. For

4996

+explicit new tests, use the `-m' option to load the extension, after

4997

+which the extended options will be available.

4998

+

4999

+To get help on an extension, use the option to load it (`-p', `-j' or

5000

+`-m') followed by `-h' or `--help', eg:

5001

+<tscreen><verb>

5002

+# iptables -p tcp --help

5003

+#

5004

+</verb></tscreen>

5005

+

5006

+<sect3>TCP Extensions

5007

+

5008

+

5009

+The TCP extensions are automatically loaded if `-p tcp' is specified.

5010

+It provides the following options (none of which match fragments).

5011

+

5012

+

5013

+<descrip>

5014

+<tag>--tcp-flags</tag> Followed by an optional `!', then two strings

5015

+of flags, allows you to filter on specific TCP flags. The first

5016

+string of flags is the mask: a list of flags you want to examine. The

5017

+second string of flags tells which one(s) should be set. For example,

5018

+

5019

+<tscreen><verb>

5020

+# iptables -A INPUT --protocol tcp --tcp-flags ALL SYN,ACK -j DROP

5021

+</verb></tscreen>

5022

+

5023

+This indicates that all flags should be examined (`ALL' is synonymous

5024

+with `SYN,ACK,FIN,RST,URG,PSH'), but only SYN and ACK should be set.

5025

+There is also an argument `NONE' meaning no flags.

5026

+

5027

+<tag>--syn</tag> Optionally preceded by a `!', this is shorthand

5028

+ for `--tcp-flags SYN,RST,ACK SYN'.

5029

+

5030

+<tag>--source-port</tag> followed by an optional `!', then either a

5031

+single TCP port, or a range of ports. Ports can be port names, as

5032

+listed in /etc/services, or numeric. Ranges are either two port names

5033

+separated by a `:', or (to specify greater than or equal to a given

5034

+port) a port with a `:' appended, or (to specify less than or equal to

5035

+a given port), a port preceded by a `:'.

5036

+

5037

+<tag>--sport</tag> is synonymous with `--source-port'.

5038

+

5039

+<tag>--destination-port</tag> and <tag>--dport</tag> are the same as

5040

+above, only they specify the destination, rather than source, port to

5041

+match.

5042

+

5043

+<tag>--tcp-option</tag> followed by an optional `!' and a number,

5044

+matches a packet with a TCP option equaling that number. A packet

5045

+which does not have a complete TCP header is dropped automatically if

5046

+an attempt is made to examine its TCP options.

5047

+</descrip>

5048

+

5049

+<sect4>An Explanation of TCP Flags

5050

+

5051

+

5052

+It is sometimes useful to allow TCP connections in one direction, but

5053

+not the other. For example, you might want to allow connections to an

5054

+external WWW server, but not connections from that server.

5055

+

5056

+

5057

+The naive approach would be to block TCP packets coming from the

5058

+server. Unfortunately, TCP connections require packets going in both

5059

+directions to work at all.

5060

+

5061

+

5062

+The solution is to block only the packets used to request a

5063

+connection. These packets are called <bf>SYN</bf> packets (ok,

5064

+technically they're packets with the SYN flag set, and the RST and ACK

5065

+flags cleared, but we call them SYN packets for short). By

5066

+disallowing only these packets, we can stop attempted connections in

5067

+their tracks.

5068

+

5069

+

5070

+The `--syn' flag is used for this: it is only valid for rules which

5071

+specify TCP as their protocol. For example, to specify TCP connection

5072

+attempts from 192.168.1.1:

5073

+<tscreen><verb>

5074

+-p TCP -s 192.168.1.1 --syn

5075

+</verb></tscreen>

5076

+

5077

+

5078

+This flag can be inverted by preceding it with a `!', which means

5079

+every packet other than the connection initiation.

5080

+

5081

+<sect3>UDP Extensions

5082

+

5083

+

5084

+These extensions are automatically loaded if `-p udp' is specified.

5085

+It provides the options `--source-port', `--sport',

5086

+`--destination-port' and `--dport' as detailed for TCP above.

5087

+

5088

+<sect3>ICMP Extensions

5089

+

5090

+

5091

+This extension is automatically loaded if `-p icmp' is specified. It

5092

+provides only one new option:

5093

+

5094

+

5095

+<descrip>

5096

+<tag>--icmp-type</tag> followed by an optional `!', then either an

5097

+icmp type name (eg `host-unreachable'), or a numeric type (eg. `3'),

5098

+or a numeric type and code separated by a `/' (eg. `3/3'). A list

5099

+of available icmp type names is given using `-p icmp --help'.

5100

+</descrip>

5101

+

5102

+<sect3>Other Match Extensions

5103

+

5104

+

5105

+The other extensions in the netfilter package are demonstration

5106

+extensions, which (if installed) can be invoked with the `-m' option.

5107

+

5108

+<descrip>

5109

+<tag>mac</tag> This module must be explicitly specified with `-m mac'

5110

+or `--match mac'. It is used for matching incoming packet's source

5111

+Ethernet (MAC) address, and thus only useful for packets traversing

5112

+the PREROUTING and INPUT chains. It provides only one option:

5113

+

5114

+ <descrip>

5115

+ <tag>--mac-source</tag> followed by an optional `!', then an

5116

+ ethernet address in colon-separated hexbyte notation, eg

5117

+ `--mac-source 00:60:08:91:CC:B7'.

5118

+ </descrip>

5119

+

5120

+<tag>limit</tag> This module must be explicitly specified with `-m

5121

+limit' or `--match limit'. It is used to restrict the rate of

5122

+matches, such as for suppressing log messages. It will only match a

5123

+given number of times per second (by default 3 matches per hour,

5124

+with a burst of 5). It takes two optional arguments:

5125

+

5126

+ <descrip>

5127

+ <tag>--limit</tag> followed by a number; specifies the maximum

5128

+ average number of matches to allow per second. The number can

5129

+ specify units explicitly, using `/second', `/minute', `/hour' or

5130

+ `/day', or parts of them (so `5/second' is the same as `5/s').

5131

+

5132

+ <tag>--limit-burst</tag> followed by a number, indicating the

5133

+ maximum burst before the above limit kicks in.

5134

+ </descrip>

5135

+

5136

+This match can often be used with the LOG target to do rate-limited

5137

+logging. To understand how it works, let's look at the following

5138

+rule, which logs packets with the default limit parameters:

5139

+

5140

+<tscreen><verb>

5141

+# iptables -A FORWARD -m limit -j LOG

5142

+</verb></tscreen>

5143

+

5144

+The first time this rule is reached, the packet will be logged; in

5145

+fact, since the default burst is 5, the first five packets will be

5146

+logged. After this, it will be twenty minutes before a packet will be

5147

+logged from this rule, regardless of how many packets reach it. Also,

5148

+every twenty minutes which passes without matching a packet, one of

5149

+the burst will be regained; if no packets hit the rule for 100

5150

+minutes, the burst will be fully recharged; back where we started.

5151

+

5152

+Note: you cannot currently create a rule with a recharge time

5153

+greater than about 59 hours, so if you set an average rate of one per

5154

+day, then your burst rate must be less than 3.

5155

+

5156

+You can also use this module to avoid various denial of service

5157

+attacks (DoS) with a faster rate to increase responsiveness.

5158

+

5159

+Syn-flood protection:

5160

+<tscreen><verb>

5161

+# iptables -A FORWARD -p tcp --syn -m limit --limit 1/s -j ACCEPT

5162

+</verb></tscreen>

5163

+

5164

+Furtive port scanner:

5165

+<tscreen><verb>

5166

+# iptables -A FORWARD -p tcp --tcp-flags SYN,ACK,FIN,RST RST -m limit --limit 1/s -j ACCEPT

5167

+</verb></tscreen>

5168

+

5169

+Ping of death:

5170

+<tscreen><verb>

5171

+# iptables -A FORWARD -p icmp --icmp-type echo-request -m limit --limit 1/s -j ACCEPT

5172

+</verb></tscreen>

5173

+

5174

+This module works like a "hysteresis door", as shown in the graph

5175

+below.

5176

+

5177

+<tscreen><verb>

5178

+ rate (pkt/s)

5179

+ ^ .---.

5180

+ | / DoS \

5181

+ | / \

5182

+Edge of DoS -|.....:.........\.......................

5183

+ = (limit * | /: \

5184

+limit-burst) | / : \ .-.

5185

+ | / : \ / \

5186

+ | / : \ / \

5187

+End of DoS -|/....:..............:.../.......\..../.

5188

+ = limit | : :`-' `--'

5189

+-------------+-----+--------------+------------------> time (s)

5190

+ LOGIC => Match | Didn't Match | Match

5191

+</verb></tscreen>

5192

+

5193

+Say we say match one packet per second with a five packet

5194

+burst, but packets start coming in at four per second, for three

5195

+seconds, then start again in another three seconds.

5196

+<tscreen><verb>

5197

+

5198

+

5199

+ <--Flood 1--> <---Flood 2--->

5200

+

5201

+Total ^ Line __-- YNNN

5202

+Packets| Rate __-- YNNN

5203

+ | mum __-- YNNN

5204

+ 10 | Maxi __-- Y

5205

+ | __-- Y

5206

+ | __-- Y

5207

+ | __-- YNNN

5208

+ |- YNNN

5209

+ 5 | Y

5210

+ | Y Key: Y -> Matched Rule

5211

+ | Y N -> Didn't Match Rule

5212

+ | Y

5213

+ |Y

5214

+ 0 +--------------------------------------------------> Time (seconds)

5215

+ 0 1 2 3 4 5 6 7 8 9 10 11 12

5216

+</verb></tscreen>

5217

+

5218

+You can see that the first five packets are allowed to exceed the one

5219

+packet per second, then the limiting kicks in. If there is a pause,

5220

+another burst is allowed but not past the maximum rate set by the

5221

+rule (1 packet per second after the burst is used).

5222

+

5223

+<tag>owner</tag>

5224

+This module attempts to match various characteristics of the packet

5225

+creator, for locally-generated packets. It is only valid in the

5226

+OUTPUT chain, and even then some packets (such as ICMP ping responses)

5227

+may have no owner, and hence never match.

5228

+

5229

+<descrip>

5230

+ <tag>--uid-owner userid</tag>

5231

+Matches if the packet was created by a process with the given

5232

+effective (numerical) user id.

5233

+ <tag>--gid-owner groupid</tag>

5234

+Matches if the packet was created by a process with the given

5235

+effective (numerical) group id.

5236

+ <tag>--pid-owner processid</tag>

5237

+Matches if the packet was created by a process with the given

5238

+process id.

5239

+ <tag>--sid-owner sessionid</tag>

5240

+Matches if the packet was created by a process in the given session

5241

+group.

5242

+</descrip>

5243

+

5244

+<tag>unclean</tag> This experimental module must be explicitly

5245

+specified with `-m unclean or `--match unclean'. It does various

5246

+random sanity checks on packets. This module has not been audited,

5247

+and should not be used as a security device (it probably makes things

5248

+worse, since it may well have bugs itself). It provides no options.

5249

+</descrip>

5250

+

5251

+<sect3>The State Match

5252

+

5253

+The most useful match criterion is supplied by the `state'

5254

+extension, which interprets the connection-tracking analysis of the

5255

+`ip_conntrack' module. This is highly recommended.

5256

+

5257

+Specifying `-m state' allows an additional `--state' option, which

5258

+is a comma-separated list of states to match (the `!' flag indicates

5259

+<bf>not</bf> to match those states). These states are:

5260

+

5261

+<descrip>

5262

+<tag>NEW</tag> A packet which creates a new connection.

5263

+

5264

+<tag>ESTABLISHED</tag> A packet which belongs to an existing

5265

+connection (i.e., a reply packet, or outgoing packet on a connection

5266

+which has seen replies).

5267

+

5268

+<tag>RELATED</tag> A packet which is related to, but not part of, an

5269

+existing connection, such as an ICMP error, or (with the FTP module

5270

+inserted), a packet establishing an ftp data connection.

5271

+

5272

+<tag>INVALID</tag> A packet which could not be identified for some

5273

+reason: this includes running out of memory and ICMP errors which

5274

+don't correspond to any known connection. Generally these packets

5275

+should be dropped.

5276

+</descrip>

5277

+

5278

+An example of this powerful match extension would be:

5279

+<tscreen><verb>

5280

+# iptables -A FORWARD -i ppp0 -m state ! --state NEW -j DROP

5281

+</verb></tscreen>

5282

+

5283

+<sect1>Target Specifications

5284

+

5285

+Now we know what examinations we can do on a packet, we need a way

5286

+of saying what to do to the packets which match our tests. This is

5287

+called a rule's <bf>target</bf>.

5288

+

5289

+There are two very simple built-in targets: DROP and ACCEPT. We've

5290

+already met them. If a rule matches a packet and its target is one of

5291

+these two, no further rules are consulted: the packet's fate has been

5292

+decided.

5293

+

5294

+There are two types of targets other than the built-in ones:

5295

+extensions and user-defined chains.

5296

+

5297

+<sect2>User-defined chains

5298

+

5299

+

5300

+One powerful feature which <tt>iptables</tt> inherits from

5301

+<tt>ipchains</tt> is the ability for the user to create new chains, in

5302

+addition to the three built-in ones (INPUT, FORWARD and OUTPUT). By

5303

+convention, user-defined chains are lower-case to distinguish them

5304

+(we'll describe how to create new user-defined chains below in <ref

5305

+id="chain-ops" name="Operations on an Entire Chain">).

5306

+

5307

+

5308

+When a packet matches a rule whose target is a user-defined chain, the

5309

+packet begins traversing the rules in that user-defined chain. If

5310

+that chain doesn't decide the fate of the packet, then once traversal

5311

+on that chain has finished, traversal resumes on the next rule in the

5312

+current chain.

5313

+

5314

+

5315

+Time for more ASCII art. Consider two (silly) chains: <tt>INPUT</tt> (the

5316

+built-in chain) and <tt>test</tt> (a user-defined chain).

5317

+

5318

+<tscreen><verb>

5319

+ `INPUT' `test'

5320

+ ---------------------------- ----------------------------

5321

+ | Rule1: -p ICMP -j DROP | | Rule1: -s 192.168.1.1 |

5322

+ |--------------------------| |--------------------------|

5323

+ | Rule2: -p TCP -j test | | Rule2: -d 192.168.1.1 |

5324

+ |--------------------------| ----------------------------

5325

+ | Rule3: -p UDP -j DROP |

5326

+ ----------------------------

5327

+</verb></tscreen>

5328

+

5329

+

5330

+Consider a TCP packet coming from 192.168.1.1, going to 1.2.3.4. It

5331

+enters the <tt>INPUT</tt> chain, and gets tested against Rule1 - no match.

5332

+Rule2 matches, and its target is <tt>test</tt>, so the next rule examined

5333

+is the start of <tt>test</tt>. Rule1 in <tt>test</tt> matches, but doesn't

5334

+specify a target, so the next rule is examined, Rule2. This doesn't

5335

+match, so we have reached the end of the chain. We return to the

5336

+<tt>INPUT</tt> chain, where we had just examined Rule2, so we now examine

5337

+Rule3, which doesn't match either.

5338

+

5339

+

5340

+So the packet path is:

5341

+<tscreen><verb>

5342

+ v __________________________

5343

+ `INPUT' | / `test' v

5344

+ ------------------------|--/ -----------------------|----

5345

+ | Rule1 | /| | Rule1 | |

5346

+ |-----------------------|/-| |----------------------|---|

5347

5348

+ |--------------------------| -----------------------v----

5349

+ | Rule3 /--+___________________________/

5350

+ ------------------------|---

5351

+ v

5352

+</verb></tscreen>

5353

+

5354

+User-defined chains can jump to other user-defined chains (but

5355

+don't make loops: your packets will be dropped if they're found to

5356

+be in a loop).

5357

+

5358

+<sect2>Extensions to iptables: New Targets

5359

+

5360

+The other type of extension is a target. A target extension

5361

+consists of a kernel module, and an optional extension to

5362

+<tt>iptables</tt> to provide new command line options. There are

5363

+several extensions in the default netfilter distribution:

5364

+

5365

+<descrip>

5366

+<tag>LOG</tag> This module provides kernel logging of matching

5367

+packets. It provides these additional options:

5368

+ <descrip>

5369

+ <tag>--log-level</tag> Followed by a level number or name. Valid

5370

+ names are (case-insensitive) `debug', `info', `notice', `warning',

5371

+ `err', `crit', `alert' and `emerg', corresponding to numbers 7

5372

+ through 0. See the man page for syslog.conf for an explanation of

5373

+ these levels. The default is `warning'.

5374

+

5375

+ <tag>--log-prefix</tag> Followed by a string of up to 29 characters,

5376

+ this message is sent at the start of the log message, to allow it to

5377

+ be uniquely identified.

5378

+ </descrip>

5379

+

5380

+ This module is most useful after a limit match, so you don't flood

5381

+ your logs.

5382

+

5383

+<tag>REJECT</tag> This module has the same effect as `DROP', except

5384

+that the sender is sent an ICMP `port unreachable' error message.

5385

+Note that the ICMP error message is not sent if (see RFC 1122):

5386

+

5387

+<itemize>

5388

+<item> The packet being filtered was an ICMP error message in the

5389

+first place, or some unknown ICMP type.

5390

+

5391

+<item> The packet being filtered was a non-head fragment.

5392

+

5393

+<item> We've sent too many ICMP error messages to that destination

5394

+recently (see /proc/sys/net/ipv4/icmp_ratelimit).

5395

+</itemize>

5396

+

5397

+REJECT also takes a `--reject-with' optional argument which alters the

5398

+reply packet used: see the manual page.

5399

+</descrip>

5400

+

5401

+<sect2>Special Built-In Targets

5402

+

5403

+There are two special built-in targets: <tt>RETURN</tt> and

5404

+<tt>QUEUE</tt>.

5405

+

5406

+<tt>RETURN</tt> has the same effect of falling off the end of a

5407

+chain: for a rule in a built-in chain, the policy of the chain is

5408

+executed. For a rule in a user-defined chain, the traversal continues

5409

+at the previous chain, just after the rule which jumped to this chain.

5410

+

5411

+<tt>QUEUE</tt> is a special target, which queues the packet for

5412

+userspace processing. For this to be useful, two further components are

5413

+required:

5414

+

5415

+<itemize>

5416

+<item>a "queue handler", which deals with the actual mechanics of

5417

+passing packets between the kernel and userspace; and

5418

+<item>a userspace application to receive, possibly manipulate, and

5419

+issue verdicts on packets.

5420

+</itemize>

5421

+The standard queue handler for IPv4 iptables is the ip_queue module,

5422

+which is distributed with the kernel and marked as experimental.

5423

+

5424

+The following is a quick example of how to use iptables to queue packets

5425

+for userspace processing:

5426

+<tscreen><verb>

5427

+# modprobe iptable_filter

5428

+# modprobe ip_queue

5429

+# iptables -A OUTPUT -p icmp -j QUEUE

5430

+</verb></tscreen>

5431

+With this rule, locally generated outgoing ICMP packets (as created with,

5432

+say, ping) are passed to the ip_queue module, which then attempts to deliver

5433

+the packets to a userspace application. If no userspace application is

5434

+waiting, the packets are dropped.

5435

+

5436

+To write a userspace application, use the libipq API. This is

5437

+distributed with iptables. Example code may be found in the testsuite

5438

+tools (e.g. redirect.c) in CVS.

5439

+

5440

+The status of ip_queue may be checked via:

5441

+<tscreen><verb>

5442

+/proc/net/ip_queue

5443

+</verb></tscreen>

5444

+The maximum length of the queue (i.e. the number packets delivered

5445

+to userspace with no verdict issued back) may be controlled via:

5446

+<tscreen><verb>

5447

+/proc/sys/net/ipv4/ip_queue_maxlen

5448

+</verb></tscreen>

5449

+The default value for the maximum queue length is 1024. Once this limit

5450

+is reached, new packets will be dropped until the length of the queue falls

5451

+below the limit again. Nice protocols such as TCP interpret dropped packets

5452

+as congestion, and will hopefully back off when the queue fills up. However,

5453

+it may take some experimenting to determine an ideal maximum queue length

5454

+for a given situation if the default value is too small.

5455

+

5456

+<sect1>Operations on an Entire Chain<label id="chain-ops">

5457

+

5458

+

5459

+A very useful feature of <tt>iptables</tt> is the ability to group

5460

+related rules into chains. You can call the chains whatever you want,

5461

+but I recommend using lower-case letters to avoid confusion with the

5462

+built-in chains and targets. Chain names can be up to 31 letters

5463

+long.

5464

+

5465

+<sect2>Creating a New Chain

5466

+

5467

+

5468

+Let's create a new chain. Because I am such an imaginative fellow,

5469

+I'll call it <tt>test</tt>. We use the `-N' or `--new-chain' options:

5470

+

5471

+<tscreen><verb>

5472

+# iptables -N test

5473

+#

5474

+</verb></tscreen>

5475

+

5476

+

5477

+It's that simple. Now you can put rules in it as detailed above.

5478

+

5479

+<sect2>Deleting a Chain

5480

+

5481

+

5482

+Deleting a chain is simple as well, using the `-X' or `--delete-chain'

5483

+options. Why `-X'? Well, all the good letters were taken.

5484

+

5485

+<tscreen><verb>

5486

+# iptables -X test

5487

+#

5488

+</verb></tscreen>

5489

+

5490

+

5491

+There are a couple of restrictions to deleting chains: they must be

5492

+empty (see <ref id="flushing" name="Flushing a Chain"> below) and they

5493

+must not be the target of any rule. You can't delete any of the three

5494

+built-in chains.

5495

+

5496

+

5497

+If you don't specify a chain, then all user-defined chains

5498

+will be deleted, if possible.

5499

+

5500

+<sect2> Flushing a Chain<label id="flushing">

5501

+

5502

+

5503

+There is a simple way of emptying all rules out of a chain, using the

5504

+`-F' (or `--flush') commands.

5505

+

5506

+<tscreen><verb>

5507

+# iptables -F FORWARD

5508

+#

5509

+</verb></tscreen>

5510

+

5511

+

5512

+If you don't specify a chain, then all chains will be flushed.

5513

+

5514

+<sect2>Listing a Chain

5515

+

5516

+

5517

+You can list all the rules in a chain by using the `-L' (or `--list')

5518

+command.

5519

+

5520

+

5521

+The `refcnt' listed for each user-defined chain is the number of rules

5522

+which have that chain as their target. This must be zero (and the

5523

+chain be empty) before this chain can be deleted.

5524

+

5525

+

5526

+If the chain name is omitted, all chains are listed, even empty ones.

5527

+

5528

+

5529

+There are three options which can accompany `-L'. The `-n' (numeric)

5530

+option is very useful as it prevents <tt>iptables</tt> from trying to

5531

+lookup the IP addresses, which (if you are using DNS like most people)

5532

+will cause large delays if your DNS is not set up properly, or you

5533

+have filtered out DNS requests. It also causes TCP and UDP ports to

5534

+be printed out as numbers rather than names.

5535

+

5536

+

5537

+The `-v' options shows you all the details of the rules, such as the

5538

+the packet and byte counters, the TOS comparisons, and the interfaces.

5539

+Otherwise these values are omitted.

5540

+

5541

+

5542

+Note that the packet and byte counters are printed out using the

5543

+suffixes `K', `M' or `G' for 1000, 1,000,000 and 1,000,000,000

5544

+respectively. Using the `-x' (expand numbers) flag as well prints the

5545

+full numbers, no matter how large they are.

5546

+

5547

+<sect2>Resetting (Zeroing) Counters

5548

+

5549

+

5550

+It is useful to be able to reset the counters. This can be done with

5551

+the `-Z' (or `--zero') option.

5552

+

5553

+

5554

+Consider the following:

5555

+

5556

+<tscreen><verb>

5557

+# iptables -L FORWARD

5558

+# iptables -Z FORWARD

5559

+#

5560

+</verb></tscreen>

5561

+

5562

+In the above example, some packets could pass through between the `-L'

5563

+and `-Z' commands. For this reason, you can use the `-L' and `-Z'

5564

+together, to reset the counters while reading them.

5565

+

5566

+<sect2>Setting Policy<label id="policy">

5567

+

5568

+

5569

+We glossed over what happens when a packet hits the end of a built-in

5570

+chain when we discussed how a packet walks through chains earlier. In

5571

+this case, the <bf>policy</bf> of the chain determines the fate of the

5572

+packet. Only built-in chains (<tt>INPUT</tt>, <tt>OUTPUT</tt> and

5573

+<tt>FORWARD</tt>) have policies, because if a packet falls off the end

5574

+of a user-defined chain, traversal resumes at the previous chain.

5575

+

5576

+

5577

+The policy can be either <tt>ACCEPT</tt> or <tt>DROP</tt>, for

5578

+example:

5579

+

5580

+<tscreen><verb>

5581

+# iptables -P FORWARD DROP

5582

+#

5583

+</verb></tscreen>

5584

+

5585

+<sect> Using ipchains and ipfwadm<label id="oldstyle">

5586

+

5587

+ There are modules in the netfilter distribution called ipchains.o

5588

+and ipfwadm.o. Insert one of these in your kernel (NOTE: they are

5589

+incompatible with ip_tables.o!). Then you can use ipchains or ipfwadm

5590

+just like the good old days.

5591

+

5592

+ This will be supported for some time yet. I think a reasonable

5593

+formula is 2 * [notice of replacement - initial stable release],

5594

+beyond the date that a stable release of the replacement is available.

5595

+This means that support will probably be dropped in Linux 2.6 or 2.8.

5596

+

5597

+<sect> Mixing NAT and Packet Filtering

5598

+

5599

+

5600

+It's common to want to do Network Address Translation (see the NAT

5601

+HOWTO) and packet filtering. The good news is that they mix extremely

5602

+well.

5603

+

5604

+You design your packet filtering completely ignoring any NAT you

5605

+are doing. The sources and destinations seen by the packet filter

5606

+will be the `real' sources and destinations. For example, if you are

5607

+doing DNAT to send any connections to 1.2.3.4 port 80 through to

5608

+10.1.1.1 port 8080, the packet filter would see packets going to

5609

+10.1.1.1 port 8080 (the real destination), not 1.2.3.4 port 80.

5610

+Similarly, you can ignore masquerading: packets will seem to come from

5611

+their real internal IP addresses (say 10.1.1.1), and replies will seem

5612

+to go back there.

5613

+

5614

+You can use the `state' match extension without making the packet

5615

+filter do any extra work, since NAT requires connection tracking

5616

+anyway. To enhance the simple masquerading example in the NAT HOWTO

5617

+to disallow any new connections from coming in the ppp0 interface, you

5618

+would do this:

5619

+

5620

+<tscreen><verb>

5621

+# Masquerade out ppp0

5622

+iptables -t nat -A POSTROUTING -o ppp0 -j MASQUERADE

5623

+

5624

+# Disallow NEW and INVALID incoming or forwarded packets from ppp0.

5625

+iptables -A INPUT -i ppp0 -m state --state NEW,INVALID -j DROP

5626

+iptables -A FORWARD -i ppp0 -m state --state NEW,INVALID -j DROP

5627

+

5628

+# Turn on IP forwarding

5629

+echo 1 > /proc/sys/net/ipv4/ip_forward

5630

+</verb></tscreen>

5631

+

5632

+<sect> Differences Between iptables and ipchains<label id="Appendix-A">

5633

+

5634

+

5635

+<itemize>

5636

+<item> Firstly, the names of the built-in chains have changed from

5637

+lower case to UPPER case, because the INPUT and OUTPUT chains now only

5638

+get locally-destined and locally-generated packets. They used to see

5639

+all incoming and all outgoing packets respectively.

5640

+

5641

+<item> The `-i' flag now means the incoming interface, and only works

5642

+in the INPUT and FORWARD chains. Rules in the FORWARD or OUTPUT

5643

+chains that used `-i' should be changed to `-o'.

5644

+

5645

+<item> TCP and UDP ports now need to be spelled out with the

5646

+--source-port or --sport (or --destination-port/--dport) options, and

5647

+must be placed after the `-p tcp' or `-p udp' options, as this loads

5648

+the TCP or UDP extensions respectively.

5649

+

5650

+<item> The TCP -y flag is now --syn, and must be after `-p tcp'.

5651

+

5652

+<item> The DENY target is now DROP, finally.

5653

+

5654

+<item> Zeroing single chains while listing them works.

5655

+

5656

+<item> Zeroing built-in chains also clears policy counters.

5657

+

5658

+<item> Listing chains gives you the counters as an atomic snapshot.

5659

+

5660

+<item> REJECT and LOG are now extended targets, meaning they are

5661

+separate kernel modules.

5662

+

5663

+<item> Chain names can be up to 31 characters.

5664

+

5665

+<item> MASQ is now MASQUERADE and uses a different syntax. REDIRECT,

5666

+while keeping the same name, has also undergone a syntax change. See

5667

+the NAT-HOWTO for more information on how to configure both of these.

5668

+

5669

+<item> The -o option is no longer used to direct packets to the userspace

5670

+device (see -i above). Packets are now sent to userspace via the QUEUE

5671

+target.

5672

+

5673

+<item> Probably heaps of other things I forgot.

5674

+</itemize>

5675

+

5676

+<sect> Advice on Packet Filter Design

5677

+

5678

+

5679

+Common wisdom in the computer security arena is to block everything,

5680

+then open up holes as neccessary. This is usually phrased `that which

5681

+is not explicitly allowed is prohibited'. I recommend this approach

5682

+if security is your maximal concern.

5683

+

5684

+Do not run any services you do not need to, even if you think you

5685

+have blocked access to them.

5686

+

5687

+If you are creating a dedicated firewall, start by running nothing,

5688

+and blocking all packets, then add services and let packets through as

5689

+required.

5690

+

5691

+I recommend security in depth: combine tcp-wrappers (for

5692

+connections to the packet filter itself), proxies (for connections

5693

+passing through the packet filter), route verification and packet

5694

+filtering. Route verification is where a packet which comes from an

5695

+unexpected interface is dropped: for example, if your internal network

5696

+has addresses 10.1.1.0/24, and a packet with that source address comes

5697

+in your external interface, it will be dropped. This can be enabled

5698

+for one interface (ppp0) like so:

5699

+

5700

+<tscreen><verb>

5701

+# echo 1 > /proc/sys/net/ipv4/conf/ppp0/rp_filter

5702

+#

5703

+</verb></tscreen>

5704

+

5705

+Or for all existing and future interfaces like this:

5706

+

5707

+<tscreen><verb>

5708

+# for f in /proc/sys/net/ipv4/conf/*/rp_filter; do

5709

+# echo 1 > $f

5710

+# done

5711

+#

5712

+</verb></tscreen>

5713

+

5714

+Debian does this by default where possible. If you have asymmetric

5715

+routing (ie. you expect packets coming in from strange directions),

5716

+you will want to disable this filtering on those interfaces.

5717

+

5718

+Logging is useful when setting up a firewall if something isn't

5719

+working, but on a production firewall, always combine it with the

5720

+`limit' match, to prevent someone from flooding your logs.

5721

+

5722

+I highly recommend connection tracking for secure systems: it

5723

+introduces some overhead, as all connections are tracked, but is very

5724

+useful for controlling access to your networks. You may need to load

5725

+the `ip_conntrack.o' module if your kernel does not load modules

5726

+automatically, and it's not built into the kernel. If you want to

5727

+accurately track complex protocols, you'll need to load the

5728

+appropriate helper module (eg. `ip_conntrack_ftp.o').

5729

+

5730

+<tscreen><verb>

5731

+# iptables -N no-conns-from-ppp0

5732

+# iptables -A no-conns-from-ppp0 -m state --state ESTABLISHED,RELATED -j ACCEPT

5733

+# iptables -A no-conns-from-ppp0 -m state --state NEW -i ! ppp0 -j ACCEPT

5734

+# iptables -A no-conns-from-ppp0 -i ppp0 -m limit -j LOG --log-prefix "Bad packet from ppp0:"

5735

+# iptables -A no-conns-from-ppp0 -i ! ppp0 -m limit -j LOG --log-prefix "Bad packet not from ppp0:"

5736

+# iptables -A no-conns-from-ppp0 -j DROP

5737

+

5738

+# iptables -A INPUT -j no-conns-from-ppp0

5739

+# iptables -A FORWARD -j no-conns-from-ppp0

5740

+</verb></tscreen>

5741

+

5742

+Building a good firewall is beyond the scope of this HOWTO, but my

5743

+advice is `always be minimalist'. See the Security HOWTO for more

5744

+information on testing and probing your box.

5745

+

5746

+</article>

5747

+

5748

Index: iptables-1.4.12/Makefile.am

5749

===================================================================

5750

--- iptables-1.4.12.orig/Makefile.am 2011-11-07 13:57:20.000000000 -0600

5751

+++ iptables-1.4.12/Makefile.am 2011-11-07 13:58:55.000000000 -0600

5752

@@ -3,7 +3,7 @@

5753

ACLOCAL_AMFLAGS = -I m4

5754

AUTOMAKE_OPTIONS = foreign subdir-objects

5755

5756

-SUBDIRS = libiptc libxtables

5757

+SUBDIRS = libiptc libxtables howtos

5758

if ENABLE_DEVEL

5759

SUBDIRS += include

5760

endif