38
38
eaPVNe2ccXLfEegoda4xU2TezbGfbSEGoU1qolyQYLX674sNA2Ni6l6/CEKYYh
41
The ClamAV project distributes two CVD files: \emph{main.cvd} and
41
The ClamAV project distributes a number of CVD files, including
42
\emph{main.cvd} and \emph{daily.cvd}.
44
\section{Debug information from libclamav}
45
In order to create efficient signatures for ClamAV it's important
46
to understand how the engine handles input files. The best way
47
to see how it works is having a look at the debug information from
48
libclamav. You can do it by calling \verb+clamscan+ with the
49
\verb+--debug+ and \verb+--leave-temps+ flags. The first switch
50
makes clamscan display all the interesting information from
51
libclamav and the second one avoids deleting temporary files so
52
they can be analyzed further. The now important part of the info
55
$ clamscan --debug attachment.exe
57
LibClamAV debug: Recognized MS-EXE/DLL file
58
LibClamAV debug: Matched signature for file type PE
59
LibClamAV debug: File type: Executable
61
The engine recognized a windows executable.
63
LibClamAV debug: Machine type: 80386
64
LibClamAV debug: NumberOfSections: 3
65
LibClamAV debug: TimeDateStamp: Fri Jan 10 04:57:55 2003
66
LibClamAV debug: SizeOfOptionalHeader: e0
67
LibClamAV debug: File format: PE
68
LibClamAV debug: MajorLinkerVersion: 6
69
LibClamAV debug: MinorLinkerVersion: 0
70
LibClamAV debug: SizeOfCode: 0x9000
71
LibClamAV debug: SizeOfInitializedData: 0x1000
72
LibClamAV debug: SizeOfUninitializedData: 0x1e000
73
LibClamAV debug: AddressOfEntryPoint: 0x27070
74
LibClamAV debug: BaseOfCode: 0x1f000
75
LibClamAV debug: SectionAlignment: 0x1000
76
LibClamAV debug: FileAlignment: 0x200
77
LibClamAV debug: MajorSubsystemVersion: 4
78
LibClamAV debug: MinorSubsystemVersion: 0
79
LibClamAV debug: SizeOfImage: 0x29000
80
LibClamAV debug: SizeOfHeaders: 0x400
81
LibClamAV debug: NumberOfRvaAndSizes: 16
82
LibClamAV debug: Subsystem: Win32 GUI
83
LibClamAV debug: ------------------------------------
84
LibClamAV debug: Section 0
85
LibClamAV debug: Section name: UPX0
86
LibClamAV debug: Section data (from headers - in memory)
87
LibClamAV debug: VirtualSize: 0x1e000 0x1e000
88
LibClamAV debug: VirtualAddress: 0x1000 0x1000
89
LibClamAV debug: SizeOfRawData: 0x0 0x0
90
LibClamAV debug: PointerToRawData: 0x400 0x400
91
LibClamAV debug: Section's memory is executable
92
LibClamAV debug: Section's memory is writeable
93
LibClamAV debug: ------------------------------------
94
LibClamAV debug: Section 1
95
LibClamAV debug: Section name: UPX1
96
LibClamAV debug: Section data (from headers - in memory)
97
LibClamAV debug: VirtualSize: 0x9000 0x9000
98
LibClamAV debug: VirtualAddress: 0x1f000 0x1f000
99
LibClamAV debug: SizeOfRawData: 0x8200 0x8200
100
LibClamAV debug: PointerToRawData: 0x400 0x400
101
LibClamAV debug: Section's memory is executable
102
LibClamAV debug: Section's memory is writeable
103
LibClamAV debug: ------------------------------------
104
LibClamAV debug: Section 2
105
LibClamAV debug: Section name: UPX2
106
LibClamAV debug: Section data (from headers - in memory)
107
LibClamAV debug: VirtualSize: 0x1000 0x1000
108
LibClamAV debug: VirtualAddress: 0x28000 0x28000
109
LibClamAV debug: SizeOfRawData: 0x200 0x1ff
110
LibClamAV debug: PointerToRawData: 0x8600 0x8600
111
LibClamAV debug: Section's memory is writeable
112
LibClamAV debug: ------------------------------------
113
LibClamAV debug: EntryPoint offset: 0x8470 (33904)
115
The section structure displayed above suggests the executable is
118
LibClamAV debug: ------------------------------------
119
LibClamAV debug: EntryPoint offset: 0x8470 (33904)
120
LibClamAV debug: UPX/FSG/MEW: empty section found - assuming
122
LibClamAV debug: UPX: bad magic - scanning for imports
123
LibClamAV debug: UPX: PE structure rebuilt from compressed file
124
LibClamAV debug: UPX: Successfully decompressed with NRV2B
125
LibClamAV debug: UPX/FSG: Decompressed data saved in
126
/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede
127
LibClamAV debug: ***** Scanning decompressed file *****
128
LibClamAV debug: Recognized MS-EXE/DLL file
129
LibClamAV debug: Matched signature for file type PE
131
Indeed, libclamav recognizes the UPX data and saves the decompressed
132
(and rebuilt) executable into \verb+/tmp/clamav-90d2d25c9dca42bae6fa9a764a4bcede+.
133
Then it continues by scanning this new file:
135
LibClamAV debug: File type: Executable
136
LibClamAV debug: Machine type: 80386
137
LibClamAV debug: NumberOfSections: 3
138
LibClamAV debug: TimeDateStamp: Thu Jan 27 11:43:15 2011
139
LibClamAV debug: SizeOfOptionalHeader: e0
140
LibClamAV debug: File format: PE
141
LibClamAV debug: MajorLinkerVersion: 6
142
LibClamAV debug: MinorLinkerVersion: 0
143
LibClamAV debug: SizeOfCode: 0xc000
144
LibClamAV debug: SizeOfInitializedData: 0x19000
145
LibClamAV debug: SizeOfUninitializedData: 0x0
146
LibClamAV debug: AddressOfEntryPoint: 0x7b9f
147
LibClamAV debug: BaseOfCode: 0x1000
148
LibClamAV debug: SectionAlignment: 0x1000
149
LibClamAV debug: FileAlignment: 0x1000
150
LibClamAV debug: MajorSubsystemVersion: 4
151
LibClamAV debug: MinorSubsystemVersion: 0
152
LibClamAV debug: SizeOfImage: 0x26000
153
LibClamAV debug: SizeOfHeaders: 0x1000
154
LibClamAV debug: NumberOfRvaAndSizes: 16
155
LibClamAV debug: Subsystem: Win32 GUI
156
LibClamAV debug: ------------------------------------
157
LibClamAV debug: Section 0
158
LibClamAV debug: Section name: .text
159
LibClamAV debug: Section data (from headers - in memory)
160
LibClamAV debug: VirtualSize: 0xc000 0xc000
161
LibClamAV debug: VirtualAddress: 0x1000 0x1000
162
LibClamAV debug: SizeOfRawData: 0xc000 0xc000
163
LibClamAV debug: PointerToRawData: 0x1000 0x1000
164
LibClamAV debug: Section contains executable code
165
LibClamAV debug: Section's memory is executable
166
LibClamAV debug: ------------------------------------
167
LibClamAV debug: Section 1
168
LibClamAV debug: Section name: .rdata
169
LibClamAV debug: Section data (from headers - in memory)
170
LibClamAV debug: VirtualSize: 0x2000 0x2000
171
LibClamAV debug: VirtualAddress: 0xd000 0xd000
172
LibClamAV debug: SizeOfRawData: 0x2000 0x2000
173
LibClamAV debug: PointerToRawData: 0xd000 0xd000
174
LibClamAV debug: ------------------------------------
175
LibClamAV debug: Section 2
176
LibClamAV debug: Section name: .data
177
LibClamAV debug: Section data (from headers - in memory)
178
LibClamAV debug: VirtualSize: 0x17000 0x17000
179
LibClamAV debug: VirtualAddress: 0xf000 0xf000
180
LibClamAV debug: SizeOfRawData: 0x17000 0x17000
181
LibClamAV debug: PointerToRawData: 0xf000 0xf000
182
LibClamAV debug: Section's memory is writeable
183
LibClamAV debug: ------------------------------------
184
LibClamAV debug: EntryPoint offset: 0x7b9f (31647)
185
LibClamAV debug: Bytecode executing hook id 257 (0 hooks)
189
No additional files get created by libclamav. By writing
190
a signature for the decompressed file you have more chances
191
that the engine will detect the target data when it gets
192
compressed with another packer.
194
This method should be applied to all files for which you want
195
to create signatures. By analyzing the debug information you
196
can quickly see how the engine recognizes and preprocesses
197
the data and what additional files get created. Signatures
198
created for bottom-level temporary files are usually more
199
generic and should help detecting the same malware in
44
202
\section{Signature formats}
261
447
(63|64)61706528;S+50:68efa311c3b9963cb1ee8e586d32aeb9043e;f9c58d
262
448
cf43987e4f519d629b103375;SL+550:6300680065005c0046006900
265
\subsection{Signatures based on archive metadata}
266
Signatures based on metadata inside archive files can provide an effective
267
protection against malware that spreads via encrypted zip or rar
268
archives. The format of a metadata signature is:
270
virname:encrypted:filename:normal size:csize:crc32:cmethod:fileno:max depth
450
ClamAV 0.96 introduced support for special macro subsignatures in
451
the following format: \verb+${min-max}MACROID$+, where \verb+MACROID+
452
points to a group of signatures and \verb+{min-max}+ specifies the
453
offset range at which one of the group signatures should match.
454
The range is calculated against the match offset of the previous
455
subsignature. The macro subsignature makes its preceding subsignature
456
considered a match only if both of them get matched. For more
457
information and examples please see
458
\url{https://wwws.clamav.net/bugzilla/show_bug.cgi?id=164}.
460
\subsection{Icon signatures for PE files}
461
ClamAV 0.96 includes an approximate/fuzzy icon matcher to help
462
detecting malicious executables disguising themselves as innocent
463
looking image files, office documents and the like.
465
Icon matching is only triggered via .ldb signatures using the special
466
attribute tokens \verb+IconGroup1+ or \verb+IconGroup2+. These identify
467
two (optional) groups of icons defined in a .idb database file. The
468
format of the .idb file is:
470
ICONNAME:GROUP1:GROUP2:ICON_HASH
474
\item \verb+ICON_NAME+ is a unique string identifier for a specific
476
\item \verb+GROUP1+ is a string identifier for the first group of
477
icons (\verb+IconGroup1+)
478
\item \verb+GROUP2+ is a string identifier for the second group of
479
icons (\verb+IconGroup2+),
480
\item \verb+ICON_HASH+ is a fuzzy hash of the icon image
482
The \verb+ICON_HASH+ field can be obtained from the debug output of
483
libclamav. For example:
485
LibClamAV debug: ICO SIGNATURE:
486
ICON_NAME:GROUP1:GROUP2:18e2e0304ce60a0cc3a09053a30000414100057e
487
000afe0000e 80006e510078b0a08910d11ad04105e0811510f084e01040c080
488
a1d0b0021000a39002a41
491
\subsection{Signatures for Version Information metadata in PE files}
492
Starting with ClamAV 0.96 it is possible to easily match certain
493
information built into PE files (executables and dynamic link libraries).
494
Whenever you lookup the properties of a PE executable file in windows,
495
you are presented with a bunch of details about the file itself.
497
These info are stored in a special area of the file resources which goes
498
under the name of \verb+VS_VERSION_INFORMATION+ (or versioninfo for short).
499
It is divided into 2 parts. The first part (which is rather uninteresting)
500
is really a bunch of numbers and flags indicating the product and file
501
version. It was originally intended for use with installers which, after
502
parsing it, should be able to determine whether a certain executable or
503
library are to be upgraded/overwritten or are already up to date. Suffice
504
to say, this approach never really worked and is generally never used.
506
The second block is much more interesting: it is a simple list of key/value
507
strings, intended for user information and completely ignored by the OS.
508
For example, if you look at ping.exe you can see the company being \emph{"Microsoft
509
Corporation"}, the description \emph{"TCP/IP Ping command"}, the internal name
510
\emph{"ping.exe"} and so on... Depending on the OS version, some keys may be given
511
peculiar visibility in the file properties dialog, however they are internally
514
To match a versioninfo key/value pair, the special file offset anchor \verb+VI+ was
515
introduced. This is similar to the other anchors (like \verb+EP+ and \verb+SL+)
516
except that, instead of matching the hex pattern against a single offset, it checks
517
it against each and every key/value pair in the file. The \verb+VI+ token doesn't
518
need nor accept a \verb#+/-# offset like e.g. \verb#EP+1#. As for the hex signature
519
itself, it's just the utf16 dump of the key and value. Only the \verb+??+ and
520
\verb+(aa|bb)+ wildcards are allowed in the signature. Usually, you don't need to
521
bother figuring it out: each key/value pair together with the corresponding VI-based
522
signature is printed by \verb+clamscan+ when the \verb+--debug+ option is given.
524
For example \verb+clamscan --debug freecell.exe+ produces:
527
Recognized MS-EXE/DLL file
529
versioninfo_cb: type: 10, name: 1, lang: 410, rva: 9608
530
cli_peheader: parsing version info @ rva 9608 (1/1)
531
VersionInfo (d2de): 'CompanyName'='Microsoft Corporation' -
532
VI:43006f006d00700061006e0079004e0061006d006500000000004d006900
533
630072006f0073006f0066007400200043006f00720070006f0072006100740
535
VersionInfo (d32a): 'FileDescription'='Entertainment Pack
536
FreeCell Game' - VI:460069006c006500440065007300630072006900700
537
0740069006f006e000000000045006e007400650072007400610069006e006d
538
0065006e00740020005000610063006b0020004600720065006500430065006
539
c006c002000470061006d0065000000
540
VersionInfo (d396): 'FileVersion'='5.1.2600.0 (xpclient.010817
541
-1148)' - VI:460069006c006500560065007200730069006f006e00000000
542
0035002e0031002e0032003600300030002e003000200028007800700063006
543
c00690065006e0074002e003000310030003800310037002d00310031003400
545
VersionInfo (d3fa): 'InternalName'='freecell' - VI:49006e007400
546
650072006e0061006c004e0061006d006500000066007200650065006300650
548
VersionInfo (d4ba): 'OriginalFilename'='freecell' - VI:4f007200
549
6900670069006e0061006c00460069006c0065006e0061006d0065000000660
550
0720065006500630065006c006c000000
551
VersionInfo (d4f6): 'ProductName'='Sistema operativo Microsoft
552
Windows' - VI:500072006f0064007500630074004e0061006d00650000000
553
000530069007300740065006d00610020006f00700065007200610074006900
554
76006f0020004d006900630072006f0073006f0066007400ae0020005700690
555
06e0064006f0077007300ae000000
556
VersionInfo (d562): 'ProductVersion'='5.1.2600.0' - VI:50007200
557
6f006400750063007400560065007200730069006f006e00000035002e00310
558
02e0032003600300030002e0030000000
561
Although VI-based signatures are intended for use in logical signatures you can test them
562
using ordinary \verb+.ndb+ files. For example:
564
my_test_vi_sig:1:VI:paste_your_hex_sig_here
566
Final note. If you want to decode a VI-based signature into a human readable form you can use:
568
echo hex_string | xxd -r -p | strings -el
572
$ echo 460069006c0065004400650073006300720069007000740069006f006e
573
000000000045006e007400650072007400610069006e006d0065006e007400200
574
05000610063006b0020004600720065006500430065006c006c00200047006100
575
6d0065000000 | xxd -r -p | strings -el
577
Entertainment Pack FreeCell Game
580
\subsection{Signatures based on container metadata}
581
ClamAV 0.96 allows creating generic signatures matching files stored
582
inside different container types which meet specific conditions.
583
The signature format is
585
VirusName:ContainerType:ContainerSize:FileNameREGEX:
586
FileSizeInContainer:FileSizeReal:IsEncrypted:FilePos:
587
Res1:Res2[:MinFL[:MaxFL]]
589
where the corresponding fields are:
591
\item \verb+VirusName:+ Virus name to be displayed when signature matches
592
\item \verb+ContainerType:+ one of \verb+CL_TYPE_ZIP+, \verb+CL_TYPE_RAR+,
593
\verb+CL_TYPE_ARJ+,\\
594
\verb+CL_TYPE_CAB+, \verb+CL_TYPE_7Z+, \verb+CL_TYPE_MAIL+, \verb+CL_TYPE_(POSIX|OLD)_TAR+,\\
595
\verb+CL_TYPE_CPIO_(OLD|ODC|NEWC|CRC)+ or \verb+*+ to match
596
any of the container types listed here
597
\item \verb+ContainerSize:+ size of the container file itself (eg. size of
598
the zip archive) specified in bytes as absolute value or range \verb+x-y+
599
\item \verb+FileNameREGEX:+ regular expression describing name of the target file
600
\item \verb+FileSizeInContainer:+ usually compressed size; for MAIL, TAR and CPIO ==
601
\verb+FileSizeReal+; specified in bytes as absolute value or range
602
\item \verb+FileSizeReal:+ usually uncompressed size; for MAIL, TAR and CPIO ==
603
\verb+FileSizeInContainer+; absolute value or range
604
\item \verb+IsEncrypted+: 1 if the target file is encrypted, 0 if it's not and
606
\item \verb+FilePos+: file position in container (counting from 1); absolute value
608
\item \verb+Res1+: when \verb+ContainerType+ is \verb+CL_TYPE_ZIP+ or
609
\verb+CL_TYPE_RAR+ this field is treated as a CRC sum of the target file
610
specified in hexadecimal format; for other container types it's ignored
611
\item \verb+Res2+: not used as of ClamAV 0.96
613
The signatures for container files are stored inside \verb+.cdb+ files.
615
\subsection{Signatures based on ZIP/RAR metadata (obsolete)}
616
The (now obsolete) archive metadata signatures can be only applied
617
to ZIP and RAR files and have the following format:
619
virname:encrypted:filename:normal size:csize:crc32:cmethod:
272
622
where the corresponding fields are: