94
94
// Instrument a basic block. Must be a true function, ie. the same
95
95
// input always results in the same output, because basic blocks
96
// can be retranslated. Unless you're doing something really
97
// strange... Note that orig_addr_noredir is not necessarily the
98
// same as the address of the first instruction in the IR, due to
99
// function redirection.
100
IRBB*(*instrument)(VgCallbackClosure*,
102
VexGuestLayout*, VexGuestExtents*,
103
IRType gWordTy, IRType hWordTy),
96
// can be retranslated, unless you're doing something really
97
// strange. Anyway, the arguments. Mostly they are straightforward
98
// except for the distinction between redirected and non-redirected
99
// guest code addresses, which is important to understand.
101
// VgCallBackClosure* closure contains extra arguments passed
102
// from Valgrind to the instrumenter, which Vex doesn't know about.
103
// You are free to look inside this structure.
105
// * closure->tid is the ThreadId of the thread requesting the
106
// translation. Not sure why this is here; perhaps callgrind
109
// * closure->nraddr is the non-redirected guest address of the
110
// start of the translation. In other words, the translation is
111
// being constructed because the guest program jumped to
112
// closure->nraddr but no translation of it was found.
114
// * closure->readdr is the redirected guest address, from which
115
// the translation was really made.
117
// To clarify this, consider what happens when, in Memcheck, the
118
// first call to malloc() happens. The guest program will be
119
// trying to jump to malloc() in libc; hence ->nraddr will contain
120
// that address. However, Memcheck intercepts and replaces
121
// malloc, hence ->readdr will be the address of Memcheck's
122
// malloc replacement in
123
// coregrind/m_replacemalloc/vg_replacemalloc.c. It follows
124
// that the first IMark in the translation will be labelled as
125
// from ->readdr rather than ->nraddr.
127
// Since most functions are not redirected, the majority of the
128
// time ->nraddr will be the same as ->readdr. However, you
129
// cannot assume this: if your tool has metadata associated
130
// with code addresses it will get into deep trouble if it does
131
// make this assumption.
133
// IRSB* sb_in is the incoming superblock to be instrumented,
136
// VexGuestLayout* layout contains limited info on the layout of
137
// the guest state: where the stack pointer and program counter
138
// are, and which fields should be regarded as 'always defined'.
139
// Memcheck uses this.
141
// VexGuestExtents* vge points to a structure which states the
142
// precise byte ranges of original code from which this translation
143
// was made (there may be up to three different ranges involved).
144
// Note again that these are the real addresses from which the code
145
// came. And so it should be the case that closure->readdr is the
146
// same as vge->base[0]; indeed Cachegrind contains this assertion.
148
// Tools which associate shadow data with code addresses
149
// (cachegrind, callgrind) need to be particularly clear about
150
// whether they are making the association with redirected or
151
// non-redirected code addresses. Both approaches are viable
152
// but you do need to understand what's going on. See comments
153
// below on discard_basic_block_info().
155
// IRType gWordTy and IRType hWordTy contain the types of native
156
// words on the guest (simulated) and host (real) CPUs. They will
157
// by either Ity_I32 or Ity_I64. So far we have never built a
158
// cross-architecture Valgrind so they should always be the same.
160
/* --- Further comments about the IR that your --- */
161
/* --- instrumentation function will receive. --- */
163
In the incoming IRSB, the IR for each instruction begins with an
164
IRStmt_IMark, which states the address and length of the
165
instruction from which this IR came. This makes it easy for
166
profiling-style tools to know precisely which guest code
167
addresses are being executed.
169
However, before the first IRStmt_IMark, there may be other IR
170
statements -- a preamble. In most cases this preamble is empty,
171
but when it isn't, what it contains is some supporting IR that
172
the JIT uses to ensure control flow works correctly. This
173
preamble does not modify any architecturally defined guest state
174
(registers or memory) and so does not contain anything that will
175
be of interest to your tool.
179
(1) copy any IR preceding the first IMark verbatim to the start
182
(2) not try to instrument it or modify it in any way.
184
For the record, stuff that may be in the preamble at
187
- A self-modifying-code check has been requested for this block.
188
The preamble will contain instructions to checksum the block,
189
compare against the expected value, and exit the dispatcher
190
requesting a discard (hence forcing a retranslation) if they
193
- This block is known to be the entry point of a wrapper of some
194
function F. In this case the preamble contains code to write
195
the address of the original F (the fn being wrapped) into a
196
'hidden' guest state register _NRADDR. The wrapper can later
197
read this register using a client request and make a
198
non-redirected call to it using another client-request-like
201
- For platforms that use the AIX ABI (including ppc64-linux), it
202
is necessary to have a preamble even for replacement functions
203
(not just for wrappers), because it is necessary to switch the
204
R2 register (constant-pool pointer) to a different value when
205
swizzling the program counter.
207
Hence the preamble pushes both R2 and LR (the return address)
208
on a small 16-entry stack in the guest state and sets R2 to an
209
appropriate value for the wrapper/replacement fn. LR is then
210
set so that the wrapper/replacement fn returns to a magic IR
211
stub which restores R2 and LR and returns.
213
It's all hugely ugly and fragile. And it places a stringent
214
requirement on m_debuginfo to find out the correct R2 (toc
215
pointer) value for the wrapper/replacement function. So much
216
so that m_redir will refuse to honour a redirect-to-me request
217
if it cannot find (by asking m_debuginfo) a plausible R2 value
220
Because this mechanism maintains a shadow stack of (R2,LR)
221
pairs in the guest state, it will fail if the
222
wrapper/redirection function, or anything it calls, longjumps
223
out past the wrapper, because then the magic return stub will
224
not be run and so the shadow stack will not be popped. So it
225
will quickly fill up. Fortunately none of this applies to
226
{x86,amd64,ppc32}-linux; on those platforms, wrappers can
227
longjump and recurse arbitrarily and everything should work
230
Note that copying the preamble verbatim may cause complications
231
for your instrumenter if you shadow IR temporaries. See big
232
comment in MC_(instrument) in memcheck/mc_translate.c for
235
IRSB*(*instrument)(VgCallbackClosure* closure,
237
VexGuestLayout* layout,
238
VexGuestExtents* vge,
105
242
// Finish up, print out any results, etc. `exitcode' is program's exit
106
243
// code. The shadow can be found with VG_(get_exit_status_shadow)().
208
348
.so unloading, or otherwise at the discretion of m_transtab, eg
209
349
when the table becomes too full) to avoid stale information being
210
350
reused for new translations. */
211
extern void VG_(needs_basic_block_discards) (
351
extern void VG_(needs_superblock_discards) (
212
352
// Discard any information that pertains to specific translations
213
353
// or instructions within the address range given. There are two
214
354
// possible approaches.
215
355
// - If info is being stored at a per-translation level, use orig_addr
216
356
// to identify which translation is being discarded. Each translation
217
357
// will be discarded exactly once.
218
// This orig_addr will match the orig_addr which was passed to
219
// to instrument() when this translation was made. Note that orig_addr
220
// won't necessarily be the same as the first address in "extents".
358
// This orig_addr will match the closure->nraddr which was passed to
359
// to instrument() (see extensive comments above) when this
360
// translation was made. Note that orig_addr won't necessarily be
361
// the same as the first address in "extents".
221
362
// - If info is being stored at a per-instruction level, you can get
222
363
// the address range(s) being discarded by stepping through "extents".
223
364
// Note that any single instruction may belong to more than one
224
365
// translation, and so could be covered by the "extents" of more than
225
366
// one call to this function.
226
367
// Doing it the first way (as eg. Cachegrind does) is probably easier.
227
void (*discard_basic_block_info)(Addr64 orig_addr, VexGuestExtents extents)
368
void (*discard_superblock_info)(Addr64 orig_addr, VexGuestExtents extents)
230
371
/* Tool defines its own command line options? */
231
372
extern void VG_(needs_command_line_options) (
232
373
// Return True if option was recognised. Presumably sets some state to
233
// record the option as well.
374
// record the option as well. Nb: tools can assume that the argv will
375
// never disappear. So they can, for example, store a pointer to a string
376
// within an option, rather than having to make a copy.
234
377
Bool (*process_cmd_line_option)(Char* argv),
236
379
// Print out command line usage for options for normal tool operation.
401
547
/* Scheduler events (not exhaustive) */
402
void VG_(track_thread_run)(void(*f)(ThreadId tid));
549
/* Called when 'tid' starts or stops running client code blocks.
550
Gives the total dispatched block count at that event. Note, this
551
is not the same as 'tid' holding the BigLock (the lock that ensures
552
that only one thread runs at a time): a thread can hold the lock
553
for other purposes (making translations, etc) yet not be running
554
client blocks. Obviously though, a thread must hold the lock in
555
order to run client code blocks, so the times bracketed by
556
'start_client_code'..'stop_client_code' are a subset of the times
557
when thread 'tid' holds the cpu lock.
559
void VG_(track_start_client_code)(
560
void(*f)(ThreadId tid, ULong blocks_dispatched)
562
void VG_(track_stop_client_code)(
563
void(*f)(ThreadId tid, ULong blocks_dispatched)
405
567
/* Thread events (not exhaustive)
407
Called during thread create, before the new thread has run any
408
instructions (or touched any memory).
410
void VG_(track_post_thread_create)(void(*f)(ThreadId tid, ThreadId child));
411
void VG_(track_post_thread_join) (void(*f)(ThreadId joiner, ThreadId joinee));
413
/* Mutex events (not exhaustive)
414
"void *mutex" is really a pthread_mutex *
416
Called before a thread can block while waiting for a mutex (called
417
regardless of whether the thread will block or not). */
418
void VG_(track_pre_mutex_lock)(void(*f)(ThreadId tid, void* mutex));
420
/* Called once the thread actually holds the mutex (always paired with
422
void VG_(track_post_mutex_lock)(void(*f)(ThreadId tid, void* mutex));
424
/* Called after a thread has released a mutex (no need for a corresponding
425
pre_mutex_unlock, because unlocking can't block). */
426
void VG_(track_post_mutex_unlock)(void(*f)(ThreadId tid, void* mutex));
569
ll_create: low level thread creation. Called before the new thread
570
has run any instructions (or touched any memory). In fact, called
571
immediately before the new thread has come into existence; the new
572
thread can be assumed to exist when notified by this call.
574
ll_exit: low level thread exit. Called after the exiting thread
575
has run its last instruction.
577
The _ll_ part makes it clear these events are not to do with
578
pthread_create or pthread_exit/pthread_join (etc), which are a
579
higher level abstraction synthesised by libpthread. What you can
580
be sure of from _ll_create/_ll_exit is the absolute limits of each
581
thread's lifetime, and hence be assured that all memory references
582
made by the thread fall inside the _ll_create/_ll_exit pair. This
583
is important for tools that need a 100% accurate account of which
584
thread is responsible for every memory reference in the process.
586
pthread_create/join/exit do not give this property. Calls/returns
587
to/from them happen arbitrarily far away from the relevant
588
low-level thread create/quit event. In general a few hundred
589
instructions; hence a few hundred(ish) memory references could get
590
misclassified each time.
592
pre_thread_first_insn: is called when the thread is all set up and
593
ready to go (stack in place, etc) but has not executed its first
594
instruction yet. Gives threading tools a chance to ask questions
595
about the thread (eg, what is its initial client stack pointer)
596
that are not easily answered at pre_thread_ll_create time.
598
For a given thread, the call sequence is:
599
ll_create (in the parent's context)
600
first_insn (in the child's context)
601
ll_exit (in the child's context)
603
void VG_(track_pre_thread_ll_create) (void(*f)(ThreadId tid, ThreadId child));
604
void VG_(track_pre_thread_first_insn)(void(*f)(ThreadId tid));
605
void VG_(track_pre_thread_ll_exit) (void(*f)(ThreadId tid));
428
608
/* Signal events (not exhaustive)