3
libeio - truly asynchronous POSIX I/O
11
The newest version of this document is also available as an html-formatted
12
web page you might find easier to navigate when reading it for the first
13
time: L<http://pod.tst.eu/http://cvs.schmorp.de/libeio/eio.pod>.
15
Note that this library is a by-product of the C<IO::AIO> perl
16
module, and many of the subtler points regarding requests lifetime
17
and so on are only documented in its documentation at the
18
moment: L<http://pod.tst.eu/http://cvs.schmorp.de/IO-AIO/AIO.pm>.
22
This library provides fully asynchronous versions of most POSIX functions
23
dealing with I/O. Unlike most asynchronous libraries, this not only
24
includes C<read> and C<write>, but also C<open>, C<stat>, C<unlink> and
25
similar functions, as well as less rarely ones such as C<mknod>, C<futime>
28
It also offers wrappers around C<sendfile> (Solaris, Linux, HP-UX and
29
FreeBSD, with emulation on other platforms) and C<readahead> (Linux, with
30
emulation elsewhere>).
32
The goal is to enable you to write fully non-blocking programs. For
33
example, in a game server, you would not want to freeze for a few seconds
34
just because the server is running a backup and you happen to call
37
=head2 TIME REPRESENTATION
39
Libeio represents time as a single floating point number, representing the
40
(fractional) number of seconds since the (POSIX) epoch (somewhere near
41
the beginning of 1970, details are complicated, don't ask). This type is
42
called C<eio_tstamp>, but it is guaranteed to be of type C<double> (or
43
better), so you can freely use C<double> yourself.
45
Unlike the name component C<stamp> might indicate, it is also used for
46
time differences throughout libeio.
50
Usage of pthreads in a program changes the semantics of fork
51
considerably. Specifically, only async-safe functions can be called after
52
fork. Libeio uses pthreads, so this applies, and makes using fork hard for
53
anything but relatively fork + exec uses.
55
This library only works in the process that initialised it: Forking is
56
fully supported, but using libeio in any other process than the one that
57
called C<eio_init> is not.
59
You might get around by not I<using> libeio before (or after) forking in
60
the parent, and using it in the child afterwards. You could also try to
61
call the L<eio_init> function again in the child, which will brutally
62
reinitialise all data structures, which isn't POSIX conformant, but
65
Otherwise, the only recommendation you should follow is: treat fork code
66
the same way you treat signal handlers, and only ever call C<eio_init> in
67
the process that uses it, and only once ever.
69
=head1 INITIALISATION/INTEGRATION
71
Before you can call any eio functions you first have to initialise the
72
library. The library integrates into any event loop, but can also be used
73
without one, including in polling mode.
75
You have to provide the necessary glue yourself, however.
79
=item int eio_init (void (*want_poll)(void), void (*done_poll)(void))
81
This function initialises the library. On success it returns C<0>, on
82
failure it returns C<-1> and sets C<errno> appropriately.
84
It accepts two function pointers specifying callbacks as argument, both of
85
which can be C<0>, in which case the callback isn't called.
87
There is currently no way to change these callbacks later, or to
88
"uninitialise" the library again.
90
=item want_poll callback
92
The C<want_poll> callback is invoked whenever libeio wants attention (i.e.
93
it wants to be polled by calling C<eio_poll>). It is "edge-triggered",
94
that is, it will only be called once when eio wants attention, until all
95
pending requests have been handled.
97
This callback is called while locks are being held, so I<you must
98
not call any libeio functions inside this callback>. That includes
99
C<eio_poll>. What you should do is notify some other thread, or wake up
100
your event loop, and then call C<eio_poll>.
102
=item done_poll callback
104
This callback is invoked when libeio detects that all pending requests
105
have been handled. It is "edge-triggered", that is, it will only be
106
called once after C<want_poll>. To put it differently, C<want_poll> and
107
C<done_poll> are invoked in pairs: after C<want_poll> you have to call
108
C<eio_poll ()> until either C<eio_poll> indicates that everything has been
109
handled or C<done_poll> has been called, which signals the same.
111
Note that C<eio_poll> might return after C<done_poll> and C<want_poll>
112
have been called again, so watch out for races in your code.
114
As with C<want_poll>, this callback is called while locks are being held,
115
so you I<must not call any libeio functions form within this callback>.
117
=item int eio_poll ()
119
This function has to be called whenever there are pending requests that
120
need finishing. You usually call this after C<want_poll> has indicated
121
that you should do so, but you can also call this function regularly to
122
poll for new results.
124
If any request invocation returns a non-zero value, then C<eio_poll ()>
125
immediately returns with that value as return value.
127
Otherwise, if all requests could be handled, it returns C<0>. If for some
128
reason not all requests have been handled, i.e. some are still pending, it
133
For libev, you would typically use an C<ev_async> watcher: the
134
C<want_poll> callback would invoke C<ev_async_send> to wake up the event
135
loop. Inside the callback set for the watcher, one would call C<eio_poll
138
If C<eio_poll ()> is configured to not handle all results in one go
139
(i.e. it returns C<-1>) then you should start an idle watcher that calls
140
C<eio_poll> until it returns something C<!= -1>.
142
A full-featured connector between libeio and libev would look as follows
143
(if C<eio_poll> is handling all requests, it can of course be simplified a
144
lot by removing the idle watcher logic):
146
static struct ev_loop *loop;
147
static ev_idle repeat_watcher;
148
static ev_async ready_watcher;
150
/* idle watcher callback, only used when eio_poll */
151
/* didn't handle all results in one call */
153
repeat (EV_P_ ev_idle *w, int revents)
155
if (eio_poll () != -1)
156
ev_idle_stop (EV_A_ w);
159
/* eio has some results, process them */
161
ready (EV_P_ ev_async *w, int revents)
163
if (eio_poll () == -1)
164
ev_idle_start (EV_A_ &repeat_watcher);
167
/* wake up the event loop */
171
ev_async_send (loop, &ready_watcher)
179
ev_idle_init (&repeat_watcher, repeat);
180
ev_async_init (&ready_watcher, ready);
181
ev_async_start (loop &watcher);
183
eio_init (want_poll, 0);
186
For most other event loops, you would typically use a pipe - the event
187
loop should be told to wait for read readiness on the read end. In
188
C<want_poll> you would write a single byte, in C<done_poll> you would try
189
to read that byte, and in the callback for the read end, you would call
192
You don't have to take special care in the case C<eio_poll> doesn't handle
193
all requests, as the done callback will not be invoked, so the event loop
194
will still signal readiness for the pipe until I<all> results have been
198
=head1 HIGH LEVEL REQUEST API
200
Libeio has both a high-level API, which consists of calling a request
201
function with a callback to be called on completion, and a low-level API
202
where you fill out request structures and submit them.
204
This section describes the high-level API.
206
=head2 REQUEST SUBMISSION AND RESULT PROCESSING
208
You submit a request by calling the relevant C<eio_TYPE> function with the
209
required parameters, a callback of type C<int (*eio_cb)(eio_req *req)>
210
(called C<eio_cb> below) and a freely usable C<void *data> argument.
212
The return value will either be 0, in case something went really wrong
213
(which can basically only happen on very fatal errors, such as C<malloc>
214
returning 0, which is rather unlikely), or a pointer to the newly-created
215
and submitted C<eio_req *>.
217
The callback will be called with an C<eio_req *> which contains the
218
results of the request. The members you can access inside that structure
219
vary from request to request, except for:
223
=item C<ssize_t result>
225
This contains the result value from the call (usually the same as the
226
syscall of the same name).
230
This contains the value of C<errno> after the call.
234
The C<void *data> member simply stores the value of the C<data> argument.
238
The return value of the callback is normally C<0>, which tells libeio to
239
continue normally. If a callback returns a nonzero value, libeio will
240
stop processing results (in C<eio_poll>) and will return the value to its
243
Memory areas passed to libeio must stay valid as long as a request
244
executes, with the exception of paths, which are being copied
245
internally. Any memory libeio itself allocates will be freed after the
246
finish callback has been called. If you want to manage all memory passed
247
to libeio yourself you can use the low-level API.
249
For example, to open a file, you could do this:
252
file_open_done (eio_req *req)
256
/* open() returned -1 */
257
errno = req->errorno;
262
int fd = req->result;
263
/* now we have the new fd in fd */
269
/* the first three arguments are passed to open(2) */
270
/* the remaining are priority, callback and data */
271
if (!eio_open ("/etc/passwd", O_RDONLY, 0, 0, file_open_done, 0))
272
abort (); /* something went wrong, we will all die!!! */
274
Note that you additionally need to call C<eio_poll> when the C<want_cb>
275
indicates that requests are ready to be processed.
277
=head2 CANCELLING REQUESTS
279
Sometimes the need for a request goes away before the request is
280
finished. In that case, one can cancel the request by a call to
285
=item eio_cancel (eio_req *req)
287
Cancel the request (and all its subrequests). If the request is currently
288
executing it might still continue to execute, and in other cases it might
289
still take a while till the request is cancelled.
291
Even if cancelled, the finish callback will still be invoked - the
292
callbacks of all cancellable requests need to check whether the request
293
has been cancelled by calling C<EIO_CANCELLED (req)>:
296
my_eio_cb (eio_req *req)
298
if (EIO_CANCELLED (req))
302
In addition, cancelled requests will I<either> have C<< req->result >>
303
set to C<-1> and C<errno> to C<ECANCELED>, or I<otherwise> they were
304
successfully executed, despite being cancelled (e.g. when they have
305
already been executed at the time they were cancelled).
307
C<EIO_CANCELLED> is still true for requests that have successfully
308
executed, as long as C<eio_cancel> was called on them at some point.
312
=head2 AVAILABLE REQUESTS
314
The following request functions are available. I<All> of them return the
315
C<eio_req *> on success and C<0> on failure, and I<all> of them have the
316
same three trailing arguments: C<pri>, C<cb> and C<data>. The C<cb> is
317
mandatory, but in most cases, you pass in C<0> as C<pri> and C<0> or some
318
custom data value as C<data>.
320
=head3 POSIX API WRAPPERS
322
These requests simply wrap the POSIX call of the same name, with the same
323
arguments. If a function is not implemented by the OS and cannot be emulated
324
in some way, then all of these return C<-1> and set C<errorno> to C<ENOSYS>.
328
=item eio_open (const char *path, int flags, mode_t mode, int pri, eio_cb cb, void *data)
330
=item eio_truncate (const char *path, off_t offset, int pri, eio_cb cb, void *data)
332
=item eio_chown (const char *path, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
334
=item eio_chmod (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
336
=item eio_mkdir (const char *path, mode_t mode, int pri, eio_cb cb, void *data)
338
=item eio_rmdir (const char *path, int pri, eio_cb cb, void *data)
340
=item eio_unlink (const char *path, int pri, eio_cb cb, void *data)
342
=item eio_utime (const char *path, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
344
=item eio_mknod (const char *path, mode_t mode, dev_t dev, int pri, eio_cb cb, void *data)
346
=item eio_link (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
348
=item eio_symlink (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
350
=item eio_rename (const char *path, const char *new_path, int pri, eio_cb cb, void *data)
352
=item eio_mlock (void *addr, size_t length, int pri, eio_cb cb, void *data)
354
=item eio_close (int fd, int pri, eio_cb cb, void *data)
356
=item eio_sync (int pri, eio_cb cb, void *data)
358
=item eio_fsync (int fd, int pri, eio_cb cb, void *data)
360
=item eio_fdatasync (int fd, int pri, eio_cb cb, void *data)
362
=item eio_futime (int fd, eio_tstamp atime, eio_tstamp mtime, int pri, eio_cb cb, void *data)
364
=item eio_ftruncate (int fd, off_t offset, int pri, eio_cb cb, void *data)
366
=item eio_fchmod (int fd, mode_t mode, int pri, eio_cb cb, void *data)
368
=item eio_fchown (int fd, uid_t uid, gid_t gid, int pri, eio_cb cb, void *data)
370
=item eio_dup2 (int fd, int fd2, int pri, eio_cb cb, void *data)
372
These have the same semantics as the syscall of the same name, their
373
return value is available as C<< req->result >> later.
375
=item eio_read (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
377
=item eio_write (int fd, void *buf, size_t length, off_t offset, int pri, eio_cb cb, void *data)
379
These two requests are called C<read> and C<write>, but actually wrap
380
C<pread> and C<pwrite>. On systems that lack these calls (such as cygwin),
381
libeio uses lseek/read_or_write/lseek and a mutex to serialise the
382
requests, so all these requests run serially and do not disturb each
383
other. However, they still disturb the file offset while they run, so it's
384
not safe to call these functions concurrently with non-libeio functions on
385
the same fd on these systems.
387
Not surprisingly, pread and pwrite are not thread-safe on Darwin (OS/X),
388
so it is advised not to submit multiple requests on the same fd on this
389
horrible pile of garbage.
391
=item eio_mlockall (int flags, int pri, eio_cb cb, void *data)
393
Like C<mlockall>, but the flag value constants are called
394
C<EIO_MCL_CURRENT> and C<EIO_MCL_FUTURE>.
396
=item eio_msync (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
398
Just like msync, except that the flag values are called C<EIO_MS_ASYNC>,
399
C<EIO_MS_INVALIDATE> and C<EIO_MS_SYNC>.
401
=item eio_readlink (const char *path, int pri, eio_cb cb, void *data)
403
If successful, the path read by C<readlink(2)> can be accessed via C<<
404
req->ptr2 >> and is I<NOT> null-terminated, with the length specified as
407
if (req->result >= 0)
409
char *target = strndup ((char *)req->ptr2, req->result);
414
=item eio_realpath (const char *path, int pri, eio_cb cb, void *data)
416
Similar to the realpath libc function, but unlike that one, C<<
417
req->result >> is C<-1> on failure. On success, the result is the length
418
of the returned path in C<ptr2> (which is I<NOT> 0-terminated) - this is
421
=item eio_stat (const char *path, int pri, eio_cb cb, void *data)
423
=item eio_lstat (const char *path, int pri, eio_cb cb, void *data)
425
=item eio_fstat (int fd, int pri, eio_cb cb, void *data)
427
Stats a file - if C<< req->result >> indicates success, then you can
428
access the C<struct stat>-like structure via C<< req->ptr2 >>:
430
EIO_STRUCT_STAT *statdata = (EIO_STRUCT_STAT *)req->ptr2;
432
=item eio_statvfs (const char *path, int pri, eio_cb cb, void *data)
434
=item eio_fstatvfs (int fd, int pri, eio_cb cb, void *data)
436
Stats a filesystem - if C<< req->result >> indicates success, then you can
437
access the C<struct statvfs>-like structure via C<< req->ptr2 >>:
439
EIO_STRUCT_STATVFS *statdata = (EIO_STRUCT_STATVFS *)req->ptr2;
443
=head3 READING DIRECTORIES
445
Reading directories sounds simple, but can be rather demanding, especially
446
if you want to do stuff such as traversing a directory hierarchy or
447
processing all files in a directory. Libeio can assist these complex tasks
448
with it's C<eio_readdir> call.
452
=item eio_readdir (const char *path, int flags, int pri, eio_cb cb, void *data)
454
This is a very complex call. It basically reads through a whole directory
455
(via the C<opendir>, C<readdir> and C<closedir> calls) and returns either
456
the names or an array of C<struct eio_dirent>, depending on the C<flags>
459
The C<< req->result >> indicates either the number of files found, or
460
C<-1> on error. On success, null-terminated names can be found as C<< req->ptr2 >>,
461
and C<struct eio_dirents>, if requested by C<flags>, can be found via C<<
464
Here is an example that prints all the names:
467
char *names = (char *)req->ptr2;
469
for (i = 0; i < req->result; ++i)
471
printf ("name #%d: %s\n", i, names);
473
/* move to next name */
474
names += strlen (names) + 1;
477
Pseudo-entries such as F<.> and F<..> are never returned by C<eio_readdir>.
479
C<flags> can be any combination of:
483
=item EIO_READDIR_DENTS
485
If this flag is specified, then, in addition to the names in C<ptr2>,
486
also an array of C<struct eio_dirent> is returned, in C<ptr1>. A C<struct
487
eio_dirent> looks like this:
491
int nameofs; /* offset of null-terminated name string in (char *)req->ptr2 */
492
unsigned short namelen; /* size of filename without trailing 0 */
493
unsigned char type; /* one of EIO_DT_* */
494
signed char score; /* internal use */
495
ino_t inode; /* the inode number, if available, otherwise unspecified */
498
The only members you normally would access are C<nameofs>, which is the
499
byte-offset from C<ptr2> to the start of the name, C<namelen> and C<type>.
501
C<type> can be one of:
503
C<EIO_DT_UNKNOWN> - if the type is not known (very common) and you have to C<stat>
504
the name yourself if you need to know,
505
one of the "standard" POSIX file types (C<EIO_DT_REG>, C<EIO_DT_DIR>, C<EIO_DT_LNK>,
506
C<EIO_DT_FIFO>, C<EIO_DT_SOCK>, C<EIO_DT_CHR>, C<EIO_DT_BLK>)
507
or some OS-specific type (currently
508
C<EIO_DT_MPC> - multiplexed char device (v7+coherent),
509
C<EIO_DT_NAM> - xenix special named file,
510
C<EIO_DT_MPB> - multiplexed block device (v7+coherent),
511
C<EIO_DT_NWK> - HP-UX network special,
512
C<EIO_DT_CMP> - VxFS compressed,
513
C<EIO_DT_DOOR> - solaris door, or
516
This example prints all names and their type:
519
struct eio_dirent *ents = (struct eio_dirent *)req->ptr1;
520
char *names = (char *)req->ptr2;
522
for (i = 0; i < req->result; ++i)
524
struct eio_dirent *ent = ents + i;
525
char *name = names + ent->nameofs;
527
printf ("name #%d: %s (type %d)\n", i, name, ent->type);
530
=item EIO_READDIR_DIRS_FIRST
532
When this flag is specified, then the names will be returned in an order
533
where likely directories come first, in optimal C<stat> order. This is
534
useful when you need to quickly find directories, or you want to find all
535
directories while avoiding to stat() each entry.
537
If the system returns type information in readdir, then this is used
538
to find directories directly. Otherwise, likely directories are names
539
beginning with ".", or otherwise names with no dots, of which names with
540
short names are tried first.
542
=item EIO_READDIR_STAT_ORDER
544
When this flag is specified, then the names will be returned in an order
545
suitable for stat()'ing each one. That is, when you plan to stat()
546
all files in the given directory, then the returned order will likely
549
If both this flag and C<EIO_READDIR_DIRS_FIRST> are specified, then the
550
likely directories come first, resulting in a less optimal stat order.
552
=item EIO_READDIR_FOUND_UNKNOWN
554
This flag should not be specified when calling C<eio_readdir>. Instead,
555
it is being set by C<eio_readdir> (you can access the C<flags> via C<<
556
req->int1 >>, when any of the C<type>'s found were C<EIO_DT_UNKNOWN>. The
557
absence of this flag therefore indicates that all C<type>'s are known,
558
which can be used to speed up some algorithms.
560
A typical use case would be to identify all subdirectories within a
561
directory - you would ask C<eio_readdir> for C<EIO_READDIR_DIRS_FIRST>. If
562
then this flag is I<NOT> set, then all the entries at the beginning of the
563
returned array of type C<EIO_DT_DIR> are the directories. Otherwise, you
564
should start C<stat()>'ing the entries starting at the beginning of the
565
array, stopping as soon as you found all directories (the count can be
566
deduced by the link count of the directory).
572
=head3 OS-SPECIFIC CALL WRAPPERS
574
These wrap OS-specific calls (usually Linux ones), and might or might not
575
be emulated on other operating systems. Calls that are not emulated will
576
return C<-1> and set C<errno> to C<ENOSYS>.
580
=item eio_sendfile (int out_fd, int in_fd, off_t in_offset, size_t length, int pri, eio_cb cb, void *data)
582
Wraps the C<sendfile> syscall. The arguments follow the Linux version, but
583
libeio supports and will use similar calls on FreeBSD, HP/UX, Solaris and
586
If the OS doesn't support some sendfile-like call, or the call fails,
587
indicating support for the given file descriptor type (for example,
588
Linux's sendfile might not support file to file copies), then libeio will
589
emulate the call in userspace, so there are almost no limitations on its
592
=item eio_readahead (int fd, off_t offset, size_t length, int pri, eio_cb cb, void *data)
594
Calls C<readahead(2)>. If the syscall is missing, then the call is
595
emulated by simply reading the data (currently in 64kiB chunks).
597
=item eio_syncfs (int fd, int pri, eio_cb cb, void *data)
599
Calls Linux' C<syncfs> syscall, if available. Returns C<-1> and sets
600
C<errno> to C<ENOSYS> if the call is missing I<but still calls sync()>,
601
if the C<fd> is C<< >= 0 >>, so you can probe for the availability of the
602
syscall with a negative C<fd> argument and checking for C<-1/ENOSYS>.
604
=item eio_sync_file_range (int fd, off_t offset, size_t nbytes, unsigned int flags, int pri, eio_cb cb, void *data)
606
Calls C<sync_file_range>. If the syscall is missing, then this is the same
607
as calling C<fdatasync>.
609
Flags can be any combination of C<EIO_SYNC_FILE_RANGE_WAIT_BEFORE>,
610
C<EIO_SYNC_FILE_RANGE_WRITE> and C<EIO_SYNC_FILE_RANGE_WAIT_AFTER>.
612
=item eio_fallocate (int fd, int mode, off_t offset, off_t len, int pri, eio_cb cb, void *data)
614
Calls C<fallocate> (note: I<NOT> C<posix_fallocate>!). If the syscall is
615
missing, then it returns failure and sets C<errno> to C<ENOSYS>.
617
The C<mode> argument can be C<0> (for behaviour similar to
618
C<posix_fallocate>), or C<EIO_FALLOC_FL_KEEP_SIZE>, which keeps the size
619
of the file unchanged (but still preallocates space beyond end of file).
623
=head3 LIBEIO-SPECIFIC REQUESTS
625
These requests are specific to libeio and do not correspond to any OS call.
629
=item eio_mtouch (void *addr, size_t length, int flags, int pri, eio_cb cb, void *data)
631
Reads (C<flags == 0>) or modifies (C<flags == EIO_MT_MODIFY) the given
632
memory area, page-wise, that is, it reads (or reads and writes back) the
633
first octet of every page that spans the memory area.
635
This can be used to page in some mmapped file, or dirty some pages. Note
636
that dirtying is an unlocked read-write access, so races can ensue when
637
the some other thread modifies the data stored in that memory area.
639
=item eio_custom (void (*)(eio_req *) execute, int pri, eio_cb cb, void *data)
641
Executes a custom request, i.e., a user-specified callback.
643
The callback gets the C<eio_req *> as parameter and is expected to read
644
and modify any request-specific members. Specifically, it should set C<<
645
req->result >> to the result value, just like other requests.
647
Here is an example that simply calls C<open>, like C<eio_open>, but it
648
uses the C<data> member as filename and uses a hardcoded C<O_RDONLY>. If
649
you want to pass more/other parameters, you either need to pass some
650
struct or so via C<data> or provide your own wrapper using the low-level
654
my_open_done (eio_req *req)
656
int fd = req->result;
662
my_open (eio_req *req)
664
req->result = open (req->data, O_RDONLY);
667
eio_custom (my_open, 0, my_open_done, "/etc/passwd");
669
=item eio_busy (eio_tstamp delay, int pri, eio_cb cb, void *data)
671
This is a request that takes C<delay> seconds to execute, but otherwise
672
does nothing - it simply puts one of the worker threads to sleep for this
675
This request can be used to artificially increase load, e.g. for debugging
676
or benchmarking reasons.
678
=item eio_nop (int pri, eio_cb cb, void *data)
680
This request does nothing, except go through the whole request cycle. This
681
can be used to measure latency or in some cases to simplify code, but is
682
not really of much use.
686
=head3 GROUPING AND LIMITING REQUESTS
688
There is one more rather special request, C<eio_grp>. It is a very special
689
aio request: Instead of doing something, it is a container for other eio
692
There are two primary use cases for this: a) bundle many requests into a
693
single, composite, request with a definite callback and the ability to
694
cancel the whole request with its subrequests and b) limiting the number
695
of "active" requests.
697
Further below you will find more discussion of these topics - first
698
follows the reference section detailing the request generator and other
703
=item eio_req *grp = eio_grp (eio_cb cb, void *data)
705
Creates, submits and returns a group request. Note that it doesn't have a
706
priority, unlike all other requests.
708
=item eio_grp_add (eio_req *grp, eio_req *req)
710
Adds a request to the request group.
712
=item eio_grp_cancel (eio_req *grp)
714
Cancels all requests I<in> the group, but I<not> the group request
715
itself. You can cancel the group request I<and> all subrequests via a
716
normal C<eio_cancel> call.
720
=head4 GROUP REQUEST LIFETIME
722
Left alone, a group request will instantly move to the pending state and
723
will be finished at the next call of C<eio_poll>.
725
The usefulness stems from the fact that, if a subrequest is added to a
726
group I<before> a call to C<eio_poll>, via C<eio_grp_add>, then the group
727
will not finish until all the subrequests have finished.
729
So the usage cycle of a group request is like this: after it is created,
730
you normally instantly add a subrequest. If none is added, the group
731
request will finish on it's own. As long as subrequests are added before
732
the group request is finished it will be kept from finishing, that is the
733
callbacks of any subrequests can, in turn, add more requests to the group,
734
and as long as any requests are active, the group request itself will not
737
=head4 CREATING COMPOSITE REQUESTS
739
Imagine you wanted to create an C<eio_load> request that opens a file,
740
reads it and closes it. This means it has to execute at least three eio
741
requests, but for various reasons it might be nice if that request looked
742
like any other eio request.
744
This can be done with groups:
748
=item 1) create the request object
750
Create a group that contains all further requests. This is the request you
751
can return as "the load request".
753
=item 2) open the file, maybe
755
Next, open the file with C<eio_open> and add the request to the group
756
request and you are finished setting up the request.
758
If, for some reason, you cannot C<eio_open> (path is a null ptr?) you
759
can set C<< grp->result >> to C<-1> to signal an error and let the group
760
request finish on its own.
762
=item 3) open callback adds more requests
764
In the open callback, if the open was not successful, copy C<<
765
req->errorno >> to C<< grp->errorno >> and set C<< grp->errorno >> to
766
C<-1> to signal an error.
768
Otherwise, malloc some memory or so and issue a read request, adding the
769
read request to the group.
771
=item 4) continue issuing requests till finished
773
In the real callback, check for errors and possibly continue with
774
C<eio_close> or any other eio request in the same way.
776
As soon as no new requests are added the group request will finish. Make
777
sure you I<always> set C<< grp->result >> to some sensible value.
781
=head4 REQUEST LIMITING
786
void eio_grp_limit (eio_req *grp, int limit);
792
=head1 LOW LEVEL REQUEST API
797
=head1 ANATOMY AND LIFETIME OF AN EIO REQUEST
799
A request is represented by a structure of type C<eio_req>. To initialise
800
it, clear it to all zero bytes:
804
memset (&req, 0, sizeof (req));
806
A more common way to initialise a new C<eio_req> is to use C<calloc>:
808
eio_req *req = calloc (1, sizeof (*req));
810
In either case, libeio neither allocates, initialises or frees the
811
C<eio_req> structure for you - it merely uses it.
819
The functions in this section can sometimes be useful, but the default
820
configuration will do in most case, so you should skip this section on
825
=item eio_set_max_poll_time (eio_tstamp nseconds)
827
This causes C<eio_poll ()> to return after it has detected that it was
828
running for C<nsecond> seconds or longer (this number can be fractional).
830
This can be used to limit the amount of time spent handling eio requests,
831
for example, in interactive programs, you might want to limit this time to
832
C<0.01> seconds or so.
838
=item a) libeio doesn't know how long your request callbacks take, so the
839
time spent in C<eio_poll> is up to one callback invocation longer then
842
=item b) this is implemented by calling C<gettimeofday> after each
843
request, which can be costly.
845
=item c) at least one request will be handled.
849
=item eio_set_max_poll_reqs (unsigned int nreqs)
851
When C<nreqs> is non-zero, then C<eio_poll> will not handle more than
852
C<nreqs> requests per invocation. This is a less costly way to limit the
853
amount of work done by C<eio_poll> then setting a time limit.
855
If you know your callbacks are generally fast, you could use this to
856
encourage interactiveness in your programs by setting it to C<10>, C<100>
859
=item eio_set_min_parallel (unsigned int nthreads)
861
Make sure libeio can handle at least this many requests in parallel. It
862
might be able handle more.
864
=item eio_set_max_parallel (unsigned int nthreads)
866
Set the maximum number of threads that libeio will spawn.
868
=item eio_set_max_idle (unsigned int nthreads)
870
Libeio uses threads internally to handle most requests, and will start and stop threads on demand.
872
This call can be used to limit the number of idle threads (threads without
873
work to do): libeio will keep some threads idle in preparation for more
874
requests, but never longer than C<nthreads> threads.
876
In addition to this, libeio will also stop threads when they are idle for
877
a few seconds, regardless of this setting.
879
=item unsigned int eio_nthreads ()
881
Return the number of worker threads currently running.
883
=item unsigned int eio_nreqs ()
885
Return the number of requests currently handled by libeio. This is the
886
total number of requests that have been submitted to libeio, but not yet
889
=item unsigned int eio_nready ()
891
Returns the number of ready requests, i.e. requests that have been
892
submitted but have not yet entered the execution phase.
894
=item unsigned int eio_npending ()
896
Returns the number of pending requests, i.e. requests that have been
897
executed and have results, but have not been finished yet by a call to
904
Libeio can be embedded directly into programs. This functionality is not
905
documented and not (yet) officially supported.
907
Note that, when including C<libeio.m4>, you are responsible for defining
908
the compilation environment (C<_LARGEFILE_SOURCE>, C<_GNU_SOURCE> etc.).
910
If you need to know how, check the C<IO::AIO> perl module, which does
914
=head1 COMPILETIME CONFIGURATION
916
These symbols, if used, must be defined when compiling F<eio.c>.
922
This symbol governs the stack size for each eio thread. Libeio itself
923
was written to use very little stackspace, but when using C<EIO_CUSTOM>
924
requests, you might want to increase this.
926
If this symbol is undefined (the default) then libeio will use its default
927
stack size (C<sizeof (void *) * 4096> currently). If it is defined, but
928
C<0>, then the default operating system stack size will be used. In all
929
other cases, the value must be an expression that evaluates to the desired
935
=head1 PORTABILITY REQUIREMENTS
937
In addition to a working ISO-C implementation, libeio relies on a few
938
additional extensions:
944
To be portable, this module uses threads, specifically, the POSIX threads
945
library must be available (and working, which partially excludes many xBSD
946
systems, where C<fork ()> is buggy).
948
=item POSIX-compatible filesystem API
950
This is actually a harder portability requirement: The libeio API is quite
951
demanding regarding POSIX API calls (symlinks, user/group management
954
=item C<double> must hold a time value in seconds with enough accuracy
956
The type C<double> is used to represent timestamps. It is required to
957
have at least 51 bits of mantissa (and 9 bits of exponent), which is good
958
enough for at least into the year 4000. This requirement is fulfilled by
959
implementations implementing IEEE 754 (basically all existing ones).
963
If you know of other additional requirements drop me a note.
968
Marc Lehmann <libeio@schmorp.de>.