1
\input texinfo @c -*-texinfo-*-
2
@comment %**start of header (This is for running Texinfo on a region.)
4
@settitle Inter Process Communication.
6
@comment %**end of header (This is for running Texinfo on a region.)
9
This file documents the System V style inter process communication
10
primitives available under linux.
12
Copyright @copyright{} 1992 krishna balasubramanian
14
Permission is granted to use this material and the accompanying
15
programs within the terms of the GNU GPL.
20
@center @titlefont{System V Inter Process Communication}
22
@center krishna balasubramanian,
24
@comment The following two commands start the copyright page.
26
@vskip 0pt plus 1filll
27
Copyright @copyright{} 1992 krishna balasubramanian
29
Permission is granted to use this material and the accompanying
30
programs within the terms of the GNU GPL.
33
@dircategory Miscellaneous
35
* ipc: (ipc). System V style inter process communication
38
@node Top, Overview, Notes, (dir)
39
@chapter System V IPC.
41
These facilities are provided to maintain compatibility with
42
programs developed on system V unix systems and others
43
that rely on these system V mechanisms to accomplish inter
44
process communication (IPC).@refill
46
The specifics described here are applicable to the Linux implementation.
47
Other implementations may do things slightly differently.
50
* Overview:: What is system V ipc? Overall mechanisms.
51
* Messages:: System calls for message passing.
52
* Semaphores:: System calls for semaphores.
53
* Shared Memory:: System calls for shared memory access.
54
* Notes:: Miscellaneous notes.
57
@node Overview, example, Top, Top
60
@noindent System V IPC consists of three mechanisms:
64
Messages : exchange messages with any process or server.
66
Semaphores : allow unrelated processes to synchronize execution.
68
Shared memory : allow unrelated processes to share memory.
72
* example:: Using shared memory.
73
* perms:: Description of access permissions.
74
* syscalls:: Overview of ipc system calls.
77
Access to all resources is permitted on the basis of permissions
78
set up when the resource was created.@refill
80
A resource here consists of message queue, a semaphore set (array)
81
or a shared memory segment.@refill
83
A resource must first be allocated by a creator before it is used.
84
The creator can assign a different owner. After use the resource
85
must be explicitly destroyed by the creator or owner.@refill
87
A resource is identified by a numeric @var{id}. Typically a creator
88
defines a @var{key} that may be used to access the resource. The user
89
process may then use this @var{key} in the @dfn{get} system call to obtain
90
the @var{id} for the corresponding resource. This @var{id} is then used for
91
all further access. A library call @dfn{ftok} is provided to translate
92
pathnames or strings to numeric keys.@refill
94
There are system and implementation defined limits on the number and
95
sizes of resources of any given type. Some of these are imposed by the
96
implementation and others by the system administrator
97
when configuring the kernel (@xref{msglimits}, @xref{semlimits},
98
@xref{shmlimits}).@refill
100
There is an @code{msqid_ds}, @code{semid_ds} or @code{shmid_ds} struct
101
associated with each message queue, semaphore array or shared segment.
102
Each ipc resource has an associated @code{ipc_perm} struct which defines
103
the creator, owner, access perms ..etc.., for the resource.
104
These structures are detailed in the following sections.@refill
108
@node example, perms, Overview, Overview
111
Here is a code fragment with pointers on how to use shared memory. The
112
same methods are applicable to other resources.@refill
114
In a typical access sequence the creator allocates a new instance
115
of the resource with the @code{get} system call using the IPC_CREAT
118
@noindent creator process:@*
125
int size = 0x5000; /* 20 K */
126
int flags = 0664 | IPC_CREAT; /* read-only for others */
128
key = ftok ("~creator/ipckey", proc_id);
129
id = shmget (key, size, flags);
130
exit (0); /* quit leaving resource allocated */
134
Users then gain access to the resource using the same key.@*
144
key = ftok ("~creator/ipckey", proc_id);
146
id = shmget (key, 0, 004); /* default size */
148
perror ("shmget ...");
150
shmaddr = shmat (id, 0, SHM_RDONLY); /* attach segment for reading */
151
if (shmaddr == (char *) -1)
152
perror ("shmat ...");
154
local_var = *(shmaddr + 3); /* read segment etc. */
156
shmdt (shmaddr); /* detach segment */
160
When the resource is no longer needed the creator should remove it.@*
162
Creator/owner process 2:
164
key = ftok ("~creator/ipckey", proc_id)
165
id = shmget (key, 0, 0);
166
shmctl (id, IPC_RMID, NULL);
170
@node perms, syscalls, example, Overview
173
Each resource has an associated @code{ipc_perm} struct which defines the
174
creator, owner and access perms for the resource.@refill
178
key_t key; /* set by creator */
179
ushort uid; /* owner euid and egid */
181
ushort cuid; /* creator euid and egid */
183
ushort mode; /* access modes in lower 9 bits */
184
ushort seq; /* sequence number */
187
The creating process is the default owner. The owner can be reassigned
188
by the creator and has creator perms. Only the owner, creator or super-user
189
can delete the resource.@refill
191
The lowest nine bits of the flags parameter supplied by the user to the
192
system call are compared with the values stored in @code{ipc_perms.mode}
193
to determine if the requested access is allowed. In the case
194
that the system call creates the resource, these bits are initialized
195
from the user supplied value.@refill
197
As for files, access permissions are specified as read, write and exec
198
for user, group or other (though the exec perms are unused). For example
199
0624 grants read-write to owner, write-only to group and read-only
200
access to others.@refill
202
For shared memory, note that read-write access for segments is determined
203
by a separate flag which is not stored in the @code{mode} field.
204
Shared memory segments attached with write access can be read.@refill
206
The @code{cuid}, @code{cgid}, @code{key} and @code{seq} fields
207
cannot be changed by the user.@refill
211
@node syscalls, Messages, perms, Overview
212
@section IPC system calls
214
This section provides an overview of the IPC system calls. See the
215
specific sections on each type of resource for details.@refill
217
Each type of mechanism provides a @dfn{get}, @dfn{ctl} and one or more
218
@dfn{op} system calls that allow the user to create or procure the
219
resource (get), define its behaviour or destroy it (ctl) and manipulate
220
the resources (op).@refill
224
@subsection The @dfn{get} system calls
226
The @code{get} call typically takes a @var{key} and returns a numeric
227
@var{id} that is used for further access.
228
The @var{id} is an index into the resource table. A sequence
229
number is maintained and incremented when a resource is
230
destroyed so that access using an obsolete @var{id} is likely to fail.@refill
232
The user also specifies the permissions and other behaviour
233
charecteristics for the current access. The flags are or-ed with the
234
permissions when invoking system calls as in:@refill
236
msgflg = IPC_CREAT | IPC_EXCL | 0666;
237
id = msgget (key, msgflg);
241
@code{key} : IPC_PRIVATE => new instance of resource is initialized.
246
IPC_CREAT : resource created for @var{key} if it does not exist.
248
IPC_CREAT | IPC_EXCL : fail if resource exists for @var{key}.
251
returns : an identifier used for all further access to the resource.
254
Note that IPC_PRIVATE is not a flag but a special @code{key}
255
that ensures (when the call is successful) that a new resource is
258
Use of IPC_PRIVATE does not make the resource inaccessible to other
259
users. For this you must set the access permissions appropriately.@refill
261
There is currently no way for a process to ensure exclusive access to a
262
resource. IPC_CREAT | IPC_EXCL only ensures (on success) that a new
263
resource was initialized. It does not imply exclusive access.@refill
266
See Also : @xref{msgget}, @xref{semget}, @xref{shmget}.@refill
270
@subsection The @dfn{ctl} system calls
272
Provides or alters the information stored in the structure that describes
273
the resource indexed by @var{id}.@refill
278
err = msgctl (id, IPC_STAT, &buf);
282
printf ("creator uid = %d\n", buf.msg_perm.cuid);
287
Commands supported by all @code{ctl} calls:@*
290
IPC_STAT : read info on resource specified by id into user allocated
291
buffer. The user must have read access to the resource.@refill
293
IPC_SET : write info from buffer into resource data structure. The
294
user must be owner creator or super-user.@refill
296
IPC_RMID : remove resource. The user must be the owner, creator or
300
The IPC_RMID command results in immediate removal of a message
301
queue or semaphore array. Shared memory segments however, are
302
only destroyed upon the last detach after IPC_RMID is executed.@refill
304
The @code{semctl} call provides a number of command options that allow
305
the user to determine or set the values of the semaphores in an array.@refill
308
See Also: @xref{msgctl}, @xref{semctl}, @xref{shmctl}.@refill
311
@subsection The @dfn{op} system calls
313
Used to send or receive messages, read or alter semaphore values,
314
attach or detach shared memory segments.
315
The IPC_NOWAIT flag will cause the operation to fail with error EAGAIN
316
if the process has to wait on the call.@refill
319
@code{flags} : IPC_NOWAIT => return with error if a wait is required.
322
See Also: @xref{msgsnd},@xref{msgrcv},@xref{semop},@xref{shmat},
327
@node Messages, msgget, syscalls, Top
330
A message resource is described by a struct @code{msqid_ds} which is
331
allocated and initialized when the resource is created. Some fields
332
in @code{msqid_ds} can then be altered (if desired) by invoking @code{msgctl}.
333
The memory used by the resource is released when it is destroyed by
334
a @code{msgctl} call.@refill
338
struct ipc_perm msg_perm;
339
struct msg *msg_first; /* first message on queue (internal) */
340
struct msg *msg_last; /* last message in queue (internal) */
341
time_t msg_stime; /* last msgsnd time */
342
time_t msg_rtime; /* last msgrcv time */
343
time_t msg_ctime; /* last change time */
344
struct wait_queue *wwait; /* writers waiting (internal) */
345
struct wait_queue *rwait; /* readers waiting (internal) */
346
ushort msg_cbytes; /* number of bytes used on queue */
347
ushort msg_qnum; /* number of messages in queue */
348
ushort msg_qbytes; /* max number of bytes on queue */
349
ushort msg_lspid; /* pid of last msgsnd */
350
ushort msg_lrpid; /* pid of last msgrcv */
353
To send or receive a message the user allocates a structure that looks
354
like a @code{msgbuf} but with an array @code{mtext} of the required size.
355
Messages have a type (positive integer) associated with them so that
356
(for example) a listener can choose to receive only messages of a
361
long mtype; type of message (@xref{msgrcv}).
362
char mtext[1]; message text .. why is this not a ptr?
365
The user must have write permissions to send and read permissions
366
to receive messages on a queue.@refill
368
When @code{msgsnd} is invoked, the user's message is copied into
369
an internal struct @code{msg} and added to the queue. A @code{msgrcv}
370
will then read this message and free the associated struct @code{msg}.@refill
378
* msglimits:: Implementation defined limits.
382
@node msgget, msgsnd, Messages, Messages
386
A message queue is allocated by a msgget system call :
389
msqid = msgget (key_t key, int msgflg);
394
@code{key}: an integer usually got from @code{ftok()} or IPC_PRIVATE.@refill
399
IPC_CREAT : used to create a new resource if it does not already exist.
401
IPC_EXCL | IPC_CREAT : used to ensure failure of the call if the
402
resource already exists.@refill
404
rwxrwxrwx : access permissions.
407
returns: msqid (an integer used for all further access) on success.
408
-1 on failure.@refill
411
A message queue is allocated if there is no resource corresponding
412
to the given key. The access permissions specified are then copied
413
into the @code{msg_perm} struct and the fields in @code{msqid_ds}
414
initialized. The user must use the IPC_CREAT flag or key = IPC_PRIVATE,
415
if a new instance is to be allocated. If a resource corresponding to
416
@var{key} already exists, the access permissions are verified.@refill
421
EACCES : (procure) Do not have permission for requested access.@*
423
EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
425
EIDRM : (procure) The resource was removed.@*
427
ENOSPC : All id's are taken (max of MSGMNI id's system-wide).@*
429
ENOENT : Resource does not exist and IPC_CREAT not specified.@*
431
ENOMEM : A new @code{msqid_ds} was to be created but ... nomem.
436
@node msgsnd, msgrcv, msgget, Messages
440
int msgsnd (int msqid, struct msgbuf *msgp, int msgsz, int msgflg);
445
@code{msqid} : id obtained by a call to msgget.
447
@code{msgsz} : size of msg text (@code{mtext}) in bytes.
449
@code{msgp} : message to be sent. (msgp->mtype must be positive).
451
@code{msgflg} : IPC_NOWAIT.
453
returns : msgsz on success. -1 on error.
456
The message text and type are stored in the internal @code{msg}
457
structure. @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lspid},
458
and @code{msg_stime} fields are updated. Readers waiting on the
459
queue are awakened.@refill
464
EACCES : Do not have write permission on queue.@*
466
EAGAIN : IPC_NOWAIT specified and queue is full.@*
468
EFAULT : msgp not accessible.@*
470
EIDRM : The message queue was removed.@*
472
EINTR : Full queue ... would have slept but ... was interrupted.@*
474
EINVAL : mtype < 1, msgsz > MSGMAX, msgsz < 0, msqid < 0 or unused.@*
476
ENOMEM : Could not allocate space for header and text.@*
480
@node msgrcv, msgctl, msgsnd, Messages
484
int msgrcv (int msqid, struct msgbuf *msgp, int msgsz, long msgtyp,
490
msqid : id obtained by a call to msgget.
492
msgsz : maximum size of message to receive.
494
msgp : allocated by user to store the message in.
499
0 => get first message on queue.
501
> 0 => get first message of matching type.
503
< 0 => get message with least type which is <= abs(msgtyp).
509
IPC_NOWAIT : Return immediately if message not found.
511
MSG_NOERROR : The message is truncated if it is larger than msgsz.
513
MSG_EXCEPT : Used with msgtyp > 0 to receive any msg except of specified
517
returns : size of message if found. -1 on error.
520
The first message that meets the @code{msgtyp} specification is
521
identified. For msgtyp < 0, the entire queue is searched for the
522
message with the smallest type.@refill
524
If its length is smaller than msgsz or if the user specified the
525
MSG_NOERROR flag, its text and type are copied to msgp->mtext and
526
msgp->mtype, and it is taken off the queue.@refill
528
The @code{msg_cbytes}, @code{msg_qnum}, @code{msg_lrpid},
529
and @code{msg_rtime} fields are updated. Writers waiting on the
530
queue are awakened.@refill
535
E2BIG : msg bigger than msgsz and MSG_NOERROR not specified.@*
537
EACCES : Do not have permission for reading the queue.@*
539
EFAULT : msgp not accessible.@*
541
EIDRM : msg queue was removed.@*
543
EINTR : msg not found ... would have slept but ... was interrupted.@*
545
EINVAL : msgsz > msgmax or msgsz < 0, msqid < 0 or unused.@*
547
ENOMSG : msg of requested type not found and IPC_NOWAIT specified.
551
@node msgctl, msglimits, msgrcv, Messages
555
int msgctl (int msqid, int cmd, struct msqid_ds *buf);
560
msqid : id obtained by a call to msgget.
562
buf : allocated by user for reading/writing info.
564
cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}).
567
IPC_STAT results in the copy of the queue data structure
568
into the user supplied buffer.@refill
570
In the case of IPC_SET, the queue size (@code{msg_qbytes})
571
and the @code{uid}, @code{gid}, @code{mode} (low 9 bits) fields
572
of the @code{msg_perm} struct are set from the user supplied values.
573
@code{msg_ctime} is updated.@refill
575
Note that only the super user may increase the limit on the size of a
576
message queue beyond MSGMNB.@refill
578
When the queue is destroyed (IPC_RMID), the sequence number is
579
incremented and all waiting readers and writers are awakened.
580
These processes will then return with @code{errno} set to EIDRM.@refill
585
EPERM : Insufficient privilege to increase the size of the queue (IPC_SET)
586
or remove it (IPC_RMID).@*
588
EACCES : Do not have permission for reading the queue (IPC_STAT).@*
590
EFAULT : buf not accessible (IPC_STAT, IPC_SET).@*
592
EIDRM : msg queue was removed.@*
594
EINVAL : invalid cmd, msqid < 0 or unused.
597
@node msglimits, Semaphores, msgctl, Messages
598
@subsection Limis on Message Resources
601
Sizeof various structures:
604
msqid_ds 52 /* 1 per message queue .. dynamic */
606
msg 16 /* 1 for each message in system .. dynamic */
608
msgbuf 8 /* allocated by user */
615
MSGMNI : number of message queue identifiers ... policy.
617
MSGMAX : max size of message.
618
Header and message space allocated on one page.
619
MSGMAX = (PAGE_SIZE - sizeof(struct msg)).
620
Implementation maximum MSGMAX = 4080.@refill
622
MSGMNB : default max size of a message queue ... policy.
623
The super-user can increase the size of a
624
queue beyond MSGMNB by a @code{msgctl} call.@refill
628
Unused or unimplemented:@*
629
MSGTQL max number of message headers system-wide.@*
630
MSGPOOL total size in bytes of msg pool.
634
@node Semaphores, semget, msglimits, Top
637
Each semaphore has a value >= 0. An id provides access to an array
638
of @code{nsems} semaphores. Operations such as read, increment or decrement
639
semaphores in a set are performed by the @code{semop} call which processes
640
@code{nsops} operations at a time. Each operation is specified in a struct
641
@code{sembuf} described below. The operations are applied only if all of
644
If you do not have a need for such arrays, you are probably better off using
645
the @code{test_bit}, @code{set_bit} and @code{clear_bit} bit-operations
646
defined in <asm/bitops.h>.@refill
648
Semaphore operations may also be qualified by a SEM_UNDO flag which
649
results in the operation being undone when the process exits.@refill
651
If a decrement cannot go through, a process will be put to sleep
652
on a queue waiting for the @code{semval} to increase unless it specifies
653
IPC_NOWAIT. A read operation can similarly result in a sleep on a
654
queue waiting for @code{semval} to become 0. (Actually there are
655
two queues per semaphore array).@refill
658
A semaphore array is described by:
661
struct ipc_perm sem_perm;
662
time_t sem_otime; /* last semop time */
663
time_t sem_ctime; /* last change time */
664
struct wait_queue *eventn; /* wait for a semval to increase */
665
struct wait_queue *eventz; /* wait for a semval to become 0 */
666
struct sem_undo *undo; /* undo entries */
667
ushort sem_nsems; /* no. of semaphores in array */
671
Each semaphore is described internally by :
674
short sempid; /* pid of last semop() */
675
ushort semval; /* current value */
676
ushort semncnt; /* num procs awaiting increase in semval */
677
ushort semzcnt; /* num procs awaiting semval = 0 */
684
* semlimits:: Limits imposed by this implementation.
687
@node semget, semop, Semaphores, Semaphores
691
A semaphore array is allocated by a semget system call:
694
semid = semget (key_t key, int nsems, int semflg);
699
@code{key} : an integer usually got from @code{ftok} or IPC_PRIVATE
704
# of semaphores in array (0 <= nsems <= SEMMSL <= SEMMNS)
706
0 => dont care can be used when not creating the resource.
707
If successful you always get access to the entire array anyway.@refill
713
IPC_CREAT used to create a new resource
715
IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists.
717
rwxrwxrwx access permissions.
720
returns : semid on success. -1 on failure.
723
An array of nsems semaphores is allocated if there is no resource
724
corresponding to the given key. The access permissions specified are
725
then copied into the @code{sem_perm} struct for the array along with the
726
user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE
727
if a new resource is to be created.@refill
732
EINVAL : nsems not in above range (allocate).@*
733
nsems greater than number in array (procure).@*
735
EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
737
EIDRM : (procure) The resource was removed.@*
739
ENOMEM : could not allocate space for semaphore array.@*
741
ENOSPC : No arrays available (SEMMNI), too few semaphores available (SEMMNS).@*
743
ENOENT : Resource does not exist and IPC_CREAT not specified.@*
745
EACCES : (procure) do not have permission for specified access.
748
@node semop, semctl, semget, Semaphores
752
Operations on semaphore arrays are performed by calling semop :
755
int semop (int semid, struct sembuf *sops, unsigned nsops);
759
semid : id obtained by a call to semget.
761
sops : array of semaphore operations.
763
nsops : number of operations in array (0 < nsops < SEMOPM).
765
returns : semval for last operation. -1 on failure.
769
Operations are described by a structure sembuf:
772
ushort sem_num; /* semaphore index in array */
773
short sem_op; /* semaphore operation */
774
short sem_flg; /* operation flags */
777
The value @code{sem_op} is to be added (signed) to the current value semval
778
of the semaphore with index sem_num (0 .. nsems -1) in the set.
779
Flags recognized in sem_flg are IPC_NOWAIT and SEM_UNDO.@refill
782
Two kinds of operations can result in wait:
785
If sem_op is 0 (read operation) and semval is non-zero, the process
786
sleeps on a queue waiting for semval to become zero or returns with
787
error EAGAIN if (IPC_NOWAIT | sem_flg) is true.@refill
789
If (sem_op < 0) and (semval + sem_op < 0), the process either sleeps
790
on a queue waiting for semval to increase or returns with error EAGAIN if
791
(sem_flg & IPC_NOWAIT) is true.@refill
794
The array sops is first read in and preliminary checks performed on
795
the arguments. The operations are parsed to determine if any of
796
them needs write permissions or requests an undo operation.@refill
798
The operations are then tried and the process sleeps if any operation
799
that does not specify IPC_NOWAIT cannot go through. If a process sleeps
800
it repeats these checks on waking up. If any operation that requests
801
IPC_NOWAIT, cannot go through at any stage, the call returns with errno
802
set to EAGAIN.@refill
804
Finally, operations are committed when all go through without an intervening
805
sleep. Processes waiting on the zero_queue or increment_queue are awakened
806
if any of the semval's becomes zero or is incremented respectively.@refill
811
E2BIG : nsops > SEMOPM.@*
813
EACCES : Do not have permission for requested (read/alter) access.@*
815
EAGAIN : An operation with IPC_NOWAIT specified could not go through.@*
817
EFAULT : The array sops is not accessible.@*
819
EFBIG : An operation had semnum >= nsems.@*
821
EIDRM : The resource was removed.@*
823
EINTR : The process was interrupted on its way to a wait queue.@*
825
EINVAL : nsops is 0, semid < 0 or unused.@*
827
ENOMEM : SEM_UNDO requested. Could not allocate space for undo structure.@*
829
ERANGE : sem_op + semval > SEMVMX for some operation.
832
@node semctl, semlimits, semop, Semaphores
836
int semctl (int semid, int semnum, int cmd, union semun arg);
841
semid : id obtained by a call to semget.
846
GETPID return pid for the process that executed the last semop.
848
GETVAL return semval of semaphore with index semnum.
850
GETNCNT return number of processes waiting for semval to increase.
852
GETZCNT return number of processes waiting for semval to become 0
854
SETVAL set semval = arg.val.
856
GETALL read all semval's into arg.array.
858
SETALL set all semval's with values given in arg.array.
861
returns : 0 on success or as given above. -1 on failure.
864
The first 4 operate on the semaphore with index semnum in the set.
865
The last two operate on all semaphores in the set.@refill
867
@code{arg} is a union :
870
int val; value for SETVAL.
871
struct semid_ds *buf; buffer for IPC_STAT and IPC_SET.
872
ushort *array; array for GETALL and SETALL
877
IPC_SET, SETVAL, SETALL : sem_ctime is updated.
879
SETVAL, SETALL : Undo entries are cleared for altered semaphores in
880
all processes. Processes sleeping on the wait queues are
881
awakened if a semval becomes 0 or increases.@refill
883
IPC_SET : sem_perm.uid, sem_perm.gid, sem_perm.mode are updated from
884
user supplied values.@refill
890
EACCES : do not have permission for specified access.@*
892
EFAULT : arg is not accessible.@*
894
EIDRM : The resource was removed.@*
896
EINVAL : semid < 0 or semnum < 0 or semnum >= nsems.@*
898
EPERM : IPC_RMID, IPC_SET ... not creator, owner or super-user.@*
900
ERANGE : arg.array[i].semval > SEMVMX or < 0 for some i.
905
@node semlimits, Shared Memory, semctl, Semaphores
906
@subsection Limits on Semaphore Resources
909
Sizeof various structures:
911
semid_ds 44 /* 1 per semaphore array .. dynamic */
912
sem 8 /* 1 for each semaphore in system .. dynamic */
913
sembuf 6 /* allocated by user */
914
sem_undo 20 /* 1 for each undo request .. dynamic */
921
SEMVMX 32767 semaphore maximum value (short).
923
SEMMNI number of semaphore identifiers (or arrays) system wide...policy.
925
SEMMSL maximum number of semaphores per id.
926
1 semid_ds per array, 1 struct sem per semaphore
927
=> SEMMSL = (PAGE_SIZE - sizeof(semid_ds)) / sizeof(sem).
928
Implementation maximum SEMMSL = 500.@refill
930
SEMMNS maximum number of semaphores system wide ... policy.
931
Setting SEMMNS >= SEMMSL*SEMMNI makes it irrelevent.@refill
933
SEMOPM Maximum number of operations in one semop call...policy.
937
Unused or unimplemented:@*
939
SEMAEM adjust on exit max value.@*
941
SEMMNU number of undo structures system-wide.@*
943
SEMUME maximum number of undo entries per process.
947
@node Shared Memory, shmget, semlimits, Top
948
@section Shared Memory
950
Shared memory is distinct from the sharing of read-only code pages or
951
the sharing of unaltered data pages that is available due to the
952
copy-on-write mechanism. The essential difference is that the
953
shared pages are dirty (in the case of Shared memory) and can be
954
made to appear at a convenient location in the process' address space.@refill
957
A shared segment is described by :
960
struct ipc_perm shm_perm;
961
int shm_segsz; /* size of segment (bytes) */
962
time_t shm_atime; /* last attach time */
963
time_t shm_dtime; /* last detach time */
964
time_t shm_ctime; /* last change time */
965
ulong *shm_pages; /* internal page table */
966
ushort shm_cpid; /* pid, creator */
967
ushort shm_lpid; /* pid, last operation */
968
short shm_nattch; /* no. of current attaches */
971
A shmget allocates a shmid_ds and an internal page table. A shmat
972
maps the segment into the process' address space with pointers
973
into the internal page table and the actual pages are faulted in
974
as needed. The memory associated with the segment must be explicitly
975
destroyed by calling shmctl with IPC_RMID.@refill
982
* shmlimits:: Limits imposed by this implementation.
986
@node shmget, shmat, Shared Memory, Shared Memory
990
A shared memory segment is allocated by a shmget system call:
993
int shmget(key_t key, int size, int shmflg);
998
key : an integer usually got from @code{ftok} or IPC_PRIVATE
1000
size : size of the segment in bytes (SHMMIN <= size <= SHMMAX).
1005
IPC_CREAT used to create a new resource
1007
IPC_EXCL used with IPC_CREAT to ensure failure if the resource exists.
1009
rwxrwxrwx access permissions.
1012
returns : shmid on success. -1 on failure.
1015
A descriptor for a shared memory segment is allocated if there isn't one
1016
corresponding to the given key. The access permissions specified are
1017
then copied into the @code{shm_perm} struct for the segment along with the
1018
user-id etc. The user must use the IPC_CREAT flag or key = IPC_PRIVATE
1019
to allocate a new segment.@refill
1021
If the segment already exists, the access permissions are verified,
1022
and a check is made to see that it is not marked for destruction.@refill
1024
@code{size} is effectively rounded up to a multiple of PAGE_SIZE as shared
1025
memory is allocated in pages.@refill
1030
EINVAL : (allocate) Size not in range specified above.@*
1031
(procure) Size greater than size of segment.@*
1033
EEXIST : (allocate) IPC_CREAT | IPC_EXCL specified and resource exists.@*
1035
EIDRM : (procure) The resource is marked destroyed or was removed.@*
1037
ENOSPC : (allocate) All id's are taken (max of SHMMNI id's system-wide).
1038
Allocating a segment of the requested size would exceed the
1039
system wide limit on total shared memory (SHMALL).@refill
1042
ENOENT : (procure) Resource does not exist and IPC_CREAT not specified.@*
1044
EACCES : (procure) Do not have permission for specified access.@*
1046
ENOMEM : (allocate) Could not allocate memory for shmid_ds or pg_table.
1050
@node shmat, shmdt, shmget, Shared Memory
1054
Maps a shared segment into the process' address space.
1058
virt_addr = shmat (int shmid, char *shmaddr, int shmflg);
1063
shmid : id got from call to shmget.
1065
shmaddr : requested attach address.@*
1066
If shmaddr is 0 the system finds an unmapped region.@*
1067
If a non-zero value is indicated the value must be page
1068
aligned or the user must specify the SHM_RND flag.@refill
1071
SHM_RDONLY : request read-only attach.@*
1072
SHM_RND : attach address is rounded DOWN to a multiple of SHMLBA.
1074
returns: virtual address of attached segment. -1 on failure.
1077
When shmaddr is 0, the attach address is determined by finding an
1078
unmapped region in the address range 1G to 1.5G, starting at 1.5G
1079
and coming down from there. The algorithm is very simple so you
1080
are encouraged to avoid non-specific attaches.
1085
Determine attach address as described above.
1086
Check region (shmaddr, shmaddr + size) is not mapped and allocate
1087
page tables (undocumented SHM_REMAP flag!).
1088
Map the region by setting up pointers into the internal page table.
1089
Add a descriptor for the attach to the task struct for the process.
1090
@code{shm_nattch}, @code{shm_lpid}, @code{shm_atime} are updated.
1095
The @code{brk} value is not altered.
1096
The segment is automatically detached when the process exits.
1097
The same segment may be attached as read-only or read-write and
1098
more than once in the process' address space.
1099
A shmat can succeed on a segment marked for destruction.
1100
The request for a particular type of attach is made using the SHM_RDONLY flag.
1101
There is no notion of a write-only attach. The requested attach
1102
permissions must fall within those allowed by @code{shm_perm.mode}.
1107
EACCES : Do not have permission for requested access.@*
1109
EINVAL : shmid < 0 or unused, shmaddr not aligned, attach at brk failed.@*
1111
EIDRM : resource was removed.@*
1113
ENOMEM : Could not allocate memory for descriptor or page tables.
1116
@node shmdt, shmctl, shmat, Shared Memory
1120
int shmdt (char *shmaddr);
1125
shmaddr : attach address of segment (returned by shmat).
1127
returns : 0 on success. -1 on failure.
1130
An attached segment is detached and @code{shm_nattch} decremented. The
1131
occupied region in user space is unmapped. The segment is destroyed
1132
if it is marked for destruction and @code{shm_nattch} is 0.
1133
@code{shm_lpid} and @code{shm_dtime} are updated.@refill
1138
EINVAL : No shared memory segment attached at shmaddr.
1141
@node shmctl, shmlimits, shmdt, Shared Memory
1145
Destroys allocated segments. Reads/Writes the control structures.
1148
int shmctl (int shmid, int cmd, struct shmid_ds *buf);
1153
shmid : id got from call to shmget.
1155
cmd : IPC_STAT, IPC_SET, IPC_RMID (@xref{syscalls}).
1158
IPC_SET : Used to set the owner uid, gid, and shm_perms.mode field.
1160
IPC_RMID : The segment is marked destroyed. It is only destroyed
1161
on the last detach.@refill
1163
IPC_STAT : The shmid_ds structure is copied into the user allocated buffer.
1166
buf : used to read (IPC_STAT) or write (IPC_SET) information.
1168
returns : 0 on success, -1 on failure.
1171
The user must execute an IPC_RMID shmctl call to free the memory
1172
allocated by the shared segment. Otherwise all the pages faulted in
1173
will continue to live in memory or swap.@refill
1178
EACCES : Do not have permission for requested access.@*
1180
EFAULT : buf is not accessible.@*
1182
EINVAL : shmid < 0 or unused.@*
1184
EIDRM : identifier destroyed.@*
1186
EPERM : not creator, owner or super-user (IPC_SET, IPC_RMID).
1189
@node shmlimits, Notes, shmctl, Shared Memory
1190
@subsection Limits on Shared Memory Resources
1196
SHMMNI max num of shared segments system wide ... 4096.
1198
SHMMAX max shared memory segment size (bytes) ... 4M
1200
SHMMIN min shared memory segment size (bytes).
1201
1 byte (though PAGE_SIZE is the effective minimum size).@refill
1203
SHMALL max shared mem system wide (in pages) ... policy.
1205
SHMLBA segment low boundary address multiple.
1206
Must be page aligned. SHMLBA = PAGE_SIZE.@refill
1209
Unused or unimplemented:@*
1210
SHMSEG : maximum number of shared segments per process.
1214
@node Notes, Top, shmlimits, Top
1215
@section Miscellaneous Notes
1217
The system calls are mapped into one -- @code{sys_ipc}. This should be
1218
transparent to the user.@refill
1220
@subsection Semaphore @code{undo} requests
1222
There is one sem_undo structure associated with a process for
1223
each semaphore which was altered (with an undo request) by the process.
1224
@code{sem_undo} structures are freed only when the process exits.
1226
One major cause for unhappiness with the undo mechanism is that
1227
it does not fit in with the notion of having an atomic set of
1228
operations on an array. The undo requests for an array and each
1229
semaphore therein may have been accumulated over many @code{semop}
1230
calls. Thus use the undo mechanism with private semaphores only.@refill
1232
Should the process sleep in @code{exit} or should all undo
1233
operations be applied with the IPC_NOWAIT flag in effect?
1234
Currently those undo operations which go through immediately are
1235
applied and those that require a wait are ignored silently.@refill
1237
@subsection Shared memory, @code{malloc} and the @code{brk}.
1238
Note that since this section was written the implementation was
1239
changed so that non-specific attaches are done in the region
1240
1G - 1.5G. However much of the following is still worth thinking
1241
about so I left it in.
1243
On many systems, the shared memory is allocated in a special region
1244
of the address space ... way up somewhere. As mentioned earlier,
1245
this implementation attaches shared segments at the lowest possible
1246
address. Thus if you plan to use @code{malloc}, it is wise to malloc a
1247
large space and then proceed to attach the shared segments. This way
1248
malloc sets the brk sufficiently above the region it will use.@refill
1250
Alternatively you can use @code{sbrk} to adjust the @code{brk} value
1251
as you make shared memory attaches. The implementation is not very
1252
smart about selecting attach addresses. Using the system default
1253
addresses will result in fragmentation if detaches do not occur
1254
in the reverse sequence as attaches.@refill
1256
Taking control of the matter is probably best. The rule applied
1257
is that attaches are allowed in unmapped regions other than
1258
in the text space (see <a.out.h>). Also remember that attach addresses
1259
and segment sizes are multiples of PAGE_SIZE.@refill
1261
One more trap (I quote Bruno on this). If you use malloc() to get space
1262
for your shared memory (ie. to fix the @code{brk}), you must ensure you
1263
get an unmapped address range. This means you must mallocate more memory
1264
than you had ever allocated before. Memory returned by malloc(), used,
1265
then freed by free() and then again returned by malloc is no good.
1266
Neither is calloced memory.@refill
1268
Note that a shared memory region remains a shared memory region until
1269
you unmap it. Attaching a segment at the @code{brk} and calling malloc
1270
after that will result in an overlap of what malloc thinks is its
1271
space with what is really a shared memory region. For example in the case
1272
of a read-only attach, you will not be able to write to the overlapped
1276
@subsection Fork, exec and exit
1278
On a fork, the child inherits attached shared memory segments but
1279
not the semaphore undo information.@refill
1281
In the case of an exec, the attached shared segments are detached.
1282
The sem undo information however remains intact.@refill
1284
Upon exit, all attached shared memory segments are detached.
1285
The adjust values in the undo structures are added to the relevant semvals
1286
if the operations are permitted. Disallowed operations are ignored.@refill
1289
@subsection Other Features
1291
These features of the current implementation are
1292
likely to be modified in the future.
1294
The SHM_LOCK and SHM_UNLOCK flag are available (super-user) for use with the
1295
@code{shmctl} call to prevent swapping of a shared segment. The user
1296
must fault in any pages that are required to be present after locking
1299
The IPC_INFO, MSG_STAT, MSG_INFO, SHM_STAT, SHM_INFO, SEM_STAT, SEMINFO
1300
@code{ctl} calls are used by the @code{ipcs} program to provide information
1301
on allocated resources. These can be modified as needed or moved to a proc
1302
file system interface.
1306
Thanks to Ove Ewerlid, Bruno Haible, Ulrich Pegelow and Linus Torvalds
1307
for ideas, tutorials, bug reports and fixes, and merriment. And more